Available Now

Bantu Embeddings API

Vector embeddings purpose-built for Africa's languages

Generate high-quality vector embeddings for Bantu text. Power search, classification, and RAG in 22 African languages with a single API call.

Features

What you get

256 and 512-dimensional vectors
Contrastive learning on Bantu language data
Sub-100ms latency
Supports IsiZulu, IsiXhosa, KiSwahili, ChiShona, and more
Powered by proprietary Bantu language models
REST API — simple POST /embed
Pricing: $0.02 per 1M tokens
Request
curl -X POST https://api.bhala.ai/v1/embed \
  -H "Authorization: Bearer bh_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Sawubona, ngingakusiza kanjani?",
    "model": "bantu-embed-v1",
    "dimensions": 256
  }'
Response
{
  "object": "embedding",
  "data": [
    {
      "index": 0,
      "embedding": [0.0234, -0.0891, 0.1456, ...],
      "dimensions": 256
    }
  ],
  "model": "bantu-embed-v1",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Use Cases

What you can build with Bantu Embeddings API

Semantic search across multilingual documents

Retrieval-Augmented Generation (RAG) in Bantu languages

Text classification and clustering

Duplicate detection and content matching

Language Support

Supported languages

Purpose-built for the Bantu language family with more languages being added continuously.

IsiZulu
12M+ speakers
IsiXhosa
8M+ speakers
Sepedi
5M+ speakers
Setswana
5M+ speakers
Sesotho
6M+ speakers
Xitsonga
4M+ speakers
IsiSwati
2.5M+ speakers
Tshivenda
1.3M+ speakers
Lingala
20M+ speakers
Tshiluba
6M+ speakers
Kikongo
5M+ speakers
KiSwahili
15M+ L1 speakers
Kinyarwanda
13M+ speakers
ChiShona
11M+ speakers
IsiNdebele
2M+ speakers
ChiTonga
1.5M+ speakers
Umbundu
6M+ speakers
Kimbundu
3M+ speakers
Kirundi
11M+ speakers
Chichewa
10M+ speakers
Bemba
4M+ speakers
Luganda
5.5M+ speakers

Ready to get started?

Start building with Bantu Embeddings API today.