We broke the scaling laws.
15M parameters. One epoch on a laptop. 2M Zulu sentences. 61.7% zero-shot intent on Korean. 56.5% on Japanese. 40+ languages. Runs on a $50 phone. No data centers. No H100s. No parallel data.
Zero-shot intent on Korean and Japanese — from a 15M-parameter model that only saw Zulu. 60× above random.
CPU inference. No GPU, no cloud round-trip. Runs on a $50 phone. Your data never leaves the device.
Sentiment, negation, intent — composable transforms that transfer across languages. Every call signed, reversible, auditable.
Frontier models serve roughly 100 of Earth's 6,000 languages and demand GPUs to run. (Stanford HAI, 2024) Bhala is the inversion — small, local, governable — and the substrate for every other language and every other stack.
The three inversions
A new class of AI.
Three bets the industry got wrong. Three results that change the valuation of every model trained under the old assumptions.
Capability isn't scale.
15M parameters beats InkubaLM-422M on Swahili intent and ties GPT-4o — from a model trained on one language in one epoch on a laptop.
See Swahili head-to-headTrain structure once. Every language inherits it.
Zulu → Korean (61.7%). Zulu → Japanese (56.5%). Zulu → Hindi (60.3%). Zulu → Amharic (60.9%). No parallel data. No target-language training.
See cross-family matrixPrompts are blunt. Operators are algebra.
Name them. Compose them. Sign them. Reverse them. Nine independent claims in the provisional filing protect the controllable-embedding API as a product.
See the API specBusiness Impact
Measurable outcomes for the enterprise.
We focus on the results that move the needle—reducing risk, ensuring compliance, and slashing operational costs through verifiable AI.
Your AI stops making things up when the question is sensitive.
When a customer asks a compliance-sensitive question, your AI pulls the right documents the first time — and you get a signed record of exactly why it chose them. No guessing. No arguing with auditors.
Show the regulator what the AI did, before they ask.
Every decision the model makes comes with a plain-language receipt: what it saw, what it chose, why. Remove bias on a specific prompt without retraining the model. Prove it was removed.
Your product keeps working when the internet doesn't.
The model fits on a cheap smartphone and answers in under a blink. No cloud call, no monthly bill per user, no data leaves the device. Ideal for emerging markets, field work, and anything offline.
Pick the product
Which one fits your situation?
Every use case below is a ready-made product. Pick the one closest to what you need — you can always compose your own from the same building blocks.
One call. A signed receipt.
Apply a named action — change sentiment, redirect intent, remove bias — to any query at inference. Every call returns the result and a signed record your compliance team can read.
- • One HTTP call. No retraining. No re-encoding.
- • Works across 40+ languages with the same action.
- • Every intervention logged, reversible, auditable.
curl https://api.bhala.ai/v1/embeddings/shift \
-H "Authorization: Bearer $BHALA_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "I want to cancel my subscription",
"lang": "en",
"operators": [
{ "id": "sentiment_positive", "alpha": 1.0 }
]
}'
# → {
# "embedding": [ ... 128 floats ... ],
# "operators_applied": [
# { "id": "sentiment_positive", "alpha": 1.0,
# "shift_norm": 0.431, "latency_ms": 23 }
# ],
# "model": "sci-v3",
# "audit_id": "aud_01HXX..."
# }Roadmap
From one language to the substrate of every stack.
We proved the hardest case first. Next we ship the API, then we become the interpretability layer for every foundation model.
Proof on the hardest case
Pretrain on a single morphologically-rich language (isiZulu). Demonstrate zero-shot transfer to 40+ more across 10 families. Core patents filed on the architecture and the operators.
Governable Embeddings API
Ship the operator library, audit log, and per-operator billing. First enterprise pilots in regulated verticals — banking, healthcare, compliance.
Composition layer for every foundation model
A thin layer that wraps OpenAI, Cohere, or any customer's in-house embedding model — and makes it governable. Our operators, their backbone.
The interpretability layer of the AI stack
Every regulator, bank, and health system requires auditable interventions. We are the substrate that makes their AI legal to ship.
Compose the intelligence you need.
Start with one core unit. Add more as you grow. Own the whole stack when you're ready for sovereign deployment.
Backed by