Composable AI

Five core units. Composed into every capability we ship.

Bhala is not a model. Bhala is a set of small, interpretable units that compose into language understanding, translation, edge AI, sovereign deployments, and auditable reasoning. Each core unit is inspectable, replaceable, and sovereign by default.

Core Unit

Sozisi Manifold

Universal linguistic structure as a composable substrate

A geometric representation of meaning learned from the structural regularities shared across human languages. The manifold is language-agnostic at its core — new languages attach to it with seconds of adaptation, not weeks of retraining. This is the substrate every other core unit builds on.

  • Zero-shot transfer to 17+ languages across 10 families
  • Adapts to a new language in <2 seconds
  • Scripts-agnostic: Arabic, Devanagari, Hangul, Cyrillic
  • Stable under perturbation (robust by geometry)
Composes with Morpheme-Aware Tokenization · Governable Embeddings · Edge-Class BackboneRead more
Core Unit

Governable Embeddings

Named, composable operators applied at inference

A patented embedding space where semantic dimensions — sentiment, intent, bias — are accessible as named operators you can apply to any query or document vector. Shift a query toward positive sentiment before retrieval. Redirect a user's intent from 'complaint' to 'refund offer' for counterfactual retrieval. Subtract a gender dimension to measurably debias a CV-screening pipeline. Every intervention is a vector operation, auditable per call, at 100% flip accuracy on held-out test data across Zulu, Swahili, Xhosa.

  • 100% sentiment-flip accuracy on AfriSenti Zulu and Swahili
  • 100% intent-redirect accuracy across 4 tested transitions
  • 93–99% cross-lingual transfer from Zulu-trained operators
  • Every operator shift logged for audit + compliance
Composes with Sozisi Manifold · Edge-Class Backbone · Self-Healing InferenceRead more
Core Unit

Morpheme-Aware Tokenization

Inspectable tokens, not opaque subwords

Tokens carry linguistic meaning — you can read them. This is the opposite of BPE, where tokens are statistical fragments with no human-auditable structure. Morpheme-aware tokenization makes downstream core units more sample-efficient and their outputs more explainable.

  • Human-readable tokens across all supported languages
  • Compact vocabulary: ~5K tokens covers 23 languages
  • Dramatically more sample-efficient than BPE
  • Enables downstream interpretability
Composes with Sozisi Manifold · Edge-Class BackboneRead more
Available Now

Edge-Class Backbone

Linear-time architecture for on-device inference

A sequence model with linear complexity in sequence length — the compute budget that lets 15M parameters run on a smartphone, a feature phone, or a sensor. Pair with Sozisi and the tokenizer and you get GPT-4-class NLU at a fraction of the cost, fully offline.

  • <50ms inference on commodity hardware
  • No GPU required for production workloads
  • Runs on Android, iOS, and embedded Linux
  • 24MB footprint — fits alongside your app
Composes with Sozisi Manifold · Morpheme-Aware Tokenization · Self-Healing InferenceRead more
Available Now

Self-Healing Inference

Robustness as a core unit, not a bolt-on

A structural correction mechanism that pulls perturbed inputs back toward the manifold during inference. The result is a model that degrades gracefully on noisy, OOD, or adversarial inputs — essential for production deployment in regulated and edge environments.

  • Graceful degradation on out-of-distribution inputs
  • Resilient to typos, code-switching, and transliteration drift
  • Stable inference under input perturbation
  • No retraining needed per failure mode
Composes with Sozisi Manifold · Governable Embeddings · Edge-Class BackboneRead more

Compose what you need.

Call one core unit from the API. License the stack for sovereign deployment. Or compose your own — we don't lock you in.