We don't train on the internet. We train on reality.
While others bet on bigger datasets, Mateza builds agents that learn from scratch—starting with critical infrastructure: open datasets and benchmarks for African languages.
Today's "AI" models—GPT, Claude, Llama—are miracles of engineering, but they are fundamentally Statistical Engines.
They work by compressing the entire internet into a neural network. When you ask them a question, they aren't "thinking"; they are predicting the next most likely word based on the billions of books they have read.
We are hitting the Data Wall. To make current models 10% smarter, we need 100x more data.
But here is the problem: We have already read the internet.
Tech giants are spending billions on bigger GPUs to squeeze marginal gains out of a stagnant pool of data. This is a dead end. You cannot build a superhuman intelligence by only reading human text.
If we cannot read more books, we must experience the world.
Mateza is building the Gymnasium for Minds. We are creating high-fidelity, physics-compliant simulations where our agents live, experiment, and learn.
By solving puzzles in these "Synthetic Realities", our agents derive the laws of logic from scratch. They don't memorize; they understand. This creates a Logic Kernel that is small, efficient, and infinitely scalable.
Agents learn gravity, friction, and causality by interacting with the world.
Self-improving logic structures that build upon previous knowledge.
Intelligence scales with compute, not data. O(N) complexity.
Logic chains are fully auditable, eliminating black-box hallucinations.
While our long-term vision is synthetic intelligence, we are currently working on critical infrastructure for African languages.
We are collaborating with partners across Rwanda, South Africa, and Eswatini to build a large-scale, open speech dataset for siSwati—a language spoken by millions but dramatically underrepresented in AI systems.
"All datasets, code, and benchmarks we create will be released openly under permissive licenses. We believe African language technologies must be built with communities, not extracted from them."
Our Commitment
ASR benchmarks, evaluation frameworks, and model fine-tuning for under-resourced African languages.
Learn more about our language work