Credentum
"Truth, remembered."
We build telemetry and security tools for AI agents. Logging that captures what actually happened. Analysis that catches what shouldn't.
Why We Exist
AI agents run in loops. They fabricate. They drift.
And nobody's watching.
We build the tools that watch. Structured logging for every output, confabulation detection, vocabulary drift measurement, and static analysis that catches security holes before deployment.
Not because we don't trust agents.
Because trust requires evidence.
Get research findings first
Studies like Movable Feast before they go public. No spam.
Research updates only. Unsubscribe anytime.
What We Build
Open source tools for observing, measuring, and securing AI agent behavior.
Vivarium
An autonomous agent telemetry harness. Runs any OpenRouter model in a loop, logs every output with structured witness data, and measures three things: how much the agent fabricates (confabulation detector), how far its vocabulary drifts from baseline (Jaccard), and actual vs requested token usage.
hello@credentum.ai
ao-lens
Static analysis and security auditing for AO processes. Tree-sitter parser, 25+ built-in checks, 20 community detection rules. Catches nil guard bypasses, determinism violations, missing auth, unsafe JSON, and state-wipe-on-replay bugs.
Vivarium Lab
Research findings from our work on AI agents and reliability. Real experiments. Honest results.
Agent Bootstrap
160 words of flat navigational anchors beat 1,000 words of memory. Four Laws of Agentic Context from 174 trials.
Movable Feast
LLMs know lunar holidays but can't find them on a calendar. We found a 0% to 100% accuracy gap across frontier models.
Persona Skills
What we learned breaking Claude's skill system. 81% routing reliability, and where it fails completely.
We share research findings, tool updates, and honest takes on AI. No hype.