Credentum

"Truth, remembered."

We build telemetry and security tools for AI agents. Logging that captures what actually happened. Analysis that catches what shouldn't.

Why We Exist

AI agents run in loops. They fabricate. They drift.
And nobody's watching.

We build the tools that watch. Structured logging for every output, confabulation detection, vocabulary drift measurement, and static analysis that catches security holes before deployment.

Not because we don't trust agents.
Because trust requires evidence.

Get research findings first

Studies like Movable Feast before they go public. No spam.

What We Build

Open source tools for observing, measuring, and securing AI agent behavior.

Vivarium

An autonomous agent telemetry harness. Runs any OpenRouter model in a loop, logs every output with structured witness data, and measures three things: how much the agent fabricates (confabulation detector), how far its vocabulary drifts from baseline (Jaccard), and actual vs requested token usage.

The probes are the interesting part: stripped-context snapshots that let you diff the model's resting state against its contextualized output at any point in the run.

hello@credentum.ai

ao-lens

Static analysis and security auditing for AO processes. Tree-sitter parser, 25+ built-in checks, 20 community detection rules. Catches nil guard bypasses, determinism violations, missing auth, unsafe JSON, and state-wipe-on-replay bugs.

CLI tool, MCP server for AI coding assistants, GitHub Action for CI, and extensible YAML rules you can write yourself.

GitHub · npm

ao-mcp-server

MCP server that connects AI assistants to AO/Arweave. Query process state, send messages, spawn processes, and execute Lua code directly from Claude, Cursor, or any MCP client.

Wallet-based signing for secure operations. Read-only queries work without a wallet.

GitHub · npm

Vivarium Lab

Research findings from our work on AI agents and reliability. Real experiments. Honest results.

Agent Bootstrap

160 words of flat navigational anchors beat 1,000 words of memory. Four Laws of Agentic Context from 174 trials.

Read the study

Movable Feast

LLMs know lunar holidays but can't find them on a calendar. We found a 0% to 100% accuracy gap across frontier models.

Read the study

Persona Skills

What we learned breaking Claude's skill system. 81% routing reliability, and where it fails completely.

Read the guide

We share research findings, tool updates, and honest takes on AI. No hype.