← Back to Credentum

Agent Bootstrap

The Four Laws of Agentic Context: How 160 Words Beat 1,000

Key Finding: 160 words of deterministic navigational anchors — a flat-format repo map, file handles, and warnings — achieves 1.000 accuracy with zero noise across 174 trials. Every other strategy either adds noise (raw memory: 79%) or latency (exploration pointers: +20 seconds) with no proportional accuracy gain.

Abstract

We ran 174 trials across 12 startup context variants and 5 task types to determine what autonomous agents need at boot time. The answer: the agent doesn't need to know what happened. It needs to know where things are.

A 160-word flat-format briefing containing repo names, file paths, and warnings outperformed 1,000+ words of narrative memory, LLM-compressed summaries, and tool-based retrieval systems. We report four empirically validated laws governing agentic context injection.

The Four Laws

LawStatement
1The 160-Word Ceiling — Beyond 160 words, noise grows faster than accuracy
2Density ≠ Relevance — Navigational anchors beat compressed facts
3Curiosity is a Latency Tax — Never invite exploration; agents treat suggestions as obligations
4Agents Understand Document Hierarchies — CLAUDE.md > MEMORY.md authority resolution is solved

Method

Each trial spawned an isolated Claude Sonnet session via headless CLI (claude --print) with controlled startup context. We measured accuracy (ground truth keyword matching), noise (fraction of hallucinated or irrelevant citations), and latency.

Task Types

TaskExample
Orientation"What repos exist and what do they do?"
Discovery"Find a specific module and list its exports"
Task Execution"Read this code and extract implementation details"
Memory Recall"What does this project's CLAUDE.md say about X?"
Conflict"CLAUDE.md says X, MEMORY.md says Y — which is correct?"

Phases

PhaseTrialsFocus
Phase 1: Baseline768 variants across 4 task types
Phase 1b: Conflict27Authority resolution (CLAUDE.md vs MEMORY.md)
Phase 2: Efficiency Frontier56Stress test top 2 variants (N=24 each)
Phase 3: Learned Anchors15Data-driven budget allocation (train + holdout)

Results

Phase 1: Variant Rankings (76 trials)

VariantWordsAccuracyNoiseAdj. Score
anchor_compact1601.0000.0001.000
briefing_light1610.9900.0420.948
bare (nothing)00.9380.0000.938
memory_compact1401.0000.3300.667
tool_pull1641.0000.4500.550
briefing_full1,1851.0000.7600.243
personalized1,2790.9400.7700.219
memory_only1,0151.0000.7900.206

The Pattern: All variants above 160 words have noise-adjusted scores below 0.55. All variants at or below 160 words score 0.55 or higher. The 160-word ceiling is not arbitrary — it marks where context shifts from navigational to narrative.

Phase 2: Stress Test (N=24 per variant)

VariantNAccuracyNoiseAdj. ScoreMean Time
anchor_compact240.9790.0000.97960.2s
briefing_light240.9900.0420.94864.0s

Phase 3: Learned Anchors (15 tasks, train + holdout)

VariantTrain (N=8)Holdout (N=7)Overall (N=15)
bare0.8440.9290.883
anchor_compact1.0001.0001.000
anchor_learned0.9690.9290.950

Key Findings

1. The Triumph of Deterministic Navigation

anchor_compact uses zero LLM calls. It scans the filesystem deterministically — repos, recent git activity, CLAUDE.md locations, uncommitted changes — and produces 160 words of navigational anchors. It beat every other variant because every word is useful. Zero noise.

2. The Prohibition Paradox

We explicitly told the agent "DO NOT READ the index unless stuck." The agent still read it, generated 25% noise, and dropped to 0.75 on task execution. Mentioning a tool consumes attention whether you invite or prohibit its use.

3. Information Dilution

In Phase 3, the learned compactor allocated 47 words (31% of the 160-word budget) to warnings about gh auth login — content with zero navigational value. This budget theft caused a discovery miss (0.75 vs 1.00) by crowding out the repo description that would have helped. Fix: cap warnings at 15 words maximum.

4. Formatting is a Tax

With identical content at 160 words, flat-format anchor_compact (no markdown headers, no indentation) beat structured anchor_learned (section headers, bold labels) by 5%. Every # or ** is a token that could have been a file path.

5. Agents Resolve Document Hierarchies

27 adversarial conflict trials pitted CLAUDE.md against MEMORY.md with fake bug claims and contradictory instructions. Result: 1.00 accuracy, 0 hallucinations. Every agent correctly identified CLAUDE.md as authoritative.

6. Learned Selection Produces Tautological Anchors

Scoring file paths by grep frequency produces task-answer cheat sheets — 100% single-task tautology. The file agents grep for most is the file the task asks about. Journey-based scoring (orientation loops, search loops) is the correct abstraction but doesn't beat manual curation at 160 words.

Discussion

The 160-word ceiling represents where context shifts from navigational ("here is where things are") to narrative ("here is what happened"). Narrative competes with the agent's own reasoning for attention. Navigation does not.

Memory injection (1,000+ words) achieves high raw accuracy but generates 77-79% noise. The agent cites commit SHAs, phone numbers, and deploy configurations that have nothing to do with the task. The context acts as an "attention DDoS" — flooding the agent's working memory with plausible but irrelevant facts.

The Curiosity Tax is robust to framing. Positive ("read this if you need it"), neutral ("context index available"), and negative ("DO NOT READ") all trigger exploration. The only reliable strategy is omission.

Three Principles

Limitations

The Production Standard

The winning configuration: 160 words, flat format, deterministic, no LLM.

Repos (authoritative — do not re-verify with ls):
- veris-platform: Veris Platform
- vivarium: AO/HyperBEAM process development. Lua 5.3 on Arweave.
  [... top 8 repos with one-line descriptions ...]
- Also: repo1, repo2, repo3 [remaining repos, no descriptions]
CLAUDE.md locations:
- worktrees/persistent/agent-dev-config/CLAUDE.md
  [... up to 3 locations ...]
Key files (zero-grep handles):
- vivarium/ao/lib/safe.lua — SafeLibrary — auth, guards, audit trail
  [... up to 5 handles ...]
Quick actions:
- repo-name: N uncommitted change(s)
Warnings:
- NEVER run gh auth login --with-token [15 words max]

Materials

All materials available at: github.com/credentum/vivarium-lab

174 trials. 12 variants. 5 task types. Designed, executed, and written with AI assistance (Claude Opus 4.5/4.6).