← Back to Credentum

Persona Skills for Claude

The Hierarchy of Agentic Reliability: From Routing to Orchestration

Key Finding: Natural Language Routing achieves 81%+ reliability for single-shot tool selection using few-shot examples and negative constraints. But it fails completely on compound, multi-step prompts. We found the architectural ceiling.

What We Did

We used Anthropic's Agent Skills system to shape Claude's behavioral voice/persona rather than its procedural capabilities. This is off-label but effective—skills are designed for task workflows, but work for disposition shaping too.

We ran a 16-phase evaluation testing activation rates, failure patterns, and architectural limits.

The Hypothesis

Can Claude's Skills system be used for persona/voice shaping rather than procedural task completion?

Traditional use: "When user wants to create a PDF, load these instructions."

Our use: "When user wants exploratory dialogue, adopt this conversational voice."

Key Findings

1. Activation is Bimodal

The overall activation rate of 56% is misleading. Activation is actually bimodal:

Prompt TypeActivation Rate
Abstract trigger phrase only~0%
Concrete dilemma with named options~100%
Mixed/ambiguousVariable

2. Few-Shot Examples Double Activation

Description phrasing matters enormously:

Examples create a "semantic cone"—the model interpolates between them rather than litigating against rigid definitions.

3. Three Failure Classes (All Solved)

Failure ClassSymptomSolution
AmbiguityWrong skill activatesFew-shot examples (semantic cone)
OverconfidenceSkill activates when shouldn'tPROHIBITED constraints
CollisionOverlapping namespacesScope exclusion

4. The Hard Limit: Compound Prompts Fail

Phase 16 Discovery: Natural Language Routing is single-shot. Compound prompts like "do X, then trigger skill Y" fail due to:

  • Token Gravity: Early tasks absorb all attention ("Skip-to-Panel")
  • Context Decay: Later instructions fade ("First-Intent-Wins")

Conclusion

Natural Language Routing is sufficient for Tool Selection but insufficient for Task Orchestration.

Multi-step workflows require a dedicated Supervisor state-manager or explicit Human-in-the-Loop intervention.

Recommendations

Use CaseRecommended Approach
Single-shot tool selectionNatural Language Routing
Multi-step workflowsHuman-in-the-Loop OR Supervisor pattern
High-reliability routingFew-shot examples + PROHIBITED constraints
Persona/behavioral skillsObservable signals + verbal tics + examples

Full Guide

The complete 1,450-line practitioner's guide includes:

Read the full guide on GitHub →

Materials

All materials available at: github.com/credentum/vivarium-lab

This research was conducted collaboratively by Matt + Claude (Opus 4.5). Study complete December 2025.