May 29, 2026 • 6 min read • Agentic Harness Engineering

The Skills Registry: Hook Points, Auto-Activation, and 38 Skills

skills.py is the single source of truth for what the harness can do. Every skill is registered with a hook point that determines where in the pipeline it runs, and an optional auto-activation predicate that fires the skill without an explicit slash command.

The pipeline in agent.py doesn't import skill implementations directly — it asks skills.py which skills are active for a given task, then calls them at the appropriate stage. This indirection means adding a new skill is a two-step operation: register it in the REGISTRY dict, then implement it. The pipeline never needs to change.

Six hook points

standalone

Bypasses the research pipeline entirely — the skill handles the full task and returns

pre_research

Runs before any web searches — can modify search behaviour or inject file context

pre_synthesis

Injects additional instructions into the synthesis prompt via the prompt field

post_synthesis

Transforms or augments the synthesized output after it's written to disk

post_wiggum

Runs additional evaluation after the Wiggum loop — used by the 3-persona panel

modifier

Changes pipeline behaviour (e.g. /wiggum forces the evaluation loop on)

The full registry

Skill map — organized by pipeline hook type. Green ring = auto-activated. Hover a node for the slash command.

SkillHookAuto-activation
/deeppre_researchauto task contains "comprehensive", "thorough", "exhaustive", "in-depth", "deep-dive"
/contextualizepre_researchauto task is self-referential ("what can you", "describe the agent", "about the harness")
/citepre_synthesisexplicit only — too broad to auto-trigger
/datepre_synthesisauto task mentions current date context ("today", "latest", "recent", "this year", "as of")
/timepre_synthesisexplicit only
/scratchpadpre_synthesisexplicit only — opt-in when output requires exact computed values
/beige-bookpre_synthesisauto task mentions Fed, FOMC, inflation, labor market, GDP, or other macroeconomic terms
/knowledge-graphpost_synthesisauto task mentions "knowledge graph", "kg", or "visualize"
/panelpost_wiggumauto planner classifies task as complexity="high"
/validate-tradespost_wiggumauto task contains trading thesis keywords ("long thesis", "trade setup", "alpaca")
/wiggummodifierexplicit only
/annotatestandaloneexplicit only
/annotated-abstractstandalonealias for /annotate
/emailstandaloneexplicit only
/githubstandaloneexplicit only
/reviewstandaloneexplicit only
/lit-reviewstandaloneexplicit only
/recallstandaloneexplicit only
/queuestandaloneexplicit only
/orientationstandaloneexplicit only
/introspectstandaloneexplicit only
/sync-wikistandaloneexplicit only
/playwrightstandaloneexplicit only
/sitemapstandaloneexplicit only
/crawlstandalonealias for /sitemap
/re-orientstandaloneexplicit only
/debugstandaloneexplicit only
/suggeststandaloneexplicit only
/troubleshootstandaloneexplicit only
/transcribestandaloneexplicit only
/designstandaloneexplicit only
/build-pagestandaloneexplicit only
/sitestandaloneexplicit only
/deckstandaloneexplicit only
/forge:pluginstandaloneexplicit only
/forge:liststandaloneexplicit only
/test-harnessstandaloneexplicit only
/onboardingstandaloneexplicit only
/grill-mestandaloneexplicit only
/execute-tradesstandaloneexplicit only — never auto-executes

Auto-activation in practice

Five skills activate without a slash command. Each has an auto lambda that receives the task string and the planner's Plan object:

# /deep: force max search rounds when task implies thoroughness
"auto": lambda task, plan: bool(re.search(
    r"\bcomprehensive\b|\bthorough\b|\bexhaustive\b|\bin.depth\b|\bdeep.dive\b",
    task, re.IGNORECASE
)),

# /panel: run 3-persona evaluation when planner sees a complex task
"auto": lambda task, plan: plan.complexity == "high",

# /contextualize: inject self-knowledge when task is self-referential
"auto": lambda task, plan: bool(re.search(
    r"\byour(?:self)?\b|\bwhat (?:can|are) you\b|\bthe harness\b",
    task, re.IGNORECASE,
)),

The panel's auto-activation is the most interesting case: it depends on the planner's output, which means the pipeline runs two LLM calls (planner) before it knows whether a third post-synthesis call (panel) will be needed. This is by design — the planner's complexity classification is more reliable than keyword matching for detecting tasks that warrant multi-perspective evaluation.

Lazy loading

Importing skills.py does not import any skill implementation. The REGISTRY is pure data — strings, lambdas, and None. When agent.py needs to actually run a skill, it calls skills.run_post_synthesis(), skills.run_annotate_standalone(), etc., which import the relevant module at call time. This keeps startup fast and avoids circular imports between agent.py and the skill modules that call back into inference.py.

Adding a new skill: register an entry in REGISTRY with a hook, description, and optional auto predicate. The prompt field only applies to pre_synthesis hooks — it's the text appended to the synthesis prompt when the skill is active. All other hooks set prompt: None and implement their behavior through the corresponding hook function in agent.py.