May 31, 2026 • 5 min read • Agentic Harness Engineering

The Navigation Skills: /suggest and /re-orient

Two skills for maintaining situational awareness. /suggest synthesizes the single most valuable next task from project state — orientation cache, recent run history, git log, and autoresearch progress — and returns a structured recommendation with a whitelist-constrained runnable command. /re-orient fetches five live GitHub data sources in parallel and synthesizes a fast project state snapshot against a focus question.

Most harness sessions start cold. You know something needs to happen next but not exactly what — experiments have been running, commits have landed, some runs passed and some failed. /suggest and /re-orient address this differently: /suggest makes a concrete recommendation from what it can read locally, while /re-orient fetches the current state of the GitHub remote and synthesizes it against the same local context. The difference is speed vs. freshness.

/suggest: synthesizing the next task

# Default: recommend the single most valuable next task
python agent.py "/suggest"

The skill assembles four context sources, each bounded to prevent context window overflow:

Orientation cache

Reads harness_orientation_raw.md from the system temp directory (written by /orientation). Truncated to 4,000 chars. If the cache is missing, it warns and continues with less context. If the cache is over 30 minutes old, it recommends running /re-orient first.

Recent runs

Last 8 entries from runs.jsonl: timestamp, final status, last Wiggum score, model, and first 80 chars of the task string. Enough to spot a run-of-failures pattern or an incomplete benchmark without reading full records.

Git log

git log --oneline -12 — the last 12 commit messages. Used to identify what was recently shipped and whether there are logical follow-ons (e.g. a commit adding a skill without a corresponding benchmark run).

Autoresearch state

Header row + last 3 rows of autoresearch.tsv. Shows the current experiment count, recent keep/discard pattern, and whether the loop is stuck — relevant context for deciding whether to continue the loop or reset.

Command whitelisting

The synthesis prompt constrains the recommended command to a fixed set of real, runnable invocations. This is one of the most important details in the implementation — without it, the model will hallucinate flags, module paths, and subcommands that don't exist:

Use only these signatures:
  python agent.py "<task description and output path>"
  python bench_model_compare.py --test-model <tag> --baseline-model <tag> [--run-both]
  python autoresearch.py [--tasks T_A,T_B] [--rounds N]
  python eval_suite.py [--fast] [--no-wiggum]
  python orchestrator.py "<compound task>"
Do not invent flags, subcommands, or module paths that are not listed above.

The output format is three labeled sections:

**Suggested task:** <one sentence describing the task>

**Why:** <2-3 sentences of rationale referencing specific evidence above>

**Command:** `<the exact command or action to take>`

/re-orient: live GitHub context

# Default: summarize project state, recent ships, and next priorities
python agent.py "/re-orient"

# Focus on a specific question
python agent.py "/re-orient what's blocking the autoresearch loop?"

Everything after the /re-orient token becomes the focus question. Without one, the default is: "Summarise the current project state, what was recently shipped, and what should be prioritised next."

Five data sources are fetched in parallel using ThreadPoolExecutor(max_workers=5), each with a 15-second timeout:

_cmds = [
    (["git", "log", "--oneline", "-20"],                        "recent_commits"),
    ([gh, "pr", "list", "--state", "merged", "--limit", "10",
      "--json", "number,title,mergedAt,author"],                "merged_prs"),
    ([gh, "pr", "list",
      "--json", "number,title,author,createdAt,headRefName"],   "open_prs"),
    ([gh, "issue", "list", "--limit", "10",
      "--json", "number,title,labels,createdAt,state"],         "open_issues"),
    ([gh, "run", "list", "--limit", "5",
      "--json", "status,conclusion,name,createdAt,headBranch"], "ci_runs"),
]

Each section is capped at 1,200 characters in the prompt. The orientation cache contributes up to 6,000 characters. Failed commands (timeout, gh not installed, no GitHub remote) are silently omitted — the skill degrades gracefully to whatever sources are available.

When to use each

/suggest is faster and works offline — it reads only local files. Use it during active experimentation when you need a quick recommendation without leaving the terminal. The orientation cache must be reasonably fresh (run /orientation at the start of each session).

/re-orient makes network calls and takes a few extra seconds. Use it when returning to the project after time away, after a batch of PRs have been merged, or when you need the focus question answered with reference to specific issue numbers and CI status. The gh CLI must be authenticated.

Both skills are also components of /troubleshoot: it assembles the same context as /suggest alongside the failure data from /debug and synthesizes both in a single LLM call. If you're starting from a failure and want both a diagnosis and a next step, use /troubleshoot rather than running /debug and /suggest separately. See The Diagnostic Skills.

/suggest: synthesizing the next task

Orientation cache

Recent runs

Git log

Autoresearch state

Command whitelisting

/re-orient: live GitHub context

When to use each

Related articles

Related posts