The Pipeline View: Data Enrichment DAG
A static SVG DAG showing the harness enrichment architecture for financial and economic tasks. Six layers — web search, Beige Book, FRED, BEA, Market Data, and Portfolio — all converge at a Context Merge node before LLM synthesis and Wiggum evaluation. A dashed conditional edge marks trading thesis tasks that proceed to Alpaca order execution.
The standard harness pipeline gathers context from DDGS web search and the research cache. For economic and financial tasks, that's insufficient — the synthesis needs time-series data from FRED, regional income tables from BEA, district narrative from the Federal Reserve Beige Book, market signals from a local SQLite database, live fundamentals from yfinance, and current portfolio state from Alpaca. The Pipeline view documents how all these sources connect.
Enrichment layers
Live DDGS web search for the task query, plus Beige Book retrieval — semantic and keyword search over the 1996–present Fed district narrative corpus. Both activate on every research task.
FRED pulls rates, inflation, employment, and 800k+ time series. BEA adds GDP, regional personal income, and trade balance data. Both fire when macroeconomic keywords appear in the task.
Three layers for equity-intent tasks: a momentum universe of trending tickers (monthly yfinance sweep), computed signals per ticker (Hurst, Bollinger z-score, cointegration rank), and live yfinance fundamentals including earnings and analyst targets.
Current Alpaca positions, available buying power, and pending orders — fetched at task submission, not during the research loop. Gives the synthesis step awareness of what you already hold before any thesis is generated.
All enrichment blocks merge into a single context window. The producer model generates a report or structured trading thesis. Wiggum then runs up to three revision rounds, scoring across six dimensions before issuing PASS or FAIL.
A scored Markdown report written to disk on every run. For trading thesis tasks, a conditional edge fires /execute-trades — building and submitting GTC bracket orders to Alpaca only when the thesis passes Wiggum.
Conditional execution: the dashed edge
The DAG has one dashed edge: Synthesis → Alpaca Orders. This edge only fires for trading thesis tasks — those matched by _is_trading_task() in agent.py — where the /execute-trades skill is active. For all other tasks, synthesis terminates at the Report (.md) output node and no orders are submitted. The dashed style distinguishes conditional from unconditional data flow at a glance.
Similarly, the Task → Alpaca Portfolio edge is routed below the main diagram rather than through the column layout. Portfolio state is fetched at task submission time, not as a research step, so it doesn't belong in the Research or Knowledge columns — it enters the merge alongside market data but via a different path in the pipeline code.
Why a static diagram
Unlike the Explorer view (which renders a unique DAG per run from live data), the Pipeline view is a fixed SVG. The enrichment architecture doesn't change run-to-run — FRED is always available for economic tasks, Beige Book is always queried when relevant keywords appear, the Alpaca portfolio is always fetched for trading tasks. A static diagram communicates the invariant architecture more clearly than a run-specific trace, which would show only the sources that happened to activate on a given task.
The diagram uses six columns: Input, Research, Knowledge, Market Data, Synthesis, Output. Bezier curves connect nodes across columns. Same-column edges (e.g. Trending DB → Market Signals, which represents the local data flow before external API calls) use straight vertical lines. The layout is hand-tuned in pixel coordinates for readability at the standard dashboard card width.
Not every enrichment source activates on every task. FRED and BEA are only queried when FRED_API_KEY and BEA_API_KEY are set and the task classifier identifies a macroeconomic context. Market Signals require the local Trending Tickers SQLite database to be populated. The Pipeline view shows the full capability graph — the actual activated path for a specific run is visible in the Explorer view.