May 29, 2026 • 6 min read • Agentic Harness Engineering

harness/api: The FastAPI Backend and Dashboard Server

harness/api/main.py is an async-native FastAPI application that replaced the old Flask server.py. It serves the dashboard as a compiled Vite build, exposes modular REST routers per concern, streams live run data over WebSocket, and ships auto OpenAPI docs at /docs.

The migration from Flask to FastAPI was motivated by three concrete problems: SSE streaming required threading hacks to avoid blocking the dev server; every async operation needed a thread executor workaround; and adding new endpoint groups meant piling more routes into a single file. FastAPI solves all three — async is the default, routers are first-class, and the framework generates OpenAPI docs for free.

Application structure

The app is assembled in main.py by including one router per concern, each defined in its own file under harness/api/routes/. All REST routes are mounted under /api; the WebSocket lives at /ws/runs. In production, the compiled dashboard Vite build is served as static files at /.

app = FastAPI(title="Harness", version="0.1.0", lifespan=lifespan)

app.include_router(runs.router,     prefix="/api")
app.include_router(queue.router,    prefix="/api")
app.include_router(tasks.router,    prefix="/api")
app.include_router(memory.router,   prefix="/api")
app.include_router(feedback.router, prefix="/api")
app.include_router(mcp.router,      prefix="/api/mcp")
# ... 9 more routers
app.include_router(ws_router)       # /ws/runs

app.mount("/", StaticFiles(directory="dashboard/dist", html=True))

Key endpoint groups

Method	Path	Purpose
GET	`/api/runs`	Paginated run history from `runs.jsonl`, newest-first, 30s cache
GET	`/api/data`	Dashboard payload: runs, queue state, orientation cache
POST	`/api/run`	Submit a task for immediate execution
POST	`/api/run/<id>/cancel`	Cancel a running task
GET	`/api/queue`	List pending queue items
POST	`/api/queue`	Enqueue a task
DELETE	`/api/queue/<id>`	Remove a queued item
GET	`/api/memory`	Agent memory entries (semantic search supported)
POST	`/api/feedback`	Run-level thumbs-up/down rating with comment
POST	`/api/page-feedback`	Page-level feedback from the browser widget
DELETE	`/api/page-feedback`	Clear page feedback by URL
GET	`/api/mcp/...`	MCP tool proxy endpoints
GET	`/ws/runs`	WebSocket: live run records as they land in `runs.jsonl`

WebSocket streaming

The old Flask server used SSE with a threading.Event to push stdout lines to the browser — a workaround for Flask's synchronous request model. The FastAPI replacement uses a proper WebSocket at /ws/runs. The handler tails runs.jsonl with asyncio.sleep, yielding new JSON records as they're appended by running agents.

@router.websocket("/ws/runs")
async def ws_runs(ws: WebSocket):
    await ws.accept()
    _connections.add(ws)
    try:
        async for record in _tail_runs():
            await ws.send_json(record)
    except WebSocketDisconnect:
        pass
    finally:
        _connections.discard(ws)

The _tail_runs generator sleeps 1 second between checks, reads only the bytes appended since the last read (tracked by file position), and yields each valid JSON line as a parsed dict. Multiple dashboard tabs each get their own WebSocket connection; the _connections set supports broadcast() for push events from other parts of the API.

Lifespan and startup

FastAPI's lifespan context manager replaces Flask's before_first_request and atexit hooks. ensure_dirs() creates the data directories on startup; teardown runs automatically when the server shuts down. This is cleaner than the old pattern of registering atexit handlers for session close logic.

@asynccontextmanager
async def lifespan(app: FastAPI):
    ensure_dirs()
    yield   # server runs here
            # teardown on exit

CORS and serving

CORS is configured with allow_origin_regex to accept any localhost or 127.0.0.1 origin (any port), plus the literal "null" origin for file:/// pages. This allows the browser feedback widget to POST from VS Code Live Server, Vite dev server, or local HTML files without a hardcoded allowlist.

The server runs on port 7860 by default (configurable via PORT in .env). In development, uvicorn runs with reload=True watching the harness/ directory — any change to a route file restarts the server without killing the process. Start it with python start.py.

Auto OpenAPI docs are available at http://localhost:7860/docs when the server is running. Every endpoint is documented with its request/response schema — useful for exploring the API without reading the route source.

Application structure

Key endpoint groups

WebSocket streaming

Lifespan and startup

CORS and serving

Related posts