harness/api: The FastAPI Backend and Dashboard Server
harness/api/main.py is an async-native FastAPI application that replaced the old Flask server.py. It serves the dashboard as a compiled Vite build, exposes modular REST routers per concern, streams live run data over WebSocket, and ships auto OpenAPI docs at /docs.
The migration from Flask to FastAPI was motivated by three concrete problems: SSE streaming required threading hacks to avoid blocking the dev server; every async operation needed a thread executor workaround; and adding new endpoint groups meant piling more routes into a single file. FastAPI solves all three — async is the default, routers are first-class, and the framework generates OpenAPI docs for free.
Application structure
The app is assembled in main.py by including one router per concern, each defined in its own file under harness/api/routes/. All REST routes are mounted under /api; the WebSocket lives at /ws/runs. In production, the compiled dashboard Vite build is served as static files at /.
app = FastAPI(title="Harness", version="0.1.0", lifespan=lifespan)
app.include_router(runs.router, prefix="/api")
app.include_router(queue.router, prefix="/api")
app.include_router(tasks.router, prefix="/api")
app.include_router(memory.router, prefix="/api")
app.include_router(feedback.router, prefix="/api")
app.include_router(mcp.router, prefix="/api/mcp")
# ... 9 more routers
app.include_router(ws_router) # /ws/runs
app.mount("/", StaticFiles(directory="dashboard/dist", html=True))
Key endpoint groups
| Method | Path | Purpose |
|---|---|---|
| GET | /api/runs | Paginated run history from runs.jsonl, newest-first, 30s cache |
| GET | /api/data | Dashboard payload: runs, queue state, orientation cache |
| POST | /api/run | Submit a task for immediate execution |
| POST | /api/run/<id>/cancel | Cancel a running task |
| GET | /api/queue | List pending queue items |
| POST | /api/queue | Enqueue a task |
| DELETE | /api/queue/<id> | Remove a queued item |
| GET | /api/memory | Agent memory entries (semantic search supported) |
| POST | /api/feedback | Run-level thumbs-up/down rating with comment |
| POST | /api/page-feedback | Page-level feedback from the browser widget |
| DELETE | /api/page-feedback | Clear page feedback by URL |
| GET | /api/mcp/... | MCP tool proxy endpoints |
| GET | /ws/runs | WebSocket: live run records as they land in runs.jsonl |
WebSocket streaming
The old Flask server used SSE with a threading.Event to push stdout lines to the browser — a workaround for Flask's synchronous request model. The FastAPI replacement uses a proper WebSocket at /ws/runs. The handler tails runs.jsonl with asyncio.sleep, yielding new JSON records as they're appended by running agents.
@router.websocket("/ws/runs")
async def ws_runs(ws: WebSocket):
await ws.accept()
_connections.add(ws)
try:
async for record in _tail_runs():
await ws.send_json(record)
except WebSocketDisconnect:
pass
finally:
_connections.discard(ws)
The _tail_runs generator sleeps 1 second between checks, reads only the bytes appended since the last read (tracked by file position), and yields each valid JSON line as a parsed dict. Multiple dashboard tabs each get their own WebSocket connection; the _connections set supports broadcast() for push events from other parts of the API.
Lifespan and startup
FastAPI's lifespan context manager replaces Flask's before_first_request and atexit hooks. ensure_dirs() creates the data directories on startup; teardown runs automatically when the server shuts down. This is cleaner than the old pattern of registering atexit handlers for session close logic.
@asynccontextmanager
async def lifespan(app: FastAPI):
ensure_dirs()
yield # server runs here
# teardown on exit
CORS and serving
CORS is configured with allow_origin_regex to accept any localhost or 127.0.0.1 origin (any port), plus the literal "null" origin for file:/// pages. This allows the browser feedback widget to POST from VS Code Live Server, Vite dev server, or local HTML files without a hardcoded allowlist.
The server runs on port 7860 by default (configurable via PORT in .env). In development, uvicorn runs with reload=True watching the harness/ directory — any change to a route file restarts the server without killing the process. Start it with python start.py.
Auto OpenAPI docs are available at http://localhost:7860/docs when the server is running. Every endpoint is documented with its request/response schema — useful for exploring the API without reading the route source.