May 31, 2026 • 7 min read • Agentic Harness Engineering

The Site Generation Skills: /design and /build-page

/design visits any URL with Playwright, extracts CSS custom properties and computed element styles, analyzes screenshots with a vision model, and synthesizes a structured 10-section design system document. /build-page takes that document and a folder of markdown files and produces a self-contained HTML page in three LLM passes: content clustering, shell generation, and per-file card injection.

The typical use case is turning a literature review — dozens of annotated markdown files — into a themed web page that matches an existing site's visual identity. The design system is the intermediary: it captures the site's colors, typography, spacing, and component styles in a portable format that the page generator can apply to arbitrary content without requiring the original site's codebase or assets.

/design: extracting a design system

# Extract design system from a URL
python agent.py "/design https://nickmccarty.me save to design.md"

# Bare domain also works
python agent.py "/design stripe.com save to stripe-design.md"

The skill launches Playwright Chromium at 1440×900 with a realistic user-agent string, navigates to the URL, and waits 2 seconds after domcontentloaded for fonts and JavaScript to settle. It then runs a JavaScript extraction in the browser context that collects four data types:

CSS custom properties — all --variable: value pairs from the :root selector, up to 60 variables
Computed styles — getComputedStyle() for 10 element types: h1, h2, h3, body, buttons, nav/header, hero section, cards, inputs, footer. Properties collected include fontFamily, fontSize, fontWeight, color, backgroundColor, borderRadius, padding, boxShadow
Background colors — distinct backgroundColor values across all section, header, footer, and main elements, converted from rgb() to hex
Google Fonts links — any <link> tag referencing fonts.googleapis.com

Three screenshots are taken: a full-page capture, an above-fold hero shot, and a mid-page component section. The hero and mid-page shots are passed to the vision model (in daemon threads with a configurable timeout, default 300 seconds) for aesthetic description. If the vision model times out due to GPU contention, the skill continues with CSS tokens only.

The design system document

All extracted data is assembled into a prompt that instructs the LLM to produce a structured markdown document with these exact sections:

Proprietary fonts are handled gracefully: the prompt instructs the LLM to document a Google Fonts substitute at the top of the document so the page generator can load it from CDN without needing the original font files.

/build-page: three-pass HTML generation

# Generate a page from a design system and content folder
python agent.py "/build-page design.md from ~/Desktop/lit-review/ save to index.html"

# With visual refinement (opens Chrome, compares screenshots, iterates)
python agent.py "/build-page design.md from ~/Desktop/lit-review/ save to index.html --refine 3"

The page generator reads all .md files from the content directory and runs three sequential LLM passes. Each pass has a distinct responsibility and its own prompt — they never share context, which keeps each call focused and avoids context window bloat.

Pass 1

Content analysis

LLM receives file summaries (title + opening paragraph per file) and returns a JSON cluster plan: 3–6 topic groups, each file assigned a display role — featured, card, or compact. Falls back to a flat "All Papers" cluster if JSON parsing fails.

Pass 2

Shell generation

LLM receives the cluster plan JSON and design system (6,000 char cap) and generates a complete HTML document with  placeholders where cards will go. Nav, hero, section headings, footer, and all CSS are included in this pass.

Pass 3

Card injection

One LLM call per file. Each call receives the filename, its assigned role, the CSS class to use, and the file content (capped by role: 5,000 chars featured, 3,000 card, 800 compact). Returns an <article> element that replaces the placeholder.

The role system controls both content depth and token budget. Featured cards get all body paragraphs, key findings, and tags; card-role files get an abstract paragraph plus 3–5 bullet findings; compact-role files get only a title and one sentence. This means a collection of 30 papers generates 32 LLM calls total (1 analysis + 1 shell + 30 cards) rather than attempting to fit everything into one enormous prompt.

Visual refinement loop

When --refine N is specified and a vision model is available, the skill opens the generated HTML in headless Chromium, screenshots it, and compares the result to the original site screenshots captured during /design. A diff prompt identifies up to five specific visual differences (wrong hex color, font-size should be Xpx, spacing mismatch). A refine prompt then applies only those changes, outputting the complete updated HTML. The loop runs up to N iterations or until the diff prompt returns "NO CHANGES NEEDED".

# Diff step — low temperature, constrained to specific changes
"Compare these two page descriptions and list up to 5 SPECIFIC visual differences
to fix in the HTML (be precise: wrong color #hex, font-size should be Xpx, etc.)."

# Refine step — applies only the listed fixes, returns full HTML
"Apply ONLY the listed fixes. Output the complete updated HTML document.
Do not change content — only visual/layout properties."

The refinement loop truncates the current HTML to 24,000 characters before passing it to the refine prompt. For large pages with many content cards this may cause the model to lose later sections. Run refinement on pages with fewer than ~15 cards for best results, or use --refine 1 for a single pass that catches the most obvious structural issues.

The shell generation prompt explicitly forbids generating card content: "Do NOT generate card titles, abstracts, or any paper content. Cards will be injected later." Without this constraint, the shell LLM call would hallucinate placeholder content that conflicts with the actual content injected in pass 3 — a common failure mode in single-pass page generation.

/design: extracting a design system

The design system document

/build-page: three-pass HTML generation

Content analysis

Shell generation

Card injection

Visual refinement loop

Related posts