May 26, 2026 • 6 min read • Agentic Harness Engineering

The Email Skill: Personalized Outreach Drafts from Conference Speaker CSVs

Two LLM calls per contact — a subject line and a warm-and-specific body — with slide content fetched automatically via MarkItDown, per-contact JSON output, and every draft logged to runs.jsonl for full dashboard visibility.

After attending a conference or watching a batch of recorded talks, following up with speakers is high-value but time-consuming to personalize at scale. email_skill.py turns a CSV of speaker records into a directory of ready-to-review JSON drafts. The drafts are not templates — each one references the speaker's specific talk topic, connects to the sender's context, and is grounded in extracted slide content when available.

CSV input format

The skill reads any CSV that contains speaker data. Column requirements are tiered:

Column	Notes
namerequired	Full name. First name is extracted for the salutation.
affiliationrequired	University, company, or research group.
markdownpreferred	Pre-converted slide text. If present, used directly (first 600 chars).
summarypreferred	Talk abstract or summary. Included in the body prompt (first 400 chars).
topic_keywordspreferred	Comma-separated or Python-list-format keywords. Used in both subject and body prompts.
content_urloptional	URL to slides or session page. Fetched via MarkItDown if `markdown` is empty.
emailsoptional	Email address(es). Falls back to `emails_regex`, `email`, `contact_email`, `speaker_email` in that order. If none found, draft is written with a `<email-not-found>` placeholder.

The email column fallback chain is intentionally broad — conference CSV exports use inconsistent column names, and the skill adapts rather than requiring a fixed schema.

Two LLM calls per contact

Subject line (64 tokens)

A separate system prompt instructs the model to write a concise, specific subject line under 60 characters. The prompt includes speaker name, affiliation, topic keywords, sender company, and goal. Output is direct — no "Subject:" prefix, no quotes.

Email body (512 tokens)

A warm, specific body in three paragraphs: sincere thanks referencing the talk topic; natural connection to the sender's context; light mention of the platform framed as useful to the speaker, not as a pitch. No bullet points, no em dashes. Plain professional prose.

Both calls use temperature 0.7, which allows variation across contacts without the outputs becoming generically repetitive. Qwen3 chain-of-thought <think>...</think> blocks are stripped from both outputs before saving.

Slide content via MarkItDown

When a row has markdown pre-populated, that text is used directly. When it's empty but content_url is present, the skill fetches the URL via MarkItDown (MarkItDown(enable_plugins=False).convert_url(url)), which handles PDFs, HTML pages, PPTX files, and DOCX documents. The first 600 characters of the converted text form the slide excerpt injected into the body prompt.

MarkItDown also handles local file paths — passing a local PDF path loads and converts it the same way. The skill treats the source field uniformly: URL → convert_url(), file path with recognized extension → convert(path), anything else → raw text up to 2000 characters.

Output structure

For a batch run, each contact produces a JSON file in the output directory:

{
  "name": "Jane Smith",
  "affiliation": "UC Berkeley BIDS",
  "to_email": "jsmith@berkeley.edu",
  "email_found": true,
  "sender_name": "Nick",
  "sender_email": "nick@upskilled.consulting",
  "subject": "Your geospatial AI talk at GeoWeek",
  "body": "Hi Jane,\n\nThank you for your excellent talk on...",
  "generated_at": "2026-06-01T14:22:31.045Z"
}

A manifest.json in the output directory consolidates all records. The final console summary reports how many contacts had email addresses found versus how many received a placeholder — useful for knowing which drafts need manual address lookup before sending.

Dashboard logging

Every email draft is logged to runs.jsonl as a distinct entry with task_type: "email_draft". The record includes the full token count (subject + body calls combined), the email address, subject line, body text, and the goal string. This means the Sessions view in the dashboard shows email drafting as part of the normal session activity, and the Artifacts view includes the JSON files as outputs.

The skill was built for conference follow-up from geo-week-talks.csv — a CSV of geospatial AI conference speakers. The invocation pattern was:
op /email geo-week-talks.csv reach out about our geospatial AI platform save to outreach/

Single-contact mode

Beyond batch CSV processing, generate_single_email() generates one draft for a named contact with an arbitrary source (URL, file, or inline text). This is the entry point used when the agent dispatches an /email <contact> <goal> slash command from the op CLI — the agent resolves the contact name to an email address from known contacts, assembles the source context, and calls this function directly.

CSV input format

Two LLM calls per contact

Subject line (64 tokens)

Email body (512 tokens)

Slide content via MarkItDown

Output structure

Dashboard logging

Single-contact mode

Related posts