"No code" is an overloaded term. In the context of the autoresearch loop, it means something specific: you don't write any code to run the optimization. You fill in a plain-English template, drop it into an AI agent's working directory, and the agent does the work — generating variants, applying changes, and producing a structured decision log. You read the output and make the keep-or-revert call. That's the entire workflow.

This guide walks through exactly what "filling in a template" looks like in practice, how to run it, and what to do with the output.

The program.md Format

Every autoresearch loop runs from a single file called program.md. It's a plain-text Markdown file with a fixed three-section structure. You don't create this file from scratch — the Autoresearch Playbook provides a ready-to-fill template for each optimization type. Your job is to replace the placeholder text with your specifics.

The three sections every program.md contains:

1. Context Block

This section tells the AI agent everything it needs to know about your situation. It typically includes:

  • What you're optimizing (cold email subject line, landing page headline, pricing page CTA)
  • Who your target audience is (ICP description — role, company size, the problem they're trying to solve)
  • The current version of the asset (paste in the actual text you want to improve)
  • Any hard constraints (character limits, brand voice notes, things not to change)

The quality of the context block is the biggest determinant of output quality. A vague ICP ("small business owners") produces generic variants. A specific ICP ("solo B2B founders selling consulting services to mid-market companies, struggling with lead generation from outbound") produces targeted, high-quality variants.

2. Metric Definition

This section defines what success looks like for this loop cycle. It has three required elements:

  • The metric: One number (open rate, reply rate, conversion rate, revenue per visitor)
  • The measurement method: How you measure it and over what window
  • The threshold: The minimum improvement required to keep the variant (15% relative is a reasonable default for most tasks)

3. Variant Instructions

This section tells the agent what kind of improvement to generate. It's not "write a better headline" — it's a specific brief:

  • What strategy to use (curiosity gap, outcome-first, problem-first)
  • What format constraints apply (maximum 10 words, must be a question, etc.)
  • What to avoid (overused phrases, competitor names, anything that implies a claim you can't substantiate)

The Autoresearch Playbook ships each template with a completed example. Before filling in your first program.md, read through the example once — it takes five minutes and dramatically reduces the friction of filling in your own.

Running the Template in Claude Code

Claude Code is Anthropic's AI coding agent — a command-line tool that can read files, generate content, and apply changes to your project. It's the primary agent the Autoresearch Playbook is designed for, though any MCP-capable agent (Cursor, Continue) works.

To run a program.md template in Claude Code:

  1. Open your terminal and navigate to the directory containing your program.md and the asset you're optimizing (your landing page HTML file, your email sequence, etc.)
  2. Start Claude Code: claude
  3. Type: "Read program.md and run the autoresearch loop as specified"
  4. The agent reads the file, generates variants, selects the best one according to your instructions, applies it to the target file, and outputs a decision log
  5. Review the decision log. The agent explains each variant it considered and why it selected the one it did

The decision log is where your judgment enters. You read what the agent generated, decide whether it's worth testing, and either accept the change or ask the agent to try a different approach. You're not approving every word — you're making a call on whether the direction is right.

Running with Ollama (Local Models, Free)

If you'd rather not use a paid API, you can run the autoresearch loop entirely on your own machine using Ollama. Ollama is a tool that runs open-source language models locally — Llama 3, Mistral, Phi-4, and others — with no API cost.

The trade-off is output quality: local models produce competitive results for most optimization tasks (headline variants, subject line testing, offer framing) but may require more review cycles for complex tasks like market research synthesis. The Autoresearch Playbook's cost appendix includes a model selection guide that maps task type to the minimum model quality needed.

For most solo founders starting out, Ollama with Llama 3 is a reasonable default: you get unlimited iterations at zero marginal cost, and you can upgrade to Claude or GPT-4o later for tasks where you need higher-confidence outputs.

Reading the Output and Making the Decision

After the agent runs the loop, you have three things to read:

  1. The applied variant: What the agent changed. Review this before you deploy — the agent's judgment is good but not infallible. Look for anything that contradicts your brand voice or makes a claim you can't support.
  2. The decision log: Why the agent selected this variant over the others it generated. This is useful for understanding what the agent is optimizing toward, and for calibrating your next set of variant instructions if you want to try a different approach.
  3. The rejected variants: The other options the agent considered. These aren't waste — they're your backup variants if the selected one underperforms.

Your keep-or-revert decision comes after your measurement window, not before. You deploy the variant, collect data, compare to your baseline, and then decide. The agent's output is a recommendation, not a verdict.

Scaling Up: Running Multiple Loops in Parallel

Once you've run your first loop and gotten comfortable with the workflow, you can scale by running multiple loops on different assets simultaneously. A cold email subject line loop and a landing page headline loop can run in parallel — they're measuring different metrics on different assets and don't interfere with each other.

The discipline is to keep each loop focused on one variable. Two simultaneous tests on the same asset (say, testing both the headline and the offer at the same time on your landing page) invalidates both measurements. One loop, one asset, one variable at a time — but you can have multiple assets in optimization simultaneously.

A realistic steady-state for a solo founder running the autoresearch system: two to three active loops at any given time (one on cold email, one on landing page, one on a pricing element), cycling every three to four weeks. That's roughly one new experiment result per week, which compounds into significant improvements over a year.

Take the free 12-point assessment to find out which of your assets has the most headroom for improvement and get a recommended starting point for your first loop.