Methodology

How this actually works.

How the panel is built, how reactions are generated, what the scores mean, and — importantly — what this is not. The numbers on this page are computed from the shipping panel files at build time, not maintained by hand.

Panel members
510hand-written people
Distinct occupations
509no copy-paste personas
States covered
51incl. DC + PR
Median age
42range 1678
Audience packs
18curated presets
Data sets
12+ 6 sample messages

Computed at build time from src/personas/_compiled/americans.json — the same file the run engine loads.

the method

An agentic focus group, not a survey.

Coldread takes any written asset (an ad, landing page, email, speech, debate answer, public statement, agent output) and runs it past a focus group of specifically-defined people. Each "person" is a written persona — name, age, occupation, region, voice, biographical detail — that the model speaks as. You get back individual reactions, scores, recurring objections, and ranked rewrite suggestions in minutes.

the run, step by step

Asset in → 510 independent agents → synthesized verdict.

The architecture is genuinely multi-agent: parallel calls with private context, a blind synthesis layer, and graceful degradation when a batch fails.

  1. 1Asset in

    Your ad, email, speech, statement, or URL becomes the stimulus. Optionally grounded with voter data and source notes.

  2. 2Private conditioning

    Each panel member is conditioned only on their own persona file — name, age, occupation, region, voice, biography.

  3. 3Parallel reactions

    One independent agent call per person. No cross-talk, no shared verdict, no consensus pressure. Disagreement is emergent.

  4. 4Blind synthesis

    A separate judge pass aggregates the reactions without ever seeing persona demographics — so it can't stereotype the read.

  5. 5Decision report

    Topline score, sentiment split, objection clusters, segment splits, trust and confusion risk, ranked rewrites.

Audience-mode runs work the same way but generate archetype agents from your audience description, source context, and the simulated population size you enter, then sample variant personas inside each archetype. The hosted website generates a 240-respondent sample (40 archetypes × 6 variants) to represent that population; CLI and MCP callers can request larger direct runs within server caps. Hybrid runs read the named panel and the generated audience together.

the panel, measured

510 hand-written Americans.

Across trades, healthcare, public service, professional work, retail, agriculture, military, education, and creative work. Written one at a time, not generated in bulk — 509 distinct occupations across 510 people.

under 3073 · 14%
30–44248 · 49%
45–64180 · 35%
65+9 · 2%

Specific, not archetypal — not "suburban mom" but Beth Howell, 36, suburban Indianapolis, ex-marketing manager who sees the manipulation in your copy and dismisses it in two seconds.

meet the panel

Real entries from the shipping roster.

Pulled straight from the persona files the engine runs. Click anyone to read the opening of their file; shuffle for a different cut of the roster.

Rotating sample of the shipping roster · subscribers can browse all entries via coldread panel show americans
measured against humans

The MIT focus group benchmark.

A public focus group baseline on AI and deepfakes — 3 human groups, 39 participants — scored against a fixed 10-category insight rubric, 0–2 points per category.

18/20rubric points recovered
Coldread reactions
546same topic brief
Model spend
$0.097for the full run
Human baseline
39participants, 3 groups
What it recovered, and the two misses

Coldread recovered trust erosion, scams and impersonation, political misinformation, verification behavior, accessibility benefits, consent risk, bias, and audience relevance gaps. The two partial misses matter: the human groups gave more texture to the "liar's dividend" problem (real evidence later dismissed as fake), and surfaced institutional disruption themes more strongly.

That is why we do not claim AI replaces human research. The claim is narrower: a fast first read on what people will understand, reject, trust, ignore, or question — before you spend weeks recruiting a panel.

under the hood

The persona schema and the guard rails.

The research-grade detail, collapsed so the page stays readable.

The five-tier persona schema

Each panel member is described by a multi-tier profile. The ordering reflects empirical priority: a narrative biography predicts how a person responds far better than any structured demographic field (Park et al., Generative Agent Simulations of 1,000 People, 2024 — interview-conditioned agents replicated their source individuals at 85% of the source's own two-week test–retest reliability; demographic-only agents reached 71%).

  • Tier 1 — stable disposition. Big Five percentiles, Schwartz value axes, Moral Foundations weights, worldview anchors, attachment style, need for cognition, regulatory focus.
  • Tier 2 — biography. A narrative life-history sketch with formative shocks captured dimensionally rather than as a cumulative count.
  • Tier 3 — active goals. What this person is trying to accomplish when they encounter the asset.
  • Tier 4 — state & context. Mood, cognitive load, social context.
  • Tier 5 — communicative style. Vocabulary level, hedging tendency, dissent willingness, scale-extremity tendency — without per-persona extremity priors, LLM personas over-emit scale endpoints (≈78% vs. humans' ≈35%).

The 510-person Americans panel was hand-authored before this schema landed; it uses Tier 2 narrative + a subset of Tier 1/5 fields and is being progressively backfilled. Auto-generated panels ship with the full schema.

Anti-bias guard rails
  • Mode collapse / extremity bias — personas are instructed to use the middle of the scale when genuinely mixed; each persona's scale-extremity tendency is a per-persona prior.
  • Sycophancy — "Praise must be earned by something specific in the asset." Uncertainty is normalized as a valid reaction.
  • WEIRD skew — the recruiter prompt explicitly resists the young/educated/urban/secular default unless that genuinely is the audience.
  • Voice flattening — Tier 5 style fields are rendered into the prompt so personas don't collapse to the LLM's default register.

These are mitigations, not eliminations. Validity work (held-out human benchmarks, GSS / ANES / WVS calibration) is ongoing.

What the scores mean

Each panel member gives the asset a 1–10 score and a short reaction in their voice. The aggregate reports average score, "would act" rate, sentiment buckets, consensus, polarization, confusion risk, trust risk, persuasion lift, backlash risk, message discipline, opposition vulnerability, factual exposure, top and bottom responders, recurring objections, and ranked rewrites. Use them as a directional read — the value is in the named reactions and recurring objections, not the single number at the top.

read this if you read nothing else

What this is not.

This is not literal market research. The panel members are written personas, not recruited respondents. Scores are a structured pre-flight signal — directional, fast, and cheap — not a forecast of conversion lift or population truth.

It is not a substitute for real voter contact, user testing, or paid pilots. Use Coldread as the read you do before spending on those — to kill the obviously-bad version and walk into the expensive test with a stronger draft.

privacy

Processed, then discarded.

Runs are dispatched server-side with a Coldread-managed key; we don't write your asset or reactions to disk. Recent runs live in your browser's localStorage. Full disclosure at /privacy.

pricing

First run free.

Then Early Access $19/mo, Pro $29/mo, or Max $120/mo — every tier includes the full panel, hosted generated-audience reads for any simulated population size, follow-ups, MCP/CLI, and exports. Billing via Stripe.

disagree?

Tell us why.

The panel and prompts evolve based on real complaints. Reach us via /contact.

← back to Coldread