Methods & Validation | Synthetic Market Research Association

Core workflow (the repeatable loop)

A minimal workflow that keeps synthetic research structured, reproducible, and auditable.

1

Grounding

Define the population frame, segments, and what evidence grounds the panel. Decide what context is allowed and what must be excluded.

2

Protocol

Run a standardised study design (questions, stimuli, controls, and run settings), with enough detail that another team can repeat it.

3

Validation

Report stability, sensitivity, and at least one external benchmark or known-truth check. Label limitations and uncertainty.

If you only do one thing…

Run the same study conditions twice and compare results. If outputs are not stable, treat the work as exploratory and avoid quantitative claims.

See stability tests

Study types

Common research workflows and what they’re good for.

Concept tests

Use for fast iteration on product concepts, feature sets, positioning, and packaging directions. Best for ranking options, surfacing objections, and generating language to test in fieldwork.

Recommended protocol

Fixed stimuli → fixed questionnaire → two runs → stability check → segment breakdown → limitations noted.

Good for

Direction + iteration

Avoid

Hard incidence claims

Message tests

Compare taglines, claims, value propositions, and creative directions. Useful for diagnosing confusion, credibility concerns, and emotional reactions across segments.

Recommended protocol

Show controlled variants → ask standard comprehension + believability + differentiation items → run sensitivity checks by changing only one element at a time.

Good for

Comparisons

Avoid

Absolute “% will buy”

Pricing exploration

Use to explore price sensitivity narratives, perceived fairness, and “why” behind price thresholds. Treat synthetic results as directional; validate with market tests or targeted surveys.

Recommended protocol

Multiple price points → consistent framing → repeat runs → measure stability of rank ordering → pair with at least one external benchmark (historical price points, category norms).

Good for

Threshold hypotheses

Avoid

Final price setting alone

Segmentation exploration

Use to explore potential segments, motivations, trade-offs, and language. Synthetic research can be a fast way to propose segmentation hypotheses, which can then be tested with fieldwork.

Recommended protocol

Define segment rules → run parallel studies per segment → check within-segment stability → compare differences that persist across runs.

Good for

Hypothesis generation

Avoid

Claiming real segment sizes

Scenario simulation

Use to stress-test narratives: competitor moves, economic shifts, channel changes, or policy changes. Most useful for “what could happen” and for identifying sensitivities worth testing in the real world.

Recommended protocol

Explicit scenario definitions → consistent prompts → multiple runs → report variance + key assumptions and constraints.

Good for

Stress tests

Avoid

Predicting exact outcomes

Protocol checklist (repeatable studies)

A minimal checklist that keeps synthetic studies consistent, comparable, and easier to validate.

Before you run

Define the population frame and intended use (exploratory vs decision-support).
Lock the stimuli (concept card, ad copy, pricing table, etc.).
Lock the questionnaire and response scales.
Specify run settings (number of runs, sample sizes, controls, any seeds/temperature equivalents).
Define evaluation metrics you will report (stability, variance, external benchmark).

During and after

Run at least two identical runs and compute variance / rank stability.
Run a sensitivity check (small prompt/context changes) and report robustness.
Break out results by segment and check for spurious differences.
Attach the disclosure label and state limitations.
Log enough metadata for a comparable re-run.

Template-friendly approach

If you standardise protocols early, you can run “research sprints” repeatedly and compare results month-to-month. This is where synthetic workflows become genuinely useful operationally.

Disclosure label Protocol template

Reporting norms (what to publish)

Good reporting makes synthetic studies comparable. It also makes bad studies easier to spot.

Always include

Disclosure label + limitations

Include population frame, protocol summary, and a clear statement of uncertainty and failure modes.

Show stability

Variance / rank consistency

Re-run and report how much results change. Do not hide instability.

Benchmark

At least one external check

Pair with published stats, historical outcomes, or limited fieldwork where feasible.

Benchmarking suite (starter set)

A small set of tests you can run today. Expand as the field matures.

Download benchmarks

Stability (test-retest)

Run the same study twice. Report variance and rank stability for the primary outcomes.

Outputs: variance, correlation, rank flips.

Sensitivity

Change one thing (prompt framing, context, ordering) and measure how much conclusions shift.

Outputs: robustness score, failure triggers.

Known-truth tasks

Include tasks where there is a known answer (or strongly bounded answer) and evaluate performance.

Outputs: accuracy, calibration curve.

External benchmarks

Compare outputs to published statistics, historical outcomes, or a small real sample where possible.

Outputs: error bands, directionality match.

Segment consistency

Check that segment differences persist across runs and are not artefacts of randomness or prompt bias.

Outputs: segment stability, false positives.

Knowledge boundaries

Ensure panels do not “know” what they should not know. Test for leakage and overconfident claims.

Outputs: leakage rate, constraint violations.

Benchmarking rule

Benchmarking is not a one-off. Run a small suite regularly, especially when models, grounding data, or prompting protocols change.

Controls & guardrails

Practical tactics to reduce drift, bias, and overconfident outputs.

Protocol controls

Use standard question wording and fixed response scales.
Randomise ordering only when it is part of the design; otherwise keep fixed for comparability.
Separate “stimuli” from “instructions” to reduce accidental leading.
Log all protocol versions and change history.

Interpretation controls

Use uncertainty language and avoid false precision.
Prefer rank-order conclusions over absolute numbers unless validated.
Require a limitations section in every study.
Escalate to fieldwork for high-stakes claims or novel behaviours.

A simple internal standard

If a claim cannot survive (a) a second identical run and (b) a small sensitivity test, it should not be presented as a stable conclusion.

FAQ

At minimum: defined population frame, fixed stimuli, fixed questionnaire, two identical runs, a basic stability report, and a clear limitations section. If you cannot do this, label the work as exploratory.

Prefer rank ordering and directional comparisons unless you have strong validation against external benchmarks. Absolute purchase intent estimates are easy to overclaim and should be treated cautiously.

Two identical runs is the minimum for a stability check. For high-stakes decisions or when outputs appear volatile, run more repeats and report variance explicitly.

It’s a task where the correct answer (or a tight range) is known - for example, publicly measured distributions, historically observed outcomes, or constrained factual checks. They help you test calibration and leakage.

Want a ready-to-use pack? Download the protocol and benchmark templates.

Protocol template Benchmark pack

What’s the minimum protocol needed for a credible synthetic study?

Should we report absolute percentages (e.g., “% would buy”)?

How many runs should we do?

What’s a “known-truth task” in market research?