A/B testing & experiments

How to use it

Create an experiment and choose the surface — storefront, campaign, journey, content, or ad creative. (Analytics → A/B testing)
Let assignment stay consistent so overlapping tests don't invalidate each other.
Read the significance report and roll out the winner.

⏱ ~5 min · 💳 Starter+ · 🎯 A real experiment program, not one-off tests

Why this matters for your business

This page covers site-wide A/B testing — the surface that manages experiments across all platform features (campaigns, journeys, content, storefront, ad creatives) in one coordinated place. Specific surfaces have their own A/B implementations (see storefront A/B block, sales-engine experiments), but this is where they're orchestrated and reported on.

The strategic value is coordination. Without a single testing surface, teams run conflicting experiments (a subject- line test + a journey-version test on the same audience makes both meaningless). They also run nothing because "it's hard to set up." This page surface is the experiment-program backbone.

What this typically unlocks

Outcome	Result
Tests/quarter (vs. ad-hoc)	+4× typical
Validated insights/year	30-40 with discipline
Conflicting experiments	0 with coordination
Time setup → live test	5 minutes

What you actually get

Capability	Description
Test catalog	Every active + completed test in one view
Conflict detection	Won't let you test contradicting variables on overlapping audiences
Sample-size calculator	"How many recipients to detect 5% lift?"
Live significance	p-value + confidence interval as data accumulates
Auto-stopping	Optional: stop test when significance reached
Hypothesis log	Pre-registered hypotheses; the "expected" before "actual"
Cross-surface tests	Same test running on email + WA + storefront

Real merchant scenarios

Scenario A — Brand commits to weekly test

Setup. $5M brand. Goal: 1 test per week × 52 weeks.

Year-1 result: 47 tests run. 31 had clear winners. 8 overlapped on audience and were re-run separately. 8 inconclusive. +18% compounded conversion lift from the ones that won.

Scenario B — Avoiding conflict

Setup. Marketing manager scheduled a subject-line test on welcome email. Same week, a designer scheduled a content test on welcome-email step 2.

Conflict alert: "Both tests target the same audience on the same journey. Suggested: stagger the tests, or coordinate under one multivariate test."

Action: Combined into one multivariate test. Both factors measured cleanly.

Scenario C — Stopping a clear winner early

Setup. A/B test on campaign subject. After 12,000 sends, variant B was up 38% with p < 0.001.

Auto-stop fired: Test concluded; remaining sends went to the winner. Captured an extra ~$8K in conversion vs. running test to full duration.

Best practices

✅ Pre-register hypotheses. Writing "I think X will win because Y" before testing makes the result mean something.

✅ Use the conflict detection. Saves you from reading inconclusive tests later.

✅ Run holdout-based tests for journeys (sales-engine experiments); they measure causal lift, not just correlation.

❌ Don't run too many concurrent tests without coordination.

❌ Don't act on tests that haven't reached power.

Plan tiers

Capability	Free	Starter	Pro	Agency	Enterprise
Test catalog + management	—	✓	✓	✓	✓
Conflict detection	—	✓	✓	✓	✓
Sample-size calculator	—	✓	✓	✓	✓
Live significance	—	✓	✓	✓	✓
Auto-stop on significance	—	—	✓	✓	✓
Multivariate tests	—	—	✓	✓	✓
Hypothesis log	—	—	✓	✓	✓

How to use it​

Why this matters for your business​

What this typically unlocks​

What you actually get​

Real merchant scenarios​

Scenario A — Brand commits to weekly test​

Scenario B — Avoiding conflict​

Scenario C — Stopping a clear winner early​

Best practices​

Plan tiers​

See also​