Free Tool

A/B Test Sample Size Calculator

Calculate the sample size your split test needs to reach statistical significance — and convert it into a test duration in days using your actual traffic. Defaults to alpha 0.05 two-tailed and 80% power, the conventional CRO standard.

Your Test Parameters

Inputs update results instantly.

Current rate of the step you are testing — not end-to-end funnel

%

Smallest relative lift you want to detect. p₂ = p₁ × (1 + MDE). Default 10%

%

False positive rate — drives Zα/2

One- or two-tailed

Probability of detecting a real effect — drives Zβ

Visitors per day reaching the page being tested — not part of the n formula

Two-Proportion Sample Size Formula
n = (Zα/2·√(2·p̄·q̄) + Zβ·√(p₁·q₁ + p₂·q₂))² / (p₁ − p₂)²
What each variable means
n
sample size per variant
p₁
baseline conversion rate = 14.00%
p₂
target rate = p₁ × (1 + MDE) = 15.40%
q₁, q₂
1 − p₁ and 1 − p₂ (non-conversion rates)
pooled rate = (p₁ + p₂) / 2 = 14.70%
1 − p̄
Zα/2
z-score from significance level = 1.9600
Zβ
z-score from statistical power = 0.8416
(p₁ − p₂)
absolute difference between rates (also written as δ) = -1.40 pp
Sample Size Per Variant
10,042
20,084 total exposures across control + variant
Recommended Test Duration
14 days
Math says 12 days. Rounded up to full weeks with a 14-day floor to absorb day-of-week effects.
Baseline (p₁)
14.00%
current rate
Target (p₂)
15.40%
at chosen MDE
Significance
α = 0.05 · 2-tailed
false positive threshold
Power
80%
chance to detect a real lift
Sensitivity

How MDE Changes Sample Size and Duration

Halving the MDE roughly quadruples the sample size required — the relationship is squared, not linear. This is the single biggest reason "let us see if anything moves" tests never finish.

Relative MDETarget Rate (p₂)Sample / VariantRecommended Duration
5.0%14.70%39,37549 days
10.0%your input15.40%10,04214 days
20.0%16.80%2,60814 days

Duration assumes a 50/50 traffic split between control and variant, with a 14-day floor and rounding up to full weeks.

What These Inputs Mean

Baseline, MDE, Alpha and Power Explained

Baseline Conversion Rate

The current conversion rate of the step you are testing. The rule: baseline is the rate the change can actually move. Testing a pricing page change against an end-to-end LP-to-purchase rate inflates required sample size by 5 to 20 times for no reason.

Minimum Detectable Effect (MDE)

The smallest relative lift you want the test to detect. Stated as a percent of baseline, not absolute percentage points. 10% relative on a 15.9% baseline means detecting 17.5%. Smaller MDE means much larger sample size — halving it roughly quadruples it.

Alpha (Significance Level)

Your tolerance for false positives — declaring a winner when there is no real difference. 0.05 two-tailed is the standard "statistical significance" threshold. One-tailed cuts sample size by ~20% but assumes you do not care if the variant is worse, which is rarely true in CRO.

Statistical Power (1 − β)

Your tolerance for false negatives — missing a real winner. 80% means that if the true lift equals your MDE, the test will detect it 80% of the time. Below 70% you are running tests that mostly cannot succeed even when the variant is genuinely better.

For a worked scenario using the formula above against a real funnel, see the full guide: A/B Test Sample Size Calculator: Baseline, MDE, Alpha and Power for CRO.

Common Questions

A/B Test Sample Size FAQ

Need help designing your CRO test programme?
Let's audit your conversion funnel

I'll review your funnel, identify the highest-leverage step to test, and help you design a test plan that can actually reach statistical significance with the traffic you have.

Get a Free Audit