How does this A/B test sample size calculator work?

It uses the standard two-proportion z-test formula with your inputs: baseline conversion rate, minimum detectable effect (MDE), alpha (significance level), statistical power, and one-tail vs two-tail. The output is the sample size per variant needed to detect the lift you specified at your chosen confidence level. We then convert that number into a test duration using your daily traffic to the test surface.

What baseline conversion rate should I enter?

Enter the current conversion rate of the specific step you are testing — not your end-to-end funnel rate. If you are testing a pricing page change, use pricing-to-checkout. If you are testing a landing page hero, use landing-to-next-step. Measure it over the last 28 to 90 days, excluding any anomalies like a sale or tracking outage.

What MDE (Minimum Detectable Effect) should I use?

MDE is the smallest relative lift you want the test to detect. 10% relative is a sensible default for most CRO tests. Below 5% you usually need more traffic than you have. Above 20% you are betting on a large effect — possible for major redesigns, optimistic for small tweaks.

What alpha and power should I use?

Alpha 0.05 two-tailed and power 0.80 are the conventional defaults — the standard "statistical significance" threshold across the industry. Use alpha 0.01 if a false positive would be expensive (a redesign you cannot easily roll back). Use power 0.90 if you want a stronger guarantee against missing real winners — sample size goes up by ~30%.

Why does the recommended duration round up to full weeks?

Day-of-week effects are real. Visitor mix, intent, and conversion behaviour on Tuesday differ from Saturday. A test that ran Tuesday to Friday is not measuring the same population as one that ran across two full weeks. The 14-day floor protects you from shipping a result that was a one-week artefact.

What if I want to test more than two variants?

Sample size per variant stays roughly the same, but with three or four variants the chance of a false positive on at least one of them rises sharply. Apply a Bonferroni correction (divide alpha by the number of variants) or use a sequential testing tool. This calculator assumes one control and one variant.

Free Tool

A/B Test Sample Size Calculator

Calculate the sample size your split test needs to reach statistical significance — and convert it into a test duration in days using your actual traffic. Defaults to alpha 0.05 two-tailed and 80% power, the conventional CRO standard.

Your Test Parameters

Inputs update results instantly.

Baseline Conversion Rate (p₁)

Current rate of the step you are testing — not end-to-end funnel

Minimum Detectable Effect — MDE (drives p₂)

Smallest relative lift you want to detect. p₂ = p₁ × (1 + MDE). Default 10%

Significance Level (α)

False positive rate — drives Z_α/2

Test Direction

One- or two-tailed

Statistical Power (1 − β)

Probability of detecting a real effect — drives Z_β

Daily Traffic to Test Surface (used for duration only)

Visitors per day reaching the page being tested — not part of the n formula

Two-Proportion Sample Size Formula

n = (Z_α/2·√(2·p̄·q̄) + Z_β·√(p₁·q₁ + p₂·q₂))² / (p₁ − p₂)²

What each variable means

n: sample size per variant
p₁: baseline conversion rate = 14.00%
p₂: target rate = p₁ × (1 + MDE) = 15.40%
q₁, q₂: 1 − p₁ and 1 − p₂ (non-conversion rates)
p̄: pooled rate = (p₁ + p₂) / 2 = 14.70%
q̄: 1 − p̄
Z_α/2: z-score from significance level = 1.9600
Z_β: z-score from statistical power = 0.8416
(p₁ − p₂): absolute difference between rates (also written as δ) = -1.40 pp

Sample Size Per Variant

10,042

20,084 total exposures across control + variant

Recommended Test Duration

14 days

Math says 12 days. Rounded up to full weeks with a 14-day floor to absorb day-of-week effects.

Baseline (p₁)

14.00%

current rate

Target (p₂)

15.40%

at chosen MDE

Significance

α = 0.05 · 2-tailed

false positive threshold

Power

80%

chance to detect a real lift

Sensitivity

How MDE Changes Sample Size and Duration

Halving the MDE roughly quadruples the sample size required — the relationship is squared, not linear. This is the single biggest reason "let us see if anything moves" tests never finish.

Relative MDE	Target Rate (p₂)	Sample / Variant	Recommended Duration
5.0%	14.70%	39,375	49 days
10.0%your input	15.40%	10,042	14 days
20.0%	16.80%	2,608	14 days

Duration assumes a 50/50 traffic split between control and variant, with a 14-day floor and rounding up to full weeks.

What These Inputs Mean

Baseline, MDE, Alpha and Power Explained

Baseline Conversion Rate

The current conversion rate of the step you are testing. The rule: baseline is the rate the change can actually move. Testing a pricing page change against an end-to-end LP-to-purchase rate inflates required sample size by 5 to 20 times for no reason.

Minimum Detectable Effect (MDE)

The smallest relative lift you want the test to detect. Stated as a percent of baseline, not absolute percentage points. 10% relative on a 15.9% baseline means detecting 17.5%. Smaller MDE means much larger sample size — halving it roughly quadruples it.

Alpha (Significance Level)

Your tolerance for false positives — declaring a winner when there is no real difference. 0.05 two-tailed is the standard "statistical significance" threshold. One-tailed cuts sample size by ~20% but assumes you do not care if the variant is worse, which is rarely true in CRO.

Statistical Power (1 − β)

Your tolerance for false negatives — missing a real winner. 80% means that if the true lift equals your MDE, the test will detect it 80% of the time. Below 70% you are running tests that mostly cannot succeed even when the variant is genuinely better.

For a worked scenario using the formula above against a real funnel, see the full guide: A/B Test Sample Size Calculator: Baseline, MDE, Alpha and Power for CRO.

Common Questions

A/B Test Sample Size FAQ

: It uses the standard two-proportion z-test formula with your inputs: baseline conversion rate, minimum detectable effect (MDE), alpha (significance level), statistical power, and one-tail vs two-tail. The output is the sample size per variant needed to detect the lift you specified at your chosen confidence level. We then convert that number into a test duration using your daily traffic to the test surface.
: Enter the current conversion rate of the specific step you are testing — not your end-to-end funnel rate. If you are testing a pricing page change, use pricing-to-checkout. If you are testing a landing page hero, use landing-to-next-step. Measure it over the last 28 to 90 days, excluding any anomalies like a sale or tracking outage.
: MDE is the smallest relative lift you want the test to detect. 10% relative is a sensible default for most CRO tests. Below 5% you usually need more traffic than you have. Above 20% you are betting on a large effect — possible for major redesigns, optimistic for small tweaks.
: Alpha 0.05 two-tailed and power 0.80 are the conventional defaults — the standard "statistical significance" threshold across the industry. Use alpha 0.01 if a false positive would be expensive (a redesign you cannot easily roll back). Use power 0.90 if you want a stronger guarantee against missing real winners — sample size goes up by ~30%.
: Day-of-week effects are real. Visitor mix, intent, and conversion behaviour on Tuesday differ from Saturday. A test that ran Tuesday to Friday is not measuring the same population as one that ran across two full weeks. The 14-day floor protects you from shipping a result that was a one-week artefact.
: Sample size per variant stays roughly the same, but with three or four variants the chance of a false positive on at least one of them rises sharply. Apply a Bonferroni correction (divide alpha by the number of variants) or use a sequential testing tool. This calculator assumes one control and one variant.

Need help designing your CRO test programme?

Let's audit your conversion funnel

I'll review your funnel, identify the highest-leverage step to test, and help you design a test plan that can actually reach statistical significance with the traffic you have.

Get a Free Audit