Sample Size Calculator
How many visitors do you need before your A/B test results are reliable? Calculate the exact sample size with your preferred confidence level and power.
Your control variation's current conversion rate
Smallest relative improvement you want to reliably detect
Fill in your baseline rate and MDE to calculate sample size
Uses the two-proportion z-test formula. Sample size is per variation — double for total traffic needed.
Understanding Sample Size for A/B Tests
Sample size is one of the most misunderstood concepts in A/B testing. Most teams either run tests that are wildly underpowered (stopping after a few hundred visitors), or they run tests for far longer than necessary because they never calculated upfront how much traffic they need.
The required sample size per variation depends on four variables: your baseline conversion rate, the minimum effect size you want to detect, your significance threshold, and your desired statistical power. Changing any one of these dramatically affects how much traffic you need.
The Formula
n = (Zα/2 + Zβ)² × (p₁(1−p₁) + p₂(1−p₂)) / (p₂ − p₁)²
This is the standard two-proportion z-test sample size formula. It calculates the minimum number of observations needed in each group (per variation) to detect a difference of (p₂ − p₁) with the specified significance level (α) and power (1 − β).
Significance Level vs. Statistical Power
Significance Level (α)
Controls your false positive rate — the probability of declaring a winner when there's actually no real difference. At 95% significance (α = 0.05), you'll incorrectly declare a winner 5% of the time by pure chance. Lower α = fewer false positives, but requires more traffic.
Statistical Power (1 − β)
Controls your true positive rate — the probability of detecting a real effect when one exists. At 80% power, you'll correctly identify a true winner 80% of the time. The other 20% are false negatives: real improvements you miss. Higher power requires more traffic.
How MDE Affects Your Sample Size
Your Minimum Detectable Effect (MDE) has the biggest impact on sample size requirements. Here's how different MDEs compare on a 2% baseline with 95% significance and 80% power:
| MDE | Target Rate | Sample/Variation | Feasibility |
|---|---|---|---|
| 5% | 2.10% | ~156,000 | Very High Traffic |
| 10% | 2.20% | ~39,000 | High Traffic |
| 15% | 2.30% | ~17,500 | Moderate Traffic |
| 20% | 2.40% | ~9,900 | Most Sites |
| 30% | 2.60% | ~4,500 | Low Traffic OK |
Frequently Asked Questions
Is the sample size per variation or total?
Per variation. For a standard A/B test with two groups (control + one variation), multiply the result by 2 to get total visitors needed. For a three-way test (A/B/C), multiply by 3 — though multivariate tests with more than two variations require even more traffic and are harder to reach significance on.
What significance level should I use?
95% is the industry standard for most A/B tests. Use 90% for low-stakes decisions where you want shorter test durations and can accept a slightly higher false positive rate. Use 99% for high-impact changes like pricing, payment flow, or major checkout redesigns — where shipping the wrong winner would be very costly. Never go below 90%.
What power should I choose?
80% power is the standard for most CRO programs. It means you'll miss 20% of real effects, which is acceptable in most cases. Choose 90% if you have high traffic and can afford longer tests. 70% is acceptable only for early-stage exploratory tests where you're looking for directional signal rather than definitive decisions.
What if my required sample size is unreachable?
If the required sample size would take 3+ months at your current traffic levels, you have a few options: increase your MDE (only test bolder changes that produce larger effects), reduce the number of variations, focus testing on higher-traffic pages, or consolidate traffic to fewer test pages. Low-traffic sites often benefit more from qualitative CRO research (user sessions, heatmaps, user testing) than from statistically-powered A/B testing.
Not Sure What's Worth Testing?
Our CRO audits identify your highest-impact test opportunities from real analytics and user behavior data — so you're not testing random hypotheses, but validated friction points.
Book a CRO AuditStarting at $2,500 · 5–7 day delivery