FREE TOOL

A/B Test Duration Calculator

Know exactly how many days to run your test — before you start. Assumes 95% confidence and 80% statistical power.

Total visitors across both control + variation combined per day

Your control page's current conversion rate

Smallest relative improvement worth detecting (e.g. 15 = detect a 15% relative lift)

Fill in all fields above to calculate your test duration

Assumes 95% confidence level, 80% statistical power, and equal traffic split between control and variation.

How to Use This Calculator

1

Enter Daily Visitors

Use the total combined visitors entering the test per day — both control and variation traffic combined. Find this in your analytics under your test page's daily sessions.

2

Set Your Baseline Rate

Enter your control page's current conversion rate as a percentage. For example, if 150 out of 5,000 visitors convert, enter 3.0. Get this from your analytics for the specific goal you're optimizing.

3

Choose Your MDE

The Minimum Detectable Effect (MDE) is the smallest relative improvement worth detecting. A 15% MDE on a 2% baseline means you want to detect when the variation reaches 2.3%. Smaller MDEs need more time.

The Formula Behind This Calculator

This calculator uses the industry-standard two-proportion z-test sample size formula. Here's the math:

n = (Zα/2 + Zβ)² × (p₁(1−p₁) + p₂(1−p₂)) / (p₂ − p₁)²

days = ⌈(n × 2) / daily_visitors⌉

Where: Zα/2 = 1.96 (95% significance), Zβ = 0.8416 (80% power), p₁ = baseline conversion rate, p₂ = p₁ × (1 + MDE/100)

Why Test Duration Matters

One of the most damaging mistakes in A/B testing is stopping a test as soon as you see a "winner." This practice — known as peeking — dramatically inflates your false positive rate. When you stop early, you're essentially looking at a sample that happens to show a difference by chance.

Running tests to their predetermined duration (calculated before the test starts) is called fixed-horizon testing and is the gold standard for reliable A/B test results. It ensures your statistical guarantees are valid and your shipping decisions are based on real signal.

Even if the math suggests you need fewer than 7 days, always run tests for at least one full week to capture day-of-week behavioral variation. Users behave very differently on Mondays versus Saturdays, and a test that runs Monday through Wednesday will over-represent weekday traffic patterns.

Frequently Asked Questions

How long should an A/B test run?

An A/B test should run until it reaches the required sample size — which depends on your daily traffic, baseline conversion rate, and the minimum effect you want to detect. Most tests need at least 1–4 weeks. Never stop a test early just because results look promising — this leads to false positives.

What is a Minimum Detectable Effect (MDE)?

The MDE is the smallest relative improvement you want your test to reliably detect. A 15% MDE on a 2% baseline means detecting when the variation reaches 2.3% conversion. Smaller MDEs require larger samples and longer tests. For most tests, an MDE of 10–20% is practical — detecting smaller effects requires very high traffic volumes.

What does 95% statistical significance mean?

95% significance means there's only a 5% probability (p < 0.05) that the observed difference happened due to chance. It does not mean you're 95% confident in the effect size or that the variation will perform the same way in production. It's a threshold for decision-making, not a guarantee.

What is 80% statistical power?

Statistical power is your test's ability to correctly detect a true effect when one exists. 80% power means that if your variation really does improve conversions by your MDE, your test has an 80% chance of detecting it as significant. The remaining 20% are false negatives — real effects you miss because your sample was too small. Higher power means fewer missed opportunities.

Can I run a test for less than 7 days?

Even if the formula suggests fewer days, always run tests for at least 7 days to account for day-of-week effects. User behavior differs significantly between weekdays and weekends. A test that runs Thursday through Sunday will over-represent weekend visitors and produce biased results that don't hold during the full week.

Need Help Deciding What to Test?

The calculator tells you how long to run a test. A CRO audit tells you what to test in the first place — based on real user behavior data, not guesswork.

Book a CRO Audit

Starting at $2,500 · 5–7 day delivery