Inferential Statistics Estimation Hypothesis Testing 22 min read April 30, 2026
BY: Statistics Fundamentals Team
Reviewed By: Minsa A (Senior Statistics Editor)

Confidence Intervals: The Complete Guide with Formulas, Examples & Calculator

When a polling firm reports that a candidate leads 52% ± 3%, that margin of error is a confidence interval. When a clinical trial reports a drug reduces blood pressure by 8 mmHg (95% CI: 5–11), that bracketed range is a confidence interval. Confidence intervals are the standard language of uncertainty in science, medicine, and data analysis.

This guide walks through everything: what a confidence interval actually means, the formula and its parts, a five-step calculation walkthrough, the most common misconceptions, and working code in Python and R.

What You'll Learn
  • ✓ The correct definition of a confidence interval (not the common misinterpretation)
  • ✓ The CI formula broken down into every component
  • ✓ When to use z vs. t critical values
  • ✓ Five types of confidence intervals with worked examples
  • ✓ How sample size, confidence level, and variability affect CI width
  • ✓ Python and R code for every scenario

What Is a Confidence Interval?

Definition — Featured Snippet Target
A confidence interval is a range of values, calculated from sample data, that is likely to contain a population parameter (such as the mean) at a specified confidence level.
CI = point estimate ± (critical value × standard error)

A confidence interval gives you two pieces of information at once: a best guess for an unknown population value (the point estimate at its center) and a measure of how uncertain that guess is (the width of the interval). A narrower interval means your data are precise; a wider one means you need more data or should accept more uncertainty.

Confidence interval in one sentence

A confidence interval is a range of plausible values for a population parameter, constructed so that the procedure that generates the interval will capture the true parameter a specified percentage of the time across many repeated samples.

Real-world analogy: the election poll

Imagine a poll of 1,200 likely voters finds 54% support Candidate A, with a margin of error of ±3 percentage points at the 95% confidence level. That means the confidence interval for the true level of voter support runs from 51% to 57%. The phrase "95% confidence level" does not mean there is a 95% chance the true support is in that range — it means the methodology used to build that interval is reliable enough that, over many elections polled the same way, 95 out of 100 such intervals would bracket the true number. That distinction matters, and it is the most commonly misunderstood idea in all of applied statistics.

How Confidence Intervals Work

The mechanics behind a confidence interval trace back to the Central Limit Theorem. When you draw a random sample of size n from any population (with finite variance), the sampling distribution of the sample mean x̄ is approximately normal, centered at the true population mean μ, with a standard deviation equal to σ/√n. This predictable behavior is what makes estimation possible.

Confidence level vs. confidence interval

These two terms are related but not the same. The confidence level is the percentage you set in advance — 90%, 95%, or 99% — that describes how reliable your estimation procedure is over many repeated samples. The confidence interval is the specific numeric range your data produce in a single study. You choose the confidence level before collecting data; the confidence interval is computed afterward from the data.

What does 95% confidence actually mean?

⚠️
The #1 Misconception in Statistics

A 95% CI does NOT mean "there is a 95% probability that the true value lies within this interval." Once your sample is collected and the interval is calculated, the true population mean is a fixed (though unknown) number. Your interval either contains it or does not — no probability involved. The 95% refers to the long-run reliability of the procedure across hypothetical repeated sampling, not to any single interval.

The correct reading: if you ran the same study 100 times using the same sample size and method, roughly 95 of those 100 intervals would contain the true population mean. The other 5 would miss it. You cannot know which type your particular interval is — that uncertainty is the price of drawing inferences from a sample.

The role of the Central Limit Theorem

The Central Limit Theorem (CLT) justifies using the normal distribution to build confidence intervals even when the underlying population is not normally distributed. For sample sizes of roughly n ≥ 30, the sampling distribution of x̄ is close enough to normal that z-based intervals are reliable. For smaller samples — particularly when σ is unknown — the t-distribution provides the heavier-tailed critical values needed to maintain the nominal coverage probability. You can review sampling distributions in depth on the sampling distributions page at Statistics Fundamentals.

The Confidence Interval Formula

Every confidence interval for a population mean shares the same structure: a point estimate, surrounded by a margin of error on both sides. The margin of error is the product of a critical value and a standard error.

General Formula — Confidence Interval for a Mean
CI = x̄ ± z*(σ/√n)
Plain text: CI = x̄ ± z*(s/√n) when σ is unknown (use s and t* instead of z*)
= sample mean (point estimate) z* = critical value σ = population std dev n = sample size σ/√n = standard error (SE) z* × SE = margin of error (ME)

Step 1 — Calculate the sample mean (x̄)

Add up all observed values and divide by n. This is your point estimate — the single best guess for the unknown population mean. Every other part of the interval depends on this value.

Step 2 — Find the standard error (SE = s/√n)

The standard error measures how much the sample mean is expected to vary from one sample to the next. It equals the standard deviation divided by the square root of n. As n increases, the standard error shrinks — more data buys more precision. The descriptive statistics guide covers standard deviation calculation in full.

Step 3 — Find the critical value (z* or t*)

The critical value scales the standard error to match your chosen confidence level. For a 95% CI with known σ or large n, z* = 1.96. For other confidence levels:

Confidence Level α z* (z-distribution) Common Use
90%0.101.645Exploratory research, surveys
95%0.051.96Standard in most sciences
99%0.012.576Clinical trials, safety-critical decisions

For small samples (n < 30) or when σ is unknown, replace z* with the t* value from the t-distribution table using df = n − 1 degrees of freedom.

Step 4 — Compute the margin of error (ME)

Multiply the critical value by the standard error: ME = z* × (s/√n). The margin of error tells you how far the interval extends in each direction from the sample mean. It is the number you see reported as "±" in polls and studies.

Step 5 — Construct the interval [lower, upper]

Subtract the margin of error from the sample mean to get the lower bound; add it to get the upper bound. The final confidence interval is [x̄ − ME, x̄ + ME].

Worked example: mean weight of 50 students

Worked Example — 95% CI for a Mean

A researcher measures the weight of 50 students. The sample mean is x̄ = 68.4 kg and the sample standard deviation is s = 9.2 kg. Construct a 95% confidence interval for the population mean weight.

1

Parameters: n = 50, x̄ = 68.4 kg, s = 9.2 kg. Because σ is unknown and n ≥ 30, use z* = 1.96.

2

Standard error: SE = s/√n = 9.2/√50 = 9.2/7.071 = 1.301 kg

3

Critical value: z* = 1.96 (95% confidence, large sample)

4

Margin of error: ME = 1.96 × 1.301 = 2.55 kg

5

Interval: [68.4 − 2.55, 68.4 + 2.55] = [65.85, 70.95]

95% CI: [65.85 kg, 70.95 kg]. For n = 50, s = 9.2, z* = 1.96: CI = [47.2, 52.8] recentered at 68.4. We are 95% confident this interval contains the true mean weight of the student population.

Z-Distribution vs. T-Distribution

Choosing between z and t is one of the first decisions in any CI calculation. The rule comes down to what you know about the population and how many observations you have.

When to use z (n ≥ 30 or σ known)

Use the z-distribution when the population standard deviation σ is known, or when your sample is large enough (n ≥ 30) that the sample standard deviation s is a reliable substitute. The Z-table gives the exact critical value for any confidence level.

When to use t (n < 30 or σ unknown)

Use the t-distribution when σ is unknown and the sample is small (n < 30). The t-distribution has heavier tails than the normal, producing wider confidence intervals that account for the added uncertainty of estimating σ from the data. The relevant parameter is the degrees of freedom: df = n − 1. Check the t-distribution table for exact critical values.

Decision rule cheat sheet

ConditionDistributionCritical valueDegrees of freedom
σ known, any nzz* from Z-tableN/A
σ unknown, n ≥ 30z (approx.) or tz* = 1.96 (95%)N/A
σ unknown, n < 30tt* from t-tablen − 1
σ unknown, n < 30, non-normal pop.Bootstrap CIEmpirical percentilesN/A

Types of Confidence Intervals

The formula adapts depending on what population parameter you are estimating. Each type has its own standard error formula and, in some cases, its own critical value distribution.

CI for a population mean

The most common CI. Use the formula x̄ ± z*(σ/√n) or x̄ ± t*(s/√n). The standard error is s/√n, and the critical value comes from the normal or t-distribution depending on the scenario above.

CI for a proportion (Wilson score interval)

For a sample proportion p̂ based on n observations, the approximate CI is:

CI for a Proportion
p̂ ± z* × √(p̂(1−p̂)/n)
Wilson score interval performs better when p̂ is close to 0 or 1

This formula works well when np̂ ≥ 5 and n(1−p̂) ≥ 5. When those conditions fail — for rare events or small samples — the Wilson score interval is more accurate. The Wilson interval adjusts for the skew in the sampling distribution of p̂ near the boundaries.

CI for the difference between two means

To compare two independent groups (e.g., treatment vs. control), the CI for (μ₁ − μ₂) uses a pooled or separate standard error. If the interval excludes zero, the two means differ significantly at the corresponding α level — a direct connection to the two-sample t-test covered in the hypothesis testing guide.

CI for regression coefficients

In linear regression, every coefficient β has a confidence interval: β̂ ± t*(SE_β). These intervals tell you the plausible range for the true slope. When a 95% CI for a slope includes zero, the predictor is not significant at the 5% level.

Bootstrap confidence intervals

When the underlying distribution is unknown or the sample size is too small for asymptotic theory, bootstrap CIs offer a data-driven alternative. You resample the original data (with replacement) thousands of times, compute the statistic of interest each time, and take the 2.5th and 97.5th percentiles of the resulting distribution as your 95% CI. Bootstrap methods are standard in machine learning and are available in both Python's scipy and R's boot package.

What Affects Confidence Interval Width?

Three variables control how wide your interval is. Understanding their relationships helps you design studies that produce the precision you need before collecting a single data point.

Sample size (n) — bigger n = narrower CI

Because n appears under the square root in the standard error, doubling your sample size cuts the interval width by a factor of √2 ≈ 1.41 — not by half. To cut the interval in half, you need four times as many observations. This is why large sample sizes are so valuable in research.

Confidence level — higher % = wider CI

Increasing your confidence level from 95% to 99% requires a larger critical value (1.96 → 2.576), which widens the interval by about 31%. There is an inherent trade-off: more confidence means less precision, and vice versa. Setting your confidence level before data collection — not after — is essential for valid inference.

Population variability (σ or s)

A population with more spread-out values produces more variable sample means, which means wider intervals. This variability is largely outside your control — but you can control it indirectly by using stratified sampling, which can reduce the effective variability in your estimates.

🎛️ Interactive: See How n and Confidence Level Change Your CI

50
95%
10

Assuming sample mean x̄ = 50

μ = 50

How to Interpret a Confidence Interval Correctly

Misinterpretation of confidence intervals is so common that the American Statistical Association devoted a dedicated statement to the problem. Getting this right matters for every downstream decision that relies on the interval.

The correct frequentist interpretation

The 95% confidence interval is a statement about the procedure, not about any individual interval. Over many repetitions of the same study — using the same sampling method and the same sample size — 95% of the intervals generated by this procedure would contain the true population parameter. Your one calculated interval is a single realization of that procedure.

3 misconceptions that are universally wrong

StatementVerdict
"There is a 95% probability that the true mean lies within this interval." Wrong. The true mean is fixed; it does not have a probability distribution. The interval either contains it or does not.
"95% of the population values fall within this interval." Wrong. A CI estimates a population parameter (e.g., the mean), not the distribution of individual data points. That is what a prediction interval is for.
"We are 95% confident this interval contains the true population mean." Correct shorthand — understood as a statement about the long-run reliability of the procedure, not the probability for this specific interval.

Confidence intervals and statistical significance

There is a direct relationship between confidence intervals and hypothesis testing. If a 95% CI for a difference (or effect) does not include zero (or the null value), the corresponding two-sided test rejects the null at α = 0.05. Confidence intervals carry more information than p-values alone because they show the magnitude and direction of an effect, not just whether it crosses a significance threshold. This is why the American Psychological Association's Publication Manual now requires reporting CIs alongside test statistics.

CI vs. Prediction Interval vs. Credible Interval

These three terms are often confused. Each answers a fundamentally different question.

Prediction interval (for future observations)

A prediction interval estimates where a single new observation will fall, not where the population mean is. Because it must account for both uncertainty in the mean estimate and natural individual variation, a prediction interval is always wider than a confidence interval. The formula for a prediction interval for one new observation is: x̄ ± t*(s × √(1 + 1/n)).

Bayesian credible interval (posterior probability)

A Bayesian credible interval does make a probability statement about a parameter — but within the Bayesian framework where parameters have prior distributions. A 95% Bayesian credible interval means exactly what frequentist CIs are often mistakenly said to mean: there is a 95% posterior probability that the true parameter value falls within the interval. The trade-off is that the answer depends on the chosen prior distribution.

Side-by-side comparison

FeatureConfidence IntervalPrediction IntervalBayesian Credible Interval
EstimatesPopulation parameter (e.g., μ)A single future observationPopulation parameter
FrameworkFrequentistFrequentistBayesian
Width relative to CIReferenceAlways widerDepends on prior
Probability statementAbout procedure, not the intervalAbout procedure, not the intervalDirectly about the parameter
Requires prior?NoNoYes

How to Calculate Confidence Intervals in Python and R

For any real dataset, you will compute confidence intervals using software rather than by hand. Both Python and R offer one-line solutions for the common cases.

Python: scipy.stats.t.interval()

from scipy import stats
import numpy as np

# Sample data: weights of 50 students
data = np.array([68.4, 72.1, 65.3, ...]) # your actual data

n = len(data)
mean = np.mean(data) # x̄
sem = stats.sem(data) # standard error = s / √n

# 95% confidence interval (t-distribution, df = n-1)
ci = stats.t.interval(0.95, df=n-1, loc=mean, scale=sem)
print(ci) # → (lower_bound, upper_bound)

# 99% confidence interval
ci_99 = stats.t.interval(0.99, df=n-1, loc=mean, scale=sem)

# For proportions (95% CI for p̂ = 0.54, n = 1200)
p_hat = 0.54
n_p = 1200
se_p = np.sqrt(p_hat * (1 - p_hat) / n_p)
ci_p = stats.norm.interval(0.95, loc=p_hat, scale=se_p)
print(ci_p) # → (0.512, 0.568)

R: t.test()$conf.int

# Sample data vector
data <- c(68.4, 72.1, 65.3, ...) # your actual data

# 95% confidence interval (default)
result <- t.test(data)
result$conf.int # → [lower, upper] with attr "conf.level"

# 99% confidence interval
t.test(data, conf.level = 0.99)$conf.int

# CI for a proportion (n=1200, x=648 successes)
prop.test(648, 1200, conf.level = 0.95)$conf.int

# Bootstrap CI using the boot package
library(boot)
boot_mean <- function(d, i) mean(d[i])
b <- boot(data, boot_mean, R = 10000)
boot.ci(b, type = "perc") # percentile bootstrap CI

Interpreting the output

Both functions return the lower and upper bounds of the confidence interval. The key number to report is the interval itself — not just the p-value. A CI of [5.1, 10.2] tells you the effect exists (does not include zero) and its likely magnitude is between 5 and 10 units. A p-value alone tells you only the former.

Real-World Examples of Confidence Intervals

Confidence intervals appear in every field that draws conclusions from sampled data. Here are four domains where they drive real decisions.

🏥

Clinical Trials and Drug Efficacy

Phase III trials report treatment effects as confidence intervals, not just p-values. A drug that reduces HbA1c by 1.2% (95% CI: 0.8–1.6%) tells clinicians both the expected benefit and its uncertainty range. The FDA requires CIs in new drug applications.

🗳️

Political Polling

Every reputable poll publishes its margin of error — the half-width of a 95% confidence interval. A poll of 1,000 voters typically yields ±3.1 percentage points. When two candidates' intervals overlap, the race is a statistical toss-up.

📊

A/B Testing in Product Analytics

When a product team runs an A/B test, they report the lift in conversion rate as a confidence interval. A lift of 2.1% (95% CI: 0.4–3.8%) means the improvement is real but the range of likely effect sizes is wide — a signal to run a longer test before shipping.

📈

Economic Indicators

The US Bureau of Labor Statistics reports monthly unemployment figures with CIs. The headline figure is a point estimate from the Current Population Survey; the published 90% CI accounts for sampling error and is factored into monetary policy decisions.

Real-World Example

COVID-19 Vaccine Efficacy

When Pfizer-BioNTech reported 95% vaccine efficacy in late 2020, the full result was stated as 95% (95% CI: 90.3–97.6%). The confidence interval was as important as the point estimate. It showed that even at the lower bound — 90.3% — the vaccine was highly effective, giving regulators the statistical assurance needed for emergency authorization. A wide or zero-crossing interval would have demanded a larger trial before authorization.

How to Report Confidence Intervals

The American Psychological Association (APA) Publication Manual specifies a standard format that most scientific journals now adopt, regardless of discipline.

APA format

For a single mean:

M = 25.4, 95% CI [22.1, 28.7]

For a difference between two means:

Δ = 4.2, 95% CI [1.8, 6.6], t(58) = 3.71, p = .001

For a proportion:

p̂ = .54, 95% CI [.51, .57]

Reporting in tables vs. in-text

In tables, square-bracket format is standard: [22.1, 28.7]. In-text, the full phrase "95% CI [22.1, 28.7]" is preferred over the abbreviation alone. Always state the confidence level — do not assume "CI" implies 95%. For regression, report each coefficient's interval: b = 0.43, 95% CI [0.21, 0.65].

Confidence Interval Calculator

🧮 Free Confidence Interval Calculator

🔑 Key Takeaways

A confidence interval = point estimate ± margin of error. CI = x̄ ± z*(s/√n) in plain text.

95% confidence means the procedure is reliable 95% of the time across repeated samples — not that this specific interval has a 95% probability of containing the truth.

Use z* when σ is known or n ≥ 30; use t* (df = n−1) otherwise. Common z* values: 1.645 (90%), 1.96 (95%), 2.576 (99%).

Larger n, lower confidence level, or smaller σ all produce narrower CIs. To halve the width, quadruple the sample size.

A CI that excludes zero implies statistical significance at the corresponding α level. CIs carry more information than p-values because they show effect magnitude.

Prediction intervals are wider than CIs because they cover individual observations, not just the mean. Bayesian credible intervals do allow direct probability statements about parameters.

Read More Articles

Hypothesis Testing

See how CIs relate to p-values and significance tests in frequentist inference.

Read More →

Normal Distribution

Understand the distribution that underpins z-based confidence intervals.

Read More →

Sampling Distributions

Learn why the sampling distribution of the mean is approximately normal for large n.

Read More →

Frequently Asked Questions

A confidence interval is a range of values, calculated from sample data, that is likely to contain a population parameter (such as the mean) at a specified confidence level. For example, a 95% confidence interval means that if the same sampling procedure were repeated 100 times, approximately 95 of the resulting intervals would contain the true population value. It is calculated as: point estimate ± (critical value × standard error).

A 95% confidence interval means the procedure used to construct it will capture the true population parameter in 95 out of 100 repeated samples. The common misinterpretation — "there is a 95% probability that the true value lies in this interval" — is incorrect. Once the interval is computed from your data, the true value is either in it or not. The 95% describes the reliability of the method, not the probability for any single interval.

To calculate a 95% confidence interval for a mean: (1) Compute the sample mean x̄. (2) Calculate the standard error SE = s/√n. (3) Find the critical value z* = 1.96 for 95% CI (or t* if n < 30). (4) Calculate the margin of error: ME = z* × SE. (5) The confidence interval is [x̄ − ME, x̄ + ME]. For example, with x̄ = 50, s = 10, n = 50: SE = 1.414, ME = 1.96 × 1.414 = 2.77, CI = [47.23, 52.77].

The general confidence interval formula is CI = x̄ ± z*(s/√n), where x̄ is the sample mean, z* is the critical value (1.96 for 95%), s is the sample standard deviation, and n is the sample size. For proportions: CI = p̂ ± z*√(p̂(1−p̂)/n). For small samples (n < 30) with unknown σ, replace z* with t* from the t-distribution with df = n−1.

A p-value tells you the probability of observing your results (or more extreme) if the null hypothesis were true — a binary signal about whether an effect exists. A confidence interval tells you the plausible range of the effect's magnitude. They are connected: if the 95% CI for a difference does not include zero, the p-value is below 0.05. But CIs are more informative because they communicate effect size, not just the yes/no decision. Most journals now require both.

Three factors control CI width: (1) Sample size — larger n narrows the CI because the standard error SE = s/√n shrinks. Quadrupling n halves the width. (2) Confidence level — higher confidence (99% vs 95%) requires a larger critical value, widening the interval. (3) Population variability — higher standard deviation σ widens the interval, and you cannot always control this factor. When designing a study, choose n to achieve the precision you need at your target confidence level.

Correct interpretation: "We used a procedure that would capture the true population mean in 95% of repeated samples; the interval produced by this sample is [lower, upper]." Incorrect interpretations to avoid: (a) "There is a 95% probability the true value is in this interval" — the true value is fixed, not random; (b) "95% of the data falls in this interval" — that would describe a prediction interval, not a CI. The confidence level describes the long-run performance of the estimation procedure.

A confidence interval estimates where the true population mean (or other parameter) lies. A prediction interval estimates where a single new observation will fall. Prediction intervals are always wider because they account for both uncertainty in the mean estimate and the natural variability of individual data points. The prediction interval formula is: x̄ ± t*(s × √(1 + 1/n)), where the extra "1" under the square root accounts for individual observation variance.

Use z when: (a) the population standard deviation σ is known, or (b) the sample size n ≥ 30 (by the Central Limit Theorem). Use t when: σ is unknown and n < 30. The t-distribution has heavier tails than the normal, giving wider intervals that account for the extra uncertainty of estimating σ from a small sample. As n increases, the t-distribution converges to the normal — by n = 120 the difference is negligible in practice.

In Python: from scipy import stats; ci = stats.t.interval(0.95, df=n-1, loc=mean, scale=stats.sem(data)). In R: t.test(data)$conf.int for the default 95% interval, or t.test(data, conf.level=0.99)$conf.int for 99%. For proportions in R: prop.test(successes, n)$conf.int. For bootstrap CIs in Python: use scipy.stats.bootstrap() with the statistic of interest and a specified confidence level.