🔑 Key Takeaways
The most important principles about normal distribution that every student and data analyst should know.
Normal distribution is defined by mean (μ) and standard deviation (σ). Changing μ shifts the curve; changing σ changes its width.
Mean = Median = Mode in a perfectly normal distribution — all three measures of center coincide exactly at the peak.
The empirical rule (68-95-99.7) tells you what percentage of data falls within 1, 2, and 3 standard deviations of the mean.
The Central Limit Theorem explains why normal distribution appears everywhere. Sample means approach normality regardless of the underlying population shape.
Z-scores standardize any normal distribution to a mean of 0 and standard deviation of 1, enabling probability lookups via a Z-table.
Not all data is normally distributed. Income, social media engagement, and website traffic are typically skewed, not bell-shaped.
What Is Normal Distribution?
Normal distribution is a continuous probability distribution where values cluster symmetrically around the mean, tapering off in both directions to produce the characteristic bell shape. It is the most widely used distribution in statistics, natural science, social science, and engineering.
Three definitions serve different audiences — from newcomers to statisticians:
Plain English
Most values cluster in the middle, with fewer values at the extremes. Plot them and you get a symmetric bell-shaped curve.
BeginnerStatistical
A symmetric, unimodal, continuous probability distribution where mean = median = mode, with tails extending infinitely.
IntermediateMathematical
A distribution fully characterized by parameters μ and σ², with PDF f(x; μ, σ) = (1/σ√2π)·exp(−(x−μ)²/2σ²), for x ∈ (−∞, +∞).
AdvancedAbraham de Moivre first described the curve in 1733 as an approximation to the binomial distribution. Carl Friedrich Gauss later applied it extensively to astronomical measurement errors in the early 19th century, which is why it is also called Gaussian distribution. The term "normal" was coined by Francis Galton and Karl Pearson in the late 1800s.
What Does "Normally Distributed" Mean?
When data is normally distributed, it means the dataset follows the bell curve pattern: most observations are near the average, and the frequency of observations decreases symmetrically as values move away from that average. A dataset's normality can be checked visually or via statistical tests — see the normality testing section below.
The Normal Distribution Bell Curve — Visual Explanation
The normal distribution curve (bell curve) is the visual representation of a normal distribution. Its distinctive shape tells you everything about how values in the dataset are spread.
Figure 1: Normal distribution bell curve showing the proportion of data within each standard deviation band.
Key Visual Features of the Normal Distribution Graph
- Perfectly symmetric about the mean — the left and right halves are mirror images
- Single peak (unimodal) located exactly at the mean μ
- Tails are asymptotic — they extend infinitely but never touch the x-axis
- Total area under the curve = 1 — representing 100% of all probability
- Points of inflection occur at exactly μ − σ and μ + σ
How μ and σ Shape the Normal Distribution Graph
The mean (μ) and standard deviation (σ) are the two parameters that fully describe any normal distribution:
Effect of Changing μ (Mean)
- Shifts the entire curve left or right along the x-axis
- The shape stays exactly the same — only the location changes
- Higher μ → curve moves right; lower μ → curve moves left
- Mean determines where the peak of the bell sits
Effect of Changing σ (Standard Deviation)
- Larger σ → wider, flatter curve (more spread-out data)
- Smaller σ → narrower, taller curve (tightly clustered data)
- The curve's location stays the same — only the width changes
- σ² (variance) is the squared measure of this spread
Properties of Normal Distribution
The normal distribution has precise mathematical properties that make it predictable and analytically tractable. These properties are what give it so much power in applied statistics.
| Property | Value / Description |
|---|---|
| Mean (μ) | Center of the distribution; can be any real number |
| Variance (σ²) | Measures spread; σ² > 0 |
| Standard Deviation (σ) | Square root of variance; same units as data |
| Skewness | 0 (perfectly symmetric) |
| Kurtosis (excess) | 0 (kurtosis = 3; called mesokurtic) |
| Mean = Median = Mode | All three equal μ exactly |
| Support | All real numbers (−∞, +∞) |
| Total probability | 1 (area under curve = 100%) |
| Moment generating function | M(t) = exp(μt + σ²t²/2) |
A perfect normal distribution has skewness = 0 (neither left- nor right-skewed) and excess kurtosis = 0 (kurtosis = 3, called mesokurtic). When testing your data, values of skewness between −0.5 and +0.5 and excess kurtosis between −1 and +1 generally suggest approximate normality.
Normal Distribution Formula — PDF Explained
The normal distribution formula (probability density function) gives the height of the bell curve at any value of x. This is the probability density, not the probability itself — you integrate it over an interval to get a probability.
Symbol-by-Symbol Breakdown
| Symbol | Name | Role in Formula |
|---|---|---|
| x | Observed value | The data point you want to evaluate |
| μ (mu) | Mean | Centers the distribution; the peak of the bell curve |
| σ (sigma) | Standard deviation | Controls the width/spread of the curve |
| σ² | Variance | Square of σ; appears in the exponent |
| e | Euler's number | ~2.71828; the base of the natural logarithm |
| π | Pi | ~3.14159; part of the normalization constant |
| 1/σ√(2π) | Normalization constant | Ensures the total area under the curve = 1 |
| (x−μ)²/2σ² | Standardized exponent | Measures distance from mean in units of σ; creates the bell shape |
Why the Formula Creates a Bell Shape
The key is in the exponent: −(x−μ)²/2σ². When x = μ, the exponent is 0, so e⁰ = 1, giving the maximum value of the function at the peak. As x moves away from μ in either direction, (x−μ)² grows, making the exponent increasingly negative, so enegative rapidly shrinks toward zero. The squaring of (x−μ) is what creates symmetry on both sides.
The Cumulative Distribution Function (CDF)
The normal cumulative distribution function answers: "What is the probability that a value is less than x?" It is the integral of the PDF from −∞ to x:
The CDF produces an S-shaped curve (sigmoid) ranging from 0 to 1. At x = μ, F(μ) = 0.5 exactly — half the distribution lies below the mean. This is why Z-tables were developed: to look up CDF values without manually solving an integral.
Standard Normal Distribution & Z-Scores
The standard normal distribution is a special case of the normal distribution with μ = 0 and σ = 1. It serves as a universal reference — any normal distribution can be converted to it, enabling probability lookups from a single table.
The Z-Score Formula
A Z-score measures how many standard deviations a value (X) is from the mean (μ). A Z of 0 means X equals the mean. A Z of +1.5 means X is 1.5 standard deviations above the mean. A Z of −2 means X is 2 standard deviations below the mean.
Worked Example — IQ Scores
What percentage of people have an IQ above 115?
Given: IQ is normally distributed with μ = 100, σ = 15.
Step 1: Find the Z-score for X = 115: Z = (115 − 100) / 15 = 15/15 = 1.00
Step 2: Look up Z = 1.00 in the standard normal table → P(Z < 1.00) = 0.8413
Step 3: P(X > 115) = 1 − 0.8413 = 0.1587 = 15.87%
About 15.87% of people score above 115 on an IQ test. See the Z-table for full lookup values.
Positive vs. Negative Z-Scores
- Positive Z-score: The value is above the mean (right side of the bell curve)
- Negative Z-score: The value is below the mean (left side of the bell curve)
- Z = 0: The value equals the mean exactly
- |Z| > 3: The value is unusually far from the mean (less than 0.3% of data)
The standard normal distribution table (Z-table) gives P(Z < z) for any z-value. For P(Z > z), subtract the table value from 1. For P(z₁ < Z < z₂), subtract the smaller table value from the larger one.
The 68-95-99.7 Empirical Rule of Normal Distribution
The empirical rule (also called the three-sigma rule or 68-95-99.7 rule) is one of the most practical results in statistics. It tells you exactly what percentage of data falls within 1, 2, and 3 standard deviations from the mean in any normal distribution.
| Range | % of Data Included | Z-Score Range | IQ Example (μ=100, σ=15) |
|---|---|---|---|
| μ ± 1σ | 68.27% | −1 to +1 | 85 to 115 |
| μ ± 2σ | 95.45% | −2 to +2 | 70 to 130 |
| μ ± 3σ | 99.73% | −3 to +3 | 55 to 145 |
| μ ± 4σ | 99.994% | −4 to +4 | 40 to 160 |
| μ ± 6σ (Six Sigma) | 99.9999998% | −6 to +6 | Effectively all values |
Theoretically, a normal distribution's tails never reach zero — the curve extends to ±∞. In practice, values beyond ±3σ are so rare (less than 0.27% of the data) that we can safely ignore them for most purposes. The Six Sigma quality standard targets ±6σ, leaving only 3.4 defects per million opportunities.
Normal Distribution Calculator
Use this normal distribution calculator to compute the Z-score and cumulative probability for any value in a normal distribution. Enter the observed value, mean, and standard deviation below.
Z-Score & Probability Calculator
For full probability tables, visit the Z-table and probability calculator.
Real-World Examples of Normal Distribution
The normal distribution appears throughout nature, science, engineering, and social research. Here are the most important domains where it applies — along with specific, concrete numbers.
IQ Scores
Wechsler IQ tests are scaled to μ = 100, σ = 15 by design. About 68% score between 85 and 115; only 2.3% score above 130.
Human Height
Adult male height in the US: μ ≈ 70 inches (178 cm), σ ≈ 3 inches. Height is approximately — not perfectly — normally distributed.
Manufacturing (Six Sigma)
Quality control targets ±6σ, meaning 3.4 defects per million parts. Machine tolerances follow a normal distribution curve.
Measurement Error
Repeated measurements of the same physical quantity produce errors that follow a normal distribution — the basis of least squares fitting.
Standardized Tests
SAT scores are scaled to μ = 1010, σ ≈ 210 for the combined test. GRE Quantitative: μ = 150, σ = 8.9.
Financial Returns
Daily stock returns are often modeled as normally distributed for simplicity — though fat tails (leptokurtosis) make this an approximation only.
Birth Weight Example — Full Walkthrough
Full-term newborn birth weights in the United States follow an approximately normal distribution with μ = 3,400 g and σ = 500 g. Using the empirical rule:
- About 68% of babies weigh between 2,900 g and 3,900 g (μ ± 1σ)
- About 95% weigh between 2,400 g and 4,400 g (μ ± 2σ)
- Babies outside the 3σ range (below 1,900 g or above 4,900 g) represent roughly 0.27% of births — statistically very unusual
The Central Limit Theorem — Why Normal Distribution Appears Everywhere
The Central Limit Theorem (CLT) is the mathematical reason that normal distribution is so pervasive. It explains why the bell curve appears even when the underlying data is not normally distributed.
When you take sufficiently large random samples from any population — regardless of its distribution — and calculate the sample means, those means will be approximately normally distributed. Formally, the sample mean X̄ ~ N(μ, σ²/n) as sample size n → ∞.
Practical Implications of the CLT
Justifies parametric statistical tests
The CLT is why t-tests, ANOVA, and regression are robust even when raw data is not perfectly normal — the sampling distributions of the test statistics converge to normal distributions.
Required sample size: n ≥ 30
As a rule of thumb, samples of 30 or more are sufficient for the CLT to produce approximately normal sampling distributions, even from skewed populations. Very skewed populations may need larger samples (n ≥ 100).
Underlies confidence intervals and hypothesis testing
Confidence intervals for means rely on the CLT-guaranteed normality of X̄. The standard error (σ/√n) decreases as sample size grows, narrowing confidence intervals. Learn more about hypothesis testing.
Explains natural normal distributions
Human height is the sum of many small genetic and environmental influences — each individually non-normal. The CLT makes their combined effect approximately normal, which is why biometric measurements so often look bell-shaped.
How to Test if Data Is Normally Distributed
Knowing whether your data follows a normal distribution is a prerequisite for many statistical tests. Use both visual methods and formal statistical tests for the most reliable assessment.
Visual Methods
- Histogram: Does it look bell-shaped and symmetric? A roughly bell-shaped histogram with no strong skew suggests approximate normality.
- Q-Q Plot (Quantile-Quantile): Plot your data quantiles against theoretical normal quantiles. Points should fall close to a straight diagonal line if normally distributed. Systematic deviations indicate non-normality.
- Box Plot: Check for symmetry around the median and whether whiskers are roughly equal in length. Outliers beyond 1.5×IQR are flagged.
Statistical Tests for Normality
| Test | Best For | Null Hypothesis | Limitation |
|---|---|---|---|
| Shapiro-Wilk | Small to medium samples (n < 2,000) | Data is normally distributed | Overly sensitive with very large n |
| Kolmogorov-Smirnov | Large samples | Data follows the specified distribution | Less powerful for small samples |
| Anderson-Darling | General use; emphasizes tails | Data is normally distributed | Requires critical value tables |
| D'Agostino-Pearson | Combines skewness + kurtosis | Skewness = 0 and kurtosis = 3 | Requires n ≥ 20 |
| Jarque-Bera | Large samples; econometrics | Skewness = 0 and kurtosis = 3 | Poor in small samples |
With very large samples (n > 5,000), normality tests become so sensitive that they detect trivially small deviations from normality — deviations that have no practical impact on your analysis. In these cases, visual inspection and the CLT are more informative than p-values from formal normality tests.
What to Do When Data Is Not Normally Distributed
- Log transformation: For positively skewed data (income, biological measurements), taking log(x) often produces approximate normality
- Square root or Box-Cox transformation: For count data or moderately skewed distributions
- Non-parametric tests: Mann-Whitney U (instead of t-test), Kruskal-Wallis (instead of ANOVA), Spearman correlation (instead of Pearson)
- Bootstrap methods: Make no distributional assumptions; resample from the observed data
- Rely on CLT: If sample sizes are large (n ≥ 30), parametric tests remain valid regardless of raw data distribution
Normal Distribution vs. Other Distributions
Understanding when to use a normal distribution — and when another distribution fits better — is a core skill in statistics and data analysis.
Normal vs. T-Distribution
The t-distribution looks like the normal distribution but has heavier tails. It is used when the population standard deviation is unknown and must be estimated from data, particularly with small samples.
| Feature | Normal Distribution | T-Distribution |
|---|---|---|
| When to use | Large samples (n > 30); known σ | Small samples; unknown σ |
| Tail weight | Lighter tails | Heavier tails (more extreme values) |
| Shape parameter | μ and σ | Degrees of freedom (df = n − 1) |
| As df → ∞ | — | Converges to standard normal |
| Kurtosis | 3 (mesokurtic) | > 3 (leptokurtic) |
Distribution Comparison Overview
Normal vs. Log-Normal
When X is skewed right- Log-normal: values cannot be negative
- log(X) follows a normal distribution
- Use for: income, stock prices, reaction times
- Log-normal is right-skewed; normal is symmetric
Normal vs. Binomial
Approximation- Normal approximates binomial when np ≥ 5 and n(1−p) ≥ 5
- Binomial is discrete; normal is continuous
- Apply continuity correction: add ±0.5
- Use for large-sample proportion tests
Normal vs. Poisson
Count data- Normal approximates Poisson when λ > 10
- Poisson is for count data (non-negative integers)
- Poisson mean = variance; normal has no such constraint
- Use Poisson for rare events, queuing, arrivals
Normal vs. Uniform
Equal probability- Uniform: all values equally likely
- Normal: values near center much more likely
- Uniform has bounded support; normal is infinite
- Use uniform for random number generation
Normal vs. Skew-Normal
Asymmetric data- Skew-normal adds a shape parameter α
- When α = 0, it reduces to standard normal
- Useful when data is "nearly but not quite" normal
- Used in finance, environmental science
Normal vs. Bivariate/Multivariate Normal
Multiple variables- Multivariate normal extends to p variables
- Characterized by mean vector μ and covariance matrix Σ
- Used in multivariate regression, factor analysis
- All marginal distributions are also normal
Common Misconceptions About Normal Distribution
Several widely held beliefs about the normal distribution are simply wrong. Knowing these misconceptions will make you a more careful and accurate statistician.
"All data is normally distributed"
Many real datasets are skewed: income, website traffic, social media engagement, and survival times are rarely normal.
✓ Reality: Normal distribution is a mathematical model. Always check your data rather than assuming normality.
"Normal means typical or good"
In statistics, "normal" is a technical term for this specific bell-curve shape — not a value judgment.
✓ Reality: A dataset of disease severity scores can follow a normal distribution. The word describes shape, not desirability.
"The tails of the bell curve reach zero"
Many students draw the tails touching the x-axis. Mathematically, the normal distribution has infinite support.
✓ Reality: Tails extend to ±∞ and never reach zero — they only approach zero asymptotically.
"Large samples are always normally distributed"
The CLT applies to sample means, not raw data. Individual observations from a skewed population stay skewed.
✓ Reality: Averaging many observations produces a normally distributed mean — but that does not change the shape of the underlying data.
"Bell-shaped = normal distribution"
The t-distribution, Cauchy distribution, and others are also bell-shaped but are not normal distributions.
✓ Reality: Normal distribution has a specific mathematical form. Bell-shaped appearance is necessary but not sufficient.
"A significant normality test means you cannot use parametric tests"
With large samples, normality tests detect trivially small deviations that are practically meaningless.
✓ Reality: With n > 30, the CLT makes parametric tests robust to moderate non-normality. Context and visual inspection matter more.
Normal Distribution in Excel, Python & R
Every major data analysis platform has built-in functions for working with the normal distribution. Here are the most useful commands.
Normal Distribution in Excel
=NORM.DIST(x, mean, standard_dev, TRUE) ' CDF: P(X ≤ x)
=NORM.DIST(x, mean, standard_dev, FALSE) ' PDF: density at x
=NORM.INV(probability, mean, standard_dev)' Inverse CDF: find x from probability
=NORM.S.DIST(z, TRUE) ' Standard normal CDF: P(Z ≤ z)
=NORM.S.INV(probability) ' Standard normal inverse CDF
Example: =NORM.DIST(115, 100, 15, TRUE) returns 0.8413 — the probability that an IQ score is ≤ 115.
Normal Distribution in Python (SciPy)
from scipy import stats
import matplotlib.pyplot as plt
import numpy as np
mu, sigma = 100, 15 # IQ example
# PDF and CDF
x = np.linspace(mu - 4*sigma, mu + 4*sigma, 300)
pdf_vals = stats.norm.pdf(x, loc=mu, scale=sigma)
cdf_vals = stats.norm.cdf(x, loc=mu, scale=sigma)
# Probability that IQ < 115
p = stats.norm.cdf(115, loc=100, scale=15) # → 0.8413
# Z-score for a value
z = stats.norm.ppf(0.95, loc=0, scale=1) # → 1.6449
# Generate random normal samples
samples = stats.norm.rvs(loc=mu, scale=sigma, size=1000)
# Plot the bell curve
plt.plot(x, pdf_vals, color='#4f46e5', linewidth=2)
plt.fill_between(x, pdf_vals, alpha=0.2, color='#4f46e5')
plt.title('Normal Distribution — IQ Scores')
plt.show()
Normal Distribution in R
# PDF: probability density at x
dnorm(x = 115, mean = 100, sd = 15) # → 0.01945
# CDF: P(X ≤ x)
pnorm(q = 115, mean = 100, sd = 15) # → 0.8413
# Inverse CDF: find x given probability
qnorm(p = 0.95, mean = 100, sd = 15) # → 124.67
# Generate random normal samples
rnorm(n = 1000, mean = 100, sd = 15)
# Normality testing
shapiro.test(x = your_data) # Shapiro-Wilk test
ks.test(your_data, "pnorm", mean(your_data), sd(your_data)) # K-S test
# Q-Q plot for visual normality check
qqnorm(your_data); qqline(your_data, col = "#4f46e5")
Key Statistics & Summary Table
A consolidated reference for the most important numerical facts and formulas related to the normal distribution.
| Fact / Formula | Value / Expression |
|---|---|
| PDF formula | f(x) = (1/σ√2π) · exp(−(x−μ)²/2σ²) |
| Standard normal Z-score | Z = (X − μ) / σ |
| Mean | μ (any real number) |
| Variance | σ² |
| Skewness | 0 |
| Kurtosis (excess) | 0 (total kurtosis = 3) |
| Within 1σ | 68.27% of data |
| Within 2σ | 95.45% of data |
| Within 3σ | 99.73% of data |
| Within 4σ | 99.994% of data |
| Six Sigma (±6σ) | 3.4 defects per million (99.9999998%) |
| Total area under curve | 1.0 |
| Mode = Median = Mean | All equal μ |
| Moment generating function | M(t) = exp(μt + σ²t²/2) |
| Standard normal at Z=0 | f(0) = 1/√(2π) ≈ 0.3989 |
Summary: Normal Distribution at a Glance
The normal distribution (Gaussian distribution / bell curve) is the cornerstone of applied statistics because of its mathematical elegance, its prevalence in nature, and the power of the Central Limit Theorem. Understanding it unlocks the logic behind the majority of statistical tests used in research, quality control, finance, and data science.
The key ideas to carry forward: any normal distribution is fully defined by its mean μ and standard deviation σ; the 68-95-99.7 rule gives you instant intuition about spread; Z-scores let you standardize and compare across different scales; and formal normality testing should always accompany visual checks.
For deeper study, explore Statistics and Probability on Statistics Fundamentals, including related topics such as hypothesis testing, random variables, and the Z-table.
Frequently Asked Questions
Related Topics
Z-Table (Standard Normal Table)
Look up cumulative probabilities for any Z-score using the standard normal distribution table.
View Z-Table →T-Distribution Table
Find t-critical values for hypothesis testing when the population standard deviation is unknown.
View T-Table →Hypothesis Testing
Apply normal distribution to t-tests, Z-tests, and p-value interpretation in hypothesis testing.
Read More →Probability Calculator
Calculate probabilities for various distributions including normal, binomial, and Poisson.
Open Calculator →Random Variables
Understand continuous and discrete random variables — the foundation for all probability distributions.
Read More →Descriptive Statistics
Master mean, median, standard deviation, and variance — the building blocks of normal distribution.
Read More →