What is a five number summary in statistics?

The five number summary is a set of five descriptive statistics that together describe a dataset's distribution: the minimum value, first quartile (Q1), median, third quartile (Q3), and maximum value. It gives a complete picture of data spread in a compact format and forms the basis of box-and-whisker plots.

What are the five values in a five number summary?

The five values are: (1) Minimum — the smallest observation in the dataset; (2) Q1 — the first quartile or 25th percentile; (3) Median — the middle value or 50th percentile; (4) Q3 — the third quartile or 75th percentile; and (5) Maximum — the largest observation in the dataset.

Can the five number summary detect outliers?

Yes. Using the IQR method, any value below Q1 − 1.5 × IQR or above Q3 + 1.5 × IQR is flagged as a potential outlier. In box plots, these outliers appear as individual points beyond the whiskers. This method is more resistant to extreme values than outlier detection based on the mean and standard deviation.

What is the difference between range and five number summary?

The range (Maximum − Minimum) is a single number that tells you the total spread of the data. The five number summary gives much more information — it shows how data is distributed within that range, where the center sits, and how spread out the middle 50% of values are. One extreme value can distort the range, but the five number summary remains interpretable because Q1 and Q3 are resistant to outliers.

Five Number Summary Explained with Quartiles & Box Plots Guide

Q: How do you calculate a five number summary step by step?

Step 1: Sort your data from smallest to largest. Step 2: Identify the minimum (first value) and maximum (last value). Step 3: Find the median — the middle value for odd-count datasets, or the average of the two middle values for even-count datasets. Step 4: Find Q1 as the median of all values below the overall median. Step 5: Find Q3 as the median of all values above the overall median.

Q: How does a box plot use a five number summary?

A box plot (box-and-whisker plot) maps each of the five values directly onto a visual: the left whisker extends to the minimum, the left edge of the box marks Q1, the line inside the box marks the median, the right edge of the box marks Q3, and the right whisker extends to the maximum. The box itself spans the IQR (Q3 − Q1), showing where the middle 50% of data falls.

Q: What does Q1 and Q3 mean in a five number summary?

Q1 (first quartile) marks the 25th percentile — 25% of the data falls at or below this value. Q3 (third quartile) marks the 75th percentile — 75% of the data falls at or below this value. Together, Q1 and Q3 define the interquartile range (IQR = Q3 − Q1), which measures the spread of the middle 50% of observations.

What Is a Five Number Summary?

Definition — Descriptive Statistics Summary

The five number summary is a statistical description of a dataset using five values: minimum, first quartile (Q1), median, third quartile (Q3), and maximum. It shows how data is distributed across its full range and is the foundation of box-and-whisker plots.

{ Min, Q1, Median, Q3, Max }

Think of the five number summary as five boundary markers placed along your data. The minimum and maximum mark the outer edges. The median marks the exact center. Q1 and Q3 mark the boundaries of the middle 50% of observations. Together, those five positions give a complete picture of where data concentrates, where it spreads out, and whether the distribution leans in either direction.

The method was formalized by statistician John Tukey in his 1977 book Exploratory Data Analysis, alongside the box plot he invented to display it visually. Tukey argued that understanding a dataset requires looking at its entire shape — not just its average. The National Institute of Standards and Technology confirms this perspective: the NIST/SEMATECH e-Handbook of Statistical Methods lists the five-number summary as a core tool for initial data exploration. For a broader grounding in the field, Statistics Fundamentals covers every major concept from data types to regression.

⚡ Quick Reference — Five Number Summary Key Facts

Five values: Minimum, Q1 (25th percentile), Median (50th percentile), Q3 (75th percentile), Maximum
First step: Always sort data in ascending order before calculating anything
IQR: Q3 − Q1 measures the spread of the middle 50% and is resistant to outliers
Box plot connection: Each of the five values maps directly onto a specific part of a box-and-whisker plot
Outlier rule: Values below Q1 − 1.5 × IQR or above Q3 + 1.5 × IQR are potential outliers
Invented by: John Tukey (1977) as part of Exploratory Data Analysis methodology

The Five Core Values Explained

Each value in the five number summary marks a specific position in your sorted dataset. Position matters more than magnitude here — a value of 45 is only meaningful as "the minimum" or "Q1" once you know where it sits relative to everything else.

Min

Minimum

The smallest value. Marks the start of your data's range.

1st Quartile

The 25th percentile. 25% of observations fall at or below this point.

Med

Median

The 50th percentile. Half the data is below, half above.

3rd Quartile

The 75th percentile. 75% of observations fall at or below this point.

Max

Maximum

The largest value. Marks the end of your data's range.

Minimum and Maximum: Data Boundaries

The minimum is the smallest observation in your dataset; the maximum is the largest. Together they define the full range (Max − Min). Both are simple to identify once data is sorted — the minimum is the first value, the maximum is the last. Their limitation is sensitivity to outliers: a single extreme observation can make the range misleading. That is why the five number summary pairs them with Q1 and Q3, which are far more resistant to extreme values.

Median: The True Center of Data

The median is the value that divides the sorted dataset exactly in half. For datasets with an odd number of observations, the median is the single middle value. For an even number, it is the arithmetic mean of the two middle values. Because the median depends only on rank order — not on the actual distances between values — one extremely large observation has no effect on it. This makes the median a more reliable measure of center than the mean when data contains outliers or is skewed. The median guide on this site covers the calculation in full detail, and the broader descriptive statistics section places it in context alongside other summary measures.

Q1 and Q3: The Quartile Boundaries

The quartiles split sorted data into four equal groups. Q1 (the first quartile) is the median of the lower half of data — that is, all values below the overall median. Q3 (the third quartile) is the median of the upper half of data — all values above the overall median. This definition follows the inclusive method: the overall median itself is excluded from both halves before Q1 and Q3 are computed.

💡

The Quartile Splitting Rule

When finding Q1 and Q3, exclude the overall median before splitting. If your dataset has 11 values and the median is the 6th, the lower half for Q1 is values 1–5 and the upper half for Q3 is values 7–11. For an even-count dataset of 12 values with two middle values, the lower half for Q1 is values 1–6 and the upper half for Q3 is values 7–12.

How to Calculate a Five Number Summary: Step by Step

📋

The 5-Step Method — Works for Any Dataset

Step 1: Sort all values in ascending order. Step 2: Identify the minimum (first) and maximum (last) values. Step 3: Find the median (middle value, or average of two middle values). Step 4: Find Q1 as the median of the lower half. Step 5: Find Q3 as the median of the upper half.

Worked Example 1 — Exam Scores (12 Students)

Worked Example 1 — Even Dataset (n = 12)

Exam scores: 45, 52, 55, 60, 63, 65, 70, 72, 75, 80, 85, 90. Find the five number summary.

Sort ascending: 45, 52, 55, 60, 63, 65, 70, 72, 75, 80, 85, 90 ✓ already sorted. n = 12.

Minimum and Maximum: Min = 45 | Max = 90

Median (n = 12, even): Middle two values are the 6th and 7th: 65 and 70. Median = (65 + 70) / 2 = 67.5

Q1 — lower half: Values below median: 45, 52, 55, 60, 63, 65. The median of these 6 values = (55 + 60) / 2 = Q1 = 57.5

Q3 — upper half: Values above median: 70, 72, 75, 80, 85, 90. The median of these 6 values = (75 + 80) / 2 = Q3 = 77.5

✓ Five number summary: Min = 45 | Q1 = 57.5 | Median = 67.5 | Q3 = 77.5 | Max = 90 | IQR = 77.5 − 57.5 = 20

Minimum

57.5

Q1 (25th %ile)

67.5

Median (50th %ile)

77.5

Q3 (75th %ile)

Maximum

The data card below shows how to read this result meaningfully:

📊 Interpretation — Exam Score Dataset

What the five numbers actually reveal

The bottom quarter of students scored between 45 and 57.5. Half scored below 67.5. The middle 50% of scores (the IQR) spanned from 57.5 to 77.5 — a range of 20 points — confirming reasonable consistency in the mid-range. The gap between the median (67.5) and the maximum (90) is larger than the gap between the median and minimum (45), which suggests slight right skew: a few high performers are pulling the top end upward.

Worked Example 2 — Odd Dataset (n = 9)

Patient wait times (minutes): 8, 12, 15, 18, 22, 27, 31, 38, 45

Sort: 8, 12, 15, 18, 22, 27, 31, 38, 45 ✓ n = 9

Min = 8 | Max = 45

Median (n = 9, odd): 5th value = 22

Q1 — lower half (exclude median): 8, 12, 15, 18. Median = (12 + 15) / 2 = Q1 = 13.5

Q3 — upper half (exclude median): 27, 31, 38, 45. Median = (31 + 38) / 2 = Q3 = 34.5

✓ Five number summary: Min = 8 | Q1 = 13.5 | Median = 22 | Q3 = 34.5 | Max = 45 | IQR = 21

How the Five Number Summary Connects to Box Plots

A box-and-whisker plot converts the five number summary into a visual. Every element of the box plot corresponds directly to one of the five values:

Box Plot Structure — Five Number Summary Mapping

Box plot built from the exam score example (n = 12). The box spans Q1 to Q3; the line inside marks the median; whiskers extend to the minimum and maximum.

Mapping Each Value to Its Visual Element

Five Number Summary Value	Box Plot Element	What It Shows
Minimum	Left whisker end (tip)	The smallest non-outlier data point
Q1 (First Quartile)	Left edge of the box	Lower boundary of the middle 50%
Median	Line inside the box	The center of the data
Q3 (Third Quartile)	Right edge of the box	Upper boundary of the middle 50%
Maximum	Right whisker end (tip)	The largest non-outlier data point

When outliers exist, some box plot implementations modify the whiskers so they extend only to the most extreme non-outlier values (1.5 × IQR from each quartile). In that version, outliers appear as individual dots beyond the whiskers. This is the modified Tukey box plot, and it is the default in most statistics software. The data visualization guide on this site covers the full range of chart types for displaying distributions.

Interquartile Range (IQR) and Why It Matters

📐 IQR Formula

IQR = Q3 − Q1

The IQR measures the spread of the middle 50% of a dataset. Because it relies only on the central half of the data, it is unaffected by extreme values at either end — making it the preferred measure of spread when data contains outliers or is asymmetric.

In the exam score example, IQR = 77.5 − 57.5 = 20. That single number answers the question: "How spread out are typical scores?" A student in the 25th percentile scored roughly 20 points below a student in the 75th percentile — a meaningful gap, but not an extreme one.

Using IQR for Outlier Detection

The 1.5 × IQR rule, also developed by Tukey, defines outlier boundaries directly from the five number summary. The method is described in Penn State's STAT 200 course materials (Penn State STAT 200) and is the standard implementation in software including R, Python's pandas, and SPSS.

Outlier Detection — Tukey's 1.5 × IQR Rule

Lower fence = Q1 − 1.5 × IQR

Upper fence = Q3 + 1.5 × IQR

Any observation outside these fences is a potential outlier

IQR = Q3 − Q1 1.5 × IQR = standard multiplier for mild outliers 3 × IQR = used for extreme outliers

For the exam score dataset: Lower fence = 57.5 − 1.5(20) = 57.5 − 30 = 27.5. Upper fence = 77.5 + 1.5(20) = 77.5 + 30 = 107.5. Since all scores fall between 45 and 90, the dataset contains no outliers. If a student had scored 15, that value (below the lower fence of 27.5) would appear as an outlier dot in a box plot.

Compare this to outlier detection using z-scores, which requires assuming a normal distribution. The IQR method makes no such assumption — it works on any distribution shape.

Five Number Summary Calculator

🧮 Interactive Five Number Summary Calculator

Enter your dataset (comma or space separated)

▶ Show step-by-step breakdown

How to Read and Interpret the Five Number Summary

Computing the five values is straightforward. Reading what they mean together takes practice. Three questions guide interpretation:

Is the Distribution Symmetric or Skewed?

Check the distance between the median and each quartile. In a symmetric distribution, the median sits at roughly the same distance from Q1 as it does from Q3. If the median is much closer to Q1 than to Q3, the distribution is right-skewed — a long tail stretches toward larger values. If the median is closer to Q3 than to Q1, the distribution is left-skewed.

Case Study: Reading Skewness from the Five Number Summary

Software engineer salaries at a tech company (n = 40)

Five number summary: Min = $72,000 | Q1 = $95,000 | Median = $108,000 | Q3 = $145,000 | Max = $380,000

The median ($108k) is much closer to Q1 ($95k — gap of $13k) than to Q3 ($145k — gap of $37k). The right whisker extends far beyond the box to $380k. This is classic right skew: most engineers cluster in the $95k–$145k range, but a few senior principals or executives earn dramatically more. The mean salary would be pulled upward by those extreme values, which is precisely why the median is a more accurate representation of what a "typical" engineer earns at this company.

Assessing Spread with IQR vs. Range

The range (Max − Min) describes total spread but a single extreme value can inflate it dramatically. The IQR describes spread for the central 50% of data and is unaffected by outliers. When those two measures diverge sharply — a large range alongside a modest IQR — that is strong evidence of outliers or a heavy-tailed distribution.

⚠️

When Range Can Mislead

A dataset of: 10, 11, 12, 13, 14, 15, 95 has Range = 85 and IQR = 3. The range of 85 suggests enormous spread, but the IQR of 3 reveals that six of the seven values cluster within a 5-point window. The outlier of 95 has inflated the range without reflecting where the actual data sits. Always check IQR alongside range.

Real-World Case Studies

🎓

Education: Standardized Test Scores

The five number summary reveals whether a school's distribution is driven by a few high achievers or whether competency is broadly distributed. A narrow IQR with a high median indicates consistent, broad achievement; a wide IQR signals polarization between high and low performers.

💰

Finance: Income Distribution

Income data is almost always right-skewed. The five number summary shows this visually — the gap between Q3 and the maximum far exceeds the gap between Q1 and the minimum. Policymakers use IQR-based analysis to measure inequality without letting extreme incomes distort the picture.

🏥

Medicine: Clinical Trial Data

Reporting the five number summary alongside the mean and standard deviation is standard practice in clinical research. The American Journal of Medicine and other peer-reviewed journals require it for skewed clinical variables. Box plots allow immediate comparison of two treatment arms.

🏭

Manufacturing: Quality Control

In production settings, the IQR defines the "typical" variation band. A batch of components with a small IQR is more consistent than one with a large IQR — even if both batches share the same median. Quality engineers use the five number summary to flag batches for rejection before formal hypothesis tests are run.

🏃

Sports Analytics

Player performance metrics are compared using the five number summary to distinguish consistent performers (small IQR, median close to mean) from volatile performers (large IQR, wide whiskers). This helps coaches make decisions based on reliability rather than peak performance alone.

Five Number Summary in Excel, Python, and R

In Microsoft Excel

Excel does not have a single FIVENUMBERSUMMARY function, but each value can be computed individually. Use these formulas on data in column A (A1:A50 in this example):

          =MIN(A1:A50)          // Minimum

          =QUARTILE.INC(A1:A50,1)  // Q1 (25th percentile)

          =MEDIAN(A1:A50)          // Median

          =QUARTILE.INC(A1:A50,3)  // Q3 (75th percentile)

          =MAX(A1:A50)          // Maximum

          =IQR: =QUARTILE.INC(A1:A50,3)-QUARTILE.INC(A1:A50,1)

💡

QUARTILE.INC vs. QUARTILE.EXC

Use QUARTILE.INC (inclusive) for most purposes — it matches the standard textbook definition used in this guide. QUARTILE.EXC (exclusive) excludes the median when computing Q1 and Q3 and may return slightly different values. QUARTILE.INC is the default method in most statistics courses.

In Python (pandas + NumPy)

          import numpy as np

          import pandas as pd

          data = [45, 52, 55, 60, 63, 65, 70, 72, 75, 80, 85, 90]

          # Using NumPy

          q1 = np.percentile(data, 25)

          median = np.median(data)

          q3 = np.percentile(data, 75)

          iqr = q3 - q1

          print(f"Min: {min(data)} | Q1: {q1} | Med: {median} | Q3: {q3} | Max: {max(data)}")

          # Using pandas — gives all five values at once

          s = pd.Series(data)

          print(s.describe())  # includes mean, std, and all quartiles

In R

          data <- c(45, 52, 55, 60, 63, 65, 70, 72, 75, 80, 85, 90)

          # Five number summary in one command

          fivenum(data)  # returns: Min Q1 Median Q3 Max using Tukey's hinges

          # Or use summary() for additional context

          summary(data)  # adds mean alongside the five number summary

          # Box plot — visualizes the five number summary instantly

          boxplot(data, main="Five Number Summary", horizontal=TRUE)

Five Number Summary vs. Other Descriptive Statistics

Measure	What It Tells You	Use When
Five number summary	Full distribution shape: center, spread, range, quartile positions	Exploring unknown data; comparing two groups; skewed data
Mean + SD	Center and average deviation; assumes roughly normal distribution	Data is roughly symmetric; no major outliers; parametric tests
Range only	Total spread from min to max	Quick scan; not adequate alone — misleading with outliers
IQR alone	Spread of the middle 50%	When robustness to outliers is the priority
Median alone	Typical center value	Central tendency for skewed data; insufficient without spread info

The variance and standard deviation are the appropriate measures of spread when data is symmetric and outlier-free. When data is skewed or contains extreme values, the IQR from the five number summary is the more informative choice. For data that contains outliers, the outlier detection guide explains both the IQR method and the z-score method in detail.

Concept Glossary

Concept	Symbol / Formula	Simple Definition	Common Mistake
Five number summary	{ Min, Q1, Median, Q3, Max }	Five values that describe data distribution and spread	Confusing with mean-based summaries
Minimum	Min	Smallest value in the dataset	Treating as the start of a "normal" range when it may be an outlier
Maximum	Max	Largest value in the dataset	Same as minimum — may be distorted by an outlier
Median	Q2 or M	Middle value; 50th percentile	Confusing with the arithmetic mean
First quartile	Q1	25th percentile — lower boundary of middle 50%	Forgetting to exclude the median before computing Q1
Third quartile	Q3	75th percentile — upper boundary of middle 50%	Same omission as Q1
IQR	Q3 − Q1	Spread of the middle 50% of observations	Confusing IQR with the full range
Box plot	—	Visual chart that maps all five summary values	Misreading whisker ends as absolute minimum/maximum when outliers are plotted separately
Outlier fence	Q1 − 1.5×IQR ; Q3 + 1.5×IQR	Boundaries beyond which values are flagged as outliers	Assuming all outliers are errors — they may be real, meaningful extremes
Percentile	pth percentile	The value below which p% of observations fall	Confusing percentile rank with the actual value

Common Mistakes When Calculating the Five Number Summary

Mistake	What Goes Wrong	Correct Approach
Forgetting to sort the data first	Q1, median, and Q3 all compute incorrectly; the minimum and maximum may also be wrong	Always sort all values from smallest to largest before touching any calculation
Including the median in both halves when finding Q1 and Q3	Q1 and Q3 are shifted toward the median, underestimating the true IQR	Exclude the overall median before splitting into lower and upper halves
Averaging the wrong pair of values for even-count datasets	Median and quartiles land at the wrong positions	For n values, the two middle positions are n/2 and (n/2)+1
Confusing IQR with range	Overstating spread when outliers are present	IQR = Q3 − Q1, not Max − Min. Use IQR for robust spread; Range for total extent
Treating the median as the mean	Wrong interpretation of center, especially in skewed data	The median is the middle rank; the mean is the arithmetic average — they differ in skewed datasets

Frequently Asked Questions

The five number summary describes a dataset using five values: the minimum, first quartile (Q1), median, third quartile (Q3), and maximum. It gives a complete picture of how data is distributed — showing the center, the spread, and where data concentrates — without requiring any assumptions about the shape of the distribution. It was introduced by John Tukey in 1977 and remains one of the most widely used tools in exploratory data analysis.

Step 1: Sort all values from smallest to largest. Step 2: Record the first value as the minimum and the last as the maximum. Step 3: Find the median — the middle value for odd-count datasets, or the average of the two middle values for even-count datasets. Step 4: Find Q1 as the median of all values below the overall median (excluding the median itself). Step 5: Find Q3 as the median of all values above the overall median (also excluding it).

Q1 (first quartile) is the 25th percentile — it marks the boundary below which 25% of the data falls. Q3 (third quartile) is the 75th percentile — 75% of the data falls at or below this value. Together, Q1 and Q3 define the interquartile range (IQR = Q3 − Q1), which measures how spread out the central 50% of observations are. A narrow IQR means data is tightly clustered; a wide IQR means substantial variation in the middle of the distribution.

A box plot maps all five values onto a single diagram: the left whisker extends to the minimum; the left edge of the box marks Q1; the line inside the box marks the median; the right edge of the box marks Q3; and the right whisker extends to the maximum. The box (from Q1 to Q3) covers the middle 50% of the data, and its width gives an immediate visual impression of the IQR. When outliers are present, whiskers are drawn only to the last non-outlier value, and extreme points appear as individual dots.

Yes. Tukey's 1.5 × IQR rule flags potential outliers using values directly from the five number summary. Any observation below Q1 − 1.5 × IQR or above Q3 + 1.5 × IQR is a candidate outlier. This method requires no assumption of normality and is more robust than z-score-based outlier detection when data is skewed. A stricter version uses 3 × IQR to identify only the most extreme values.

Range = Max − Min, giving one number that describes total spread. The five number summary gives five numbers that describe how data is arranged within that spread. The range can be dramatically inflated by a single extreme observation, while the IQR and quartile positions in the five number summary remain stable. For any dataset with potential outliers or non-normal distribution shape, the five number summary is far more informative than the range alone.

R's fivenum() uses Tukey's hinges, which can produce slightly different Q1 and Q3 values than the standard inclusive quartile method for some dataset sizes. R's summary() command uses a slightly different quartile algorithm (Type 7 by default) that may also differ from fivenum(). For educational purposes and textbook work, the standard inclusive method described in this guide matches QUARTILE.INC in Excel and the default behavior in most introductory statistics courses. Differences between methods are typically small and only arise in datasets with few values.

Sources and References:
Tukey, J.W. (1977). Exploratory Data Analysis. Addison-Wesley. | NIST/SEMATECH. (2012). e-Handbook of Statistical Methods — Quantile Plot. itl.nist.gov | Penn State STAT 200. Elementary Statistics: Describing Distributions with Numbers. online.stat.psu.edu | MIT OpenCourseWare. (2016). 18.650 Statistics for Applications. ocw.mit.edu | Agresti, A. & Franklin, C. (2018). Statistics: The Art and Science of Learning from Data (4th ed.). Pearson. | Moore, D.S., McCabe, G.P., & Craig, B.A. (2017). Introduction to the Practice of Statistics (9th ed.). W.H. Freeman.