Box Plot Generator
Paste from Excel, type values manually, or click Load Example to see a demo.
Enter data for each group. Each textarea accepts comma, space, or newline separated values. Use the colour-coded groups to compare distributions.
Describe your box plot in plain English — mention group names, medians, ranges, and outliers. The generator parses your description and draws a polished, publication-ready SVG illustration instantly. No data entry required.
Try an example:
Box Plot Examples
Browse examples or generate your own above
What Is a Box Plot?
A box plot, also called a box-and-whisker plot, is a standardized diagram that displays the distribution of a dataset using its five-number summary: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. The rectangular "box" spans from Q1 to Q3, representing the interquartile range (IQR) — the middle 50% of the data. A line inside the box marks the median. "Whiskers" extend outward to the smallest and largest values still within 1.5 times the IQR. Any data points beyond the whisker boundaries are plotted individually as outliers.
Box plots were introduced by statistician John Tukey in his 1977 book Exploratory Data Analysis as a tool for quickly assessing shape, spread, and symmetry in a dataset. According to the NIST Engineering Statistics Handbook, box plots are especially valuable when comparing multiple datasets side by side because they condense each distribution into a compact five-number summary without losing information about center, spread, or outliers.
The Five-Number Summary
| Statistic | Formula / Position | Plain Meaning | Box Plot Location |
|---|---|---|---|
| Minimum | Smallest value ≥ lower fence | Start of left whisker (non-outlier floor) | Left/bottom whisker tip |
| Q1 (First Quartile) | Median of lower 50% | 25% of data falls below this value | Left/bottom edge of box |
| Median (Q2) | Middle value when sorted | 50% of data falls below this value | Line inside the box |
| Q3 (Third Quartile) | Median of upper 50% | 75% of data falls below this value | Right/top edge of box |
| Maximum | Largest value ≤ upper fence | End of right whisker (non-outlier ceiling) | Right/top whisker tip |
| IQR | Q3 − Q1 | Spread of middle 50% of data | Width/height of the box |
How to Calculate a Box Plot — Step by Step
Arrange every data point from smallest to largest. This is the foundation for all quartile calculations.
If n is odd, the median is the middle value. If n is even, it is the average of the two middle values. This splits the dataset into a lower half and an upper half.
Q1 is the median of the lower half (excluding the median if n is odd). Q3 is the median of the upper half. The box spans from Q1 to Q3.
IQR = Q3 − Q1. This is the length of the box and measures the spread of the central 50% of data.
Lower fence = Q1 − 1.5 × IQR. Upper fence = Q3 + 1.5 × IQR. These are not drawn on the plot — they define the cutoff for outliers.
The lower whisker extends to the smallest data value ≥ lower fence. The upper whisker extends to the largest data value ≤ upper fence.
Any values below the lower fence or above the upper fence are plotted as individual dots (or circles) beyond the whisker ends.
Worked Example
Dataset: Exam scores for 15 students: 45, 52, 58, 61, 64, 67, 70, 72, 75, 78, 80, 83, 85, 88, 105
| Calculation | Value | Working |
|---|---|---|
| Sorted data | 45, 52, 58, 61, 64, 67, 70, 72, 75, 78, 80, 83, 85, 88, 105 | n = 15 |
| Median (Q2) | 72 | Position 8 of 15 |
| Q1 | 61 | Median of positions 1–7 (lower half) |
| Q3 | 81.5 | Median of positions 9–15 (upper half) |
| IQR | 20.5 | 81.5 − 61 = 20.5 |
| Lower fence | 30.25 | 61 − 1.5 × 20.5 = 30.25 |
| Upper fence | 112.25 | 81.5 + 1.5 × 20.5 = 112.25 |
| Lower whisker | 45 | Smallest value ≥ 30.25 |
| Upper whisker | 105 | Largest value ≤ 112.25 → no outliers here |
| Outliers | None | All values within fences |
Enter this dataset into the generator above (click Load Example) to see the box plot drawn instantly.
Understanding Quartiles and the IQR
The interquartile range (IQR) is the most important single number in a box plot. It measures the spread of the middle 50% of data, making it far more resistant to outliers than the full range (Max − Min). A wide IQR box means high variability in the center of the distribution; a narrow box indicates that the middle half of values are tightly clustered.
Quartile Calculation (Inclusive Method)
n = 15 data points (sorted)
Q2 = value at position (n+1)/2
= position 8 = 72
Q1 = median of lower half
= positions 1–7 → pos 4 = 61
Q3 = median of upper half
= positions 9–15 → pos 12 = 83
IQR = Q3 − Q1 = 83 − 61 = 22
Fence & Outlier Calculation
IQR = Q3 − Q1
Lower fence = Q1 − 1.5 × IQR
Upper fence = Q3 + 1.5 × IQR
Mild outlier: beyond 1.5 × IQR
Extreme outlier: beyond 3.0 × IQR
Whisker = most extreme
non-outlier value in dataset
Interpreting Box Plot Shape and Skewness
The position of the median line within the box reveals the skewness of the distribution without computing a single number. Three patterns cover all cases:
Box Plot vs. Histogram: When to Use Each
Both charts display distributions, but they serve different purposes. A histogram shows the full shape of a distribution using bars for frequency counts — ideal for understanding detail, detecting multimodality (two peaks), and seeing exact bin frequencies. A box plot condenses the distribution into a compact summary, making it ideal for comparing many groups side by side. Histograms are better for one-group deep analysis; box plots are better for multi-group comparison and outlier spotting.
| Feature | Box Plot | Histogram |
|---|---|---|
| Best for | Comparing multiple groups | Detailed shape of one group |
| Shows outliers | Yes — as individual points | Only indirectly (long tails) |
| Shows multimodality | No | Yes — multiple peaks visible |
| Space required | Compact — fits many groups | Needs full panel per group |
| Five-number summary | Yes — directly readable | No — must compute separately |
Practice Problems
Q2 = (22+24)/2 = 23 | Q1 = (15+18)/2 = 16.5 | Q3 = (28+30)/2 = 29
IQR = 29 − 16.5 = 12.5
Lower fence = 16.5 − 1.5×12.5 = −2.25 | Upper fence = 29 + 18.75 = 47.75
No outliers — all values between −2.25 and 47.75.
5 is an outlier (below 10) | 95 is an outlier (above 90) | 25 and 75 are within fences.
Related Topics
Sources & further reading:
- NIST Engineering Statistics Handbook — Box Plots
- Tukey, J.W. (1977). Exploratory Data Analysis. Addison-Wesley. [Origin of the box-and-whisker plot]
- Khan Academy — Box Plot Review
Frequently Asked Questions
Box plots are used to visualize the distribution of a dataset in a compact, standardized format. They are especially useful for comparing multiple groups side by side, identifying outliers, assessing skewness, and summarizing the spread and center of data without showing every individual value. Common applications include comparing test scores across classes, treatment outcomes across clinical groups, or performance metrics across business units.
Outliers are identified using the 1.5 × IQR rule. First compute IQR = Q3 − Q1. Then calculate the lower fence (Q1 − 1.5 × IQR) and upper fence (Q3 + 1.5 × IQR). Any data point below the lower fence or above the upper fence is an outlier and is plotted as an individual dot beyond the whisker. Extreme outliers fall beyond 3 × IQR from the box edges.
The length (or height, in a vertical box plot) of the rectangular box equals the IQR — Q3 minus Q1. A wider box means more variability in the middle 50% of the data. A narrow box indicates that the central data points are tightly clustered. Comparing box widths across groups is one of the fastest ways to compare variability without computing a single number.
A violin plot combines a box plot with a kernel density estimate — the width of the violin at any point represents how many data points fall at that value. This reveals multimodal distributions (two peaks) that a box plot cannot show. Box plots are simpler and more widely understood; violin plots provide more information about the full shape of the distribution but require more space and explanation. For comparing many groups quickly, box plots are usually preferred.
In Excel 2016 and later: select your data, go to Insert → Insert Statistic Chart → Box and Whisker. Excel will calculate the five-number summary and draw the box plot automatically. In older versions, you need to compute Q1, median, Q3 and whisker lengths manually, then build a stacked bar chart and add error bars to simulate the box. Our generator above is faster and shows the full step-by-step calculation.