Why Use Excel for Statistics?
Microsoft Excel is the most widely used data analysis tool in the world. Despite its reputation as a basic spreadsheet application, Excel has powerful built-in statistical functions that can handle most analyses a business professional or student will need:
Excel includes a powerful Analysis ToolPak add-in that provides point-and-click access to 19 statistical analysis tools including regression, ANOVA, t-tests, and more.
Enable Analysis ToolPak: File → Options → Add-ins → Excel Add-ins → Manage → Go → Check "Analysis ToolPak" → OK
Descriptive Statistics in Excel
Excel has dedicated functions for every key descriptive statistic.
Central Tendency Functions
| Statistic | Excel Formula | Example |
| Mean |
AVERAGE(range) |
=AVERAGE(A1:A20) |
| Median |
MEDIAN(range) |
=MEDIAN(A1:A20) |
| Mode |
MODE(range) |
=MODE(A1:A20) |
| Geometric Mean |
GEOMEAN(range) |
=GEOMEAN(A1:A20) |
Variability Functions
| Statistic | Excel Formula | Note |
| Sample Std Dev |
STDEV(range) or STDEV.S(range) |
Divides by n−1 |
| Population Std Dev |
STDEVP(range) or STDEV.P(range) |
Divides by n |
| Sample Variance |
VAR(range) or VAR.S(range) |
Square of STDEV.S |
| Range |
=MAX(range)-MIN(range) |
No dedicated function |
| Quartile 1 |
QUARTILE(range,1) |
25th percentile |
| Quartile 3 |
QUARTILE(range,3) |
75th percentile |
Hands-On Example: Analyzing Test Scores
Enter these scores in cells A1:A10: 78, 85, 92, 65, 88, 74, 90, 82, 78, 95
=AVERAGE(A1:A10) → 82.7
=MEDIAN(A1:A10) → 83.5
=MODE(A1:A10) → 78
=STDEV.S(A1:A10) → 9.67
=MAX(A1:A10)-MIN(A1:A10) → 30
Descriptive Statistics ToolPak
For a complete summary in one click:
1
Go to Data tab → Data Analysis
2
Select Descriptive Statistics → OK
3
Set Input Range to your data, check Summary Statistics, click OK
Excel will output: Mean, Standard Error, Median, Mode, Std Dev, Variance, Range, Min, Max, Sum, Count, Confidence Level.
Calculating Mean, Median & Mode Step-by-Step
Step 1: Enter Your Data
Enter your data in a column (e.g., A1:A20). Label column A as "Data" in cell A1 for clarity.
Step 2: Set Up Your Summary Table
In a new area (e.g., C1:D5), create this layout:
| Cell C | Cell D |
| Mean | =AVERAGE(A2:A20) |
| Median | =MEDIAN(A2:A20) |
| Mode | =MODE.SNGL(A2:A20) |
| Std Dev | =STDEV.S(A2:A20) |
| Count | =COUNT(A2:A20) |
Tip: Use MODE.MULT() instead of MODE.SNGL() if you expect multiple modes. It returns an array of all modes, entered as an array formula (Ctrl+Shift+Enter in older Excel versions).
Running a t-Test in Excel
Method 1: T.TEST Function
Quick formula approach:
=T.TEST(array1, array2, tails, type)
- array1, array2: Your two datasets
- tails: 1 = one-tailed, 2 = two-tailed
- type: 1 = paired, 2 = two-sample equal variance, 3 = two-sample unequal variance
Example: Two-Sample t-Test
Group A scores (A1:A10): 78, 82, 85, 90, 74, 88, 79, 83, 91, 86
Group B scores (B1:B10): 70, 75, 68, 82, 74, 71, 79, 73, 77, 80
=T.TEST(A1:A10, B1:B10, 2, 3)
Result: p-value = 0.012. Since 0.012 < 0.05, we reject H₀ — the groups have significantly different means.
Method 2: Analysis ToolPak t-Test
1
Data tab → Data Analysis
2
Select "t-Test: Two-Sample Assuming Unequal Variances"
3
Set Variable 1 Range, Variable 2 Range, Alpha (0.05), and Output location
4
Read the output: t-Stat, p-value (one-tail), P(T≤t) two-tail, t Critical values
Correlation & Regression in Excel
Pearson Correlation
=CORREL(array1, array2) ' Returns Pearson's r (-1 to +1)
Example
Hours studied in A1:A10, exam scores in B1:B10:
=CORREL(A1:A10, B1:B10)
Result: r = 0.87 → Strong positive correlation between study hours and scores.
Simple Linear Regression
The LINEST function returns regression statistics as an array:
=LINEST(y_range, x_range, TRUE, TRUE) ' Enter as array formula
Or use the scatter plot approach:
1
Select your X and Y data → Insert → Scatter Chart
2
Click on any data point → Add Trendline
3
Choose Linear → Check "Display Equation" and "Display R² Value"
Analysis ToolPak Regression: Data → Data Analysis → Regression → Set Y Input Range and X Input Range → Check "Residuals" and "Normal Probability" → OK. This gives the most complete regression output including p-values for each coefficient.
Complete Worked Example
Let's analyze sales data for a retail store. We have 10 weeks of data with advertising spend (X) and sales (Y):
| Week | Ad Spend ($) | Sales ($) | Formula (Excel) |
| 1 | 500 | 12,000 |
=AVERAGE(C2:C11)
=STDEV.S(C2:C11)
=CORREL(B2:B11,C2:C11)
=T.TEST(B2:B11,C2:C11,2,3)
|
| 2 | 750 | 15,500 |
| 3 | 1,000 | 18,200 |
| 4 | 800 | 16,800 |
| 5 | 1,200 | 21,500 |
| 6 | 600 | 13,400 |
| 7 | 950 | 17,600 |
| 8 | 1,100 | 20,100 |
| 9 | 700 | 14,200 |
| 10 | 1,300 | 23,000 |
Expected Results:
- Mean Sales: $17,230
- Std Dev Sales: $3,508
- Correlation (r): 0.997 — nearly perfect positive correlation
- Regression: Sales ≈ 4,000 + 14.5 × (Ad Spend)