Inferential Statistics Hypothesis Testing Experimental Design 25 min read May 9, 2026
BY: Statistics Fundamentals Team
Reviewed By: Minsa A (Senior Statistics Editor)

ANOVA (Analysis of Variance): The Complete Beginner's Guide

Three teaching methods. Three fertilizer types. Four production lines. When you need to compare more than two groups at once — and do it without inflating your error rate — ANOVA is the tool that handles it. This guide walks through exactly how it works and why.

You'll find the complete formula breakdown, a step-by-step worked example with real numbers, an interactive calculator, and four real-world case studies from clinical research to manufacturing. The goal is to make ANOVA feel logical before the math arrives.

What You'll Learn
  • ✓ What ANOVA is, what "analysis of variance" actually means, and when to use it
  • ✓ The three main types: one-way, two-way, and repeated measures ANOVA
  • ✓ The F-statistic, ANOVA formula, and how to read an ANOVA table
  • ✓ A full worked example with a real dataset — seven steps, every calculation shown
  • ✓ All four ANOVA assumptions and what to do when one fails
  • ✓ Post-hoc tests, effect size, p-value interpretation, and common mistakes
  • ✓ Python, R, and SPSS code for running ANOVA yourself

What Is ANOVA? (Plain-English Definition)

Definition — Analysis of Variance
ANOVA (Analysis of Variance) is a statistical test used to compare the means of three or more groups. It works by analyzing the ratio of variance between groups — differences caused by the factor being studied — to variance within groups — natural scatter inside each group. A large F-statistic means group differences exceed what random chance would produce, so we reject the null hypothesis that all group means are equal.
F = Mean Square Between ÷ Mean Square Within

At its heart, ANOVA answers one question: are these groups genuinely different, or does the variation we see fit what random chance alone would produce? The name "Analysis of Variance" can mislead newcomers who expect a test about variances. ANOVA actually tests means — but it does so by looking at how variance is distributed across and within groups.

The Statistics Fundamentals team at statisticsfundamentals.com covers ANOVA as part of the broader hypothesis testing curriculum — because ANOVA sits squarely inside the inferential statistics toolkit alongside the one-sample t-test, the two-sample t-test, and regression analysis.

⚡ Quick Reference — ANOVA Key Facts
  • Full name: Analysis of Variance — a test that compares 3+ group means using variance ratios
  • Test statistic: F = MSB ÷ MSW (between-group mean square divided by within-group mean square)
  • Decision rule: If p < α (typically 0.05), reject H₀ and conclude that groups differ
  • What it tells you: Whether any group differs — not which specific groups (you need post-hoc tests for that)
  • When to use ANOVA vs t-test: Three or more groups → ANOVA; exactly two groups → t-test
  • Non-parametric equivalent: Kruskal-Wallis test (when normality assumption fails)

Why ANOVA Is Used in Statistics

Here is the problem ANOVA solves. Suppose you want to compare exam scores from four different tutoring programs. The obvious approach is to run pairwise t-tests: program A vs. B, A vs. C, A vs. D, B vs. C, B vs. D, C vs. D. That is six separate tests.

Each t-test carries a 5% chance of a false positive at α = 0.05. Six independent tests means the combined probability of making at least one false positive climbs to roughly 1 − (0.95)⁶ ≈ 26%. You have inflated your experiment-wise Type I error rate from 5% to 26% without testing anything new. ANOVA runs all the group comparisons in one procedure, keeping the error rate at exactly α = 0.05.

ℹ️
The Core ANOVA Logic

ANOVA asks: is the variation between group averages large relative to the variation inside groups? If groups genuinely differ, their means should spread out far more than individual scores vary within any one group. That ratio — between-group signal divided by within-group noise — is the F-statistic.

This framework was developed by Ronald A. Fisher in the 1920s, formalized in his 1925 textbook Statistical Methods for Research Workers. Fisher's insight — that variance can be partitioned into explainable and unexplainable components — remains one of the foundational ideas in experimental statistics. The NIST/SEMATECH e-Handbook of Statistical Methods gives the historical derivation if you want the full theoretical context.

Types of ANOVA

ANOVA is not a single test. Three versions cover most research scenarios, and the choice depends on how many independent variables you have and whether the same subjects appear in multiple conditions.

One-Way ANOVA

One-way ANOVA tests the effect of a single independent variable (the "factor") that has three or more levels (the "groups"). The dependent variable must be continuous. Example: you grow wheat under three different fertilizer conditions and measure yield per plot. One independent variable (fertilizer type), one dependent variable (yield), three groups.

This is the most common form of ANOVA, and the one we work through completely in the step-by-step example below.

Two-Way ANOVA

Two-way ANOVA introduces a second independent variable. Now you can test three things at once: the main effect of factor A, the main effect of factor B, and their interaction — whether the effect of A depends on the level of B.

A concrete example: you measure student test scores grouped by both teaching method (lecture, flipped classroom, online) and class size (small, medium, large). Two-way ANOVA tells you whether each factor independently affects scores, and whether the best teaching method varies by class size. Without the interaction term, two separate one-way ANOVAs would miss that dependency entirely.

Repeated Measures ANOVA

Repeated measures ANOVA is for situations where the same subjects appear in every condition. Think of measuring blood pressure in the same patients at baseline, after four weeks of treatment, and after eight weeks. Because each person's scores are correlated, regular ANOVA would underestimate effect sizes. Repeated measures ANOVA accounts for within-subject correlation, giving a more powerful test.

Type Independent Variables Subjects
One-Way ANOVA 1 factor, 3+ levels Different groups
Two-Way ANOVA 2 factors, tests interaction Different groups
Repeated Measures ANOVA 1+ factors, same subjects Same subjects in all conditions
MANOVA 1+ factors Different groups, multiple DVs

The ANOVA Formula Explained

The ANOVA formula produces one number — the F-statistic — by working through four quantities: two sums of squares, two mean squares. Each step is a ratio, and each ratio has an intuitive meaning.

ANOVA Core Formula — F-Statistic
F = MSB / MSW
Mean Square Between divided by Mean Square Within
MSB = SSB ÷ (k − 1) MSW = SSW ÷ (N − k) SSB = Sum of Squares Between groups SSW = Sum of Squares Within groups k = number of groups N = total number of observations

Breaking Down Each Component

The table below maps each ANOVA term to what it measures. Read this before looking at any formula — the labels matter more than the arithmetic at first.

Term Symbol What It Measures Formula
Sum of Squares Between SSB How far group means spread from the grand mean Σ nᵢ × (x̄ᵢ − x̄)²
Sum of Squares Within SSW How far individual scores spread from their own group mean Σ Σ (xᵢⱼ − x̄ᵢ)²
Sum of Squares Total SST Total variation across all observations SSB + SSW
Degrees of Freedom Between df_B Number of groups minus one k − 1
Degrees of Freedom Within df_W Total observations minus number of groups N − k
Mean Square Between MSB Average between-group variance per degree of freedom SSB ÷ df_B
Mean Square Within MSW Average within-group variance per degree of freedom SSW ÷ df_W
F-statistic F Ratio of between-group to within-group variance MSB ÷ MSW

Understanding the F-Statistic

The F-statistic is a signal-to-noise ratio. The signal is between-group variance — how much the group averages differ from each other. The noise is within-group variance — how much individual scores vary inside each group regardless of the group assignment.

When F = 1, the signal equals the noise: group differences are no larger than would be expected from random variation. When F is large, the group differences are substantial relative to the background scatter, and we start suspecting something real is happening. The F-table gives the critical value that F must exceed for a given α and degrees of freedom.

F-Distribution Shape — Right-Skewed, Always Positive

Peak near F=0 F critical (α=0.05) Reject H₀ Fail to reject H₀ region 0

The F-distribution is always positive and right-skewed. Values in the red shaded region (beyond F critical) lead to rejecting H₀.

Reading an F-Table vs. Using a P-Value

Two approaches exist for making the ANOVA decision. The classical approach compares your computed F to the critical F-value from a table, indexed by df_between, df_within, and α. The modern approach — used in every statistics package — computes the exact p-value, which is the probability of observing this F-ratio (or larger) if H₀ were true. Both give the same conclusion when used correctly.

Null and Alternative Hypotheses in ANOVA

ANOVA is a formal hypothesis test, so it begins with two competing claims about the population.

ANOVA Hypotheses
H₀: μ₁ = μ₂ = μ₃ = ... = μₖ
Null hypothesis: all group population means are equal
H₁: at least one μᵢ differs from the rest
Alternative hypothesis: at least one group mean is different

One common misreading of H₁ is thinking it means "all groups differ." It does not. H₁ says at least one group mean differs from the others. That one difference is enough to reject H₀. This is why a significant ANOVA result needs post-hoc testing — the overall test only tells you that something is different, not where the difference lies.

Between-Group vs Within-Group Variance

Understanding these two components is the key to understanding why ANOVA works.

Between-Group Variance (The Signal)

Between-group variance measures how much the group means vary around the grand mean (the average of all observations combined). If the treatment or factor has a real effect, the groups will have meaningfully different averages and between-group variance will be large.

Within-Group Variance (The Noise)

Within-group variance measures how much individual observations vary within each group, regardless of group membership. This is random variation — individual differences, measurement error, anything that is not caused by the factor we are studying. It represents the noise floor.

The Core ANOVA Decision Rule in Plain English

If group means spread far apart relative to within-group scatter → F is large → p is small → reject H₀. If group means cluster together and within-group scatter is comparable → F is near 1 → p is large → fail to reject H₀.

Image Placeholder

Add a diagram here showing three overlapping bell curves (between-group spread) vs three tight bell curves (within-group variation). Suggested filename: between-within-variance-anova.png

Step-by-Step ANOVA Example (Full Calculation)

Numbers are easier to follow than formulas in isolation. Here is a complete one-way ANOVA worked from raw data to a decision, showing every calculation.

The Research Question

A soil scientist tests three fertilizer formulations (A, B, C) to see whether any produces significantly higher wheat yield. Each formulation gets applied to five separate plots, and yield (in bushels per acre) is recorded after harvest.

The Dataset

Plot Fertilizer A Fertilizer B Fertilizer C
1202818
2223020
3192717
4212919
5233121
Group Mean21.029.019.0
One-Way ANOVA — Complete Calculation

Three fertilizer groups (k = 3), five observations per group (n = 5), total N = 15. α = 0.05.

1

State the hypotheses. H₀: μ_A = μ_B = μ_C (all fertilizers produce equal yield). H₁: at least one fertilizer mean differs.

2

Calculate the grand mean. Grand mean x̄ = (sum of all 15 observations) ÷ 15 = (105 + 145 + 95) ÷ 15 = 345 ÷ 15 = 23.0

3

Calculate SSB (Sum of Squares Between).
SSB = n_A×(x̄_A − x̄)² + n_B×(x̄_B − x̄)² + n_C×(x̄_C − x̄)²
SSB = 5×(21 − 23)² + 5×(29 − 23)² + 5×(19 − 23)²
SSB = 5×4 + 5×36 + 5×16 = 20 + 180 + 80 = 280

4

Calculate SSW (Sum of Squares Within).
For group A: (20−21)² + (22−21)² + (19−21)² + (21−21)² + (23−21)² = 1+1+4+0+4 = 10
For group B: (28−29)² + (30−29)² + (27−29)² + (29−29)² + (31−29)² = 1+1+4+0+4 = 10
For group C: (18−19)² + (20−19)² + (17−19)² + (19−19)² + (21−19)² = 1+1+4+0+4 = 10
SSW = 10 + 10 + 10 = 30

5

Compute degrees of freedom.
df_between = k − 1 = 3 − 1 = 2
df_within = N − k = 15 − 3 = 12

6

Compute MSB and MSW, then F.
MSB = 280 ÷ 2 = 140
MSW = 30 ÷ 12 = 2.5
F = 140 ÷ 2.5 = 56.0

7

Compare to the critical value. For F(2, 12) at α = 0.05, the critical value from the F-table is approximately 3.89. Our F = 56.0 exceeds this by a wide margin.

✓ F(2, 12) = 56.0, p < 0.001. We reject H₀. At least one fertilizer produces a significantly different yield. Post-hoc testing (Tukey HSD) is needed to identify which specific pairs differ — in practice, B vs. A and B vs. C are the likely drivers.

The ANOVA Table Explained

Statistical software presents ANOVA results as a table. Knowing what each column represents means you can read any ANOVA output — from SPSS, R, Python, or a published paper — without confusion.

Source SS df MS F p-value
Between Groups 280 2 140.00 56.00 < 0.001
Within Groups 30 12 2.50
Total 310 14

Reading the table: the Source column names where the variation comes from. SS is the sum of squared deviations. df is degrees of freedom. MS (Mean Square) is SS ÷ df. The F column only has a value for Between Groups because F is defined as the ratio MSB ÷ MSW. The p-value directly gives the probability of F ≥ 56.0 under H₀.

🧮 One-Way ANOVA Calculator

Enter comma-separated values for each group. The calculator computes SSB, SSW, F, and a decision at α = 0.05.

How to Interpret ANOVA Results

A significant F-statistic is the beginning, not the end. Three things need attention after seeing p < 0.05.

Step 1: Check the P-Value Against Alpha

If p < α (usually 0.05), reject H₀. If p ≥ α, fail to reject H₀. Note that "fail to reject" is not the same as "the groups are definitely equal" — you simply lack sufficient evidence to declare a difference.

Step 2: Run Post-Hoc Tests

ANOVA only flags that something is different. Post-hoc tests make pairwise comparisons while controlling the experiment-wise error rate. Three common options are shown below.

Post-Hoc Test Best For Conservatism
Tukey HSD All pairwise comparisons, equal group sizes Moderate
Bonferroni correction Planned comparisons, any group sizes More conservative
Scheffé test Complex contrasts, unequal group sizes Most conservative
Games-Howell Unequal variances or group sizes Moderate

Step 3: Compute Effect Size (η²)

A statistically significant result says the groups differ. Effect size says how much they differ. Eta squared (η²) answers this question in practical terms.

Effect Size for ANOVA
η² = SSB ÷ SST
0.01 = small effect 0.06 = medium effect 0.14 = large effect

For the fertilizer example: η² = 280 ÷ 310 = 0.90. Fertilizer type accounts for 90% of the total variation in yield — a very large practical effect. These benchmarks come from Cohen's (1988) framework, referenced widely in the effect size literature and taught at institutions including University of Victoria's statistical computing resources.

Understanding P-Values in ANOVA

The p-value in ANOVA answers: if all group means were truly equal (H₀ true), what is the probability of observing an F-statistic this large or larger purely by chance? A small p-value (below α) means that chance alone is an unlikely explanation, so we reject H₀.

⚠️
Common P-Value Misreading

The p-value is not the probability that H₀ is true. It is the probability of the observed data (or more extreme data) given that H₀ is true. This subtle but critical distinction is documented by the American Statistical Association's 2016 statement on p-values. Always pair p-values with effect sizes for a complete picture.

The p-value and effect size tell different stories. A p-value below 0.05 means the result is statistically significant — but with a very large sample, even a tiny, practically meaningless difference can produce p < 0.001. Effect size (η²) tells you whether the difference matters in the real world. Report both. This guidance is consistent with recommendations from the APA Publication Manual and most major scientific journals.

ANOVA Assumptions

ANOVA produces valid results only when four conditions hold. Researchers at UCLA's Statistical Consulting Group list these assumptions as standard practice for any ANOVA analysis.

1

Independence of Observations

Each data point must come from a different, unrelated subject or unit. If the same subject appears in multiple groups, use repeated measures ANOVA instead. Violation here inflates Type I error seriously.

2

Normality Within Groups

The dependent variable should be approximately normally distributed within each group. With n ≥ 30 per group, the central limit theorem makes ANOVA fairly robust to non-normality. For smaller samples, check with a Shapiro-Wilk test. Severe skewness with small samples warrants the Kruskal-Wallis alternative.

3

Homogeneity of Variance (Homoscedasticity)

All groups should have roughly equal variances. Test this with Levene's test before running ANOVA. If this assumption fails, use Welch's ANOVA, which does not assume equal variances and is available in R (oneway.test()), Python (scipy), and SPSS.

4

Random Sampling

Data should come from a random sample drawn from the population. This assumption is about study design. If subjects were not randomly sampled or randomly assigned to groups, the generalizability of results is limited regardless of statistical significance.

What to Do When an Assumption Fails

Violated Assumption Alternative Test When to Use It
Normality Kruskal-Wallis test Non-normal distributions, ordinal data, small samples
Homogeneity of variance Welch's ANOVA Unequal group variances, large Levene's test p-value
Independence Repeated Measures ANOVA Same subjects measured in multiple conditions

ANOVA vs t-Test: When to Use Each

The t-test and ANOVA both compare group means. The choice between them is mostly a question of how many groups you have — and what happens to your error rate if you pick the wrong one.

Feature t-Test ANOVA
Number of groups Exactly 2 3 or more
Test statistic t-statistic F-statistic
Type I error control Fine for 2 groups, inflates with 3+ Controls experiment-wise error at α
Post-hoc tests needed? No Yes, if significant
Independent variables 1 1 (one-way) or 2+ (two-way)
Special case relationship t² = F when k = 2 ANOVA with k=2 gives identical result to t-test

The last row is worth dwelling on. When ANOVA is run with only two groups, F = t². They are mathematically equivalent. The distinction matters only when you have three or more groups — then ANOVA is the correct choice. You can read more about the t-test variants in the one-sample t-test, two-sample t-test, and paired samples t-test guides.

Advantages and Limitations of ANOVA

What ANOVA Does Well

  • Tests multiple groups in a single procedure, controlling Type I error at exactly α
  • Flexible across designs — one-way, two-way, repeated measures, MANOVA
  • Detects interaction effects when two factors are studied together (two-way ANOVA)
  • Robust to mild violations of normality when group sizes are large and equal
  • Well-supported in every major statistics package (SPSS, R, Python, SAS, Stata)

Where ANOVA Falls Short

  • Only flags that at least one group differs — post-hoc tests are always required for specifics
  • Sensitive to outliers, particularly in small samples
  • Requires the homogeneity of variance assumption (though Welch's ANOVA relaxes this)
  • Not designed for non-normal, ordinal, or count data without transformation
  • With very large samples, statistically significant results may have negligible practical size

Real-World Applications of ANOVA

ANOVA is not a classroom exercise. These four domains use it routinely.

🧪

Clinical Trials

Three treatment arms — placebo, low-dose, high-dose — are compared on a continuous outcome like blood pressure reduction. ANOVA tests whether any dose produces a different result before the trial moves to regulatory submission.

📚

Education Research

Researchers compare exam scores across three teaching formats: in-person lecture, flipped classroom, and fully online. ANOVA determines whether format affects outcomes across diverse student populations.

📈

Marketing A/B/C Testing

Three ad creative variations run across separate market segments. ANOVA tests whether click-through rate differs across creatives, guiding budget allocation before a full campaign launch.

🏭

Manufacturing Quality Control

Four production lines manufacture the same component. ANOVA tests whether defect rates differ across lines, pointing quality teams toward which lines need process adjustment.

Case Study: Clinical Trials

Real-World Application

Drug Effectiveness: Placebo vs. Drug A vs. Drug B

Sixty patients are randomly assigned to three groups (20 per group): placebo, Drug A, and Drug B. After eight weeks, systolic blood pressure reduction (mmHg) is measured. One-way ANOVA tests H₀: μ_placebo = μ_A = μ_B. A significant result (say, F(2, 57) = 8.4, p = 0.001) indicates that drug assignment affected outcomes. Tukey HSD post-hoc tests would then reveal whether both drugs outperformed placebo, or whether one drug outperformed the other.

Note: The FDA and EMA require ANOVA or equivalent analyses in most Phase II and Phase III clinical trial submissions. The International Council for Harmonisation (ICH) guideline ICH E9(R1) covers the statistical analysis framework for clinical studies.

Running ANOVA in Python, R, and SPSS

Python — scipy.stats and pingouin

from scipy import stats import numpy as np # Our three fertilizer groups A = np.array([20, 22, 19, 21, 23]) B = np.array([28, 30, 27, 29, 31]) C = np.array([18, 20, 17, 19, 21]) # One-way ANOVA F, p = stats.f_oneway(A, B, C) print(f"F = {F:.2f}, p = {p:.4f}") # Output: F = 56.00, p = 0.0000 # Check Levene's test for equal variances first W, lp = stats.levene(A, B, C) print(f"Levene's W = {W:.3f}, p = {lp:.3f}")

R — aov() and TukeyHSD()

# Build a data frame yield <- c(20,22,19,21,23, 28,30,27,29,31, 18,20,17,19,21) group <- rep(c("A", "B", "C"), each = 5) # Run one-way ANOVA model <- aov(yield ~ group) summary(model) # Tukey HSD post-hoc test TukeyHSD(model) # Check assumption: Levene's test (requires car package) # car::leveneTest(yield ~ group)

For SPSS: Analyze → Compare Means → One-Way ANOVA → move the dependent variable and factor, then click Post Hoc and select Tukey. The ANOVA table, effect size, and post-hoc results all appear in the output viewer.

Common ANOVA Mistakes

# Mistake Consequence Fix
1 Skipping assumption checks Invalid F-statistic, inflated error rate Always run Levene's test and check normality before ANOVA
2 Stopping at significant F No idea which groups actually differ Run Tukey HSD or Bonferroni post-hoc tests
3 Using ANOVA with only 2 groups Technically valid, but overly complex Use a t-test for two groups — same result, simpler interpretation
4 Ignoring effect size Large-sample studies report trivial differences as significant Always report η² alongside p-value
5 Treating non-independent data as independent Artificially inflated sample size, wrong F Use repeated measures ANOVA for matched or longitudinal data
6 Not checking for outliers Outliers distort both SS calculations Use boxplots and z-scores to flag outliers before analysis

ANOVA Entity and Formula Glossary

This reference table covers every major term in ANOVA analysis. It is formatted for direct lookup — use it alongside any ANOVA output you are trying to read.

Term Symbol Definition Formula / Benchmark
ANOVA Statistical method for comparing means of 3+ groups by partitioning total variance Analysis of Variance
F-statistic F Ratio of between-group variance to within-group variance; the ANOVA test statistic F = MSB ÷ MSW
Sum of Squares Between SSB Total squared deviations of group means from the grand mean, weighted by group size Σ nᵢ(x̄ᵢ − x̄)²
Sum of Squares Within SSW Total squared deviations of individual observations from their group mean Σ Σ (xᵢⱼ − x̄ᵢ)²
Sum of Squares Total SST Total variation in the data; equals SSB + SSW SSB + SSW
Mean Square Between MSB Between-group variance per degree of freedom SSB ÷ (k − 1)
Mean Square Within MSW Within-group variance per degree of freedom; also called Mean Square Error SSW ÷ (N − k)
Degrees of Freedom (Between) df_B Number of groups minus one k − 1
Degrees of Freedom (Within) df_W Total observations minus number of groups N − k
P-value p Probability of observing F this large or larger under H₀; not the probability H₀ is true p < α → reject H₀
Null Hypothesis H₀ Assumption that all k group population means are equal μ₁ = μ₂ = ... = μₖ
Alternative Hypothesis H₁ At least one group mean differs from the others At least one μᵢ ≠ μⱼ
Effect Size η² Proportion of total variance explained by the group factor (eta squared) η² = SSB ÷ SST
Eta Squared Benchmarks η² Magnitude guidelines from Cohen (1988): small, medium, large 0.01 / 0.06 / 0.14
Homogeneity of Variance Assumption that group variances are approximately equal; tested with Levene's test Levene's p > 0.05 = satisfied

Frequently Asked Questions About ANOVA

ANOVA (Analysis of Variance) is a statistical test that compares the means of three or more groups simultaneously. It divides total variation into between-group variation (caused by the factor being studied) and within-group variation (random scatter). The ratio of these two quantities — the F-statistic — tells you whether group differences are larger than random chance would produce.

Yes, technically. When ANOVA runs with two groups, F = t², so it gives the same result as an independent samples t-test. But ANOVA's value is specifically in handling three or more groups without inflating the Type I error rate. For two groups, use the two-sample t-test — it is simpler and produces identical conclusions.

A significant ANOVA (p < α) tells you that at least one group mean is statistically different from the others. It does not tell you which groups differ — that requires post-hoc tests (Tukey HSD, Bonferroni, Scheffé). Think of ANOVA as a smoke alarm: it tells you there is a fire somewhere, but you still need to find the room.

ANOVA requires: (1) independence of observations — each data point is from a separate subject or unit; (2) approximate normality within groups — check with Shapiro-Wilk for small samples; (3) homogeneity of variance — all groups have similar variances, tested with Levene's test; (4) random sampling from the population. Violations of homogeneity can be addressed with Welch's ANOVA, and non-normality with the Kruskal-Wallis test.

There is no universally "good" F-value in isolation. You interpret F by converting it to a p-value using the degrees of freedom for your specific dataset. An F of 3.0 might be significant or not depending on your sample size and number of groups. Focus on p and effect size (η²), not on whether F sounds large or small.

The Kruskal-Wallis test is the non-parametric equivalent of one-way ANOVA. Use it when the normality assumption is clearly violated, especially with small samples (n < 15 per group) or ordinal data. It ranks all observations combined and tests whether the rank distributions differ across groups. For repeated measures, the Friedman test serves a similar role.

Eta squared (η² = SSB ÷ SST) measures practical significance: the proportion of total variance that the group factor explains. A study with 500 subjects might detect a statistically significant ANOVA result where η² = 0.01, meaning the factor explains only 1% of variance — probably not practically meaningful. Always report effect size alongside p-values. Benchmarks (small = 0.01, medium = 0.06, large = 0.14) come from Cohen's 1988 framework.

ANOVA connects to a cluster of techniques. Understanding them together gives you a complete picture of inferential statistics for group comparisons.

  • Hypothesis Testing — The parent framework that gives ANOVA its structure: null hypothesis, Type I error, significance level
  • One-Sample t-Test — Comparing a single group mean to a known value; the simplest hypothesis test
  • Two-Sample t-Test — Comparing exactly two independent group means; use ANOVA when you have three or more
  • Paired Samples t-Test — For two related measurements on the same subjects; repeated measures ANOVA generalizes this to 3+ time points
  • F-Table (Critical Values) — The reference table for finding the critical F-value given df and α
  • Confidence Intervals — Complements p-values by estimating the range of plausible parameter values
  • Simple Linear Regression — ANOVA is actually a special case of the general linear model; regression can reproduce ANOVA results with dummy-coded predictors
  • Normal Distribution — The theoretical basis for the normality assumption in ANOVA
  • Z-Score — Standardization concept that connects to how ANOVA treats within-group variation
  • Sampling Distributions — Why the F-statistic follows an F-distribution under H₀
📚
Authoritative External Sources

For deeper theoretical treatment, see: the NIST/SEMATECH e-Handbook — ANOVA chapter; UCLA Statistical Consulting ANOVA seminar; and Khan Academy's ANOVA library. For clinical trial applications, refer to ICH E9(R1).