Hypothesis Testing Categorical Data Nonparametric 22 min read May 12, 2026
BY: Statistics Fundamentals Team
Reviewed By: Minsa A (Senior Statistics Editor)

Chi-Square Test: Complete Guide, Formula, Distribution Table & Worked Examples

A survey reports that 60% of men prefer Brand A while only 35% of women do. Is that gap real, or just sampling noise? A raw percentage comparison cannot answer that. The chi-square test can — it tells you whether the difference between what you observe and what chance would predict is too large to ignore.

This reference covers the chi-square test formula, a complete critical values table, an interactive calculator, step-by-step calculation walkthroughs, SPSS and R code, three fully worked examples, and an effect size guide. Every section is structured for both exam preparation and active research use.

What You'll Learn
  • ✓ The exact chi-square formula with every variable defined
  • ✓ How to read the chi-square distribution table at α = 0.10, 0.05, 0.025, 0.01, 0.001
  • ✓ Three fully worked examples: medicine, genetics, and market research
  • ✓ Step-by-step calculation with an 8-stage method you can replicate on any dataset
  • ✓ SPSS menu walkthrough and R code for both test types
  • ✓ Effect size (Cramér's V) and APA reporting format
  • ✓ When chi-square fails — and which test to use instead

What Is a Chi-Square Test?

Definition — Chi-Square Test (χ² Test)
A chi-square test is a nonparametric statistical test used to determine whether observed categorical data differ significantly from expected frequencies, or whether two categorical variables are independent of each other. It measures how much the data deviate from what chance alone would predict.
χ² = Σ [(O − E)² / E]

Chi-square tests work on count data — the number of people, items, or observations that fall into each category. They do not require the data to follow a normal distribution, which makes them one of the most broadly applicable tools in statistics. Researchers use chi-square tests in medical trials, ecology, marketing surveys, genetics, machine learning, and any other field where outcomes are measured in categories rather than continuous numbers.

The test was developed by Karl Pearson in 1900 and remains one of the most cited statistical procedures in scientific literature. According to the NIST/SEMATECH Engineering Statistics Handbook, the chi-square test is the standard approach for categorical data inference when expected cell counts are adequate. At Statistics Fundamentals, chi-square sits within the broader framework of hypothesis testing, which establishes the null and alternative hypothesis structure all these tests share.

⚡ Quick Reference — Chi-Square Key Facts
  • Formula: χ² = Σ[(O − E)² / E], where O = observed frequency, E = expected frequency
  • Data requirement: Categorical variables only (counts/frequencies, not means)
  • Two main types: Goodness-of-fit (1 variable) and test of independence (2 variables)
  • Key assumption: Every expected cell frequency must be ≥ 5
  • Most common critical value: df = 1, α = 0.05 → χ² = 3.841
  • Effect size: Use Cramér's V — V = √(χ² / [n × min(r−1, c−1)])
3.841
Critical value
df=1, α=0.05
≥ 5
Min. expected
cell frequency
1900
Year Pearson
developed it
χ²
Symbol (Greek
letter chi-squared)

Two Types of Chi-Square Tests

Chi-square tests take two distinct forms. Choosing the wrong one is a common mistake, so it is worth being precise about what each one does.

Feature Goodness-of-Fit Test Test of Independence
Number of variables1 categorical variable2 categorical variables
Data structureSingle frequency tableContingency table (rows × columns)
Research questionDoes this variable follow a specified distribution?Are these two variables associated?
Null hypothesis (H₀)Observed = expected distributionThe two variables are independent
Expected frequencyE = n × p (theoretical proportion)E = (Row Total × Col Total) / N
Degrees of freedomk − 1(r − 1)(c − 1)
Classic exampleDo the six faces of a die appear equally often?Is gender associated with product preference?

Chi-Square Goodness-of-Fit Test

The goodness-of-fit test asks: does a single categorical variable match a theoretical distribution? You collect counts across k categories and compare them to the counts you would expect if a specific null hypothesis were true. A genetics researcher testing whether a cross follows Mendel's 9:3:3:1 ratio, or a quality control engineer testing whether defect types are uniformly distributed, both use this form.

Chi-Square Test of Independence

The test of independence asks: are two categorical variables related? Both variables are measured on the same sample. Their joint counts are arranged in a contingency table — rows for one variable, columns for the other — and the test determines whether the pattern of cell frequencies is consistent with the two variables being unrelated. This is the more commonly encountered form in social science, medical, and marketing research.

Chi-Square Test Formula

All chi-square tests use the same core formula for the test statistic. The distinction between test types lies in how expected frequencies are calculated, not in the formula itself.

The Chi-Square Test Statistic: χ² = Σ[(O − E)² / E]

Chi-Square Test Statistic — Pearson (1900)
χ² = Σ [(O − E)² / E]
Sum across all categories (goodness-of-fit) or all cells (test of independence)
χ² = chi-square test statistic O = observed frequency (actual count) E = expected frequency (under H₀) Σ = sum over all categories or cells

In plain terms: For every cell or category, subtract the expected count from the observed count, square the result (to eliminate negative values), divide by the expected count (to standardize for scale), then add all those values together. A larger χ² means the data deviate more from the null hypothesis.

Expected Frequency Formulas

Expected Frequency — Goodness-of-Fit
E = n × p
Where n = total sample size, p = theoretical expected proportion for that category
Expected Frequency — Test of Independence
E = (Row Total × Column Total) / Grand Total
Calculate for every cell individually using the row and column marginals

Degrees of Freedom Formulas

Degrees of Freedom (df)
Goodness-of-fit: df = k − 1
Test of independence: df = (r − 1)(c − 1)
k = number of categories; r = rows; c = columns in the contingency table

The degrees of freedom determine which chi-square distribution to use when finding the p-value or critical value. Higher df shifts the distribution to the right — the same χ² value corresponds to a larger p-value when df is larger. This is why the critical value at df = 1 (3.841 at α = 0.05) is much smaller than the critical value at df = 9 (16.919 at α = 0.05).

Assumptions of the Chi-Square Test

Chi-square test results are only valid when these five conditions hold. Penn State's STAT 500 course — a widely cited statistics curriculum — lists adequate expected cell frequency as the most commonly violated assumption in practice (Penn State STAT 500, Lesson 8).

1
Categorical data

Both variables must be measured as categories — nominal (unordered labels like colors or countries) or ordinal (ordered categories like rating scales). Chi-square cannot be applied to continuous measurements directly.

2
Independent observations

Each subject or observation contributes to exactly one cell. Observations must not be paired, matched, or repeated. For paired categorical data, use McNemar's test instead.

3
Adequate expected frequencies (E ≥ 5 per cell)

Every expected cell frequency must be at least 5. If more than 20% of cells have E < 5, the chi-square approximation is unreliable. For 2×2 tables with small expected counts, use Fisher's exact test.

4
Random or representative sampling

Data must come from a random sample or a sampling design that is representative of the population being studied. Non-random convenience samples limit the generalizability of the result.

5
Mutually exclusive categories

Each observation must fall into one and only one category. Overlapping categories (where an observation could be counted twice) violate the independence of cells and invalidate the test.

⚠️
Most Violated Assumption

Expected frequencies below 5 account for the majority of chi-square misapplications in published research. Always check E values before reporting results. The SPSS output footnote and R's chisq.test()$expected both flag this automatically.

Chi-Square Distribution Table (Critical Values)

Use this table to find the critical value for your test. Locate your degrees of freedom in the left column, then find the column matching your significance level (α). If your calculated χ² exceeds the critical value, reject the null hypothesis.

df α = 0.10 α = 0.05 α = 0.025 α = 0.01 α = 0.001
12.7063.8415.0246.63510.828
24.6055.9917.3789.21013.816
36.2517.8159.34811.34516.266
47.7799.48811.14313.27718.467
59.23611.07012.83315.08620.515
610.64512.59214.44916.81222.458
712.01714.06716.01318.47524.322
813.36215.50717.53520.09026.125
914.68416.91919.02321.66627.877
1015.98718.30720.48323.20929.588
1218.54921.02623.33726.21732.910
1522.30724.99627.48830.57837.697
2028.41231.41034.17037.56645.315
2534.38237.65240.64644.31452.620
3040.25643.77346.97950.89259.703
4051.80555.75859.34263.69173.402
5063.16767.50571.42076.15486.661
6074.39779.08283.29888.37999.607
8096.578101.879106.629112.329124.839
100118.498124.342129.561135.807149.449

Highlighted: χ² = 3.841 at df = 1, α = 0.05 — the most commonly referenced critical value. Source: Tabulated from the chi-square cumulative distribution function. Values match those in the NIST/SEMATECH Statistics Handbook Table. For the extended downloadable version, visit the Chi-Square Table reference page.

Quick Lookup — Common Scenarios

ScenariodfCritical value α = 0.05Critical value α = 0.01
2-category goodness-of-fit13.8416.635
3-category goodness-of-fit25.9919.210
4-category goodness-of-fit37.81511.345
2×2 contingency table13.8416.635
2×3 contingency table25.9919.210
3×3 contingency table49.48813.277
3×4 contingency table612.59216.812
4×4 contingency table916.91921.666
Mendel's 9:3:3:1 ratio (4 categories)37.81511.345

How to Perform a Chi-Square Test: 8-Step Method

📋
Featured Snippet — 8 Steps

Step 1: State H₀ and H₁. Step 2: Set α. Step 3: Build the contingency table with observed counts. Step 4: Calculate expected frequencies using E = (Row × Col) / N. Step 5: Compute χ² = Σ[(O − E)² / E]. Step 6: Find df. Step 7: Look up the critical value. Step 8: Compare χ² to the critical value and state the conclusion.

Worked Example — Full 8-Step Walkthrough

Research question: Is there a statistically significant association between gender (Male/Female) and preference (Brand A / Brand B) in a sample of 100 consumers?

1

State the hypotheses:
H₀: Gender and brand preference are independent (no association)
H₁: Gender and brand preference are associated

2

Set the significance level: α = 0.05

3

Build the observed contingency table:

Brand ABrand BRow Total
Male302050
Female104050
Col Total4060100
4

Calculate expected frequencies using E = (Row Total × Column Total) / Grand Total:

CellCalculationExpected (E)
Male / Brand A(50 × 40) / 10020.0
Male / Brand B(50 × 60) / 10030.0
Female / Brand A(50 × 40) / 10020.0
Female / Brand B(50 × 60) / 10030.0

✓ All expected frequencies ≥ 5. Assumption satisfied.

5

Calculate the chi-square statistic:

CellOE(O − E)²(O − E)² / E
Male / Brand A30201005.000
Male / Brand B20301003.333
Female / Brand A10201005.000
Female / Brand B40301003.333
Totalχ² = 16.667
6

Degrees of freedom: df = (r − 1)(c − 1) = (2 − 1)(2 − 1) = 1

7

Critical value: At df = 1 and α = 0.05, critical value = 3.841 (from the distribution table above)

8

Decision: χ² = 16.667 > critical value 3.841 → Reject H₀

✓ Conclusion: There is a statistically significant association between gender and brand preference (χ²(1, N = 100) = 16.67, p < .001). Men and women differ in their brand preferences beyond what chance alone would predict.

Chi-Square Calculator

🧮 Chi-Square Test of Independence Calculator

Enter observed counts for a 2×2 contingency table. The calculator computes χ², degrees of freedom, the p-value approximation, and Cramér's V effect size.

Observed Counts (O)

📊
Enter counts in the 2×2 table to the left and click Calculate.

Worked Examples Across Three Fields

Example 1 — Medical Research: Smoking and Lung Disease

This example follows the design used in classic epidemiology studies on smoking. The data below are representative of the proportions established in large-sample clinical research, consistent with findings documented by the CDC Tobacco Statistics and Data program.

Worked Example 2 — Test of Independence (Medicine)

Research question: Is smoking status (Smoker / Non-Smoker) associated with lung disease diagnosis (Yes / No) in a sample of 300 patients?

1

Observed contingency table:

Lung Disease: YesLung Disease: NoRow Total
Smoker9060150
Non-Smoker30120150
Col Total120180300
2

Expected frequencies:
Smoker/Yes: (150 × 120) / 300 = 60  |  Smoker/No: (150 × 180) / 300 = 90
Non-Smoker/Yes: (150 × 120) / 300 = 60  |  Non-Smoker/No: (150 × 180) / 300 = 90

3

Chi-square statistic:
(90−60)²/60 + (60−90)²/90 + (30−60)²/60 + (120−90)²/90
= 900/60 + 900/90 + 900/60 + 900/90
= 15 + 10 + 15 + 10 = χ² = 50.00

4

df = (2−1)(2−1) = 1. Critical value at α = 0.05, df = 1: 3.841. Since 50.00 ≫ 3.841, reject H₀.

✓ Conclusion: Smoking status and lung disease are significantly associated (χ²(1, N = 300) = 50.00, p < .001). Cramér's V = √(50/300) = 0.408, indicating a medium-to-large effect.

Example 2 — Genetics: Mendel's Goodness-of-Fit Test

Worked Example 3 — Goodness-of-Fit (Biology / Genetics)

Research question: Does a dihybrid pea plant cross produce phenotype ratios consistent with Mendel's predicted 9:3:3:1 ratio in a sample of 160 offspring?

1

Observed vs. expected counts:

PhenotypeObserved (O)RatioExpected (E = n × p)(O−E)²/E
Round/Yellow909/1690.00.000
Round/Green283/1630.00.133
Wrinkled/Yellow323/1630.00.133
Wrinkled/Green101/1610.00.000
Total160160χ² = 0.267
2

df = k − 1 = 4 − 1 = 3. Critical value at α = 0.05, df = 3: 7.815. Since 0.267 ≪ 7.815, fail to reject H₀.

✓ Conclusion: The observed phenotype ratios are consistent with Mendel's 9:3:3:1 prediction (χ²(3, N = 160) = 0.27, p = .966). The data do not provide evidence against the Mendelian model.

How to Report Chi-Square Results (APA Format)

When writing up chi-square test results, follow the APA 7th edition format. This format is required by most journals and expected in graduate-level coursework. The Publication Manual of the American Psychological Association (7th ed.) specifies this structure for nonparametric test reporting.

APA Reporting Format
χ²(df, N = sample size) = value, p = p-value, V = Cramér's V

Full sentence examples:

  • For independence: "A chi-square test of independence found a significant relationship between gender and brand preference, χ²(1, N = 100) = 16.67, p < .001, Cramér's V = 0.41."
  • For goodness-of-fit: "A chi-square goodness-of-fit test indicated the observed phenotype distribution was consistent with the 9:3:3:1 Mendelian ratio, χ²(3, N = 160) = 0.27, p = .97."
Always report effect size

Statistical significance alone does not tell the reader how large the association is. With large samples, a trivially small association can produce p < .001. Always accompany a significant chi-square result with Cramér's V or Phi (φ) for 2×2 tables.

Effect Size: Cramér's V

A significant chi-square result tells you the association is real; Cramér's V tells you how strong it is. Cohen (1988) established the benchmark thresholds below, though more recent work by Lakens (2013) at Eindhoven University of Technology notes that these benchmarks should be treated as contextual guides rather than rigid rules.

Cramér's V — Effect Size for Chi-Square
V = √( χ² / [n × min(r − 1, c − 1)] )
Where n = total sample size, r = rows, c = columns. V ranges from 0 (no association) to 1 (perfect association).
Cramér's VEffect Sizedf = 1 (2×2)df = 2 (2×3)df = 3 (2×4)
0.10SmallWeak associationWeak associationWeak association
0.30MediumModerate associationModerate associationModerate association
0.50LargeStrong associationStrong associationStrong association

For 2×2 tables specifically, Phi (φ) is equivalent to Cramér's V: φ = √(χ²/n). Both yield the same value when there is one degree of freedom.

Chi-Square Test in SPSS

Test of Independence in SPSS

SPSS — Step-by-Step Menu Navigation

Running a chi-square test of independence in IBM SPSS Statistics

1

Go to Analyze → Descriptive Statistics → Crosstabs

2

Move your first variable into the Row(s) box and your second variable into the Column(s) box

3

Click Statistics → check Chi-square → also check Phi and Cramer's V for effect size → click Continue

4

Click Cells → under Counts, check Observed and Expected → under Percentages, check Row → click Continue → OK

Reading SPSS output:

  • In the Chi-Square Tests table, read the Pearson Chi-Square row
  • The Asymptotic Significance (2-sided) column gives your p-value
  • Check the footnote for "X cells have expected count less than 5" — if this appears, consider Fisher's exact test
  • The Symmetric Measures table gives Cramér's V

Goodness-of-Fit in SPSS

Navigate to Analyze → Nonparametric Tests → Legacy Dialogs → Chi-Square. Move your variable into the Test Variable List. Under Expected Values, choose "All categories equal" for a uniform distribution, or enter custom expected proportions. Click OK.

Chi-Square Test in R

Test of Independence in R

R
# Create the observed contingency table data_matrix <- matrix(c(30, 20, 10, 40), nrow = 2, dimnames = list( Gender = c("Male", "Female"), Preference = c("Brand A", "Brand B") )) # Run the test result <- chisq.test(data_matrix) print(result) # Output: X-squared = 16.667, df = 1, p-value = 4.46e-05 # Check expected frequencies (must all be ≥ 5) result$expected # Calculate Cramér's V effect size library(rstatix) cramer_v(data_matrix)
⚠️
Yates' Continuity Correction

R applies Yates' continuity correction by default for 2×2 tables — this slightly reduces χ². To match hand calculations or SPSS output, disable it: chisq.test(data_matrix, correct = FALSE)

Goodness-of-Fit in R

R
# Observed phenotype counts observed <- c(90, 28, 32, 10) # Expected proportions from Mendel's 9:3:3:1 ratio expected_probs <- c(9/16, 3/16, 3/16, 1/16) # Run the test chisq.test(observed, p = expected_probs) # Output: X-squared = 0.267, df = 3, p-value = 0.966 # Interpretation: Data consistent with 9:3:3:1 ratio

Chi-Square vs. Other Statistical Tests

Chi-Square vs. Fisher's Exact Test

FeatureChi-Square TestFisher's Exact Test
Best forLarge samples (all E ≥ 5)Small samples (any E < 5)
CalculationApproximate (asymptotic)Exact probability
Table sizeAny r × cMost commonly 2×2
Sample size guidancen > 40 (with all E ≥ 5)n < 20, or any E < 5
SoftwareDefault in SPSS, RCheckbox in SPSS; fisher.test() in R

Chi-Square vs. t-Test

FeatureChi-Square Testt-Test
Outcome variable typeCategorical (counts)Continuous (means)
Normality requiredNoYes (or large n)
What it testsAssociation or distributionDifference between means
Example question"Is political party related to voting behavior?""Is the mean exam score higher in Group A vs B?"
Effect sizeCramér's VCohen's d

Chi-Square vs. ANOVA

FeatureChi-Square TestANOVA
Outcome variableCategorical (counts)Continuous (means)
Independent variableCategorical groupsCategorical groups
TestsAssociation between categoriesDifferences in group means
Effect sizeCramér's Vη² (eta squared)

For a structured decision guide covering all major statistical tests, see the Statistical Test Selector tool on this site.

Where Chi-Square Tests Are Used

🏥

Medical Research

Testing whether treatment outcomes (recovered/not recovered) differ by treatment group. Used routinely in randomized controlled trials with binary endpoints.

🧬

Genetics & Biology

Verifying whether observed genotype or phenotype ratios match Mendelian or Hardy-Weinberg predictions in population genetics.

📊

Survey Analysis

Determining whether survey responses differ by demographic groups such as age, gender, or education level. Standard in social science research.

🛒

Market Research

Testing whether brand preferences, product choices, or consumer behaviors differ across customer segments or regions.

🤖

Machine Learning

Feature selection: identifying which categorical features are statistically associated with the target variable before model training.

🏛️

Social Sciences

Analyzing voting patterns, criminal justice outcomes, educational attainment, and social mobility data at the population level.

🌿

Ecology

Comparing species distribution across habitat types, or testing whether species are associated with particular environmental conditions.

🏭

Quality Control

Testing whether defect rates or product categories are uniformly distributed across production lines, batches, or suppliers.

When Chi-Square Fails: Alternatives

ProblemRecommended AlternativeWhy
Expected frequencies < 5 in any cell (small sample)Fisher's exact testExact computation; no large-sample approximation needed
Paired or matched categorical dataMcNemar's testAccounts for the non-independence of matched pairs
Ordered categories (ordinal data)Cochran-Armitage trend testDetects monotonic trend rather than general association
Very large N (χ² inflated trivially)Report Cramér's V; interpret practicallyWith huge n, even negligible associations become significant
Three or more repeated measuresCochran's Q testExtension of McNemar for k > 2 related groups

Frequently Asked Questions

Chi-Square Quick Reference Cheat Sheet

The table below consolidates every key term, formula, and decision rule covered in this guide. It is structured for maximum LLM parsability and serves as a one-page study reference.

Term / Entity Formula / Value When to Use Plain Interpretation
Chi-square statistic (χ²)χ² = Σ[(O − E)² / E]All chi-square testsTotal deviation of observed from expected counts
Expected frequency — goodness-of-fitE = n × pSingle-variable testCount predicted by theoretical proportion p
Expected frequency — independenceE = (Row × Col) / NTwo-variable testCount predicted if variables were unrelated
Degrees of freedom (goodness-of-fit)df = k − 1k = number of categoriesNumber of free cells after constraints applied
Degrees of freedom (independence)df = (r−1)(c−1)r × c contingency tableSame logic; both row and column margins constrained
Critical value (df=1, α=0.05)3.841Most 2×2 tablesExceed this → reject H₀ at 5% level
Critical value (df=1, α=0.01)6.6352×2 with stricter thresholdExceed this → reject H₀ at 1% level
Cramér's VV = √(χ²/[n × min(r−1,c−1)])After significant χ²Effect size: 0.10=small, 0.30=medium, 0.50=large
p-valueP(χ² ≥ observed | H₀ true)Always reportp < 0.05 → reject H₀ (at conventional α)
Null hypothesis (independence)H₀: Variables are independentTest of independenceNo association between the two categorical variables
Null hypothesis (goodness-of-fit)H₀: Observed = expected distributionGoodness-of-fitData match the theoretical distribution
Assumption: expected frequencyE ≥ 5 per cellEvery chi-square testIf violated, use Fisher's exact test instead
APA reporting formatχ²(df, N = n) = value, p = .xxxAll publicationsInclude df, sample size, statistic, p-value, V

Continue Learning at Statistics Fundamentals

Related Topics in Hypothesis Testing & Statistics

Chi-square tests connect to a network of statistical concepts. The guides below cover the prerequisite ideas and follow-on methods in natural learning sequence.

External References & Authority Sources