What are the assumptions of linear regression?

The five core OLS assumptions for linear regression (Gauss-Markov) are: linearity, independence of errors, homoscedasticity (constant variance), normality of residuals, and no perfect multicollinearity. An additional assumption is that the mean of errors equals zero.

What are the assumptions of the t-test?

T-test assumptions include: the data is continuous and roughly normally distributed (or n > 30 by the Central Limit Theorem), observations are independent, and for the two-sample t-test, equal variances (homoscedasticity) unless Welch's correction is used.

What are ANOVA assumptions?

ANOVA requires three main assumptions: normality within each group, homogeneity of variance across groups (homoscedasticity, tested with Levene's test), and independence of observations. Violations can be addressed with transformations or non-parametric alternatives like Kruskal-Wallis.

What happens if assumptions are violated?

Violated assumptions bias estimates, inflate Type I error rates, produce unreliable p-values, and make confidence intervals misleading. The severity depends on the degree of violation and sample size. Remedies include data transformations, robust methods, or non-parametric alternatives.

How do you check the normality assumption?

Check normality visually with a Q-Q plot or histogram of residuals. Formally test with the Shapiro-Wilk test (best for small samples) or Kolmogorov-Smirnov test. For large samples (n > 30), the Central Limit Theorem means normality is less critical for means.

Statistical Assumptions: Complete Guide to Regression, ANOVA & T-Test (2026)

Q: What are statistical assumptions?

Statistical assumptions are conditions that must hold for a test's results to be valid. They define the circumstances under which a model's estimates, p-values, and confidence intervals can be trusted. Each test — t-test, ANOVA, linear regression — has its own set.

What Are Statistical Assumptions?

Definition — Statistical Assumptions

Statistical assumptions are the conditions that must be true for the results of a statistical test or model to be valid. They specify the data-generating structure the method was designed for. When assumptions hold, estimates are unbiased, p-values have their stated meaning, and confidence intervals achieve their stated coverage. When they fail, all three can be wrong.

Valid Inference = Correct Method + Satisfied Assumptions

Every statistical method is a mathematical machine built for a specific type of data. A t-test assumes your observations come from a normally distributed population. A linear regression assumes a straight-line relationship between your variables. A chi-square test assumes expected cell counts are large enough. These aren't arbitrary rules — they're the conditions under which the underlying math is provably correct.

Think of assumptions as the "terms and conditions" of a statistical test. Ignoring them doesn't make the test fail to run — your software will still produce a p-value. It just means that p-value may not mean what you think it does. A significance result on badly violated assumptions can lead you to reject a true null hypothesis far more often than your chosen α would suggest.

The importance of checking assumptions before drawing conclusions is emphasized across all major statistical frameworks, from classical frequentist inference to modern machine learning diagnostics. For a broader grounding in how probability and inference work, the statistics and probability section on Statistics Fundamentals provides the necessary foundation.

⚡ Quick Reference — Why Assumptions Matter

Unbiasedness: Violated assumptions often cause estimates to systematically over- or underestimate the true parameter
Valid p-values: A p-value of 0.04 means a 4% false-positive rate only when assumptions are met; violations can push this to 15–20%
Reliable confidence intervals: A "95% CI" may cover the true parameter only 80% of the time when homoscedasticity is violated
Predictive accuracy: A model built on violated assumptions will perform poorly on new data
Interpretability: Regression coefficients only have their intended meaning when the linearity and independence assumptions hold

Types of Statistical Assumptions

Assumptions can be organized into three broad categories. Distributional assumptions concern the shape of the data's probability distribution. Structural assumptions describe the mathematical relationship between variables. Data quality assumptions concern how observations were collected and whether they are independent of one another.

Type 01

Distributional Assumptions

Specify the probability distribution the data or residuals should follow. The most common is normality. These assumptions make it possible to derive exact sampling distributions for test statistics.

Examples: normality, known variance

Type 02

Structural Assumptions

Describe the mathematical form of the relationship between variables — for instance, that the relationship is linear, or that variance is constant across all values of a predictor.

Examples: linearity, homoscedasticity, no multicollinearity

Type 03

Data Quality Assumptions

Concern how observations were obtained. Independence means each data point carries unique information. Random sampling means the sample represents the population without systematic bias.

Examples: independence, random sampling, no measurement error

💡

Parametric vs. Non-Parametric

Parametric tests (t-test, ANOVA, linear regression) make explicit distributional assumptions — usually normality. Non-parametric tests (Mann-Whitney U, Kruskal-Wallis, Spearman correlation) replace distributional assumptions with weaker rank-based ones. The trade-off: non-parametric tests are more flexible but have less statistical power when parametric assumptions actually hold.

Linear Regression Assumptions (OLS / Gauss-Markov)

Linear regression is the most widely used statistical model, and its assumptions are the most studied. The Ordinary Least Squares (OLS) estimator is provably the Best Linear Unbiased Estimator (BLUE) — meaning it has the lowest variance among all linear unbiased estimators — under the five Gauss-Markov assumptions. Add normality of errors and you get exact p-values and confidence intervals as well.

Linear Regression Model

Y = β₀ + β₁X₁ + β₂X₂ + … + βₚXₚ + ε

Y = response variable

β = coefficients to estimate

X = predictor variables

ε = error term (residual)

Assumption 1: Linearity

The relationship between each predictor and the outcome must be linear — a straight line captures it adequately. This is an assumption about the mean of Y given X, not about the distribution of X or Y individually.

How to check: Plot the residuals against each predictor (residual vs. fitted plot). A random scatter around zero indicates linearity. A U-shape or systematic curve signals a non-linear relationship that the model is missing.

Fix if violated: Add a squared or higher-order term (polynomial regression), apply a log or square-root transformation to the predictor, or use a non-linear model. The full guide to simple linear regression covers polynomial extensions.

Assumption 2: Independence of Errors

Each observation's error term must be independent of every other observation's error term. In practice, this means the residuals should not be correlated with each other. Violations occur most often with time-series data (where yesterday's error predicts today's) or clustered data (students within classrooms, patients within hospitals).

How to check: Plot residuals in collection order and look for runs or oscillations. The Durbin-Watson statistic (range 0–4; values near 2 indicate no autocorrelation) provides a formal test.

Fix if violated: For time-series: include lagged variables, use generalized least squares (GLS), or an ARIMA model. For clustered data: use multilevel (mixed) models or cluster-robust standard errors.

Assumption 3: Homoscedasticity (Constant Variance)

The variance of the errors must be constant across all levels of the predictor variables. When variance changes with the level of a predictor — larger residuals at higher fitted values, for example — the data is heteroscedastic. OLS estimates remain unbiased under heteroscedasticity, but standard errors are wrong, making p-values and confidence intervals unreliable.

How to check: A scale-location plot (square root of |residuals| vs. fitted values) should show a flat horizontal line. Formally, the Breusch-Pagan test or White test detects heteroscedasticity.

Fix if violated: Apply a log or square-root transformation to Y. Alternatively, use heteroscedasticity-consistent (HC) robust standard errors (White's sandwich estimator) or weighted least squares (WLS).

Assumption 4: Normality of Residuals

For exact p-values and confidence intervals, residuals (not raw data) should be normally distributed. This assumption becomes less critical as sample size grows because the Central Limit Theorem ensures the sampling distribution of the coefficients approaches normality regardless. With n > 100, moderate departures rarely matter in practice.

How to check: A Q-Q plot of residuals should follow the diagonal reference line closely. The Shapiro-Wilk test provides a formal normality check (most reliable for n ≤ 50). See the normal distribution guide for background on the normal curve itself.

Fix if violated: Log or Box-Cox transformation of Y. If the distribution is heavily skewed or has extreme outliers, consider a generalized linear model (GLM) with an appropriate distribution family.

Assumption 5: No Perfect Multicollinearity

In multiple regression, no predictor should be a perfect linear combination of others. When two predictors are highly correlated (but not perfectly), the OLS estimates become unstable — small changes in the data produce large swings in coefficients — and standard errors inflate. This is the practical form of the assumption that matters most.

How to check: Compute the Variance Inflation Factor (VIF) for each predictor. VIF > 10 (some use VIF > 5) signals a problem worth investigating. A correlation matrix among predictors provides an informal visual check.

Fix if violated: Remove one of the correlated predictors. Combine them using Principal Component Analysis (PCA). Use ridge regression, which is designed to handle multicollinearity by adding a penalty term. See the guide on multiple linear regression for a full treatment.

Worked Example — Regression Diagnostics

Checking OLS Assumptions: Salary vs. Experience Data

You have salary data for 80 employees regressed on years of experience. Here's the four-plot diagnostic sequence:

Residuals vs. Fitted: The scatter shows a slight upward curve. This suggests non-linearity — experience may have diminishing returns. Add a quadratic term: Experience².

Scale-Location Plot: The spread of residuals increases steadily as fitted values rise. Salary variance grows with seniority — classic heteroscedasticity. Apply log(Salary) as the outcome.

Q-Q Plot of Residuals: Points follow the diagonal well in the middle but deviate at the upper tail. With n = 80 and only modest tail deviation, the CLT provides sufficient protection for inference.

Durbin-Watson Statistic: DW = 1.93. This is within the acceptable range (1.5–2.5), so autocorrelation is not a concern in this cross-sectional dataset.

Diagnosis: Transform outcome to log(Salary) and add Experience² to address non-linearity and heteroscedasticity. Re-run and re-check diagnostics after transformation.

T-Test Assumptions

The t-test is one of the most widely used tests in statistics, and it comes in three versions: one-sample (comparing a sample mean to a known value), independent two-sample (comparing means of two groups), and paired (comparing two related measurements). Each has its own assumption set, though they share a common core.

Assumption	One-Sample t	Two-Sample t	Paired t
Normality (or large n)	✓	✓	✓ (differences)
Independence of observations	✓	✓	✓ within pairs
Equal variances (homoscedasticity)	—	Student's t only	—
Continuous data	✓	✓	✓
Random sampling	✓	✓	✓

Normality for T-Tests

The t-test assumes the sample was drawn from a normally distributed population. In practice, the test is remarkably robust to non-normality when n > 30, thanks to the Central Limit Theorem — the sampling distribution of the mean approaches normality regardless of the population shape. For small samples (n ≤ 15), normality matters more. Check with a Shapiro-Wilk test or Q-Q plot.

Equal Variance for the Two-Sample T-Test

Student's t-test assumes the two populations have equal variances. Levene's test checks this formally (p > 0.05 suggests equal variances are plausible). When variances are unequal, use Welch's t-test, which adjusts the degrees of freedom and is now the default in most statistical software. The guide on the two-sample t-test covers Welch's correction in detail.

Independence for T-Tests

Observations must be independent — each data point should provide unique information not duplicated by another. Mixing paired and independent designs is a common error: if the same subject is measured twice (before/after), a paired t-test is required. Using an independent t-test on paired data wastes power and can produce incorrect p-values. See the dedicated paired samples t-test page for when and how to apply it.

⚠️

Common Mistake: Checking Normality of Raw Data Instead of Residuals

For regression, normality should be checked on the residuals, not on Y or X individually. For t-tests, it applies to the outcome variable within each group (or to the differences, for paired tests). Checking the wrong quantity is one of the most frequent assumption-checking errors in applied research.

ANOVA Assumptions

Analysis of Variance (ANOVA) tests whether the means of three or more groups differ. It extends the t-test logic and shares similar assumptions, but the equal-variance condition is now across all groups rather than just two. The full theoretical treatment of ANOVA — including one-way and two-way designs, post-hoc tests, and effect sizes — is covered on the ANOVA guide.

ANOVA 01

Normality Within Groups

Residuals (observations minus their group mean) should be normally distributed within each group. With balanced, large groups, ANOVA is robust to moderate non-normality.

Test: Shapiro-Wilk per group, Q-Q plots

ANOVA 02

Homogeneity of Variance

The variance of the outcome should be approximately equal across all groups. This is the most critical ANOVA assumption when group sizes differ.

Test: Levene's test, Bartlett's test

ANOVA 03

Independence of Observations

Each observation must come from a different, unrelated subject. Repeated measures on the same subject require Repeated Measures ANOVA or a mixed model.

Check: Study design review, no repeated subjects

When ANOVA Assumptions Are Violated

When normality is severely violated (especially with small, unequal groups), the Kruskal-Wallis test is the non-parametric alternative. It tests whether distributions have the same central location without requiring normality. When homogeneity of variance fails, Welch's ANOVA (which adjusts degrees of freedom similarly to Welch's t-test) performs better than the standard F-test. A log or square-root transformation of the outcome often simultaneously improves both normality and equal variance.

Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). The homoscedasticity requirement for ANOVA is discussed in Chapter 12, with simulation evidence showing F's sensitivity to unequal variances when group sizes differ by more than 3:1.

Logistic Regression Assumptions

Logistic regression predicts a binary outcome (yes/no, success/failure) and operates on fundamentally different assumptions than linear regression. It does not require normality of residuals or homoscedasticity — these assumptions simply don't apply to a binary outcome. The full framework, including interpretation of odds ratios and model fit statistics, is covered in the logistic regression guide.

Assumption	Required?	How to Check
Binary or ordinal outcome	Yes	Check outcome variable type
Independence of observations	Yes	Study design review
Linearity of log-odds	Yes	Box-Tidwell test, plots of log-odds vs. predictor
No perfect multicollinearity	Yes	VIF, correlation matrix
Adequate sample size	Yes	≥10–20 events per predictor variable
Normality of residuals	No	Not applicable
Homoscedasticity	No	Not applicable

The most practically important logistic regression assumption is the linearity of log-odds: each continuous predictor should have a linear relationship with the log-odds of the outcome (not with the probability itself). A natural log transformation of the predictor, combined with an interaction term (the Box-Tidwell approach), tests this formally. For categorical predictors, no such assumption applies.

✅

Sample Size Rule for Logistic Regression

A widely cited guideline is at least 10 events per predictor variable (EPP), where "events" means the less common of the two outcomes. With 5 predictors and a 20% event rate, you'd need at least 50 events, which requires a minimum sample of 250. Small-sample violations inflate coefficients and produce overfit models — a phenomenon called complete or quasi-complete separation.

Chi-Square Test Assumptions

The chi-square test of independence examines whether two categorical variables are related. Its assumptions are simpler than those of parametric tests, but the expected frequency condition is frequently overlooked and causes serious errors in applied work. The chi-square test guide covers the full procedure including reading the chi-square table.

Categorical Data

Both variables must be nominal or ordinal categories. The chi-square test counts observations falling into cells of a contingency table. It is not appropriate for continuous measurements without binning, and binning arbitrary continuous data into categories loses information.

Independence of Observations

Each participant or unit should appear only once in the table. Repeated-measures categorical data (the same person measured at two time points) violates this assumption and requires McNemar's test instead.

Expected Frequency ≥ 5

The expected count (not observed count) in every cell of the contingency table should be at least 5. Cells with expected counts below 5 make the chi-square approximation poor. When this condition fails, use Fisher's exact test, which doesn't rely on the large-sample approximation.

Adequate Sample Size

As a practical guide, the total n should be at least 5 times the number of cells in the table. For a 2×3 table (6 cells), this means n ≥ 30. Very small samples make it impossible to satisfy the expected frequency condition.

How to Check Statistical Assumptions: Diagnostic Guide

A systematic diagnostic routine before finalizing any analysis saves considerable time and prevents misleading conclusions. The tools below cover visual checks (fast, intuitive) and formal statistical tests (objective, sample-size-sensitive).

Visual Diagnostic Tools

📈

Q-Q Plot (Quantile-Quantile)

Plots observed quantiles against theoretical normal quantiles. Points following the diagonal line closely indicate normality. Deviations at the tails indicate heavy or light tails; an S-shape suggests skewness. Use it for both raw data and residuals.

📊

Residuals vs. Fitted Plot

For regression: plots residuals (y-axis) against fitted values (x-axis). Random scatter around zero means linearity and homoscedasticity are satisfied. A curve means non-linearity; a funnel shape means heteroscedasticity.

📉

Scale-Location Plot

Plots the square root of |residuals| against fitted values. A flat horizontal line confirms constant variance (homoscedasticity). An upward trend confirms variance grows with the fitted value — the most common pattern of heteroscedasticity.

🔢

Histogram of Residuals

A quick check for approximate normality of residuals. Should be roughly bell-shaped and symmetric. Useful for identifying heavy tails (kurtosis) or asymmetry (skewness) that a Q-Q plot might be harder to read for non-specialists.

Formal Statistical Tests for Assumptions

Assumption	Test	Statistic	Decision Rule	Best For
Normality	Shapiro-Wilk	W statistic	p > 0.05 = normality plausible	n ≤ 50 (most powerful)
Normality	Kolmogorov-Smirnov	D statistic	p > 0.05 = normality plausible	n > 50
Normality	Anderson-Darling	A² statistic	p > 0.05 = normality plausible	Sensitive to tail departures
Equal variance	Levene's test	F statistic	p > 0.05 = equal variances plausible	Non-normal data (robust)
Equal variance	Bartlett's test	χ² statistic	p > 0.05 = equal variances plausible	Normal data (more powerful)
Autocorrelation	Durbin-Watson	DW (0–4)	≈ 2 = no autocorrelation	Time-series regression residuals
Heteroscedasticity	Breusch-Pagan	LM statistic	p > 0.05 = no heterosc.	Linear regression residuals
Multicollinearity	Variance Inflation Factor	VIF	VIF < 5 acceptable, < 10 tolerable	Multiple regression predictors

⚠️

Important: Don't Over-Rely on Formal Tests

With large samples, formal tests like Shapiro-Wilk have very high power and will flag trivially small departures from normality as "significant" — departures that have no meaningful impact on your analysis. With small samples, the same tests have very low power and may miss genuine violations. Always pair formal tests with visual inspection. The goal is practical, not statistical, significance of the violation.

What to Do When Assumptions Are Violated

A violated assumption is not the end of the analysis — it's a diagnostic signal that guides your next step. The appropriate response depends on which assumption is violated, the degree of the violation, and your sample size.

Violated Assumption	Moderate Violation Fix	Severe Violation Fix
Non-normality of residuals	Log or Box-Cox transformation of Y; rely on CLT if n > 100	Non-parametric test (Mann-Whitney, Kruskal-Wallis); GLM with appropriate family
Heteroscedasticity	Log-transform Y; HC robust standard errors	Weighted least squares (WLS); generalized least squares (GLS)
Autocorrelation (time-series)	Add lagged predictor; Cochrane-Orcutt procedure	ARIMA model; GLS with AR(1) error structure
Multicollinearity	Remove one correlated predictor; center variables	Ridge regression; PCA to create uncorrelated components
Non-linearity	Add polynomial term (X²); transform X	Non-parametric regression; generalized additive model (GAM)
Unequal variances (ANOVA)	Welch's ANOVA (adjusted df)	Kruskal-Wallis non-parametric test; log-transform Y
Non-independence	Cluster-robust standard errors	Mixed-effects model; GEE for longitudinal data

Worked Examples: Assumption Checks in Practice

Worked Example 1 — T-Test

Checking Normality Before a Two-Sample T-Test

Scenario: A nutritionist measures daily calorie intake in 25 people on Diet A and 28 on Diet B. She wants to test whether mean intake differs between groups using a two-sample t-test.

Check normality per group: Shapiro-Wilk on Diet A: W = 0.954, p = 0.31. Diet B: W = 0.947, p = 0.16. Both p > 0.05 → normality is plausible. Q-Q plots confirm approximate normality with slight right skewness in both groups.

Check equal variances: Levene's test: F = 2.14, p = 0.15. Since p > 0.05, equal variances are plausible → Student's t-test is appropriate. If p were < 0.05, switch to Welch's t-test.

Check independence: Participants were recruited independently with no repeated measurements and no family clusters in the sample. Independence holds.

Decision: All three assumptions satisfied. Proceed with the independent samples t-test. See the two-sample t-test guide for the calculation steps.

Result: All assumptions satisfied. Run the standard independent t-test. If normality had failed, use the Mann-Whitney U test as the non-parametric alternative.

Worked Example 2 — ANOVA

Diagnosing Assumption Violation: Unequal Variance in a Three-Group Design

Scenario: A researcher tests whether three teaching methods (Traditional, Flipped, Hybrid) produce different exam scores. Group sizes are n₁ = 18, n₂ = 22, n₃ = 14 — unequal, which raises concern about variance violations.

Normality check: Shapiro-Wilk per group: all p > 0.08. Q-Q plots look acceptable. Normality is not a concern here.

Homogeneity of variance: Levene's test: F = 4.82, p = 0.012. The standard ANOVA assumption of equal variances is violated. Group standard deviations are 8.2, 14.7, and 9.1 — the Flipped classroom group is substantially more variable.

Response: With unequal group sizes and a Levene's test that flags the assumption, use Welch's one-way ANOVA. The Welch F-statistic adjusts degrees of freedom to account for variance heterogeneity. Alternatively, apply a natural log transformation to scores and re-check — if it restores homoscedasticity, standard ANOVA is then appropriate.

Decision: Run Welch's ANOVA instead of standard one-way ANOVA. For pairwise follow-up tests, use Games-Howell rather than Tukey's HSD (which also requires equal variances).

Statistical Assumptions Diagnostic Checklist

Use this checklist before finalizing any parametric analysis. Click each item to mark it complete.

✅ Pre-Analysis Assumption Checklist

Normality checked — Q-Q plot reviewed and/or Shapiro-Wilk test run on residuals (regression) or outcome per group (t-test, ANOVA)

Independence verified — Study design confirms no repeated measures, clustering, or matched pairs that would require a different model

Equal variance tested — Levene's test (for t-test/ANOVA) or Breusch-Pagan/scale-location plot (for regression)

Linearity confirmed — Residuals vs. fitted plot shows no systematic curve; scatter plots of Y vs. each X reviewed

Multicollinearity checked — VIF calculated for all predictors in multiple regression; no predictor exceeds VIF = 10

Outliers reviewed — Cook's distance (regression) or boxplots (t-test/ANOVA) used to identify high-influence points; decision made about handling

Sample size adequate — Sufficient observations relative to number of predictors; power analysis completed if relevant

Autocorrelation checked — Durbin-Watson statistic computed if data is time-ordered; value near 2.0 is acceptable

Violations documented — Any identified violations noted with the chosen remediation (transformation, robust SE, non-parametric alternative)

Full Assumptions Reference Table by Test

Statistical Test	Normality?	Equal Variance?	Independence?	Other Key Assumption	Non-Parametric Alternative
One-sample t-test	Yes (or n > 30)	N/A	Yes	Continuous data	Wilcoxon signed-rank
Independent two-sample t-test	Yes (or n > 30)	Yes (Student's) / No (Welch's)	Yes	Two independent groups	Mann-Whitney U
Paired t-test	Differences normal	N/A	Yes (between pairs)	Matched/paired observations	Wilcoxon signed-rank
One-way ANOVA	Yes (within groups)	Yes (Levene's)	Yes	Three or more groups	Kruskal-Wallis
Simple linear regression	Residuals normal	Constant (homoscedasticity)	Yes (errors)	Linearity, no outliers	Spearman / quantile regression
Multiple linear regression	Residuals normal	Constant	Yes (errors)	No multicollinearity	Ridge / LASSO for collinearity
Logistic regression	No	No	Yes	Linear log-odds, no separation	Exact logistic regression
Chi-square test	No	No	Yes	Expected freq ≥ 5	Fisher's exact test
Pearson correlation	Yes (bivariate)	N/A	Yes	Linear relationship	Spearman rank correlation
Z-test (one-sample)	Yes or n > 30	N/A	Yes	Population σ known	Sign test (large n)

Where Statistical Assumptions Matter Most

🏥

Clinical Trials

Normality and independence assumptions are scrutinized in RCTs. Violation of independence (patients clustering within hospitals) is handled with mixed-effects models. Drug approval decisions depend on correctly computed p-values, so assumption checks are mandatory per regulatory guidance.

💹

Financial Modeling

Financial returns are notorious for violating normality — they have fat tails and are autocorrelated. Classical OLS on return data produces underestimated risk. GARCH models, robust estimation, and copula-based approaches handle these violations in practice.

🔬

Social Science Research

Survey data often violates independence due to clustering (students within schools, employees within firms). Mixed-effects models or cluster-robust standard errors are standard remedies in published research.

🤖

Machine Learning

Linear models in ML make the same OLS assumptions; violations degrade out-of-sample performance. Residual diagnostic plots remain essential for linear and logistic regression deployed in production systems.

🌱

Agricultural / Experimental Design

Field experiments often have spatial correlation violating independence. Block designs, randomization, and mixed models for spatial autocorrelation are standard in agronomic research.

📡

Engineering & Quality Control

Process data collected over time is almost always autocorrelated. Statistical process control (SPC) charts and time-series models account for this. Applying a t-test to non-independent process data produces inflated false-positive rates for out-of-control signals.

Frequently Asked Questions

Statistical assumptions are the conditions that must hold for a test's results to be valid. They specify the type of data and data-generating process for which the test was designed. Each method, such as the t-test, ANOVA, linear regression, and chi-square, has its own assumptions. When these assumptions are met, estimates are unbiased, p-values are valid, and confidence intervals have the expected accuracy. When assumptions are violated, statistical conclusions may become unreliable.

Linear regression relies on five main assumptions: linearity between predictors and the outcome, independence of errors, homoscedasticity (constant error variance), normally distributed residuals for valid inference, and no perfect multicollinearity among predictors. When these assumptions hold, ordinary least squares (OLS) produces the Best Linear Unbiased Estimators (BLUE).

ANOVA assumes that observations are independent, the outcome variable is approximately normally distributed within each group, and variances are equal across groups (homogeneity of variance). While ANOVA is fairly robust to moderate normality violations with balanced sample sizes, unequal variances combined with unequal group sizes can affect the validity of the F-test. Welch's ANOVA is recommended in such cases.

The t-test requires continuous data, independent observations, and approximately normal distributions within each group, especially for small samples. The independent two-sample t-test also assumes equal variances unless Welch's t-test is used. For a paired t-test, the normality assumption applies to the differences between paired observations rather than the original values.

The impact depends on which assumption is violated. Minor departures from normality are often acceptable in large samples because of the Central Limit Theorem. Violating independence is usually the most serious issue because it can greatly increase false positive rates. Heteroscedasticity mainly affects standard errors and p-values, while nonlinearity can produce biased predictions and misleading conclusions.

Normality is commonly assessed using Q-Q plots, histograms, and formal tests such as the Shapiro-Wilk test. For regression, inspect the residuals rather than the raw data. For t-tests and ANOVA, examine the outcome variable within each group. Visual inspection should always accompany formal statistical tests because large samples can detect trivial departures from normality.

Homoscedasticity means the variance of the errors remains constant across all levels of the predictors. When this assumption is violated, a condition known as heteroscedasticity, regression coefficients remain unbiased but standard errors become unreliable, leading to incorrect confidence intervals and hypothesis tests. Common solutions include robust standard errors, weighted least squares, or transforming the response variable.

Although statistical software allows analyses even when assumptions are violated, the results may no longer be trustworthy. Some violations, such as mild non-normality in large samples, have little practical impact, while others, especially lack of independence, can seriously invalidate results. The best practice is to evaluate assumptions, apply appropriate remedies when necessary, and report the checks and any adjustments made.

Sources & Further Reading

Gauss, C. F. (1809). Theoria Motus Corporum Coelestium. The original formulation of the least-squares method and underlying assumptions.

Markov, A. A. (1900). "Wahrscheinlichkeitsrechnung" (Calculus of Probability). Established the conditions (Gauss-Markov theorem) under which OLS is BLUE.

Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE. — Comprehensive applied treatment of assumption testing across all major parametric procedures. discoveringstatistics.com

Gelman, A., & Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press. — Chapter 3 covers regression assumption diagnostics in depth.

NIST/SEMATECH (2023). e-Handbook of Statistical Methods. U.S. Department of Commerce. itl.nist.gov/div898/handbook — The standard government reference for statistical model validation.

Wilcox, R. R. (2022). Introduction to Robust Estimation and Hypothesis Testing (5th ed.). Academic Press. — Covers robust alternatives when classical assumptions break down.