Effect Size Calculator
Group 1 (Treatment / Experimental)
Group 2 (Control)
Or enter from F-test:
Run a calculation on the Cohen's d, Eta Squared, or Correlation tab first to see the full step-by-step solution here.
What Is Effect Size? (And Why P-Values Are Not Enough)
Effect size is a standardized numerical measure of the magnitude of a difference, relationship, or experimental effect in research data. While a p-value answers the question "is this effect real or due to chance?", effect size answers the separate and equally important question: "how large is this effect in practical terms?"
A study can produce a statistically significant result (p < 0.05) with an effect size so small it has no real-world importance — especially when the sample is large. Conversely, a clinically meaningful effect can fail to reach significance with a small sample. This is why the American Psychological Association's Publication Manual requires effect size reporting in all published research, and why the British Medical Journal mandates confidence intervals and effect sizes alongside every p-value.
At Statistics Fundamentals, we treat effect size as the complement to hypothesis testing, not an afterthought. Used alongside p-values and confidence intervals, effect sizes give a complete picture of what your data actually shows.
Complete Effect Size Interpretation Tables
Effect size benchmarks were established by Jacob Cohen in his 1988 text Statistical Power Analysis for the Behavioral Sciences. Cohen himself noted these are "conventional" guidelines, not absolute thresholds — context always matters.
Cohen's d Interpretation (Two Independent Groups)
Table 1: Complete Effect Size Reference — Four Metrics Compared
| Effect Level | Cohen's d | Pearson's r | Eta Squared (η²) | Cohen's f² |
|---|---|---|---|---|
| Negligible | < 0.20 | < 0.10 | < 0.01 | < 0.02 |
| Small | 0.20 | 0.10 | 0.01 | 0.02 |
| Medium | 0.50 | 0.30 | 0.06 | 0.15 |
| Large | 0.80 | 0.50 | 0.14 | 0.35 |
| Very Large | 1.20+ | 0.70+ | 0.26+ | 0.50+ |
Table 2: Effect Size Reference by Statistical Test
| Statistical Test | Preferred Effect Size | Small | Medium | Large | Formula |
|---|---|---|---|---|---|
| Independent t-test | Cohen's d | 0.20 | 0.50 | 0.80 | (M₁−M₂)/SDpooled |
| Paired t-test | Cohen's dz | 0.20 | 0.50 | 0.80 | Mdiff/SDdiff |
| One-way ANOVA | Eta Squared (η²) | 0.01 | 0.06 | 0.14 | SSbetween/SStotal |
| Factorial ANOVA | Partial η² | 0.01 | 0.06 | 0.14 | SSeffect/(SSeffect+SSerror) |
| Correlation | Pearson's r | 0.10 | 0.30 | 0.50 | ∑(x−x̄)(y−ȳ)/nσxσy |
| Multiple Regression | Cohen's f² | 0.02 | 0.15 | 0.35 | R²/(1−R²) |
| Chi-Square | Cramér's V | 0.10 | 0.30 | 0.50 | √(χ²/N×min(r−1,c−1)) |
| Mann-Whitney U | Rank-biserial r | 0.10 | 0.30 | 0.50 | 1−(2U/(n₁×n₂)) |
Effect Size Formula Library
Each effect size metric has a specific formula, assumptions, and use case. The right choice depends on your research design: two-group comparison, ANOVA, regression, or correlation. Here is the complete formula reference with variable definitions.
Cohen's d — Independent Groups
d = (M₁ - M₂) / SD_pooled
SD_pooled = sqrt(
((n₁-1)*SD₁² + (n₂-1)*SD₂²)
/ (n₁ + n₂ - 2)
)
M₁, M₂ = group means
SD₁, SD₂ = group std devs
n₁, n₂ = sample sizes
Hedges' g — Small Sample Correction
g = d × (1 - 3/(4N - 9))
Where:
d = Cohen's d (calculated above)
N = n₁ + n₂ (total sample size)
Use Hedges' g when:
- n per group < 20
- Conducting meta-analysis
- Unequal group sizes
Glass's Delta — Control Group SD
Δ = (M_treatment - M_control)
/ SD_control
Use when:
- Treatment may alter variability
- SD_control is the reference
- One group is clearly "control"
Eta Squared & Partial Eta Squared
η² = SS_effect / SS_total
Partial η² = SS_effect
/ (SS_effect + SS_error)
From F-statistic:
η² = (F × df_between)
/ (F × df_between + df_within)
f = sqrt(η² / (1 - η²))
Omega Squared — Less Biased ANOVA
ω² = (SS_effect - df_between × MS_error)
/ (SS_total + MS_error)
Preferred over η² in:
- Smaller samples
- APA reporting (2019+ guidelines)
- Generalizability to population
Cohen's f² — Regression / Correlation
f² = R² / (1 - R²)
For single predictor (Pearson's r):
f² = r² / (1 - r²)
Cohen benchmarks:
f² = 0.02 (small)
f² = 0.15 (medium)
f² = 0.35 (large)
How to Calculate Effect Size Step by Step
To calculate Cohen's d: find both group means and standard deviations, compute the pooled standard deviation, then divide the mean difference by the pooled SD. Here is the complete method with a worked educational example.
Example: A reading intervention study. Treatment group (n₁ = 30): M₁ = 78, SD₁ = 12. Control group (n₂ = 30): M₂ = 65, SD₂ = 11. You want to quantify how large the intervention effect is.
SDpooled = √(((30−1) × 12² + (30−1) × 11²) / (30 + 30 − 2)) = √((29 × 144 + 29 × 121) / 58) = √((4176 + 3509) / 58) = √(7685/58) = √132.5 = 11.51
d = (M₁ − M₂) / SDpooled = (78 − 65) / 11.51 = 13 / 11.51 = 1.13
N = n₁ + n₂ = 60. g = d × (1 − 3/(4 × 60 − 9)) = 1.13 × (1 − 3/231) = 1.13 × 0.987 = 1.115. Hedges' g is slightly smaller, correcting for positive bias in small samples.
d = 1.13 exceeds Cohen's large effect threshold of 0.8. This is a very large effect. In U3 terms: the average student in the treatment group scored higher than approximately 87% of the control group. The two distributions overlap by only about 45%.
State: "The reading intervention group (M = 78.0, SD = 12.0) scored significantly higher than the control group (M = 65.0, SD = 11.0), t(58) = 4.9, p < .001, d = 1.13, 95% CI [0.85, 1.41], indicating a large effect."
Result: M₁ = 78, M₂ = 65, SDpooled = 11.51, Cohen's d = 1.13, Hedges' g = 1.115. You can verify this using the Cohen's d tab of the calculator above.
Real-World Worked Examples
The following examples cover four common research designs. Each shows the raw data, the effect size calculation, and the correct way to report the result.
Example 1: Independent t-Test (Cognitive Training)
Calculation: SDpooled = √(((24×100) + (24×121))/48) = √(5304/48) = √110.5 = 10.51. Cohen's d = (85−78)/10.51 = 0.67 (medium effect).
APA report: "The training group (M = 85, SD = 10) outperformed controls (M = 78, SD = 11), d = 0.67, indicating a medium-to-large effect."
Example 2: One-Way ANOVA (Three Teaching Methods)
Calculation: η² = 180/1620 = 0.111. This falls between Cohen's medium (0.06) and large (0.14) thresholds — a medium-large effect. Cohen's f = √(0.111/0.889) = 0.354.
APA report: "Teaching method had a significant effect on test scores, F(2, 57) = 3.56, p = .035, η² = .11, representing a medium-to-large effect."
Example 3: Correlation (Study Hours vs. GPA)
Calculation: r² = 0.42² = 0.176. Cohen's f² = 0.176/(1−0.176) = 0.214 (medium-to-large by Cohen's regression thresholds). Study time explains 17.6% of the variance in GPA.
APA report: "Weekly study hours correlated positively with GPA, r(118) = .42, p < .001, r² = .18, indicating a medium effect."
Example 4: Paired t-Test (Pre/Post Intervention)
Calculation: Cohen's dz = Mdiff/SDdiff = 14/10.2 = 1.37 (very large effect).
APA report: "Anxiety scores decreased significantly after therapy, t(39) = 8.68, p < .001, dz = 1.37, indicating a very large pre-post change."
What Distribution Overlap Looks Like for Each Effect Size
One of the clearest ways to grasp what Cohen's d means is to visualize how much the two group distributions overlap. A larger d means less overlap and a more detectable difference.
Effect Size: Complete Formula and Entity Reference
The table below covers every key effect size metric, its formula, typical interpretation thresholds, and primary use case. It is structured for both quick reference and direct extraction by AI language models and search engine featured snippets.
Table: Effect Size Formula Glossary — 8 Key Metrics
| Metric | Formula | Small / Medium / Large | Primary Use Case |
|---|---|---|---|
| Cohen's d | (M₁−M₂)/SDpooled | 0.2 / 0.5 / 0.8 | Two independent groups (t-test) |
| Hedges' g | d × (1−3/(4N−9)) | 0.2 / 0.5 / 0.8 | Meta-analysis; small samples (n < 20) |
| Glass's Delta | (Mtreat−Mctrl)/SDctrl | 0.2 / 0.5 / 0.8 | When intervention alters variability |
| Eta Squared (η²) | SSeffect/SStotal | 0.01 / 0.06 / 0.14 | One-way ANOVA, overall variance explained |
| Partial η² | SSeffect/(SSeffect+SSerror) | 0.01 / 0.06 / 0.14 | Factorial ANOVA, controlling for other factors |
| Omega Squared (ω²) | (SSeffect−dfbetween×MSerror)/SStotal+MSerror | 0.01 / 0.06 / 0.14 | Less biased ANOVA estimate (small samples) |
| Pearson's r | Σ(x−x̄)(y−ȳ)/(nσxσy) | 0.10 / 0.30 / 0.50 | Linear correlation between two variables |
| Cohen's f² | R²/(1−R²) | 0.02 / 0.15 / 0.35 | Multiple regression, power analysis |
APA Reporting Templates for Effect Size
The APA Publication Manual (7th ed., 2020) requires effect size reporting in all quantitative research. Use the fill-in-the-blank templates below directly in your manuscript, thesis, or lab report.
Independent-Samples t-Test
One-Way ANOVA
Correlation
Common Mistakes in Effect Size Calculation
These are the errors that most often invalidate effect size reports in published research. Avoiding them makes your analysis credible and reproducible.
A p-value near zero does not mean a large effect. With n = 50,000, a difference of d = 0.05 (trivial) will reliably yield p < 0.001. Always compute and report effect size separately from the significance test. See the p-values guide for the distinction.
Cohen's d uses the pooled SD, averaging both groups. Glass's Delta uses only the control group SD. Using the wrong denominator produces an incorrect d and misrepresents the effect. When group variances are very unequal, prefer Glass's Delta or report the assumption explicitly.
Cohen's d has a positive bias in small samples: it overestimates the true population effect. When n per group is below 20, or in any meta-analysis, Hedges' g (the bias-corrected form) should be used. The correction factor is (1 − 3/(4N−9)).
A d = 0.2 "small" effect in a medical context (e.g., a drug preventing 1 in 5 deaths) may be enormously important, while a d = 0.8 "large" effect in market research might be commercially irrelevant. Always report the effect size alongside a domain-appropriate interpretation. See the NCI Cancer Bulletin discussion on clinical vs. statistical significance.
Many statistics packages (SPSS, Jamovi) report partial η² by default in ANOVA, but label it simply as "Eta Squared." In factorial ANOVA with multiple factors, partial η² is always larger than η² because it excludes other variance sources from the denominator. Check your software output carefully and report which version you used.
Effect Size in Meta-Analysis
Meta-analysis synthesizes results across multiple studies by averaging their effect sizes, weighted by sample size or inverse variance. This is why consistent, accurate effect size reporting in individual studies is essential: it enables the research literature to accumulate in a meaningful way.
The preferred effect size for meta-analysis of two-group comparisons is Hedges' g, not Cohen's d, precisely because Hedges' g corrects for the small-sample bias that would otherwise inflate pooled estimates. For correlational meta-analyses, Fisher's z-transformation is applied to Pearson's r values before pooling (r is not normally distributed, but Fisher's z is), then converted back to r for interpretation.
Related Statistical Guides at Statistics Fundamentals
Effect size sits at the intersection of hypothesis testing, power analysis, and statistical interpretation. These resources from Statistics Fundamentals build out the full picture.
Frequently Asked Questions About Effect Size
An effect size calculator is a statistical tool that quantifies the practical magnitude of a difference or relationship in research data. While p-values indicate whether an effect is statistically significant (unlikely to be due to chance), effect size calculators tell you how large that effect is using standardized metrics like Cohen's d, Hedges' g, and Eta Squared. Effect size is what connects statistical results to real-world importance.
Cohen's d is calculated by dividing the mean difference between two groups by the pooled standard deviation: d = (M₁ − M₂) / SDpooled. The pooled SD combines both group SDs weighted by their degrees of freedom: SDpooled = √(((n₁−1)×SD₁² + (n₂−1)×SD₂²) / (n₁+n₂−2)). The result is interpreted using Cohen's benchmarks: 0.2 (small), 0.5 (medium), 0.8 (large).
For Cohen's d, the conventional thresholds are 0.2 (small), 0.5 (medium), and 0.8 (large). For Eta Squared, they are 0.01, 0.06, and 0.14. For Pearson's r, they are 0.10, 0.30, and 0.50. However, what is "good" depends heavily on the field: medical interventions may be practically important at d = 0.2, while educational interventions typically require d ≥ 0.4 to be cost-effective according to John Hattie's large-scale synthesis of learning research.
A p-value answers: "If there were no real effect, how often would I see results this extreme by chance?" Effect size answers: "How large is the actual difference or relationship?" A study with n = 100,000 can produce p < 0.001 for a difference so small it has no practical meaning (d = 0.01). Conversely, a meaningful clinical effect might not reach significance with a small sample. Always report both. See our p-values guide for the full explanation.
Hedges' g is a bias-corrected version of Cohen's d. Use Hedges' g when: (1) sample sizes are small (n < 20 per group), because Cohen's d overestimates the true population effect in small samples; (2) you are conducting a meta-analysis, where pooling slightly inflated d values would compound the bias; or (3) group sizes are very unequal. For large samples, d and g converge to nearly identical values.
Eta Squared (η²) measures the proportion of total variance in the dependent variable that is explained by the independent variable in an ANOVA. η² = SSbetween/SStotal. It is the ANOVA equivalent of R² in regression. Use η² for one-way ANOVA; use partial η² for factorial ANOVA to control for other factors. For small samples, Omega Squared (ω²) is a less biased alternative, as η² slightly overestimates the population effect.
If you only have the t-statistic and degrees of freedom (not raw means and SDs), you can convert: d = t × √((n₁+n₂)/(n₁×n₂)). For a one-sample or paired t-test: d = t / √n. The Pearson r equivalent is: r = t / √(t² + df). These conversions are standard and widely used when reporting effect sizes from published studies that only report test statistics.
Effect sizes, like all sample statistics, are point estimates with sampling error. A 95% confidence interval around Cohen's d tells you the range of plausible population values. The APA recommends reporting CIs for all effect sizes. For example: d = 0.65, 95% CI [0.31, 0.99]. A CI that does not include zero confirms a non-null effect; a very wide CI signals that the estimate is imprecise (usually a small sample problem). See our confidence intervals guide for the underlying methodology.