What is the Pearson correlation formula?

r = Σ(xi − x̄)(yi − ȳ) / √[Σ(xi − x̄)² · Σ(yi − ȳ)²]. This divides the covariance of X and Y by the product of their standard deviations, producing a dimensionless coefficient between -1 and +1.

What is a strong Pearson correlation?

Values of r between 0.60 and 0.79 are generally considered strong, and values from 0.80 to 1.00 are very strong. The interpretation depends on context — in psychology r = 0.30 can be practically meaningful, while in physics a correlation below 0.99 might be considered weak.

What are the assumptions of Pearson correlation?

Pearson correlation requires: (1) both variables are continuous, (2) the relationship is linear, (3) observations are independent, (4) no extreme outliers, and (5) approximate bivariate normality. When these are not met, Spearman rank correlation is usually a better choice.

Does correlation imply causation?

No. A high Pearson r only confirms a linear association exists in your sample. It cannot establish that one variable causes changes in the other. A confounding third variable, reverse causation, or coincidence could all produce the same coefficient.

What is the difference between Pearson and Spearman correlation?

Pearson measures linear association between continuous variables and assumes approximate normality. Spearman measures monotonic association using ranked data and makes no distributional assumptions, making it more appropriate for ordinal data or when outliers are present.

Pearson Correlation: Coefficient Guide, Formula & Interpretation (2026)

Q: What is Pearson correlation?

Pearson correlation is a statistical measure of the strength and direction of a linear relationship between two continuous variables. The coefficient r ranges from -1 to +1, where +1 is a perfect positive relationship, -1 is perfect negative, and 0 means no linear association.

What Is Pearson Correlation?

Definition — Pearson Correlation Coefficient

Pearson correlation is a standardized measure of the strength and direction of a linear relationship between two continuous variables. The coefficient, written r, ranges from −1 to +1. A value near +1 means both variables tend to increase together; near −1 means one rises as the other falls; near 0 means no detectable linear pattern.

−1 ≤ r ≤ +1

The full name is the Pearson product-moment correlation coefficient, introduced by Karl Pearson in 1896 based on earlier work by Francis Galton on regression toward the mean. The "product-moment" refers to the fact that r is computed from the products of mean-centered values — what statisticians call cross-products or moments about the mean.

What r measures specifically is linear association. Two variables can be strongly related in a curved or U-shaped way and still produce r ≈ 0 if the relationship is not well-described by a straight line. Always plot your data in a scatter plot before interpreting r — the number alone does not tell the full story.

⚠️

Correlation ≠ Causation

A high r only confirms a linear pattern in your sample data. It does not mean one variable causes the other to change. A confounding third variable, coincidence, or reverse causation can all produce the same coefficient. Establishing causation requires controlled experiments or causal inference methods.

+1.0

Perfect positive linear relationship

No linear association

−1.0

Perfect negative linear relationship

r²

Proportion of variance explained (R²)

Karl Pearson (1896). "Mathematical Contributions to the Theory of Evolution." Philosophical Transactions of the Royal Society A, 187, 253–318. The modern formula and notation are documented in the NIST Engineering Statistics Handbook §6.3.5.

Pearson Correlation Formula

The formula comes directly from the definition of covariance. Covariance measures how two variables vary together, but its magnitude depends on the measurement units of each variable. Dividing by both standard deviations removes the units and constrains the result to [−1, +1].

Pearson Correlation Coefficient — Sample Formula

r = Σ(xᵢ − x̄)(yᵢ − ȳ) / √[Σ(xᵢ − x̄)² · Σ(yᵢ − ȳ)²]

r = Pearson correlation coefficient xᵢ = each observed X value yᵢ = each observed Y value x̄ = mean of X ȳ = mean of Y n = number of pairs

The numerator, Σ(xᵢ − x̄)(yᵢ − ȳ), is the sum of cross-products of deviations from the mean — this is n−1 times the sample covariance. The denominator scales it by the product of both standard deviations, making r unitless and bounded.

Equivalent Computational Form

For hand calculation with a data table, the equivalent raw-score formula avoids computing means first:

Raw-Score Computational Formula

r = [nΣxy − ΣxΣy] / √{[nΣx² − (Σx)²][nΣy² − (Σy)²]}

n = number of data pairs Σxy = sum of each xᵢyᵢ product Σx² = sum of squared x values

Population vs Sample Notation

When computing r from a sample to estimate the true population correlation (denoted ρ, the Greek letter rho), use n pairs. The formula above gives the sample r. For the population, replace the sums with expected values: ρ = Cov(X,Y) / (σₓ · σᵧ). In practice you almost always work with sample data, so r is what you calculate.

Coefficient of Determination (R²)

Squaring the Pearson r gives R², the proportion of variance in one variable that is statistically accounted for by linear association with the other. If r = 0.80, then R² = 0.64, meaning 64% of the variance in Y is explained by the linear relationship with X. R² appears again in simple linear regression, where it measures overall model fit.

Coefficient of Determination

R² = r²

Ranges from 0 (no explanation) to 1 (perfect explanation)

How to Interpret Pearson r

The sign tells you direction; the absolute value tells you strength. These thresholds are broadly accepted in behavioral and social science research, traced to guidelines proposed by Jacob Cohen (1988). In physics or engineering, tolerances are often much tighter.

Pearson r Scale — Direction and Strength

−1.0
Perfect Negative −0.6
Strong Neg −0.3
Weak Neg 0
None +0.3
Weak Pos +0.6
Strong Pos +1.0
Perfect Positive

r Value	Interpretation	R² (Variance Explained)	Example Context
−1.0	Perfect negative linear relationship	100%	Theoretical only
−0.80 to −1.0	Very strong negative	64–100%	Price vs demand (economics)
−0.60 to −0.79	Strong negative	36–62%	Exercise frequency vs resting heart rate
−0.40 to −0.59	Moderate negative	16–35%	Stress vs sleep quality
−0.20 to −0.39	Weak negative	4–15%	Commute time vs job satisfaction
−0.19 to +0.19	Very weak or no linear relationship	0–4%	Shoe size vs IQ
+0.20 to +0.39	Weak positive	4–15%	Height vs shoe size
+0.40 to +0.59	Moderate positive	16–35%	Study hours vs exam score
+0.60 to +0.79	Strong positive	36–62%	SAT math vs SAT verbal
+0.80 to +1.0	Very strong positive	64–100%	Height (cm) vs height (inches)
+1.0	Perfect positive linear relationship	100%	Same variable measured twice

📌

Statistical vs Practical Significance

With a large sample (n = 1,000), r = 0.08 can be statistically significant (p < 0.05) even though it explains less than 1% of the variance. Always report both r and R², and consider whether the effect size is meaningful for your specific context — not just whether p is below 0.05.

How to Calculate Pearson Correlation (7 Steps)

The following worked calculation uses a real data structure: hours studied per week and final exam percentage for 6 students. Each step maps directly to one part of the formula.

Manual Calculation — Study Hours vs Exam Score

Dataset: n = 6 students. X = weekly study hours, Y = exam percentage.

Student	X (hours)	Y (score %)
A	4	65
B	6	70
C	8	75
D	10	82
E	12	88
F	14	93

Calculate the means:
x̄ = (4+6+8+10+12+14)/6 = 54/6 = 9.0
ȳ = (65+70+75+82+88+93)/6 = 473/6 = 78.83

Compute deviations from the mean (xᵢ − x̄) and (yᵢ − ȳ):
A: (4−9)=−5, (65−78.83)=−13.83 | B: (6−9)=−3, (70−78.83)=−8.83
C: (8−9)=−1, (75−78.83)=−3.83 | D: (10−9)=+1, (82−78.83)=+3.17
E: (12−9)=+3, (88−78.83)=+9.17 | F: (14−9)=+5, (93−78.83)=+14.17

Compute cross-products (xᵢ − x̄)(yᵢ − ȳ):
A: (−5)(−13.83)=69.15 | B: (−3)(−8.83)=26.49 | C: (−1)(−3.83)=3.83
D: (1)(3.17)=3.17 | E: (3)(9.17)=27.51 | F: (5)(14.17)=70.85
Σ(xᵢ − x̄)(yᵢ − ȳ) = 201.00

Compute Σ(xᵢ − x̄)²:
(−5)²+(−3)²+(−1)²+(1)²+(3)²+(5)² = 25+9+1+1+9+25 = 70

Compute Σ(yᵢ − ȳ)²:
(−13.83)²+(−8.83)²+(−3.83)²+(3.17)²+(9.17)²+(14.17)² ≈ 191.27+77.97+14.67+10.05+84.09+200.79 = 578.84

Apply the formula:
r = 201.00 / √(70 × 578.84) = 201.00 / √40,518.8 = 201.00 / 201.29 ≈ 0.999

Interpret: r = 0.999 indicates a near-perfect positive linear relationship. R² = 0.998, so roughly 99.8% of the variance in exam scores is explained by the linear relationship with weekly study hours in this sample.

✅ r = 0.999. Students who study more hours score higher on exams in an almost perfectly linear pattern. Causation would require a controlled experiment — other variables (ability, prior knowledge) are not held constant here.

Calculation methodology follows NIST Engineering Statistics Handbook Chapter 6. For software implementation, see the SciPy pearsonr documentation.

Pearson Correlation Calculator

Enter paired X and Y values separated by commas (or spaces). The calculator computes r, R², the t-statistic, p-value, and gives a plain-English interpretation. Separate the two series with a new line, or paste them into the two boxes below.

Pearson r Calculator

X values (comma-separated)

Y values (comma-separated)

Significance level (α)

Tails

Pearson Correlation Assumptions

Pearson r is only a valid and interpretable measure when the following five conditions hold. Violating them does not always make r numerically impossible to compute — it just means the number does not mean what you think it means.

Assumption 1

Continuous Variables

Both X and Y must be measured on a continuous interval or ratio scale. Pearson r is not appropriate for ordinal data (ranked categories) or binary variables — use Spearman or point-biserial correlation instead.

Assumption 2

Linear Relationship

The underlying relationship between X and Y must be approximately linear. A scatter plot will reveal curves, U-shapes, or other non-linear patterns that Pearson r will underestimate. Spearman captures monotonic (but not necessarily linear) relationships.

Assumption 3

Independence of Observations

Each pair (xᵢ, yᵢ) must come from a different, independent unit. Repeated measures from the same individual, time-series data with autocorrelation, or clustered samples all violate this assumption and can inflate r.

Assumption 4

No Extreme Outliers

A single outlier can shift r by 0.3 or more in a small sample. Check scatter plots for leverage points before reporting r. Robust alternatives include Spearman's ρ or the winsorized correlation for datasets with outliers.

Assumption 5

Approximate Bivariate Normality

For significance testing (the t-test below), the pair (X, Y) should follow an approximate bivariate normal distribution. With large samples (n > 30) this matters less due to the central limit theorem — see the central limit theorem guide. For small n, check histograms and Q-Q plots.

✅

Quick Assumption Check

Before reporting r: (1) confirm both variables are continuous, (2) inspect a scatter plot for linearity and outliers, (3) verify observations are independent. These three steps catch the most common errors. Full normality testing matters mainly when n < 30.

Hypothesis Testing for Pearson Correlation

Computing r tells you the sample correlation. To decide whether the result reflects a true population correlation — or could plausibly be produced by chance from data where ρ = 0 — you run a significance test using a t-statistic. This connects directly to hypothesis testing principles.

Setting Up the Hypotheses

Hypotheses for Pearson Correlation Test

H₀: ρ = 0 — The population correlation is zero; any sample r is due to chance
H₁: ρ ≠ 0 — Two-tailed test: the population has a nonzero correlation (either direction)
H₁: ρ > 0 — One-tailed test: the population correlation is positive
H₁: ρ < 0 — One-tailed test: the population correlation is negative

The t-Statistic

t-Test for Pearson Correlation

t = r · √(n − 2) / √(1 − r²)

r = sample Pearson coefficient n = number of pairs df = n − 2 degrees of freedom

This t-statistic follows a t-distribution with df = n − 2 degrees of freedom under H₀. Compare it to the critical value from a t-distribution table for your α and number of tails, or read the p-value directly from software. The n−2 comes from estimating two parameters (the intercept and slope of the regression line that connects correlation to regression).

Significance Test — Is r = 0.72 significant at α = 0.05?

Given: r = 0.72, n = 18 pairs, two-tailed test, α = 0.05

Hypotheses: H₀: ρ = 0 | H₁: ρ ≠ 0 (two-tailed)

Degrees of freedom: df = 18 − 2 = 16. From the Pearson correlation table, the critical r at df=16, α=0.05 two-tailed is 0.468.

Calculate t: t = 0.72 × √(16) / √(1 − 0.72²) = 0.72 × 4 / √(1 − 0.518) = 2.88 / √0.482 = 2.88 / 0.694 = 4.15

Critical value: For df=16, two-tailed, α=0.05: t* = ±2.120. Our |t| = 4.15 > 2.120.

p-value: p ≈ 0.0008 (well below 0.05)

✅ Reject H₀. With r = 0.72, t(16) = 4.15, p ≈ 0.001, the correlation is statistically significant at α = 0.05. There is evidence of a positive linear relationship in the population.

APA-Style Reporting

In research papers, report the sample size, r, degrees of freedom, and p-value together: r(16) = .72, p = .001. Some journals also require the 95% confidence interval for r, computed using Fisher's z-transformation.

Pearson Correlation Examples

Example 1 — Health Research: Blood Pressure and Age

Worked Example — Clinical Research

A researcher records age (years) and systolic blood pressure (mmHg) for 8 adults to see whether age predicts blood pressure.

Person	Age (X)	SBP mmHg (Y)
1	25	115
2	31	122
3	38	127
4	45	134
5	52	140
6	58	148
7	65	155
8	72	162

x̄ = 48.25, ȳ = 137.875

Σ(xᵢ−x̄)(yᵢ−ȳ) = 1,382.25 | Σ(xᵢ−x̄)² = 1,848.5 | Σ(yᵢ−ȳ)² = 1,035.875

r = 1,382.25 / √(1,848.5 × 1,035.875) = 1,382.25 / √1,914,802.3 ≈ 1,382.25 / 1,383.76 ≈ 0.999

✅ r ≈ 0.999. Systolic blood pressure rises in near-perfect linear proportion with age in this sample. R² ≈ 0.998. Note: this sample is small and this relationship would require a larger study before drawing medical conclusions.

Example 2 — Marketing: Advertising Spend vs Revenue

Worked Example — Business Analytics

Monthly ad spend ($000s) and revenue ($000s) across 6 months. Does advertising drive revenue in this dataset?

Month	Ad Spend X ($k)	Revenue Y ($k)
Jan	10	82
Feb	14	97
Mar	18	103
Apr	20	114
May	25	125
Jun	30	141

x̄ = 19.5, ȳ = 110.33

Σ(xᵢ−x̄)(yᵢ−ȳ) ≈ 571 | Σ(xᵢ−x̄)² = 290.5 | Σ(yᵢ−ȳ)² ≈ 1,140.3

r = 571 / √(290.5 × 1,140.3) ≈ 571 / √331,257 ≈ 571 / 575.6 ≈ 0.992

✅ r ≈ 0.992 (very strong positive). R² ≈ 0.984. Nearly all variance in monthly revenue is explained by the linear relationship with advertising spend. To model this formally, use simple linear regression.

Pearson vs Spearman vs Kendall

Three correlation coefficients are used routinely in statistics. Choosing the wrong one gives a number that does not answer your actual question. The decision depends on your data type, whether you expect a linear or just monotonic relationship, and how sensitive you need to be to outliers.

Feature	Pearson r	Spearman ρ	Kendall τ
Relationship type measured	Linear only	Monotonic (any direction)	Monotonic (concordance-based)
Data type required	Continuous (interval/ratio)	Ordinal or continuous	Ordinal or continuous
Distributional assumption	Approximate bivariate normality	None (non-parametric)	None (non-parametric)
Sensitivity to outliers	High	Low (ranks reduce influence)	Low
Effect of tied ranks	N/A	Requires tie correction	Handles ties naturally
Preferred sample size	n ≥ 10, larger better	n ≥ 10	Better with small n
Common use cases	Physical measurements, finance, psychological scales	Survey Likert data, non-normal variables	Small samples, heavy ties

A simple rule: use Pearson when your data is continuous and you have checked the scatter plot for linearity and absence of extreme outliers. Switch to Spearman when the relationship might be monotonic-but-curved, when data is ordinal, or when outliers are present. Kendall tau is preferred for very small samples or datasets with many tied values.

📊

Pearson vs Regression

Pearson r and simple linear regression answer related but different questions. r measures the strength of the linear association (symmetric — it does not matter which variable is X or Y). Regression estimates the predicted change in Y for each one-unit change in X (directional). When you need to predict or control for multiple variables, move from correlation to multiple linear regression.

Real-World Applications

Pearson correlation appears in virtually every quantitative field. Below are eight domains where it is routinely used as a first-pass analytical tool before more complex modelling.

🧬

Medical Research

Correlating biomarkers — cholesterol levels vs cardiovascular risk, age vs bone density — to identify variables worth investigating in clinical trials.

📈

Finance

Measuring portfolio diversification: a low r between two assets means holding both reduces total risk. Pairs trading identifies stocks with high historical r.

🎓

Education Research

Relating study time, attendance, or prior grades to exam outcomes. Helps curriculum designers identify which inputs predict achievement.

🛒

Marketing Analytics

Connecting advertising spend to conversion rates, or customer satisfaction scores to retention. Guides budget allocation decisions.

🤖

Machine Learning

Feature selection: high r between a feature and the target suggests predictive value. High r between two features (multicollinearity) can harm regression models.

🧠

Psychology

Relating test scores across cognitive domains, validating psychometric instruments, and studying personality trait associations in survey research.

🌍

Economics

GDP growth vs unemployment, inflation vs interest rates, trade volume vs currency strength — correlations that inform macroeconomic forecasting.

🌱

Environmental Science

Linking temperature changes to species distribution shifts, or precipitation levels to crop yields across geographic regions.

Pearson Correlation Matrix

When you have more than two variables, computing r for every pair produces a correlation matrix. Each cell shows the Pearson r between that row variable and that column variable. The diagonal is always 1.0 (a variable is perfectly correlated with itself). The matrix is symmetric: r(X,Y) = r(Y,X).

Variable	Age	Income	Education (yrs)	Health Score
Age	1.00	0.24	−0.08	−0.41
Income	0.24	1.00	0.61	0.38
Education (yrs)	−0.08	0.61	1.00	0.29
Health Score	−0.41	0.38	0.29	1.00

Reading this example matrix: Income and Education have the strongest correlation (r = 0.61). Age has a moderate negative relationship with Health Score (r = −0.41), meaning older individuals in this sample tend to have lower health scores. Before building a regression model with multiple predictors, scan the matrix for high pairwise correlations (|r| > 0.70) between predictor variables, which would signal multicollinearity — an issue addressed in the multiple linear regression guide.

Common Mistakes and Misconceptions

Mistake	What People Think	What Is Actually True
Confusing r with R²	r = 0.70 means 70% of variance explained	R² = 0.70² = 0.49, so only 49% is explained
Inferring causation	High r means X causes Y	Correlation only confirms a linear pattern; causation needs experimental design
Ignoring the scatter plot	r tells the whole story	Anscombe's quartet shows four datasets with identical r but completely different patterns
Non-linear data	r = 0 means no relationship	A perfect U-shaped relationship produces r ≈ 0; Pearson misses non-linear patterns
Using r with ordinal data	Likert scale data is "basically continuous"	Ordinal variables require Spearman; Pearson assumes equal intervals between values
Truncated range	r reflects the true population relationship	Sampling only a restricted range of X can dramatically reduce r (range restriction bias)

Frequently Asked Questions

Can Pearson r be negative?

Yes. A negative r means the two variables move in opposite directions: as X increases, Y tends to decrease. For example, r between hours of sleep deprivation and cognitive performance is negative — more deprivation, lower performance. The strength interpretation uses the absolute value; r = −0.75 and r = +0.75 describe equally strong relationships, just in opposite directions.

What sample size does Pearson correlation need?

There is no single rule, but r becomes unstable in very small samples (n < 10). For n = 5, a sample r of 0.80 is not statistically significant at α = 0.05. A common practical minimum is n ≥ 30 for the central limit theorem to give reliable p-values. For power to detect r = 0.30 at 80% power with α = 0.05, you need roughly n = 84 pairs — use a dedicated power calculator. The sample size calculator can help with planning.

What is Anscombe's Quartet?

Francis Anscombe constructed four datasets in 1973, each with nearly identical means, variances, and Pearson r ≈ 0.816, but completely different scatter plots — one linear, one curved, one with a single outlier driving the correlation, one with a vertical cluster. The quartet is a classic demonstration that r must always be paired with a scatter plot. It is why the first step in any correlation analysis is visualizing the data.

How does Pearson r relate to linear regression slope?

In simple linear regression, the standardized slope (beta coefficient) equals the Pearson r when both variables are z-scored. More concretely, the slope b in Y = a + bX relates to r via: b = r · (sᵧ / sₓ), where sᵧ and sₓ are the standard deviations of Y and X. So r and the regression slope carry the same directional information, but r is dimensionless while b has the units of Y per unit of X.

What is Fisher's z-transformation?

Because r does not follow a normal distribution (especially near ±1), comparing two correlations or computing confidence intervals requires converting r to Fisher's z: z = 0.5 · ln[(1+r)/(1−r)]. The z-score is approximately normally distributed with standard error 1/√(n−3), which makes it suitable for building confidence intervals or testing whether two independent r values differ significantly.

When should I use the Pearson correlation table?

The Pearson correlation table gives critical r values for specific degrees of freedom (df = n−2) and significance levels. If your calculated |r| exceeds the critical value in the table, the correlation is statistically significant. It is the manual alternative to computing a t-statistic when doing by-hand work or checking software output.

Pearson Correlation in Software

Python (SciPy)

Python — SciPy

            from scipy import stats

            import numpy as np

            x = np.array([4, 6, 8, 10, 12, 14])

            y = np.array([65, 70, 75, 82, 88, 93])

            r, p_value = stats.pearsonr(x, y)

            print(f"r = {r:.4f}, p = {p_value:.4f}")

R

            x <- c(4, 6, 8, 10, 12, 14)

            y <- c(65, 70, 75, 82, 88, 93)

            cor.test(x, y, method = "pearson")

            # Returns r, t, df, p-value, and 95% CI

Excel

Use the built-in function =CORREL(array1, array2) to compute r directly from two columns of data. For the p-value, you need to compute the t-statistic manually: =r*SQRT(n-2)/SQRT(1-r^2), then use =T.DIST.2T(ABS(t), n-2) for a two-tailed p-value. The online correlation calculator handles this automatically.

Pearson Correlation Cheat Sheet

Item	Formula / Value	Notes
Sample Pearson r	Σ(xᵢ−x̄)(yᵢ−ȳ) / √[Σ(xᵢ−x̄)² · Σ(yᵢ−ȳ)²]	Ranges from −1 to +1
Raw-score formula	[nΣxy − ΣxΣy] / √{[nΣx²−(Σx)²][nΣy²−(Σy)²]}	Easier for hand computation
Population correlation	ρ = Cov(X,Y) / (σₓ · σᵧ)	Estimated by sample r
Coefficient of determination	R² = r²	Proportion of variance explained
t-test statistic	t = r√(n−2) / √(1−r²)	df = n − 2
Fisher's z-transform	z = 0.5 · ln[(1+r)/(1−r)]	Used for CIs and comparing r values
Interpretation: weak	\|r\| = 0.10 to 0.29	Cohen's (1988) benchmarks
Interpretation: moderate	\|r\| = 0.30 to 0.49
Interpretation: strong	\|r\| = 0.50 and above	Context-dependent
Null hypothesis	H₀: ρ = 0	No population linear association
Decision rule	Reject H₀ if \|t\| > t*(df, α)	Or if p < α
Key assumption	Linear relationship + continuous data	Check with scatter plot first