What is Linear Regression?
Simple linear regression models the relationship between two variables by fitting a straight line to the data. One variable is the independent variable (X) and the other is the dependent variable (Y).
The goal is to find the line that minimizes the sum of squared residuals (differences between observed and predicted Y values) — this is called the Ordinary Least Squares (OLS) method.
Where:
- ŷ = predicted Y value
- b₀ = y-intercept (value of Y when X = 0)
- b₁ = slope (change in Y for each 1-unit increase in X)
- x = independent variable value
How to Calculate Slope & Intercept
b₀ = ȳ − b₁x̄
Interpreting the Results
| Output | Symbol | What It Means |
|---|---|---|
| Slope | b₁ | For each 1-unit increase in X, Y changes by this amount. Positive = upward trend. |
| Intercept | b₀ | Predicted value of Y when X = 0. May not always be meaningful in context. |
| R² (R-squared) | R² | Proportion of variance in Y explained by X. 0.85 = 85% of Y's variation is explained by X. |
| Pearson's r | r | Correlation coefficient (−1 to 1). Values near ±1 indicate strong linear relationships. |
R² Interpretation Guide:
0.00–0.29: Weak | 0.30–0.49: Moderate | 0.50–0.69: Moderate–Strong | 0.70–0.89: Strong | 0.90–1.00: Very Strong
Example: Advertising vs. Sales
A company tracked weekly advertising spend ($1000s) and resulting sales ($1000s):
| Week | Ad Spend (X) | Sales (Y) |
|---|---|---|
| 1 | 1 | 14 |
| 2 | 2 | 17 |
| 3 | 3 | 21 |
| 4 | 5 | 27 |
| 5 | 8 | 36 |
Result: ŷ = 10.4 + 3.2x, R² = 0.99 — a near-perfect linear relationship. For every additional $1,000 in ad spend, sales increase by approximately $3,200.