What Is Significance Level? (α Definition)
In plain language: α is the line in the sand. Set α = 0.05 and you're saying, "If the null hypothesis were true, I'd be willing to wrongly reject it no more than 5% of the time." That 5% is not a mistake — it's a deliberate risk budget. Every time you run a test at α = 0.05, there is a 1-in-20 chance of a false positive even when no real effect exists.
The symbol α is the Greek letter alpha. You will see it written as the significance level, the alpha level, or occasionally the significance threshold. All three phrases mean the same thing. The phrase level of significance is also common in textbooks and journal methods sections.
The concept was introduced by Ronald Fisher in his 1925 book Statistical Methods for Research Workers, and refined by Neyman and Pearson into the formal decision-theoretic framework used today. The full context of how α fits into statistical inference is covered in the hypothesis testing guide on Statistics Fundamentals.
The significance level (α) is the probability threshold set before a hypothesis test that determines when to reject the null hypothesis; if the p-value is less than α, the result is statistically significant.
Significance Level Formula and Symbol
The formula for the significance level states it as the conditional probability of rejecting a true null hypothesis:
α = significance level (alpha)
H₀ = null hypothesis
P(A | B) = probability of A given B is true
The relationship between significance level and confidence level follows directly from this definition:
The three standard significance levels and their corresponding critical values are:
| Significance Level (α) | Confidence Level | z* (two-tailed) | z* (one-tailed) | Typical Field |
|---|---|---|---|---|
| α = 0.10 | 90% | ±1.645 | ±1.282 | Exploratory research |
| α = 0.05 | 95% | ±1.960 | ±1.645 | Social science, business |
| α = 0.01 | 99% | ±2.576 | ±2.326 | Medicine, safety testing |
Critical values come from the standard normal distribution at the α/2 tail (two-tailed) or α tail (one-tailed). You can look them up using the z-table or the t-distribution table for t-tests.
How Significance Level Works in Hypothesis Testing
The significance level enters hypothesis testing at step 2 of the standard procedure — after you state your hypotheses but before you touch any data. This ordering is not arbitrary: α must be fixed before analysis to maintain the probabilistic guarantee it provides.
State H₀ and H₁
Write the null hypothesis (default claim, usually "no effect") and the alternative hypothesis (what you're trying to show). Example: H₀: μ = 50 vs H₁: μ ≠ 50.
Set the Significance Level (α) — Before Collecting Data
Choose α based on your field, the cost of a false positive, and your sample size. The most common default is α = 0.05. Setting α after seeing data invalidates the test. See the How to Choose α section below for field-specific guidance.
Collect Data and Compute the Test Statistic
Run the appropriate test (z, t, F, χ²) to convert your data into a single number measuring how far your sample is from H₀. See the statistical test selector to choose the right test.
Calculate the p-value
The p-value is the probability of observing a test statistic as extreme as yours, assuming H₀ is true. A small p-value means your data is unlikely under H₀. See the null hypothesis guide for detailed p-value interpretation.
Compare p-value to α — Make a Decision
This is where α and the p-value meet. The decision rule is exactly: if p < α, reject H₀; if p ≥ α, fail to reject H₀. The result is "statistically significant" only when p < α.
Here is the decision rule in flowchart form:
Significance Level Decision Rule
Choosing or adjusting α after seeing the data — or running multiple tests and only reporting the significant ones — inflates your actual Type I error rate far above α. Always set α before analysis. The American Statistical Association's 2016 statement on p-values addresses this directly.
How to Choose a Significance Level
The right α depends on two things: the cost of a false positive (Type I error) in your field, and the cost of a false negative (Type II error). There is no universally correct answer, but there are strong conventions by discipline.
Standard Significance Levels by Field
Why Is 0.05 the Standard Significance Level?
The α = 0.05 convention traces back to Fisher, who wrote in 1926 that a result at the 5% level — a result that would occur by chance less than once in twenty — was "barely significant." He framed it as a convenient rule of thumb, not a universal law. Over the following decades it became entrenched in scientific publishing, particularly in psychology and medicine, to the point that journals routinely required p < 0.05 for publication.
This entrenchment has been the subject of sustained criticism. The journal Nature published calls to abandon the 0.05 threshold, and many researchers now report exact p-values alongside effect sizes rather than relying on a binary reject/fail-to-reject decision. Still, for most introductory statistics courses and general research, α = 0.05 remains the starting point.
When to Use α = 0.01
Use a smaller significance level when the cost of a false positive is high. Drug trials, medical device approvals, safety regulations, and criminal forensics all involve decisions with serious consequences. A drug approved based on a false positive causes real harm. In these settings, demanding p < 0.01 (or even p < 0.001) before rejecting H₀ reduces that risk.
When to Use α = 0.10
In exploratory research — pilot studies, preliminary surveys, hypothesis generation — the cost of missing a real effect (Type II error) often outweighs the cost of a false lead that will be tested again in a larger study. Using α = 0.10 makes the test more sensitive to potential effects. This is common in early-phase trials and preliminary market research.
- High stakes (medicine, safety, policy): Use α = 0.01 or smaller. False positives have serious consequences.
- Standard research (psychology, social science, business): Use α = 0.05. The conventional default.
- Exploratory work (pilot studies, hypothesis generation): Use α = 0.10. Catching potential effects matters more than strict control.
- Multiple comparisons: Use Bonferroni correction — divide α by the number of tests. For 5 tests at α = 0.05, use α = 0.01 per test.
- Always set α before collecting data. Post-hoc adjustment of α based on the p-value invalidates the test.
Significance Level vs P-Value
This is one of the most searched comparisons in statistics — and one of the most commonly confused. The significance level and the p-value are related but completely different things.
| Aspect | Significance Level (α) | P-Value |
|---|---|---|
| What it is | A threshold you set in advance | A probability calculated from your data |
| When determined | Before data collection | After running the test |
| Chosen by | The researcher | Derived from the test statistic |
| What it measures | Maximum acceptable Type I error risk | Probability of data this extreme under H₀ |
| Typical values | 0.05, 0.01, 0.10 | Any value between 0 and 1 |
| Decision role | Sets the bar | Is compared against the bar |
| Decision rule | If p < α → reject H₀ | If p ≥ α → fail to reject H₀ | |
A useful analogy: the significance level is the speed limit, and the p-value is your actual speed. The significance level is posted in advance. The p-value is what you measure. If your p-value goes under the α limit — if you drive fast enough (small p means the data is unlikely under H₀) — you reject the null hypothesis.
When p < α, the result is statistically significant. You reject H₀. The data is unlikely enough under H₀ that you conclude there is sufficient evidence for H₁. This does not mean the effect is practically large or that H₁ is proven — only that the evidence clears your pre-set threshold.
When p ≥ α, you fail to reject H₀. This does not mean H₀ is true — it means the data does not provide enough evidence to reject it at your chosen α level. The result is not statistically significant. Never write "we accept H₀."
Significance Level vs Confidence Level
The significance level and the confidence level are two sides of the same coin, connected by a direct formula:
When you construct a 95% confidence interval for a parameter, you are using the same α = 0.05 standard as a two-tailed hypothesis test at that level. If the 95% confidence interval does not contain the null hypothesis value, the p-value for that hypothesis test is less than 0.05. The two procedures give identical decisions when both are two-tailed at the same α. You can explore this further in the confidence intervals guide.
| Aspect | Significance Level (α) | Confidence Level (1 − α) |
|---|---|---|
| Definition | Probability of false rejection | Long-run coverage probability |
| Value at standard setting | 0.05 | 0.95 (95%) |
| Used in | Hypothesis testing | Interval estimation |
| Interpretation | Risk of Type I error | Proportion of intervals capturing true parameter |
| Relationship | α + Confidence Level = 1 (always) | |
Significance Level, Type I Error, and Type II Error
The significance level directly sets the Type I error rate. But every decision about α involves a trade-off with Type II error (β), and understanding both is necessary to choose α wisely.
| H₀ is True | H₀ is False | |
|---|---|---|
| Reject H₀ | Type I Error (False Positive) Probability = α |
Correct Decision (True Positive) Probability = 1 − β (Power) |
| Fail to Reject H₀ | Correct Decision (True Negative) Probability = 1 − α |
Type II Error (False Negative) Probability = β |
The critical trade-off: as you lower α (e.g., from 0.05 to 0.01) to reduce false positives, you automatically increase Type II error β — making it harder to detect real effects. The only way to reduce both simultaneously is to increase sample size. This relationship is the core of statistical power analysis and study design.
| Error Type | Description | Probability | Reduced by |
|---|---|---|---|
| Type I (False Positive) | Reject H₀ when it is actually true | = α | Lowering α (0.01 instead of 0.05) |
| Type II (False Negative) | Fail to reject H₀ when it is actually false | = β | Raising α, increasing sample size |
| Statistical Power | Correctly rejecting a false H₀ | = 1 − β | Larger n, larger effect size, larger α |
Worked Examples — Significance Level in Practice
Each example below shows how to apply the significance level in a complete hypothesis test. All three follow the same structure: set α first, then run the test. For additional fully solved tests, see the hypothesis testing examples guide.
Example 1 — Setting α = 0.05 (Two-Tailed Z-Test)
A bottling plant claims each bottle contains exactly 500 mL. Quality control samples 40 bottles and finds x̄ = 497.5 mL with known σ = 8 mL. Test at α = 0.05.
Hypotheses: H₀: μ = 500 mL | H₁: μ ≠ 500 mL (two-tailed)
Significance level: α = 0.05. For a two-tailed test, the critical values are z = ±1.960 (each tail holds α/2 = 0.025). See the z-table.
Test statistic: SE = 8/√40 = 1.265 | z = (497.5 − 500) / 1.265 = −1.98
P-value: P(Z < −1.98) × 2 ≈ 0.024 × 2 = p ≈ 0.048
Decision: p = 0.048 < α = 0.05 → Reject H₀. Also confirmed: |z| = 1.98 > 1.96.
✅ Conclusion: At the α = 0.05 significance level, there is sufficient evidence that the mean fill is not 500 mL. Had we used α = 0.01, the critical values would be ±2.576 and we would fail to reject H₀ (p = 0.048 > 0.01) — showing how the choice of α changes the conclusion.
Example 2 — Using α = 0.01 (Medical Research T-Test)
A clinical researcher tests whether a new drug reduces systolic blood pressure below the known baseline of 140 mmHg. A sample of 25 patients yields x̄ = 134 mmHg, s = 12 mmHg. Test at α = 0.01 (one-tailed, left).
df = n − 1 = 24
Hypotheses: H₀: μ = 140 | H₁: μ < 140 (one-tailed, left) — testing whether the drug lowers blood pressure
Significance level: α = 0.01 (medical context; high cost of false positive). Critical value from the t-table at df = 24, one-tailed: t* = −2.492.
Test statistic: SE = 12/√25 = 2.4 | t = (134 − 140) / 2.4 = −2.50
P-value: P(t₂₄ < −2.50) ≈ 0.0098
Decision: p = 0.0098 < α = 0.01 → Reject H₀. Also: |t| = 2.50 > 2.492 (critical value).
✅ Conclusion: At the 1% significance level, there is sufficient evidence that the drug lowers mean systolic blood pressure below 140 mmHg. Using the stricter α = 0.01 was appropriate given the clinical stakes. See the one-sample t-test guide for the full methodology.
Example 3 — 2% Significance Level, Left-Tailed Test
A factory's machine is calibrated to produce components with a mean diameter of 25 mm. An engineer suspects the machine is producing undersized parts. A sample of 36 components has x̄ = 24.3 mm, σ = 1.8 mm (population known). Test at α = 0.02.
Hypotheses: H₀: μ = 25 mm | H₁: μ < 25 mm (left-tailed — testing for undersized parts)
Significance level: α = 0.02 (left-tailed). Critical value: z* = −2.054 (from z-table at 2% left tail area).
Test statistic: SE = 1.8/√36 = 0.3 | z = (24.3 − 25) / 0.3 = −2.333
P-value: P(Z < −2.333) ≈ 0.0098
Decision: p = 0.0098 < α = 0.02 → Reject H₀. Also: z = −2.333 < −2.054 (falls in rejection region).
✅ Conclusion: At the 2% significance level (left-tailed), the evidence supports the engineer's concern — the machine produces parts below the target diameter. This example shows that any α value (not just the three "standard" ones) can be used when the situation warrants it.
Significance Level Calculator
Enter your sample data below to compute the test statistic and p-value, then compare against your chosen α. The calculator handles one-sample z-tests and t-tests.
Significance Level Calculator — Z-Test & T-Test
Real-World Applications of Significance Level
The significance level appears in every field that uses data to make decisions. Here is how α is chosen in practice across different domains:
Clinical Trials
Phase III drug trials typically require α = 0.05 for the primary endpoint, with some regulatory agencies (FDA, EMA) requiring pre-registration of α. Trials with multiple primary endpoints use Bonferroni-corrected α values.
Scientific Research
Psychology and social science conventionally use α = 0.05. After the replication crisis, many journals now require reporting exact p-values and effect sizes rather than binary significant/not-significant decisions.
A/B Testing (Marketing)
Online experiments often use α = 0.05 with 80–95% power targets. Large tech companies run hundreds of tests simultaneously and adjust α to control the family-wise error rate.
Manufacturing Quality Control
Statistical process control charts use α to set control limits. A 3-sigma control chart corresponds to α ≈ 0.0027 per sample point.
Particle Physics
The 5σ standard for claiming discovery of a new particle corresponds to α ≈ 0.0000003 — far stricter than any other field because of the cost of announcing a false discovery.
Finance & Economics
Regression coefficients in economic models are typically reported with α = 0.05 and 0.01 thresholds. Trading strategies may use stricter thresholds given the risk of acting on false signals.
Quick Reference Table — Significance Level Values
| α (Significance Level) | 1 − α (Confidence Level) | z* Two-Tailed | z* One-Tailed | Type I Error Risk |
|---|---|---|---|---|
| 0.001 | 99.9% | ±3.291 | ±3.090 | 0.1% |
| 0.01 | 99% | ±2.576 | ±2.326 | 1% |
| 0.02 | 98% | ±2.326 | ±2.054 | 2% |
| 0.05 | 95% | ±1.960 | ±1.645 | 5% |
| 0.10 | 90% | ±1.645 | ±1.282 | 10% |
For t-test critical values (which depend on degrees of freedom), use the t-distribution table. For chi-square tests, see the chi-square table. The z-score guide covers how z-statistics relate to areas under the normal curve.
Frequently Asked Questions
The significance level (α) is the probability threshold set before a hypothesis test. It defines the maximum acceptable risk of rejecting a true null hypothesis — a Type I error. When the p-value falls below α, the result is called statistically significant and H₀ is rejected.
A significance level of 0.05 means you accept a 5% probability of falsely rejecting a true null hypothesis. For every 100 tests under a true H₀, about 5 may appear significant purely by chance. This is why replication is important.
Yes. Alpha (α) is the significance level. It represents the probability of making a Type I error. It is also referred to as the alpha level or level of significance in statistical testing.
The significance level is chosen before analysis, not calculated from data. Common choices are 0.10 for exploratory work, 0.05 for general research, and 0.01 for high-stakes fields like medicine. It depends on the cost of false positives.
The symbol for significance level is α (alpha). It represents P(rejecting H₀ when H₀ is true). It is a fixed threshold used to decide whether results are statistically significant.
They are complements: confidence level = 1 − α. For α = 0.05, the confidence level is 95%. Both approaches lead to the same decision in two-tailed tests: rejecting H₀ when p < α is equivalent to the null value lying outside the 95% confidence interval.
If p < α, you reject the null hypothesis. The result is statistically significant, meaning the observed data is unlikely under H₀. However, statistical significance does not imply practical importance, so effect size should always be considered.
In a left-tailed test with α = 0.05, the rejection region is in the left tail only. The critical value for a z-test is approximately −1.645. If the test statistic is below this value, or p < 0.05, you reject H₀.
Sources and Further Reading
For more hypothesis testing tools and guides, visit Statistics Fundamentals — including the complete hypothesis testing guide, confidence intervals, normal distribution, and the full calculator suite.