Hypothesis Testing Inferential Statistics Statistical Decision-Making 22 min read June 12, 2026
BY: Statistics Fundamentals Team
Reviewed By: Minsa A (Senior Statistics Editor)

The Decision Rule in Hypothesis Testing: Complete Guide

A clinical researcher sets up a trial. Before a single patient is enrolled, before any data is collected, she writes down one sentence: "I will reject the null hypothesis if p ≤ 0.05." That sentence is the decision rule — and writing it in advance is what separates objective statistical inference from post-hoc rationalization.

This guide covers both methods for stating a decision rule — the p-value approach and the critical value approach — shows how they produce identical conclusions, explains rejection regions for one-tailed and two-tailed tests, and walks through three fully solved worked examples. An interactive simulator at the end lets you apply a decision rule to your own numbers.

What You'll Learn
  • ✓ What a decision rule is and why it must be stated before data collection
  • ✓ The p-value method: reject H₀ if p ≤ α
  • ✓ The critical value method: reject H₀ if the test statistic enters the rejection region
  • ✓ One-tailed vs. two-tailed rejection regions with boundary conditions
  • ✓ Three fully solved examples: z-test, t-test, and A/B testing
  • ✓ Why "fail to reject" is not the same as "accept" H₀
  • ✓ How α controls Type I error and shapes the rejection boundary

What Is a Decision Rule? Definition and Core Purpose

Definition — Decision Rule in Statistics
A decision rule is a formal, predefined criterion that dictates whether to reject or fail to reject the null hypothesis (H₀) based on the results of a statistical test. It establishes an objective evidence threshold before data collection, ensuring that conclusions follow statistical logic rather than subjective interpretation after the results are known.
Reject H₀ if p-value ≤ α  |  or if |test statistic| ≥ critical value

The decision rule answers one specific question: given this test statistic (or this p-value), what do I conclude? It converts a continuous numerical result into a binary decision: reject or do not reject H₀. By committing to the rule before seeing data, a researcher eliminates the temptation to move goalposts — a form of bias called p-hacking or data dredging that inflates false positive rates.

The rule works in two equivalent formulations. The p-value approach operates in probability space: compare the computed p-value to the pre-set significance level α. The critical value approach operates in data space: compare the test statistic to a cutoff on the sampling distribution. Both frameworks test the same underlying null hypothesis and always reach the same decision — the choice between them is one of convenience, not logic.

The framework traces back to Ronald Fisher's significance testing in the 1920s and the Neyman–Pearson decision-theoretic extension of the 1930s. Today it underpins every form of inferential statistics — from clinical trials to A/B tests — taught and practised through resources like Statistics Fundamentals.

α
Significance level: pre-set probability threshold
p
P-value: probability of data given H₀ is true
H₀
Null hypothesis: the default claim being tested
Zcrit
Critical value: boundary of the rejection region
⚡ Quick Reference — Decision Rule Summary
  • P-value method: Reject H₀ if p ≤ α; fail to reject H₀ if p > α
  • Critical value method: Reject H₀ if |test statistic| ≥ critical value
  • α = 0.05 means you accept a 5% chance of a false rejection (Type I error)
  • Two-tailed test at α = 0.05: critical values are z = ±1.96
  • One-tailed test at α = 0.05: critical value is z = 1.645 (right) or −1.645 (left)
  • Fail to reject ≠ accept: insufficient evidence against H₀ is not proof that H₀ is true

The Two Methods for Applying a Decision Rule

Every decision rule has two equivalent formulations. Understanding both is worth the time — different textbooks, software packages, and fields default to different presentations, and being fluent in both prevents confusion when switching contexts.

The P-Value Method (Probability Space)

The p-value is the probability of observing a test statistic at least as extreme as the one calculated, under the assumption that H₀ is true. It is a tail probability — a small p-value means the observed data is unlikely if the null hypothesis were correct.

P-Value Decision Rule
Reject H₀ if p ≤ α   |   Fail to Reject H₀ if p > α
p = computed p-value from the test α = pre-set significance level (e.g., 0.05)

The logic is straightforward: if the probability of getting your data by random chance alone (assuming H₀) is smaller than your tolerance for false alarms (α), the data is "too surprising" to be consistent with H₀ and you reject it. The p-value itself carries no information about effect size or practical importance — a p of 0.001 in a study of ten million people may correspond to a trivially small difference.

⚠️
Common Misreading

The p-value is not the probability that H₀ is true. It is the probability of your observed data (or more extreme data) given that H₀ is true. These are fundamentally different quantities. See the p-values guide for a full treatment.

The Critical Value Method (Data Space)

The critical value is the threshold on the sampling distribution that corresponds to α. Test statistics beyond this threshold — in the tail(s) of the distribution — constitute the rejection region. Because critical values are derived from the same distribution as p-values, the two methods always agree on the final decision.

Critical Value Decision Rule
Reject H₀ if |Zcalc| ≥ Zcrit
Zcalc = calculated test statistic Zcrit = critical value from distribution table |·| = absolute value (for two-tailed tests)

Critical values come from the sampling distribution relevant to the test — the standard normal distribution (z-tests), Student's t-distribution (t-tests), the chi-square distribution, or the F-distribution. They depend on α and, for t-tests, on degrees of freedom.

P-Value Method

  • Works in probability space (0 to 1)
  • Exact probability computed from test statistic
  • Default output of most statistical software
  • Allows comparison across different tests
  • Requires no distribution table lookup

Critical Value Method

  • Works in data space (z-scores, t-values)
  • Geometric: maps directly to rejection region
  • Useful for visualizing the decision boundary
  • Standard in textbooks and hand calculation
  • Requires table of critical values (e.g., z-table)

Rejection Regions: One-Tailed and Two-Tailed Tests

The rejection region is the set of test statistic values for which the decision rule produces "reject H₀." Its shape depends on whether the alternative hypothesis is directional (one-tailed) or non-directional (two-tailed). Getting this right before computing anything is part of setting the decision rule correctly.

Rejection Region Configurations at α = 0.05

Two-Tailed (α = 0.05) Reject Reject Fail to Reject z = −1.96 z = +1.96 α/2 = 0.025 each tail Right-Tailed (α = 0.05) Fail to Reject Reject z = +1.645 α = 0.05 in right tail H₁: μ ≠ μ₀ H₁: μ > μ₀

The table below lists the rejection conditions for each tail configuration at common significance levels.

Test Direction Alternative Hypothesis Reject H₀ if (z-test) α = 0.05 boundary α = 0.01 boundary
Two-tailed μ ≠ μ₀ |z| ≥ zα/2 |z| ≥ 1.96 |z| ≥ 2.576
Right-tailed μ > μ₀ z ≥ zα z ≥ 1.645 z ≥ 2.326
Left-tailed μ < μ₀ z ≤ −zα z ≤ −1.645 z ≤ −2.326

For t-tests, replace the z critical values with t* values from the t-distribution table, using degrees of freedom df = n − 1. The shape of the decision boundary changes with df — with very small samples, t* is considerably larger than the corresponding z value, reflecting greater uncertainty when the population standard deviation is unknown.

How to State the Decision Rule (Step-by-Step)

A well-stated decision rule comes before any calculation. Here is the procedure for writing one, in order.

1

State H₀ and H₁ First

The decision rule depends on H₁ — specifically, whether it is directional. H₀: μ = μ₀ vs. H₁: μ ≠ μ₀ requires a two-tailed rule; H₀: μ = μ₀ vs. H₁: μ > μ₀ requires a right-tailed rule. Write out both hypotheses before doing anything else. See the null and alternative hypothesis guide.

2

Choose the Significance Level α

α sets the false positive rate you can tolerate. Common choices: α = 0.05 (most fields), α = 0.01 (medical and safety research), α = 0.10 (exploratory work). Choosing a smaller α tightens the rejection region and lowers Type I error at the cost of increased Type II error (missing a real effect). This connection is covered in significance level and Type I and Type II errors.

3

Identify the Appropriate Test Statistic

The right test depends on what is known and how the data is structured. Use a z-statistic when population σ is known; use a t-statistic when σ is estimated from the sample. The test statistic formula determines which sampling distribution the critical value comes from — and therefore where the rejection region boundary falls. See the statistical test selector.

4

State the Decision Rule Explicitly

Write it out in full before collecting data. For the p-value method: "Reject H₀ if p ≤ 0.05." For the critical value method: "Reject H₀ if |z| ≥ 1.96." Both are complete, unambiguous decision rules. A rule stated after seeing the data is not a legitimate decision rule — it is rationalization.

5

Apply the Rule and State the Conclusion

Compute the test statistic and either the p-value or compare to the critical value. Apply the rule mechanically. Then translate the binary output into a plain-language conclusion: "At the 5% significance level, there is sufficient evidence to conclude that μ differs from μ₀" — or the reverse. Never write "we prove H₁" or "we accept H₀."

How α Shapes the Decision Rule

The significance level α is the primary lever controlling the decision rule's sensitivity. Lowering α moves the critical value further into the tail, making the rejection region smaller and harder to enter. This reduces false positives (Type I errors) but increases false negatives (Type II errors, β).

Significance Level (α) Type I Error Rate Two-Tailed zcrit Right-Tailed zcrit Practical Use
0.10 10% ±1.645 1.282 Exploratory research
0.05 5% ±1.960 1.645 Default in most fields
0.01 1% ±2.576 2.326 Clinical trials, safety
0.001 0.1% ±3.291 3.090 Physics (5σ convention)

The choice of α carries consequences that extend beyond the individual test. In large-scale multiple testing scenarios — genomics studies examining thousands of genetic markers, for example — the expected number of false positives at α = 0.05 across 10,000 tests is 500. Corrections such as the Bonferroni adjustment or the false discovery rate (FDR) tighten the per-test decision rule to control the family-wise error rate.

ℹ️
α and Statistical Power

Statistical power (1 − β) is the probability of correctly rejecting a false H₀. Reducing α lowers power. To maintain both low Type I and low Type II error simultaneously, a researcher must increase the sample size. The Cohen's d and effect size guide explains how to calculate required sample size for a target power.

Worked Examples: Applying the Decision Rule

Each example below states the decision rule explicitly before computing anything, then applies it mechanically. Critical values use the standard normal and t-distributions as documented by the NIST Engineering Statistics Handbook.

Example 1 — One-Sample Z-Test (Corporate Operations)

Worked Example 1 — One-Sample Z-Test

Problem: A logistics company claims mean parcel delivery time is 48 hours. An operations analyst samples 64 deliveries and records x̄ = 50.5 hours. Known population SD σ = 8 hours. At α = 0.05, does the data contradict the company's claim?

Z-Test Statistic
z = (x̄ − μ₀) / (σ / √n)
= 50.5 hrs μ₀ = 48 hrs σ = 8 hrs n = 64
1

Hypotheses: H₀: μ = 48 hours  |  H₁: μ ≠ 48 hours (two-tailed — testing for any departure from the claimed value)

2

Significance level: α = 0.05 (two-tailed)

3

Decision rule (stated before calculation):
P-value method: Reject H₀ if p ≤ 0.05
Critical value method: Reject H₀ if |z| ≥ 1.96

4

Test statistic:
SE = σ/√n = 8/√64 = 8/8 = 1.00
z = (50.5 − 48) / 1.00 = 2.50

5

Apply the decision rule:
Critical value method: |z| = 2.50 > 1.96 → test statistic is in the rejection region
P-value method: p = 2 × P(Z > 2.50) = 2 × 0.0062 = 0.0124 < 0.05

✅ Decision: Reject H₀. At α = 0.05, there is sufficient evidence to conclude that mean delivery time differs from 48 hours. Both methods agree: the test statistic (z = 2.50) exceeds the critical value (1.96), and p = 0.0124 < 0.05.

Critical value z = 1.96 from the z-table. Z-test methodology: Fisher, R.A. (1925). Statistical Methods for Research Workers.

Example 2 — One-Sample T-Test (Unknown Population SD)

Worked Example 2 — One-Sample T-Test

Problem: A manufacturer claims its batteries last 300 hours. A quality engineer tests a sample of 16 batteries and finds x̄ = 291 hours with s = 20 hours. At α = 0.05, is there evidence that batteries are lasting less than claimed?

T-Test Statistic
t = (x̄ − μ₀) / (s / √n)
= 291 hrs μ₀ = 300 hrs s = 20 hrs n = 16, df = 15
1

Hypotheses: H₀: μ = 300 hours  |  H₁: μ < 300 hours (left-tailed — testing specifically for shorter life)

2

Significance level: α = 0.05 (left-tailed)

3

Decision rule (stated before calculation):
P-value method: Reject H₀ if p ≤ 0.05
Critical value method: Reject H₀ if t ≤ −t*(df=15, α=0.05) = −1.753 (from t-distribution table)

4

Test statistic:
SE = s/√n = 20/√16 = 20/4 = 5.00
t = (291 − 300) / 5.00 = −9/5 = −1.80

5

Apply the decision rule:
Critical value method: t = −1.80 < −1.753 → test statistic is in the left rejection region
P-value: p ≈ 0.046 < 0.05

✅ Decision: Reject H₀. At α = 0.05, there is sufficient evidence that the batteries' mean life is less than 300 hours. The test statistic (t = −1.80) falls just past the critical boundary (−1.753), and p ≈ 0.046. See the full one-sample t-test guide for more examples.

Example 3 — Two-Sample T-Test (A/B Testing)

Worked Example 3 — A/B Testing Decision Rule

Problem: A product team runs an A/B test on a checkout flow. Version A (n₁ = 100) averages $52 per order (s₁ = $12); Version B (n₂ = 100) averages $56 per order (s₂ = $14). At α = 0.05, did Version B generate significantly higher revenue per order?

Two-Sample T-Test Statistic (Welch's)
t = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂)
x̄₁ = 52, x̄₂ = 56 s₁ = 12, s₂ = 14 n₁ = n₂ = 100
1

Hypotheses: H₀: μ_A = μ_B (no difference)  |  H₁: μ_B > μ_A (right-tailed — testing that B exceeds A)

2

Decision rule (before computing):
P-value method: Reject H₀ if p ≤ 0.05
Critical value method: Reject H₀ if t ≥ 1.645 (large-sample approximation; use the two-sample t-test guide for exact df via Welch–Satterthwaite)

3

Test statistic:
SE = √(144/100 + 196/100) = √(1.44 + 1.96) = √3.40 ≈ 1.844
t = (52 − 56) / 1.844 = −4 / 1.844 ≈ −2.17

4

Apply the decision rule:
t = −2.17. The test is right-tailed (H₁: μ_B > μ_A). Since we framed H₁ as μ_B − μ_A > 0 but computed x̄_A − x̄_B, t is negative. The correct framing: t = (56 − 52) / 1.844 = +2.17 > 1.645.
P-value ≈ 0.015 < 0.05.

✅ Decision: Reject H₀. Version B produced statistically significantly higher average revenue per order at α = 0.05 (t = 2.17 > 1.645, p ≈ 0.015). Sign convention matters: always frame the test statistic to match H₁'s direction.

Fail to Reject Is Not Accept: A Critical Distinction

The most common error in reporting statistical conclusions is writing "we accept H₀" when the decision rule does not lead to rejection. This is wrong, and the distinction matters practically.

Situation Incorrect Phrasing Correct Phrasing
p = 0.23, α = 0.05 We accept H₀: the mean equals 50. We fail to reject H₀. There is insufficient evidence at α = 0.05 to conclude that the mean differs from 50.
p = 0.08, α = 0.05 The drug has no effect (H₀ is true). The data did not produce a statistically significant result at α = 0.05. This does not rule out a real effect.
p = 0.049, α = 0.05 We barely proved H₁ is true. We reject H₀ at α = 0.05. The result is statistically significant; H₁ is supported by the data, not proven.

Failing to reject H₀ means your sample did not provide enough evidence against the null claim at the chosen significance level. Four things can produce this outcome: the null hypothesis is genuinely correct; the sample size was too small to detect the effect; the true effect size is smaller than the study was powered to detect; or the test was misspecified. A non-significant result warrants investigation, not the conclusion that "nothing is there."

🚫
Never Write "We Accept H₀"

A single test cannot prove a null hypothesis. Absence of evidence (p > α) is not evidence of absence. The correct framing is always: "There is insufficient statistical evidence to reject H₀ at the α = 0.05 significance level." For more on what p-values do and do not say, see the p-values explainer.

Decision Rule Reference Tables

Primary Decision Rule Framework

Method Condition for Rejection (Reject H₀) Condition for Retention (Fail to Reject H₀)
P-Value Approach p ≤ α p > α
Critical Value Approach Test statistic falls in rejection region Test statistic falls in retention region

Tail Configurations and Boundary Conditions (z-test)

Test Type Alternative Hypothesis (H₁) Rejection Region Condition α = 0.05 boundary α = 0.01 boundary
Left-Tailed μ < μ₀ z ≤ −Zα z ≤ −1.645 z ≤ −2.326
Right-Tailed μ > μ₀ z ≥ Zα z ≥ 1.645 z ≥ 2.326
Two-Tailed μ ≠ μ₀ |z| ≥ Zα/2 |z| ≥ 1.960 |z| ≥ 2.576

Common Critical Values for the T-Distribution

Degrees of Freedom (df) α = 0.10 (two-tailed) α = 0.05 (two-tailed) α = 0.01 (two-tailed) α = 0.05 (one-tailed)
52.0152.5714.0322.015
101.8122.2283.1691.812
151.7532.1312.9471.753
201.7252.0862.8451.725
301.6972.0422.7501.697
601.6712.0002.6601.671
∞ (z)1.6451.9602.5761.645

Full tables: t-distribution table  |  z-table  |  chi-square table

Key Entities and Formulas

Entity Notation Definition / Formula
Null Hypothesis H₀ The default claim being tested; assumes no effect or no difference (e.g., μ = μ₀)
Alternative Hypothesis H₁ (or Hₐ) The research claim; states an effect exists (e.g., μ ≠ μ₀, μ > μ₀, or μ < μ₀)
Significance Level α Pre-set Type I error rate; probability of rejecting H₀ when it is true
P-Value p P(data as extreme or more extreme | H₀ true); small values indicate evidence against H₀
Z Statistic z = (x̄ − μ₀)/(σ/√n) Standardized distance from sample mean to hypothesized mean; used when σ is known
T Statistic t = (x̄ − μ₀)/(s/√n) Standardized distance using sample SD; follows t-distribution with df = n − 1
Critical Value Zcrit or t* The threshold test statistic value at which the decision switches from "fail to reject" to "reject"
Rejection Region RR The set of test statistic values for which H₀ is rejected; located in the tail(s) of the distribution
Type I Error α (false positive) Rejecting H₀ when it is actually true; its probability is exactly α by construction
Type II Error β (false negative) Failing to reject H₀ when it is false; its complement (1 − β) is statistical power

Frequently Asked Questions

Q: What is the decision rule in a hypothesis test?

A decision rule is a predefined, explicit criterion for choosing between two outcomes — reject H₀ or fail to reject H₀ — based on sample data. It is stated before data collection and takes one of two forms: "reject if p ≤ α" (p-value method) or "reject if the test statistic falls in the rejection region" (critical value method). Both are mathematically equivalent.

Q: How do you make a decision in hypothesis testing?

State H₀, H₁, and α before collecting data
Compute test statistic from sample
Get p-value or compare to critical value
p ≤ α: Reject H₀ — result is statistically significant
p > α: Fail to reject H₀ — insufficient evidence

Q: Can I state the decision rule after seeing the data?

No. A decision rule stated after observing the data is not a legitimate decision rule — it is rationalization. The entire purpose of stating the rule in advance is to control the Type I error rate. When α is chosen after seeing whether p < 0.05 or p < 0.01 produced a nicer conclusion, the actual false positive rate is no longer controlled at the claimed level. This practice is known as p-hacking. The American Statistical Association's 2016 statement on p-values explicitly addresses this problem, available at The American Statistician.

Q: What is the difference between the p-value method and the critical value method?

They operate in different spaces but always reach the same conclusion. The p-value method computes the exact tail probability and compares it to α — it works in probability space (0 to 1). The critical value method maps the test statistic to a boundary on the sampling distribution and checks whether the statistic crosses that boundary — it works in data space (z-scores, t-values). Statistical software reports both by default. The p-value method is more common in practice because it gives an exact measure of evidence; the critical value method is more common in textbook hand-calculation exercises because it connects visually to the rejection region diagram.

Q: How does the decision rule apply in A/B testing?

An A/B test is a two-sample hypothesis test. The decision rule is typically: "reject H₀ (no difference between variants) if p ≤ 0.05." In practice, product teams pre-register the required sample size using power analysis, run the test until that size is reached, then apply the decision rule mechanically. Early stopping — checking the result before the planned end and stopping if p < 0.05 — inflates the false positive rate and violates the spirit of the decision rule. See the hypothesis testing examples page for a fully worked A/B testing example.

Interactive Decision Rule Simulator

Enter your test parameters below. The simulator applies the decision rule to your numbers, shows the test statistic, p-value, and critical value, and delivers the correct statistical conclusion. For a z-test, supply the known population standard deviation; for a t-test, supply the sample standard deviation.

Decision Rule Calculator — Z-Test & T-Test

The Decision Rule in Practice

The decision rule shows up under different names and in different software outputs across every field that uses inferential statistics, but the underlying logic is always the same.

🏥

Clinical Research

Trial protocols registered with ClinicalTrials.gov must pre-specify α and the primary endpoint decision rule. Regulators treat post-hoc α adjustment as a form of protocol deviation.

📊

A/B Testing

Growth teams at technology companies use a two-sample decision rule to determine whether a product variant is a significant improvement. The rule gates shipping decisions. Bayesian alternatives also exist.

🏭

Quality Control

Control charts in manufacturing implement a continuous decision rule: flag a process as out-of-control when the test statistic (sample mean or range) breaches a ±3σ control limit — effectively α ≈ 0.0027.

🔬

Physics

Particle physics uses a 5-sigma (5σ) standard for discovery claims — a two-tailed z-test decision rule with p < 2.87 × 10⁻⁷. The 2012 Higgs boson announcement met this threshold.

📈

Finance & Economics

Regression coefficients in econometric models are tested with a t-test decision rule. A coefficient is reported as "statistically significant" when |t| ≥ t*(df, α) — typically 2.0 for large samples at α = 0.05.

🧬

Genomics

Genome-wide association studies test hundreds of thousands of SNPs simultaneously. The per-test decision rule uses α = 5 × 10⁻⁸ (Bonferroni-adjusted) rather than 0.05, to control the family-wise error rate.

The decision rule sits at the centre of hypothesis testing. The concepts below build directly on or connect directly to understanding it correctly.

External references: NIST/SEMATECH e-Handbook of Statistical Methods — Critical Region and Hypothesis Testing. Wasserstein & Lazar (2016) — ASA Statement on P-Values. Penn State STAT 415 — Introduction to Mathematical Statistics.