What is a null hypothesis?

The null hypothesis (H₀) is the default assumption that there is no effect, no difference, or no relationship in the population being studied. It represents the status quo and is the hypothesis that statistical tests are designed to challenge. For example, H₀: μ = 100 states that the population mean equals 100.

What is an alternative hypothesis?

The alternative hypothesis (H₁ or Hₐ) is the claim the researcher wants to test — that a real effect, difference, or relationship exists. It directly contradicts the null hypothesis. For example, H₁: μ ≠ 100 states that the population mean is not equal to 100.

What is the difference between null and alternative hypothesis?

The null hypothesis (H₀) assumes no effect or difference and is the statement being tested. The alternative hypothesis (H₁) represents the claim that an effect or difference exists. You never prove H₀; instead, you either reject it in favor of H₁ (when evidence is strong enough) or fail to reject it (when evidence is insufficient).

When do you reject the null hypothesis?

You reject the null hypothesis when the p-value is less than or equal to the significance level α (commonly 0.05). This means the observed data would occur less than 5% of the time if H₀ were true, giving you enough evidence to conclude H₁ is supported.

What is the difference between a one-tailed and two-tailed test?

A two-tailed test checks for differences in either direction (H₁: μ ≠ μ₀) and splits α across both tails of the distribution. A one-tailed test checks for a difference in only one direction — either greater than (H₁: μ > μ₀) or less than (H₁: μ < μ₀) — and places all of α in a single tail. Use a one-tailed test only when your research question specifies a directional prediction.

What is a Type I error?

A Type I error (false positive) occurs when you reject the null hypothesis even though it is actually true. The probability of committing a Type I error is equal to the significance level α. Setting α = 0.05 means you accept a 5% chance of incorrectly rejecting a true null hypothesis.

What is a Type II error?

A Type II error (false negative) occurs when you fail to reject the null hypothesis even though it is actually false. The probability of a Type II error is called β. Statistical power (1 − β) is the probability of correctly rejecting a false null hypothesis, and researchers typically aim for power of 0.80 or higher.

What does failing to reject the null hypothesis mean?

Failing to reject the null hypothesis means the data did not provide sufficient evidence to support the alternative hypothesis. It does not mean H₀ is proven true — only that the test was inconclusive. The result could stem from a small sample size, high variability in the data, or a genuinely absent effect.

Null and Alternative Hypothesis Explained with Examples (2026)

Q: What is a Type I error?

A Type I error (false positive) occurs when you reject the null hypothesis even though it is actually true. The probability of committing a Type I error is equal to the significance level α. Setting α = 0.05 means you accept a 5% chance of incorrectly rejecting a true null hypothesis.

Q: What is a Type II error?

A Type II error (false negative) occurs when you fail to reject the null hypothesis even though it is actually false. The probability of a Type II error is called β. Statistical power (1 − β) is the probability of correctly rejecting a false null hypothesis, and researchers typically aim for power of 0.80 or higher.

Q: What does failing to reject the null hypothesis mean?

Failing to reject the null hypothesis means the data did not provide sufficient evidence to support the alternative hypothesis. It does not mean H₀ is proven true — only that the test was inconclusive. The result could stem from a small sample size, high variability in the data, or a genuinely absent effect.

Null and Alternative Hypothesis: Core Definitions

Definition — Null Hypothesis (H₀)

The null hypothesis is the default assumption that there is no effect, no difference, and no relationship in the population. It is the statement that a statistical test is designed to challenge. Researchers never prove the null hypothesis true — they either reject it or fail to reject it based on data.

H₀: μ = μ₀ | H₀: p = p₀ | H₀: μ₁ = μ₂

Definition — Alternative Hypothesis (H₁ or Hₐ)

The alternative hypothesis is the claim the researcher is trying to support — that a real effect, difference, or relationship exists in the population. It is what the test is built to detect. If evidence is strong enough, the researcher rejects H₀ in favor of H₁.

H₁: μ ≠ μ₀ | H₁: μ > μ₀ | H₁: μ < μ₀

Every statistical test in science, medicine, business analytics, and social research uses this same structure. The null and alternative hypothesis together form a mutually exclusive and exhaustive pair — exactly one of them describes the true state of the population, and the data helps you determine which one the evidence favors.

⚡ Quick Reference — H₀ vs H₁ at a Glance

Null hypothesis (H₀): No effect, no difference. Always contains an equality sign (=, ≤, ≥)
Alternative hypothesis (H₁): Effect exists. Contains ≠, >, or <
You test H₀: The goal is to find evidence strong enough to reject it
Rejection rule: Reject H₀ when p-value ≤ α (significance level)
"Fail to reject": Not the same as proving H₀ true — the data was simply inconclusive
Common α values: 0.05 (most fields), 0.01 (medicine), 0.10 (exploratory research)

The Court Trial Analogy: How to Think About Hypothesis Testing

The easiest way to understand the null vs alternative hypothesis structure is through a court trial. The logic is identical.

Court Trial	Hypothesis Testing	What It Means
Defendant is innocent until proven guilty	H₀ is assumed true until evidence says otherwise	Default assumption requires no evidence to accept
Prosecution presents evidence	Researcher collects data and computes test statistic	Evidence evaluated against the default assumption
"Beyond reasonable doubt"	p-value ≤ α (significance level)	Evidence must cross a defined threshold
Verdict: Guilty	Reject H₀, support H₁	Evidence is strong enough to overturn the default
Verdict: Not Guilty	Fail to reject H₀	Insufficient evidence — not proof of innocence

Notice that "not guilty" is not the same as "innocent." The jury does not conclude the defendant is innocent; they conclude the evidence did not meet the required standard. Hypothesis testing works exactly the same way. Failing to reject H₀ never means you have proven H₀ is true. It means your data, at the chosen significance level, was not strong enough to reject it.

⚠️

The Most Common Misconception

Many students write "we accept H₀" when a test is not significant. This is statistically incorrect. The correct language is "we fail to reject H₀." You can only reject or fail to reject a null hypothesis — you never accept or prove it. This distinction matters in published research and on statistics exams.

How to Write Null and Alternative Hypotheses

Good hypotheses share four properties: they are clear, testable, mutually exclusive, and stated in terms of population parameters (not sample statistics). The parameter is the true value in the whole population; the statistic is what you calculate from your sample.

H₀ Rule

Null always uses equality

H₀: μ = μ₀

The null hypothesis always contains =, ≤, or ≥. It represents the status quo. Write the value you are comparing against as μ₀ or p₀.

H₁ Rule

Alternative defines the test direction

H₁: μ ≠, >, or < μ₀

Use ≠ for a two-tailed test (either direction). Use > or < for a directional one-tailed test when theory predicts a specific direction.

Parameter Rule

Test population parameters

μ, p, σ, μ₁−μ₂

Hypotheses are about population parameters (μ, p), not sample statistics (x̄, p̂). You collect sample data to make inferences about the population.

Exclusivity Rule

H₀ and H₁ cover all cases

H₀ ∪ H₁ = all outcomes

Together, H₀ and H₁ must account for every possible value of the parameter. No value should fall into neither hypothesis.

Writing Hypotheses: Six Real Scenarios

Research Scenario	H₀	H₁	Test Type
Does a new drug lower mean blood pressure below 120 mmHg?	H₀: μ ≥ 120	H₁: μ < 120	One-tailed (left)
Has a website's conversion rate changed from the historical 3.5%?	H₀: p = 0.035	H₁: p ≠ 0.035	Two-tailed
Does a training program improve test scores above the national mean of 75?	H₀: μ ≤ 75	H₁: μ > 75	One-tailed (right)
Do two manufacturing machines produce parts with equal mean diameters?	H₀: μ₁ = μ₂	H₁: μ₁ ≠ μ₂	Two-tailed
Is the defect rate in a production line above the acceptable 2%?	H₀: p ≤ 0.02	H₁: p > 0.02	One-tailed (right)
Is mean customer satisfaction different between two service designs?	H₀: μ₁ − μ₂ = 0	H₁: μ₁ − μ₂ ≠ 0	Two-tailed

The 6-Step Hypothesis Testing Framework

Every hypothesis test follows the same logical sequence. Learn this order once and you can apply it to any statistical test — z-test, t-test, chi-square, ANOVA, or regression coefficient test.

Framework

The Universal 6-Step Process

State the hypotheses (H₀ and H₁). Write both hypotheses using formal parameter notation. Decide at this step whether the test is one-tailed or two-tailed based on the research question — never adjust the direction of the test after seeing the data.

Set the significance level (α). Choose α before collecting data. Common choices are 0.05 (general research), 0.01 (clinical/medical studies requiring stricter evidence), and 0.10 (exploratory pilot studies). α defines the risk of a Type I error you are willing to accept.

Choose the test and verify assumptions. Select the appropriate test based on data type, sample size, and whether population parameters are known. A statistical test selector can help. Check normality, independence, and homogeneity of variance as needed.

Collect data and calculate the test statistic. Compute the standardized test statistic (z-score or t-score) from your sample data. This single number summarizes how far your sample result is from what H₀ predicts, measured in standard errors.

Find the p-value and compare to α. The p-value is the probability of observing a test statistic as extreme as yours (or more extreme) if H₀ were true. If p ≤ α, the evidence is strong enough to reject H₀. If p > α, fail to reject H₀.

State the conclusion in context. Write a plain-language conclusion that references the original research question. Never just write "reject H₀" — explain what that means for the specific problem. For example: "There is sufficient evidence at α = 0.05 to conclude that the new drug lowers mean blood pressure below 120 mmHg."

Test Statistic Formulas

The test statistic converts your sample result into a standardized number that can be looked up in a distribution table or converted to a p-value. The two most common tests for means are the z-test (when population standard deviation σ is known) and the t-test (when σ is unknown and estimated from the sample).

Z-Test Statistic — Population SD Known

z = (x̄ − μ₀) / (σ / √n)

Use when: population σ is known, OR n ≥ 30 (Central Limit Theorem applies)

x̄ = sample mean μ₀ = hypothesized population mean (from H₀) σ = population standard deviation n = sample size

One-Sample T-Test Statistic — Population SD Unknown

t = (x̄ − μ₀) / (s / √n)

Use when: population σ is unknown, sample SD s is used as an estimate, df = n − 1

x̄ = sample mean μ₀ = hypothesized population mean s = sample standard deviation n = sample size

One-Sample Proportion Z-Test

z = (p̂ − p₀) / √[p₀(1 − p₀) / n]

Use when: testing a population proportion, np₀ ≥ 10 and n(1 − p₀) ≥ 10

p̂ = sample proportion p₀ = hypothesized population proportion (from H₀) n = sample size

💡

Z-Test vs T-Test: Which Should You Use?

Use a z-test when the population standard deviation (σ) is known, or when your sample size is 30 or larger (the Central Limit Theorem makes the sampling distribution approximately normal regardless). Use a t-test when σ is unknown and your sample is small (n < 30). In practice, σ is almost never known for real-world problems, so the t-test is more commonly used.

Step-by-Step Worked Examples

Example 1: Z-Test (Two-Tailed)

Worked Example — Z-Test

A quality control engineer samples 50 bolts from a production line. The historical mean bolt diameter is 10 mm with a known population standard deviation of 0.5 mm. The sample mean is 10.14 mm. At α = 0.05, is there evidence that the mean diameter has changed?

State hypotheses: H₀: μ = 10 mm | H₁: μ ≠ 10 mm. This is a two-tailed test — the engineer wants to detect any change, not just an increase or decrease.

Set significance level: α = 0.05 (given). For a two-tailed test, the critical region is split between both tails: α/2 = 0.025 per tail. Critical z-values are ±1.96.

Calculate test statistic: z = (x̄ − μ₀) / (σ / √n) = (10.14 − 10) / (0.5 / √50) = 0.14 / 0.0707 = 1.98

Find p-value: z = 1.98 corresponds to a one-tail area of 0.0239. For a two-tailed test: p-value = 2 × 0.0239 = 0.0478.

Decision: p-value (0.0478) ≤ α (0.05) → Reject H₀. The test statistic (1.98) also falls in the rejection region (|z| > 1.96).

✓ Conclusion: At α = 0.05, there is sufficient statistical evidence to conclude that the mean bolt diameter has changed from 10 mm. The production line may need recalibration.

Example 2: One-Sample T-Test (One-Tailed)

Worked Example — T-Test

A school claims its students score above the national average of 72 on a standardized test. A sample of 16 students has a mean score of 76 with a sample standard deviation of 8. At α = 0.05, does the evidence support the school's claim?

State hypotheses: H₀: μ ≤ 72 | H₁: μ > 72. This is a right-tailed test — the school claims scores are above the national average, a directional claim.

Set significance level: α = 0.05. Degrees of freedom: df = n − 1 = 15. For a right-tailed t-test at α = 0.05 with df = 15, the critical value is t* = 1.753.

Calculate test statistic: t = (x̄ − μ₀) / (s / √n) = (76 − 72) / (8 / √16) = 4 / 2 = 2.00

Find p-value: Using the t-distribution with df = 15, t = 2.00 gives a one-tail p-value ≈ 0.032.

Decision: p-value (0.032) ≤ α (0.05) → Reject H₀. The test statistic (2.00) also exceeds the critical value (1.753).

✓ Conclusion: At α = 0.05, there is sufficient evidence to support the school's claim that students score above the national average of 72. The sample mean of 76 is statistically significantly higher.

Example 3: Proportion Z-Test

Worked Example — Proportion Test

A tech company claims its product has a 95% customer satisfaction rate. In a random survey of 200 customers, 184 report being satisfied (p̂ = 0.92). At α = 0.05, is there evidence the true satisfaction rate is below 95%?

State hypotheses: H₀: p ≥ 0.95 | H₁: p < 0.95. Left-tailed test — the question is whether satisfaction has fallen below the claimed rate.

Verify conditions: np₀ = 200 × 0.95 = 190 ≥ 10. n(1 − p₀) = 200 × 0.05 = 10 ≥ 10. ✓ Conditions met. Critical value at α = 0.05 (left-tail): z* = −1.645.

Calculate test statistic: z = (0.92 − 0.95) / √[0.95 × 0.05 / 200] = −0.03 / √(0.0002375) = −0.03 / 0.01541 ≈ −1.95

Find p-value: z = −1.95 gives a left-tail area ≈ 0.026.

Decision: p-value (0.026) ≤ α (0.05) → Reject H₀.

✓ Conclusion: At α = 0.05, there is sufficient evidence to conclude the true satisfaction rate is below the claimed 95%. The company should investigate the drop in customer satisfaction.

What the P-Value Actually Means

The p-value is one of the most misunderstood concepts in statistics. Here is the precise definition:

Definition — P-Value

The p-value is the probability of observing a test statistic as extreme as the one calculated from the sample data, assuming the null hypothesis is true. A small p-value means the observed result would be unlikely if H₀ were true — giving reason to doubt H₀.

p-value = P(test statistic ≥ observed | H₀ is true)

❌

Three Things the P-Value Is NOT

1. The p-value is NOT the probability that H₀ is true. 2. The p-value is NOT the probability that you made an error. 3. A small p-value does NOT mean the effect is large or practically important — a trivially small difference can produce a tiny p-value with a large sample. Always report effect size alongside p-values.

p < 0.01

Very strong evidence against H₀

p < 0.05

Standard threshold for significance

p < 0.10

Marginal evidence (exploratory work)

p > 0.10

Weak/no evidence against H₀

One-Tailed vs Two-Tailed Tests

The tail direction is determined entirely by the alternative hypothesis, and it must be chosen before data collection. Changing from a two-tailed to a one-tailed test after seeing your results to achieve significance is called p-hacking and invalidates the test.

Feature	Two-Tailed Test	One-Tailed Test
Alternative hypothesis	H₁: μ ≠ μ₀	H₁: μ > μ₀ or H₁: μ < μ₀
Where rejection region lies	Split between both tails (α/2 each)	Entirely in one tail (all of α)
Critical z at α = 0.05	±1.96	+1.645 (right) or −1.645 (left)
When to use	When you want to detect change in either direction	When theory predicts a specific direction before data collection
More conservative?	Yes — harder to reject H₀	No — easier to reject in the predicted direction
Common example	"Has the mean changed from 50?"	"Is the mean greater than 50?"

Rejection Regions: Two-Tailed vs One-Tailed (α = 0.05)

Red shading = rejection region for two-tailed test. Green shading = rejection region for right-tailed test. Same α = 0.05, different critical values.

Type I and Type II Errors

Every hypothesis test carries two types of possible mistakes. Understanding them prevents misinterpreting results and helps researchers design studies with enough statistical power to detect real effects.

Decision \ Reality	H₀ is Actually TRUE	H₀ is Actually FALSE
Reject H₀	❌ Type I Error (False Positive) Probability = α	✓ Correct Decision Probability = 1 − β (Power)
Fail to Reject H₀	✓ Correct Decision Probability = 1 − α	⚠️ Type II Error (False Negative) Probability = β

Type I Error

False Positive

P(Type I) = α

Rejecting H₀ when it is actually true. Controlled directly by your choice of α. A drug trial commits a Type I error when it concludes a useless drug is effective. Lowering α reduces Type I error risk but increases Type II error risk.

Type II Error

False Negative

P(Type II) = β

Failing to reject H₀ when it is actually false. Reduced by increasing sample size, increasing effect size, or raising α. A drug trial commits a Type II error when it misses a genuinely effective drug.

Statistical Power

Probability of Correct Rejection

Power = 1 − β

The probability of correctly rejecting a false H₀. Researchers typically aim for power ≥ 0.80, meaning an 80% chance of detecting a true effect. Power increases with larger sample size and larger effect size.

Academic Source

Type I and Type II Errors in Medical Research

The distinction between Type I and Type II errors was formalized by Jerzy Neyman and Egon Pearson in their foundational 1933 paper on hypothesis testing. In clinical trial design, a Type I error rate of α = 0.05 combined with a target power of 1 − β = 0.80 is the standard that determines minimum required sample sizes. The U.S. Food and Drug Administration's guidance on clinical trials requires explicit pre-specification of α and power for drug approval studies. See Banerjee et al. (2009) in the Indian Journal of Dermatology for a clear clinical overview.

Real-World Case Studies

Case Study 1: A/B Testing in Digital Marketing

Real-World Application

Conversion Rate Optimization

An e-commerce company runs two versions of a product page simultaneously. Version A (control) shows a historical conversion rate of 4.2%. Version B (new design) converts 4.8% of visitors out of a sample of 2,500 per group. The team wants to know if the new design genuinely outperforms the control or if the difference is within normal random variation.

Hypotheses: H₀: p_B = p_A (no difference) | H₁: p_B > p_A (Version B converts better)

Result: z = 2.21, p-value = 0.014. At α = 0.05, reject H₀. The difference is statistically significant — the new design produces a real improvement in conversion rate. The company rolls out Version B globally, resulting in an estimated $340,000 increase in annual revenue from a 0.6 percentage-point lift.

Case Study 2: Clinical Drug Trial

Real-World Application

Blood Pressure Medication Efficacy

A clinical trial tests whether a new antihypertensive drug reduces mean systolic blood pressure below the placebo group's mean of 145 mmHg. After 12 weeks, 120 patients on the drug show a mean of 138 mmHg with s = 15 mmHg.

Hypotheses: H₀: μ ≥ 145 mmHg | H₁: μ < 145 mmHg (one-tailed, left)

Result: t = (138 − 145) / (15 / √120) = −7 / 1.369 ≈ −5.11, p < 0.0001. Extremely strong evidence to reject H₀. The drug reduces blood pressure. The study proceeds to Phase III trials — but the research team also reports effect size (Cohen's d ≈ 0.47, a medium effect) to show clinical meaningfulness, not just statistical significance.

Case Study 3: Manufacturing Quality Control

Real-World Application

Production Line Defect Rate

A manufacturer's specification requires a defect rate no higher than 2%. Quality control inspects 500 units and finds 15 defective (p̂ = 0.03, or 3%). Is this evidence the process is out of control?

Hypotheses: H₀: p ≤ 0.02 | H₁: p > 0.02 (one-tailed, right)

Result: z = (0.03 − 0.02) / √(0.02 × 0.98 / 500) = 0.01 / 0.00626 ≈ 1.60. p-value ≈ 0.055. At α = 0.05, fail to reject H₀. At α = 0.10, reject H₀. This borderline result prompts the quality team to increase sampling to 1,000 units — a correct application of the "insufficient power" reasoning for inconclusive results.

Interactive Hypothesis Testing Calculator

Enter your sample data below to compute the test statistic, p-value, and a plain-language decision. The calculator handles one-sample z-tests and one-sample t-tests for means.

Hypothesis Testing Calculator

Sample Mean (x̄)

Hypothesized Mean (μ₀)

Population SD (σ)

Sample Size (n)

Significance Level (α)

Test Direction

Sample Mean (x̄)

Hypothesized Mean (μ₀)

Sample SD (s)

Sample Size (n)

Significance Level (α)

Test Direction

Null vs Alternative Hypothesis: Full Comparison

Concept	Null Hypothesis (H₀)	Alternative Hypothesis (H₁)
Symbol	H₀	H₁ or Hₐ
What it claims	No effect, no difference, no relationship	An effect, difference, or relationship exists
Mathematical sign	Always includes = (equals, ≤, or ≥)	Always uses ≠, >, or <
Role in the test	Default assumption being tested (skeptical position)	Research claim trying to gain support
What data does	Either rejects or fails to reject H₀	Is supported when H₀ is rejected
Can you prove it?	No — only reject or fail to reject	Supported (not proven) when H₀ is rejected
Error if wrong decision	Falsely rejecting = Type I error (α)	Failing to support when true = Type II error (β)
Example (mean)	H₀: μ = 100	H₁: μ ≠ 100 (two-tailed)
Example (proportion)	H₀: p = 0.50	H₁: p > 0.50 (right-tailed)

Key Terms and Formulas Glossary

Term	Formula / Notation	Definition
Null Hypothesis	H₀: μ = μ₀	Default assumption of no effect or difference. Contains an equality sign. Never proven, only rejected or retained.
Alternative Hypothesis	H₁: μ ≠ μ₀	The research claim being tested. Contains an inequality. Supported when H₀ is rejected.
Significance Level	α (commonly 0.05)	The threshold probability for rejecting H₀. Represents the maximum acceptable Type I error rate.
P-Value	P(data \| H₀ true)	Probability of observing results as extreme as the sample data if H₀ were true. Reject H₀ when p ≤ α.
Z-Test Statistic	z = (x̄ − μ₀) / (σ/√n)	Standardized test statistic used when population σ is known or n ≥ 30. Compared to z-distribution critical values.
T-Test Statistic	t = (x̄ − μ₀) / (s/√n)	Test statistic used when population σ is unknown. Uses sample SD s. Degrees of freedom = n − 1.
Critical Value	z* or t*	The boundary between the rejection region and non-rejection region. Reject H₀ if \|test statistic\| > critical value.
Type I Error	P = α	Rejecting a true H₀ (false positive). Probability equals α. In medicine: concluding a useless drug works.
Type II Error	P = β	Failing to reject a false H₀ (false negative). Reduced by increasing sample size. In medicine: missing a real drug effect.
Statistical Power	1 − β	Probability of correctly rejecting a false H₀. Target ≥ 0.80. Increases with larger n and larger true effect size.
Standard Error	SE = σ/√n or s/√n	Standard deviation of the sampling distribution of x̄. Measures precision of the sample mean estimate.
Confidence Interval	x̄ ± z* × SE	Range of plausible values for the population parameter. A 95% CI corresponds to a two-tailed α = 0.05 test.

Practice Problems

Work through these problems before checking the answers. Each one uses the 6-step framework from Section 4.

Beginner Level

A coffee shop claims its medium drinks contain exactly 16 oz. A customer measures 10 drinks and finds a mean of 15.6 oz with s = 0.8 oz. At α = 0.05, is there evidence the mean differs from 16 oz?

Show Answer ▼

H₀: μ = 16 oz | H₁: μ ≠ 16 oz (two-tailed)
t = (15.6 − 16) / (0.8 / √10) = −0.4 / 0.253 = −1.58
df = 9. Critical value at α = 0.05, two-tailed: t* = ±2.262
|−1.58| < 2.262 → Fail to reject H₀.
Conclusion: At α = 0.05, there is insufficient evidence to conclude the mean fill differs from 16 oz. The sample of 10 is small and the result is inconclusive — a larger sample would be needed.

A coin is flipped 100 times and lands heads 58 times. At α = 0.05, is there evidence the coin is biased (not fair)?

Show Answer ▼

H₀: p = 0.50 | H₁: p ≠ 0.50 (two-tailed)
p̂ = 58/100 = 0.58
z = (0.58 − 0.50) / √(0.50 × 0.50 / 100) = 0.08 / 0.05 = 1.60
Critical value: z* = ±1.96. |1.60| < 1.96 → Fail to reject H₀.
p-value ≈ 0.110 > 0.05.
Conclusion: At α = 0.05, the 100-flip sample does not provide sufficient evidence to conclude the coin is biased. 58 heads in 100 flips is within the expected range of variation for a fair coin.

Intermediate Level

A factory claims its batteries last on average at least 500 hours. Quality control tests 36 batteries and finds x̄ = 492 hours with σ = 24 hours. At α = 0.01, does the data support the factory's claim?

Show Answer ▼

H₀: μ ≥ 500 | H₁: μ < 500 (left-tailed; we test whether evidence contradicts the claim)
z = (492 − 500) / (24 / √36) = −8 / 4 = −2.00
Critical value at α = 0.01 (left-tail): z* = −2.326
−2.00 > −2.326 → Fail to reject H₀ at α = 0.01.
p-value = 0.023. At α = 0.05 we would reject H₀; at α = 0.01 we do not.
Conclusion: At the stricter α = 0.01 significance level, there is insufficient evidence to reject the factory's claim. At α = 0.05, the evidence would be significant — illustrating how the choice of α changes the conclusion.

A sample of 25 students from a new teaching method scores x̄ = 82 with s = 10. The national mean is μ₀ = 78. At α = 0.05, is there evidence the new method improves scores?

Show Answer ▼

H₀: μ ≤ 78 | H₁: μ > 78 (right-tailed)
t = (82 − 78) / (10 / √25) = 4 / 2 = 2.00
df = 24. Critical value at α = 0.05 (right-tail): t* = 1.711
2.00 > 1.711 → Reject H₀. p-value ≈ 0.028.
Conclusion: At α = 0.05, there is sufficient evidence to conclude the new teaching method produces higher scores than the national mean of 78. The difference of 4 points is statistically significant with this sample.

Advanced Level

Researchers test whether a new website layout increases click-through rates from the historical 8%. In an A/B test, 1,200 visitors see the new layout and 120 click through (p̂ = 0.10). At α = 0.05, does the new layout perform better? Also identify: what type of error would occur if we reject H₀ when the true rate is actually still 8%?

Show Answer ▼

H₀: p ≤ 0.08 | H₁: p > 0.08 (right-tailed)
Verify: n × p₀ = 1200 × 0.08 = 96 ≥ 10 ✓
z = (0.10 − 0.08) / √(0.08 × 0.92 / 1200) = 0.02 / √(0.0000613) = 0.02 / 0.00783 ≈ 2.55
Critical value: z* = 1.645. 2.55 > 1.645 → Reject H₀. p-value ≈ 0.0054.
Conclusion: At α = 0.05, the new layout significantly increases click-through rate above 8%.
Error type: Rejecting H₀ when the true rate is still 8% would be a Type I error (false positive) — concluding the new design works when it actually does not. The probability of this error is α = 0.05.

Hypothesis testing connects deeply to several other areas of statistics fundamentals. The guides below build directly on what you have learned here.

Next Step

Continue With These Pages

Hypothesis Testing Overview · Hypothesis Testing Examples · ANOVA · Chi-Square Test · Paired Samples T-Test · T-Test Calculator · Z-Score Calculator · Normal Distribution · Sampling Distributions · Statistics Glossary

External Resources

Authoritative References

MIT OpenCourseWare: Statistics for Applications — covers hypothesis testing rigorously with problem sets. · Khan Academy: Significance Tests — free video lessons on null and alternative hypotheses. · StatTrek Hypothesis Testing Reference — concise formulas and critical value tables. · Nature Methods: Importance of Being Uncertain — explains p-values and errors for life sciences researchers.

Null and Alternative Hypothesis: Core Definitions

The Court Trial Analogy: How to Think About Hypothesis Testing

How to Write Null and Alternative Hypotheses

Null always uses equality

Alternative defines the test direction

Test population parameters

H₀ and H₁ cover all cases

Writing Hypotheses: Six Real Scenarios

The 6-Step Hypothesis Testing Framework

The Universal 6-Step Process

Test Statistic Formulas

Step-by-Step Worked Examples

Example 1: Z-Test (Two-Tailed)

A quality control engineer samples 50 bolts from a production line. The historical mean bolt diameter is 10 mm with a known population standard deviation of 0.5 mm. The sample mean is 10.14 mm. At α = 0.05, is there evidence that the mean diameter has changed?

Example 2: One-Sample T-Test (One-Tailed)

A school claims its students score above the national average of 72 on a standardized test. A sample of 16 students has a mean score of 76 with a sample standard deviation of 8. At α = 0.05, does the evidence support the school's claim?

Example 3: Proportion Z-Test

A tech company claims its product has a 95% customer satisfaction rate. In a random survey of 200 customers, 184 report being satisfied (p̂ = 0.92). At α = 0.05, is there evidence the true satisfaction rate is below 95%?

What the P-Value Actually Means

One-Tailed vs Two-Tailed Tests

Rejection Regions: Two-Tailed vs One-Tailed (α = 0.05)

Type I and Type II Errors

False Positive

False Negative

Probability of Correct Rejection

Type I and Type II Errors in Medical Research

Real-World Case Studies

Case Study 1: A/B Testing in Digital Marketing

Real-World Application

Conversion Rate Optimization

Case Study 2: Clinical Drug Trial

Real-World Application

Blood Pressure Medication Efficacy

Case Study 3: Manufacturing Quality Control

Real-World Application

Production Line Defect Rate

Interactive Hypothesis Testing Calculator

Hypothesis Testing Calculator

Null vs Alternative Hypothesis: Full Comparison

Key Terms and Formulas Glossary

Practice Problems

Beginner Level

Intermediate Level

Advanced Level

Continue Learning: Related Topics

One-Sample T-Test

Two-Sample T-Test

Confidence Intervals

Z-Score and the Normal Distribution

Continue With These Pages

Authoritative References