Hypothesis Testing Inferential Statistics AP Statistics / Research 25 min read June 5, 2026
BY: Statistics Fundamentals Team
Reviewed By: Minsa A (Senior Statistics Editor)

Null and Alternative Hypothesis Explained with Examples

A drug trial enrolls 200 patients. Researchers need a structured, objective way to ask: does this drug actually lower blood pressure, or could the results we are seeing just be random chance? The answer runs through two statements called hypotheses — H₀ (the null) and H₁ (the alternative) — and data decides which one the evidence supports.

This guide builds hypothesis testing from the ground up. It covers how to write H₀ and H₁, what p-values and significance levels actually mean, how to run a z-test and a t-test step by step, and how to avoid the classic errors that trip up students and researchers alike. The interactive calculator at the end lets you test your own data instantly.

What You'll Learn
  • ✓ The exact definitions of H₀ and H₁ and how they relate to each other
  • ✓ A repeatable 6-step framework: from research question to statistical decision
  • ✓ How to write correct null and alternative hypotheses for any scenario
  • ✓ Step-by-step worked examples using z-tests and t-tests with real numbers
  • ✓ One-tailed vs two-tailed tests — when each applies and why it matters
  • ✓ Type I and Type II errors, with a decision matrix you can reuse
  • ✓ Six real-world case studies: A/B testing, medicine, manufacturing, and more
  • ✓ An interactive hypothesis testing calculator for z-tests and t-tests

Null and Alternative Hypothesis: Core Definitions

Definition — Null Hypothesis (H₀)
The null hypothesis is the default assumption that there is no effect, no difference, and no relationship in the population. It is the statement that a statistical test is designed to challenge. Researchers never prove the null hypothesis true — they either reject it or fail to reject it based on data.
H₀: μ = μ₀  |  H₀: p = p₀  |  H₀: μ₁ = μ₂
Definition — Alternative Hypothesis (H₁ or Hₐ)
The alternative hypothesis is the claim the researcher is trying to support — that a real effect, difference, or relationship exists in the population. It is what the test is built to detect. If evidence is strong enough, the researcher rejects H₀ in favor of H₁.
H₁: μ ≠ μ₀  |  H₁: μ > μ₀  |  H₁: μ < μ₀

Every statistical test in science, medicine, business analytics, and social research uses this same structure. The null and alternative hypothesis together form a mutually exclusive and exhaustive pair — exactly one of them describes the true state of the population, and the data helps you determine which one the evidence favors.

⚡ Quick Reference — H₀ vs H₁ at a Glance
  • Null hypothesis (H₀): No effect, no difference. Always contains an equality sign (=, ≤, ≥)
  • Alternative hypothesis (H₁): Effect exists. Contains ≠, >, or <
  • You test H₀: The goal is to find evidence strong enough to reject it
  • Rejection rule: Reject H₀ when p-value ≤ α (significance level)
  • "Fail to reject": Not the same as proving H₀ true — the data was simply inconclusive
  • Common α values: 0.05 (most fields), 0.01 (medicine), 0.10 (exploratory research)

The Court Trial Analogy: How to Think About Hypothesis Testing

The easiest way to understand the null vs alternative hypothesis structure is through a court trial. The logic is identical.

Court Trial Hypothesis Testing What It Means
Defendant is innocent until proven guiltyH₀ is assumed true until evidence says otherwiseDefault assumption requires no evidence to accept
Prosecution presents evidenceResearcher collects data and computes test statisticEvidence evaluated against the default assumption
"Beyond reasonable doubt"p-value ≤ α (significance level)Evidence must cross a defined threshold
Verdict: GuiltyReject H₀, support H₁Evidence is strong enough to overturn the default
Verdict: Not GuiltyFail to reject H₀Insufficient evidence — not proof of innocence

Notice that "not guilty" is not the same as "innocent." The jury does not conclude the defendant is innocent; they conclude the evidence did not meet the required standard. Hypothesis testing works exactly the same way. Failing to reject H₀ never means you have proven H₀ is true. It means your data, at the chosen significance level, was not strong enough to reject it.

⚠️
The Most Common Misconception

Many students write "we accept H₀" when a test is not significant. This is statistically incorrect. The correct language is "we fail to reject H₀." You can only reject or fail to reject a null hypothesis — you never accept or prove it. This distinction matters in published research and on statistics exams.

How to Write Null and Alternative Hypotheses

Good hypotheses share four properties: they are clear, testable, mutually exclusive, and stated in terms of population parameters (not sample statistics). The parameter is the true value in the whole population; the statistic is what you calculate from your sample.

H₀ Rule

Null always uses equality

H₀: μ = μ₀

The null hypothesis always contains =, ≤, or ≥. It represents the status quo. Write the value you are comparing against as μ₀ or p₀.

H₁ Rule

Alternative defines the test direction

H₁: μ ≠, >, or < μ₀

Use ≠ for a two-tailed test (either direction). Use > or < for a directional one-tailed test when theory predicts a specific direction.

Parameter Rule

Test population parameters

μ, p, σ, μ₁−μ₂

Hypotheses are about population parameters (μ, p), not sample statistics (x̄, p̂). You collect sample data to make inferences about the population.

Exclusivity Rule

H₀ and H₁ cover all cases

H₀ ∪ H₁ = all outcomes

Together, H₀ and H₁ must account for every possible value of the parameter. No value should fall into neither hypothesis.

Writing Hypotheses: Six Real Scenarios

Research Scenario H₀ H₁ Test Type
Does a new drug lower mean blood pressure below 120 mmHg? H₀: μ ≥ 120 H₁: μ < 120 One-tailed (left)
Has a website's conversion rate changed from the historical 3.5%? H₀: p = 0.035 H₁: p ≠ 0.035 Two-tailed
Does a training program improve test scores above the national mean of 75? H₀: μ ≤ 75 H₁: μ > 75 One-tailed (right)
Do two manufacturing machines produce parts with equal mean diameters? H₀: μ₁ = μ₂ H₁: μ₁ ≠ μ₂ Two-tailed
Is the defect rate in a production line above the acceptable 2%? H₀: p ≤ 0.02 H₁: p > 0.02 One-tailed (right)
Is mean customer satisfaction different between two service designs? H₀: μ₁ − μ₂ = 0 H₁: μ₁ − μ₂ ≠ 0 Two-tailed

The 6-Step Hypothesis Testing Framework

Every hypothesis test follows the same logical sequence. Learn this order once and you can apply it to any statistical test — z-test, t-test, chi-square, ANOVA, or regression coefficient test.

Framework

The Universal 6-Step Process

1

State the hypotheses (H₀ and H₁). Write both hypotheses using formal parameter notation. Decide at this step whether the test is one-tailed or two-tailed based on the research question — never adjust the direction of the test after seeing the data.

2

Set the significance level (α). Choose α before collecting data. Common choices are 0.05 (general research), 0.01 (clinical/medical studies requiring stricter evidence), and 0.10 (exploratory pilot studies). α defines the risk of a Type I error you are willing to accept.

3

Choose the test and verify assumptions. Select the appropriate test based on data type, sample size, and whether population parameters are known. A statistical test selector can help. Check normality, independence, and homogeneity of variance as needed.

4

Collect data and calculate the test statistic. Compute the standardized test statistic (z-score or t-score) from your sample data. This single number summarizes how far your sample result is from what H₀ predicts, measured in standard errors.

5

Find the p-value and compare to α. The p-value is the probability of observing a test statistic as extreme as yours (or more extreme) if H₀ were true. If p ≤ α, the evidence is strong enough to reject H₀. If p > α, fail to reject H₀.

6

State the conclusion in context. Write a plain-language conclusion that references the original research question. Never just write "reject H₀" — explain what that means for the specific problem. For example: "There is sufficient evidence at α = 0.05 to conclude that the new drug lowers mean blood pressure below 120 mmHg."

Test Statistic Formulas

The test statistic converts your sample result into a standardized number that can be looked up in a distribution table or converted to a p-value. The two most common tests for means are the z-test (when population standard deviation σ is known) and the t-test (when σ is unknown and estimated from the sample).

Z-Test Statistic — Population SD Known
z = (x̄ − μ₀) / (σ / √n)
Use when: population σ is known, OR n ≥ 30 (Central Limit Theorem applies)
= sample mean μ₀ = hypothesized population mean (from H₀) σ = population standard deviation n = sample size
One-Sample T-Test Statistic — Population SD Unknown
t = (x̄ − μ₀) / (s / √n)
Use when: population σ is unknown, sample SD s is used as an estimate, df = n − 1
= sample mean μ₀ = hypothesized population mean s = sample standard deviation n = sample size
One-Sample Proportion Z-Test
z = (p̂ − p₀) / √[p₀(1 − p₀) / n]
Use when: testing a population proportion, np₀ ≥ 10 and n(1 − p₀) ≥ 10
= sample proportion p₀ = hypothesized population proportion (from H₀) n = sample size
💡
Z-Test vs T-Test: Which Should You Use?

Use a z-test when the population standard deviation (σ) is known, or when your sample size is 30 or larger (the Central Limit Theorem makes the sampling distribution approximately normal regardless). Use a t-test when σ is unknown and your sample is small (n < 30). In practice, σ is almost never known for real-world problems, so the t-test is more commonly used.

Step-by-Step Worked Examples

Example 1: Z-Test (Two-Tailed)

Worked Example — Z-Test

A quality control engineer samples 50 bolts from a production line. The historical mean bolt diameter is 10 mm with a known population standard deviation of 0.5 mm. The sample mean is 10.14 mm. At α = 0.05, is there evidence that the mean diameter has changed?

1

State hypotheses: H₀: μ = 10 mm  |  H₁: μ ≠ 10 mm. This is a two-tailed test — the engineer wants to detect any change, not just an increase or decrease.

2

Set significance level: α = 0.05 (given). For a two-tailed test, the critical region is split between both tails: α/2 = 0.025 per tail. Critical z-values are ±1.96.

3

Calculate test statistic: z = (x̄ − μ₀) / (σ / √n) = (10.14 − 10) / (0.5 / √50) = 0.14 / 0.0707 = 1.98

4

Find p-value: z = 1.98 corresponds to a one-tail area of 0.0239. For a two-tailed test: p-value = 2 × 0.0239 = 0.0478.

5

Decision: p-value (0.0478) ≤ α (0.05) → Reject H₀. The test statistic (1.98) also falls in the rejection region (|z| > 1.96).

✓ Conclusion: At α = 0.05, there is sufficient statistical evidence to conclude that the mean bolt diameter has changed from 10 mm. The production line may need recalibration.

Example 2: One-Sample T-Test (One-Tailed)

Worked Example — T-Test

A school claims its students score above the national average of 72 on a standardized test. A sample of 16 students has a mean score of 76 with a sample standard deviation of 8. At α = 0.05, does the evidence support the school's claim?

1

State hypotheses: H₀: μ ≤ 72  |  H₁: μ > 72. This is a right-tailed test — the school claims scores are above the national average, a directional claim.

2

Set significance level: α = 0.05. Degrees of freedom: df = n − 1 = 15. For a right-tailed t-test at α = 0.05 with df = 15, the critical value is t* = 1.753.

3

Calculate test statistic: t = (x̄ − μ₀) / (s / √n) = (76 − 72) / (8 / √16) = 4 / 2 = 2.00

4

Find p-value: Using the t-distribution with df = 15, t = 2.00 gives a one-tail p-value ≈ 0.032.

5

Decision: p-value (0.032) ≤ α (0.05) → Reject H₀. The test statistic (2.00) also exceeds the critical value (1.753).

✓ Conclusion: At α = 0.05, there is sufficient evidence to support the school's claim that students score above the national average of 72. The sample mean of 76 is statistically significantly higher.

Example 3: Proportion Z-Test

Worked Example — Proportion Test

A tech company claims its product has a 95% customer satisfaction rate. In a random survey of 200 customers, 184 report being satisfied (p̂ = 0.92). At α = 0.05, is there evidence the true satisfaction rate is below 95%?

1

State hypotheses: H₀: p ≥ 0.95  |  H₁: p < 0.95. Left-tailed test — the question is whether satisfaction has fallen below the claimed rate.

2

Verify conditions: np₀ = 200 × 0.95 = 190 ≥ 10. n(1 − p₀) = 200 × 0.05 = 10 ≥ 10. ✓ Conditions met. Critical value at α = 0.05 (left-tail): z* = −1.645.

3

Calculate test statistic: z = (0.92 − 0.95) / √[0.95 × 0.05 / 200] = −0.03 / √(0.0002375) = −0.03 / 0.01541 ≈ −1.95

4

Find p-value: z = −1.95 gives a left-tail area ≈ 0.026.

5

Decision: p-value (0.026) ≤ α (0.05) → Reject H₀.

✓ Conclusion: At α = 0.05, there is sufficient evidence to conclude the true satisfaction rate is below the claimed 95%. The company should investigate the drop in customer satisfaction.

What the P-Value Actually Means

The p-value is one of the most misunderstood concepts in statistics. Here is the precise definition:

Definition — P-Value
The p-value is the probability of observing a test statistic as extreme as the one calculated from the sample data, assuming the null hypothesis is true. A small p-value means the observed result would be unlikely if H₀ were true — giving reason to doubt H₀.
p-value = P(test statistic ≥ observed | H₀ is true)
Three Things the P-Value Is NOT

1. The p-value is NOT the probability that H₀ is true.   2. The p-value is NOT the probability that you made an error.   3. A small p-value does NOT mean the effect is large or practically important — a trivially small difference can produce a tiny p-value with a large sample. Always report effect size alongside p-values.

p < 0.01
Very strong evidence against H₀
p < 0.05
Standard threshold for significance
p < 0.10
Marginal evidence (exploratory work)
p > 0.10
Weak/no evidence against H₀

One-Tailed vs Two-Tailed Tests

The tail direction is determined entirely by the alternative hypothesis, and it must be chosen before data collection. Changing from a two-tailed to a one-tailed test after seeing your results to achieve significance is called p-hacking and invalidates the test.

Feature Two-Tailed Test One-Tailed Test
Alternative hypothesisH₁: μ ≠ μ₀H₁: μ > μ₀ or H₁: μ < μ₀
Where rejection region liesSplit between both tails (α/2 each)Entirely in one tail (all of α)
Critical z at α = 0.05±1.96+1.645 (right) or −1.645 (left)
When to useWhen you want to detect change in either directionWhen theory predicts a specific direction before data collection
More conservative?Yes — harder to reject H₀No — easier to reject in the predicted direction
Common example"Has the mean changed from 50?""Is the mean greater than 50?"

Rejection Regions: Two-Tailed vs One-Tailed (α = 0.05)

Two-Tailed Test (H₁: μ ≠ μ₀) α/2 = 0.025 α/2 = 0.025 Do not reject H₀ z = −1.96 z = +1.96 Right-Tailed Test (H₁: μ > μ₀) α = 0.05 Do not reject H₀ z = +1.645

Red shading = rejection region for two-tailed test. Green shading = rejection region for right-tailed test. Same α = 0.05, different critical values.

Type I and Type II Errors

Every hypothesis test carries two types of possible mistakes. Understanding them prevents misinterpreting results and helps researchers design studies with enough statistical power to detect real effects.

Decision \ Reality H₀ is Actually TRUE H₀ is Actually FALSE
Reject H₀ ❌ Type I Error (False Positive)
Probability = α
✓ Correct Decision
Probability = 1 − β (Power)
Fail to Reject H₀ ✓ Correct Decision
Probability = 1 − α
⚠️ Type II Error (False Negative)
Probability = β
Type I Error

False Positive

P(Type I) = α

Rejecting H₀ when it is actually true. Controlled directly by your choice of α. A drug trial commits a Type I error when it concludes a useless drug is effective. Lowering α reduces Type I error risk but increases Type II error risk.

Type II Error

False Negative

P(Type II) = β

Failing to reject H₀ when it is actually false. Reduced by increasing sample size, increasing effect size, or raising α. A drug trial commits a Type II error when it misses a genuinely effective drug.

Statistical Power

Probability of Correct Rejection

Power = 1 − β

The probability of correctly rejecting a false H₀. Researchers typically aim for power ≥ 0.80, meaning an 80% chance of detecting a true effect. Power increases with larger sample size and larger effect size.

Academic Source

Type I and Type II Errors in Medical Research

The distinction between Type I and Type II errors was formalized by Jerzy Neyman and Egon Pearson in their foundational 1933 paper on hypothesis testing. In clinical trial design, a Type I error rate of α = 0.05 combined with a target power of 1 − β = 0.80 is the standard that determines minimum required sample sizes. The U.S. Food and Drug Administration's guidance on clinical trials requires explicit pre-specification of α and power for drug approval studies. See Banerjee et al. (2009) in the Indian Journal of Dermatology for a clear clinical overview.

Real-World Case Studies

Case Study 1: A/B Testing in Digital Marketing

Real-World Application

Conversion Rate Optimization

An e-commerce company runs two versions of a product page simultaneously. Version A (control) shows a historical conversion rate of 4.2%. Version B (new design) converts 4.8% of visitors out of a sample of 2,500 per group. The team wants to know if the new design genuinely outperforms the control or if the difference is within normal random variation.

Hypotheses: H₀: p_B = p_A (no difference)  |  H₁: p_B > p_A (Version B converts better)

Result: z = 2.21, p-value = 0.014. At α = 0.05, reject H₀. The difference is statistically significant — the new design produces a real improvement in conversion rate. The company rolls out Version B globally, resulting in an estimated $340,000 increase in annual revenue from a 0.6 percentage-point lift.

Case Study 2: Clinical Drug Trial

Real-World Application

Blood Pressure Medication Efficacy

A clinical trial tests whether a new antihypertensive drug reduces mean systolic blood pressure below the placebo group's mean of 145 mmHg. After 12 weeks, 120 patients on the drug show a mean of 138 mmHg with s = 15 mmHg.

Hypotheses: H₀: μ ≥ 145 mmHg  |  H₁: μ < 145 mmHg (one-tailed, left)

Result: t = (138 − 145) / (15 / √120) = −7 / 1.369 ≈ −5.11, p < 0.0001. Extremely strong evidence to reject H₀. The drug reduces blood pressure. The study proceeds to Phase III trials — but the research team also reports effect size (Cohen's d ≈ 0.47, a medium effect) to show clinical meaningfulness, not just statistical significance.

Case Study 3: Manufacturing Quality Control

Real-World Application

Production Line Defect Rate

A manufacturer's specification requires a defect rate no higher than 2%. Quality control inspects 500 units and finds 15 defective (p̂ = 0.03, or 3%). Is this evidence the process is out of control?

Hypotheses: H₀: p ≤ 0.02  |  H₁: p > 0.02 (one-tailed, right)

Result: z = (0.03 − 0.02) / √(0.02 × 0.98 / 500) = 0.01 / 0.00626 ≈ 1.60. p-value ≈ 0.055. At α = 0.05, fail to reject H₀. At α = 0.10, reject H₀. This borderline result prompts the quality team to increase sampling to 1,000 units — a correct application of the "insufficient power" reasoning for inconclusive results.

Interactive Hypothesis Testing Calculator

Enter your sample data below to compute the test statistic, p-value, and a plain-language decision. The calculator handles one-sample z-tests and one-sample t-tests for means.

Hypothesis Testing Calculator

Null vs Alternative Hypothesis: Full Comparison

Concept Null Hypothesis (H₀) Alternative Hypothesis (H₁)
SymbolH₀H₁ or Hₐ
What it claimsNo effect, no difference, no relationshipAn effect, difference, or relationship exists
Mathematical signAlways includes = (equals, ≤, or ≥)Always uses ≠, >, or <
Role in the testDefault assumption being tested (skeptical position)Research claim trying to gain support
What data doesEither rejects or fails to reject H₀Is supported when H₀ is rejected
Can you prove it?No — only reject or fail to rejectSupported (not proven) when H₀ is rejected
Error if wrong decisionFalsely rejecting = Type I error (α)Failing to support when true = Type II error (β)
Example (mean)H₀: μ = 100H₁: μ ≠ 100 (two-tailed)
Example (proportion)H₀: p = 0.50H₁: p > 0.50 (right-tailed)

Key Terms and Formulas Glossary

Term Formula / Notation Definition
Null Hypothesis H₀: μ = μ₀ Default assumption of no effect or difference. Contains an equality sign. Never proven, only rejected or retained.
Alternative Hypothesis H₁: μ ≠ μ₀ The research claim being tested. Contains an inequality. Supported when H₀ is rejected.
Significance Level α (commonly 0.05) The threshold probability for rejecting H₀. Represents the maximum acceptable Type I error rate.
P-Value P(data | H₀ true) Probability of observing results as extreme as the sample data if H₀ were true. Reject H₀ when p ≤ α.
Z-Test Statistic z = (x̄ − μ₀) / (σ/√n) Standardized test statistic used when population σ is known or n ≥ 30. Compared to z-distribution critical values.
T-Test Statistic t = (x̄ − μ₀) / (s/√n) Test statistic used when population σ is unknown. Uses sample SD s. Degrees of freedom = n − 1.
Critical Value z* or t* The boundary between the rejection region and non-rejection region. Reject H₀ if |test statistic| > critical value.
Type I Error P = α Rejecting a true H₀ (false positive). Probability equals α. In medicine: concluding a useless drug works.
Type II Error P = β Failing to reject a false H₀ (false negative). Reduced by increasing sample size. In medicine: missing a real drug effect.
Statistical Power 1 − β Probability of correctly rejecting a false H₀. Target ≥ 0.80. Increases with larger n and larger true effect size.
Standard Error SE = σ/√n or s/√n Standard deviation of the sampling distribution of x̄. Measures precision of the sample mean estimate.
Confidence Interval x̄ ± z* × SE Range of plausible values for the population parameter. A 95% CI corresponds to a two-tailed α = 0.05 test.

Practice Problems

Work through these problems before checking the answers. Each one uses the 6-step framework from Section 4.

Beginner Level

1
A coffee shop claims its medium drinks contain exactly 16 oz. A customer measures 10 drinks and finds a mean of 15.6 oz with s = 0.8 oz. At α = 0.05, is there evidence the mean differs from 16 oz?
Show Answer ▼

H₀: μ = 16 oz  |  H₁: μ ≠ 16 oz (two-tailed)
t = (15.6 − 16) / (0.8 / √10) = −0.4 / 0.253 = −1.58
df = 9. Critical value at α = 0.05, two-tailed: t* = ±2.262
|−1.58| < 2.262 → Fail to reject H₀.
Conclusion: At α = 0.05, there is insufficient evidence to conclude the mean fill differs from 16 oz. The sample of 10 is small and the result is inconclusive — a larger sample would be needed.

2
A coin is flipped 100 times and lands heads 58 times. At α = 0.05, is there evidence the coin is biased (not fair)?
Show Answer ▼

H₀: p = 0.50  |  H₁: p ≠ 0.50 (two-tailed)
p̂ = 58/100 = 0.58
z = (0.58 − 0.50) / √(0.50 × 0.50 / 100) = 0.08 / 0.05 = 1.60
Critical value: z* = ±1.96.  |1.60| < 1.96 → Fail to reject H₀.
p-value ≈ 0.110 > 0.05.
Conclusion: At α = 0.05, the 100-flip sample does not provide sufficient evidence to conclude the coin is biased. 58 heads in 100 flips is within the expected range of variation for a fair coin.

Intermediate Level

3
A factory claims its batteries last on average at least 500 hours. Quality control tests 36 batteries and finds x̄ = 492 hours with σ = 24 hours. At α = 0.01, does the data support the factory's claim?
Show Answer ▼

H₀: μ ≥ 500  |  H₁: μ < 500 (left-tailed; we test whether evidence contradicts the claim)
z = (492 − 500) / (24 / √36) = −8 / 4 = −2.00
Critical value at α = 0.01 (left-tail): z* = −2.326
−2.00 > −2.326 → Fail to reject H₀ at α = 0.01.
p-value = 0.023. At α = 0.05 we would reject H₀; at α = 0.01 we do not.
Conclusion: At the stricter α = 0.01 significance level, there is insufficient evidence to reject the factory's claim. At α = 0.05, the evidence would be significant — illustrating how the choice of α changes the conclusion.

4
A sample of 25 students from a new teaching method scores x̄ = 82 with s = 10. The national mean is μ₀ = 78. At α = 0.05, is there evidence the new method improves scores?
Show Answer ▼

H₀: μ ≤ 78  |  H₁: μ > 78 (right-tailed)
t = (82 − 78) / (10 / √25) = 4 / 2 = 2.00
df = 24. Critical value at α = 0.05 (right-tail): t* = 1.711
2.00 > 1.711 → Reject H₀. p-value ≈ 0.028.
Conclusion: At α = 0.05, there is sufficient evidence to conclude the new teaching method produces higher scores than the national mean of 78. The difference of 4 points is statistically significant with this sample.

Advanced Level

5
Researchers test whether a new website layout increases click-through rates from the historical 8%. In an A/B test, 1,200 visitors see the new layout and 120 click through (p̂ = 0.10). At α = 0.05, does the new layout perform better? Also identify: what type of error would occur if we reject H₀ when the true rate is actually still 8%?
Show Answer ▼

H₀: p ≤ 0.08  |  H₁: p > 0.08 (right-tailed)
Verify: n × p₀ = 1200 × 0.08 = 96 ≥ 10 ✓
z = (0.10 − 0.08) / √(0.08 × 0.92 / 1200) = 0.02 / √(0.0000613) = 0.02 / 0.00783 ≈ 2.55
Critical value: z* = 1.645.   2.55 > 1.645 → Reject H₀. p-value ≈ 0.0054.
Conclusion: At α = 0.05, the new layout significantly increases click-through rate above 8%.
Error type: Rejecting H₀ when the true rate is still 8% would be a Type I error (false positive) — concluding the new design works when it actually does not. The probability of this error is α = 0.05.

Hypothesis testing connects deeply to several other areas of statistics fundamentals. The guides below build directly on what you have learned here.

Next Step

One-Sample T-Test

Deep dive into the t-test for a single mean: assumptions, degrees of freedom, and interpreting output from statistical software. The natural follow-on from this guide.

Next Step

Two-Sample T-Test

Compare means from two independent groups. Used when you have two separate samples and want to know if they came from populations with the same mean.

Related

Confidence Intervals

The interval estimation counterpart to hypothesis testing. A 95% confidence interval corresponds exactly to a two-tailed α = 0.05 test — if the null value falls outside the CI, you reject H₀.

Related

Z-Score and the Normal Distribution

Z-scores underpin z-test critical values and p-value lookups. Understanding the standard normal distribution makes everything in this guide click more clearly.

External Resources

Authoritative References

MIT OpenCourseWare: Statistics for Applications — covers hypothesis testing rigorously with problem sets. · Khan Academy: Significance Tests — free video lessons on null and alternative hypotheses. · StatTrek Hypothesis Testing Reference — concise formulas and critical value tables. · Nature Methods: Importance of Being Uncertain — explains p-values and errors for life sciences researchers.