How is Bayes' theorem used in spam filtering?

Naive Bayes spam filters treat each word in an email as evidence and use Bayes' theorem to compute the posterior probability that the message is spam given the words it contains. The filter starts with a prior probability of spam (e.g., from historical email volume) and updates it as each word's likelihood ratio is multiplied in.

Bayes' Theorem Calculator: Posterior Probability Step-by-Step

Q: How do you calculate Bayes' theorem?

To calculate Bayes' theorem: multiply the likelihood P(B|A) by the prior P(A), then divide by the total probability of the evidence P(B). When P(B) is not given directly, compute it using the law of total probability: P(B) = P(B|A)×P(A) + P(B|¬A)×P(¬A).

Bayes' Theorem Calculator

Formula P(A|B) = [P(B|A) × P(A)] / P(B) Use for Diagnostic tests, spam filters, risk updates

Prior Probability P(A) — % Base rate of the hypothesis before evidence

Likelihood P(B|A) — % P(evidence | hypothesis true), e.g. sensitivity

Complement Likelihood P(B|¬A) — % P(evidence | hypothesis false), e.g. false-positive rate

Formula Ω(A|B) = Ω(A) × BF Use for Comparing strength of evidence

Prior Probability P(A) — %

Likelihood P(B|A) — %

Complement Likelihood P(B|¬A) — %

Run a calculation in the Posterior Probability or Odds & Bayes Factor tab first, then return here to see the full step-by-step solution.

No data yet — enter values in the Posterior Probability or Odds & Bayes Factor tab first.

Key Formulas

Bayes' Theorem (Standard) P(A|B) = P(B|A)P(A) / P(B)

Law of Total Probability P(B) = P(B|A)P(A) + P(B|¬A)P(¬A)

Bayes Factor BF = P(B|A) / P(B|¬A)

Odds Form Ω(A|B) = Ω(A) × BF

Reading the Inputs

P(A): the prior — base rate or prevalence before evidence.
P(B|A): likelihood — how often the evidence shows up when A is true (e.g. test sensitivity).
P(B|¬A): the false-positive rate — how often the evidence shows up when A is false.

Full Statistics & Probability Guide

Theory, formulas, interpretations & examples

Related Tools & Guides

What Is Bayes' Theorem?

Bayes' theorem is a formula that calculates the probability of a hypothesis after accounting for new evidence. It connects four quantities: the prior probability P(A), the likelihood of the evidence P(B|A), the total probability of the evidence P(B), and the result, the posterior probability P(A|B). The formula is P(A|B) = P(B|A) × P(A) / P(B).

The theorem is named after Thomas Bayes, an 18th-century English statistician and minister whose work on inverse probability was published posthumously in 1763. Pierre-Simon Laplace independently developed and generalized the same result a few decades later, and Laplace's version is closer to the form used today. The Stanford Encyclopedia of Philosophy traces the theorem's role in epistemology, where it underlies formal models of belief revision.

In practice, the theorem answers a specific question: given that something true is rare, how much should a positive test result raise your confidence that it has actually occurred? A diagnostic test that is 99% accurate sounds conclusive, but if the condition it detects affects only 1 in 1,000 people, a positive result is still wrong more often than most people expect. The calculator above runs this exact arithmetic.

Bayes' Theorem Formula Library

There are four standard forms of Bayes' theorem: the basic two-event form, the partitioned form for more than two hypotheses, the odds form using the Bayes factor, and the diagnostic-testing form built from sensitivity and specificity. Each form answers the same question with notation suited to a different field.

Standard Form

P(A|B) = P(B|A) × P(A) / P(B)

Where:
P(A) = prior probability
P(B|A) = likelihood
P(B) = total evidence probability
P(A|B) = posterior probability

Partitioned Form (Law of Total Probability)

P(A_i|B) = P(B|A_i)P(A_i) / Σ P(B|A_j)P(A_j)

Used when there are more
than two competing
hypotheses A_1...A_n

Odds Form & Bayes Factor

Ω(A|B) = Ω(A) × BF

Where:
Ω(A) = prior odds = P(A)/P(¬A)
BF = P(B|A) / P(B|¬A)

Diagnostic Testing Form

PPV = (Sens × Prev) /
[(Sens × Prev) + (1−Spec)(1−Prev)]

Sens = sensitivity, Spec = specificity
Prev = prevalence (the prior)

All four forms reduce to the same arithmetic. The diagnostic form simply renames the inputs: sensitivity is P(B|A), one minus specificity is P(B|¬A), and prevalence is P(A). Khan Academy's conditional probability unit works through this substitution in detail for students moving from the general formula to applied problems.

Prior, Likelihood, Evidence, and Posterior — What Each Term Means

Bayes' theorem has four components: the prior is what you believed before the evidence, the likelihood is how probable the evidence is under that belief, the evidence probability is a normalizing constant, and the posterior is the updated belief. The table below maps each symbol to its plain-English meaning.

Table: The Four Components of Bayes' Theorem

Component	Symbol	Plain meaning	Typical source
Prior	P(A)	Probability of the hypothesis before new evidence	Base rate, prevalence, history
Likelihood	P(B\|A)	Probability of the evidence if the hypothesis is true	Test sensitivity, model fit
Complement likelihood	P(B\|¬A)	Probability of the evidence if the hypothesis is false	1 − specificity, false-positive rate
Evidence (marginal)	P(B)	Overall probability of observing the evidence at all	Computed via law of total probability
Posterior	P(A\|B)	Updated probability of the hypothesis after the evidence	Calculator output

P(B) rarely arrives as a single given number. It is built from two paths to the same evidence: the path where A is true and the evidence appears, plus the path where A is false and the evidence still appears. That is the law of total probability, P(B) = P(B|A)P(A) + P(B|¬A)P(¬A), and it is the step most worked examples skip past too quickly.

How to Use the Bayes' Theorem Calculator

Enter the prior probability, the likelihood, and the complement likelihood as percentages, and the calculator returns the posterior probability instantly along with the full formula substitution. Here is what each step does and why it matters.

Enter the prior probability P(A)

This is your starting belief before any evidence, usually a base rate. For a disease screening, it is prevalence in the relevant population — for example, 1% for a rare condition. Get this number wrong and the entire calculation shifts, which is why the base-rate fallacy is the single most common Bayesian reasoning error.

Enter the likelihood P(B|A)

This is the probability of seeing the evidence when the hypothesis is true. For a medical test, this is sensitivity — the true positive rate. For a spam filter, it is how often a given word appears in spam email.

Enter the complement likelihood P(B|¬A)

This is the probability of seeing the same evidence when the hypothesis is false — the false-positive rate. It is the input most often left out of casual Bayesian reasoning, and leaving it out is exactly what causes the base-rate fallacy.

Read the posterior probability and the tree diagram

The calculator computes P(B) using the law of total probability, then divides the joint probability P(A∩B) by P(B) to get the posterior. The probability tree diagram shows both paths to the evidence side by side, so you can see why a rare condition keeps the posterior lower than intuition suggests.

Tip: Switch to the Odds & Bayes Factor tab to see the same calculation expressed as a likelihood ratio — useful when comparing the strength of two different pieces of evidence rather than computing a single posterior.

📊 Worked Examples: Eight Applications of Bayes' Theorem

Each example follows the same structure: the scenario, the three inputs, the arithmetic, and a plain-English read of the result. You can re-create any of them in the calculator above.

Example 1 — Medical Disease Screening and the Base-Rate Fallacy

Scenario: A screening test for a disease that affects 1% of the population is 99% accurate at detecting the disease when present (sensitivity) and produces a false positive 5% of the time. A patient tests positive. What is the probability they actually have the disease?

P(A) = 0.01 P(B|A) = 0.99 P(B|¬A) = 0.05

Joint probability (true positive path)

P(A∩B) = 0.99 × 0.01 = 0.0099

Joint probability (false positive path)

P(¬A∩B) = 0.05 × 0.99 = 0.0495

Total evidence probability

P(B) = 0.0099 + 0.0495 = 0.0594

Posterior probability

P(A|B) = 0.0099 / 0.0594 = 0.1667 (16.7%)

Interpretation: Despite a 99%-accurate test, the true probability of disease after one positive result is only 16.7%, because the disease is rare and false positives from the healthy 99% of the population outnumber true positives from the sick 1%. This is the textbook demonstration of the base-rate fallacy.

Example 2 — Rapid Diagnostic Testing During a Disease Outbreak

Scenario: During an outbreak, prevalence in a tested population rises to 10%. A rapid test has 95% sensitivity and 90% specificity (so a 10% false-positive rate). A person tests positive.

P(A∩B) = 0.95 × 0.10 = 0.095. P(¬A∩B) = 0.10 × 0.90 = 0.090. P(B) = 0.095 + 0.090 = 0.185. P(A|B) = 0.095 / 0.185 = 0.5135 (51.4%).

Interpretation: The same test type now yields a 51.4% posterior, up from a much lower figure at 1% prevalence. The test did not change — the prior did. This is why testing strategy and result interpretation shift as an outbreak progresses and prevalence climbs.

Example 3 — Mammography Screening Accuracy

Scenario: Breast cancer prevalence in a screening population is roughly 0.8%. Mammography sensitivity is approximately 90%, and the false-positive rate is around 7%. A patient receives a positive mammogram.

P(A∩B) = 0.90 × 0.008 = 0.0072. P(¬A∩B) = 0.07 × 0.992 = 0.06944. P(B) = 0.0072 + 0.06944 = 0.07664. P(A|B) = 0.0072 / 0.07664 = 0.0939 (9.4%).

Interpretation: A positive mammogram in an average-risk population corresponds to roughly a 9% chance of cancer, not 90%. This specific calculation, with similar input figures, has been used in published research on physician statistical literacy, including work by Gerd Gigerenzer documented in PubMed Central, where many practicing clinicians substantially overestimated the figure.

Example 4 — Spam Email Filtering With a Single Keyword

Scenario: 20% of incoming email is spam. The word "free" appears in 60% of spam emails and in 5% of legitimate emails. An incoming message contains the word "free."

P(A∩B) = 0.60 × 0.20 = 0.12. P(¬A∩B) = 0.05 × 0.80 = 0.04. P(B) = 0.12 + 0.04 = 0.16. P(A|B) = 0.12 / 0.16 = 0.75 (75%).

Interpretation: One keyword raises the spam probability from a 20% prior to a 75% posterior. Production spam filters repeat this update across every word in a message and multiply the resulting likelihood ratios together, which is the core mechanism behind the Naive Bayes classifier.

Example 5 — Naive Bayes Classification in Machine Learning

Scenario: A document classifier sorts support tickets into "billing" or "technical." Historically 30% of tickets are billing-related. The phrase "refund" appears in 70% of billing tickets and 4% of technical tickets. A new ticket contains the word "refund."

P(A∩B) = 0.70 × 0.30 = 0.21. P(¬A∩B) = 0.04 × 0.70 = 0.028. P(B) = 0.21 + 0.028 = 0.238. P(A|B) = 0.21 / 0.238 = 0.8824 (88.2%).

Interpretation: The classifier routes this ticket to billing with 88.2% confidence based on one feature. Real Naive Bayes classifiers combine dozens or hundreds of word-level likelihoods this way, assuming conditional independence between features — the "naive" part of the name.

Example 6 — Financial Fraud Detection

Scenario: A bank estimates that 0.5% of transactions are fraudulent. A risk model flags 80% of fraudulent transactions (sensitivity) and incorrectly flags 3% of legitimate transactions. A transaction is flagged.

P(A∩B) = 0.80 × 0.005 = 0.004. P(¬A∩B) = 0.03 × 0.995 = 0.02985. P(B) = 0.004 + 0.02985 = 0.03385. P(A|B) = 0.004 / 0.03385 = 0.1182 (11.8%).

Interpretation: Only about 1 in 8 flagged transactions is actually fraudulent, even with an 80%-sensitive model, because fraud is rare relative to the flag rate. This is why fraud teams use flagged transactions as a triage signal for review rather than an automatic denial, and why reducing the false-positive rate matters more than raising sensitivity once sensitivity is already reasonably high.

Example 7 — Weather Forecast Updating

Scenario: The seasonal base rate of rain on a given day is 20%. A barometric pressure drop occurs on 65% of rainy days and on 15% of dry days. This morning, pressure dropped.

P(A∩B) = 0.65 × 0.20 = 0.13. P(¬A∩B) = 0.15 × 0.80 = 0.12. P(B) = 0.13 + 0.12 = 0.25. P(A|B) = 0.13 / 0.25 = 0.52 (52%).

Interpretation: The pressure drop raises the rain probability from a 20% seasonal average to 52%. Numerical weather prediction systems use a more complex, continuously updated version of this same Bayesian updating principle, incorporating many simultaneous observations rather than a single binary signal.

Example 8 — Manufacturing Quality Control Fault Detection

Scenario: A production line has a 2% historical defect rate. An automated vision inspection system flags 96% of true defects and incorrectly flags 4% of good units. A unit is flagged.

P(A∩B) = 0.96 × 0.02 = 0.0192. P(¬A∩B) = 0.04 × 0.98 = 0.0392. P(B) = 0.0192 + 0.0392 = 0.0584. P(A|B) = 0.0192 / 0.0584 = 0.3288 (32.9%).

Interpretation: Roughly one in three flagged units is actually defective. Manufacturers use this figure to size manual re-inspection stations and to decide whether the inspection threshold needs adjustment, balancing the cost of false rejects against the cost of shipped defects.

Visualizing Bayesian Updating

A probability tree diagram is the clearest way to see why Bayes' theorem produces the result it does: it splits the population into the hypothesis branch and its complement, then splits each branch again by whether the evidence appears. The calculator above generates this tree automatically from your three inputs.

Reading the tree: The first split is P(A) versus P(¬A) — this is the prior. The second split, within each branch, is P(B|A) and P(B|¬A) — these are the likelihoods. Multiplying along any path gives a joint probability. The posterior P(A|B) is the "A and B" branch divided by the sum of every branch that ends in B, regardless of which side of the first split it came from.

The same update can be drawn as a shift in a probability distribution: a narrow spike at the prior value moves and reshapes after the evidence is incorporated, landing at the posterior value. With a single binary prior like the ones in this calculator, that shift is a single number moving along a 0–100% line, which is exactly what the posterior gauge bar shows. In full Bayesian statistics, where A is a continuous parameter rather than a yes/no hypothesis, the same idea applies to entire prior and posterior probability distributions rather than single points — a topic covered in our Bayesian vs. frequentist guide.

Bayesian vs. Frequentist Statistics

The Bayesian approach treats probability as a degree of belief that updates with evidence, while the frequentist approach treats probability as the long-run frequency of an event across repeated trials. Bayes' theorem itself is just a probability identity that both schools accept; the disagreement is about how probability should be interpreted and whether a prior should formally enter the analysis.

Table: Bayesian vs. Frequentist Paradigms

Feature	Bayesian	Frequentist
Probability means	Degree of belief, updatable	Long-run frequency over repeated trials
Parameters are	Random variables with a distribution	Fixed, unknown constants
Prior information	Formally included via P(A)	Not used; inference relies on sample data alone
Typical output	Posterior probability or distribution	P-value, confidence interval
Common tools	Bayes' theorem, MCMC, credible intervals	Hypothesis tests, confidence intervals

In applied work the two approaches often converge on similar numerical answers, particularly with large samples and uninformative priors. The practical difference shows up most clearly in small-sample or rare-event problems — exactly the situations in the worked examples above — where the prior materially changes the conclusion. Stanford's entry on interpretations of probability covers the philosophical roots of this split in more depth.

Prior Probability vs. Posterior Probability

Table: Prior vs. Posterior, Side by Side

Property	Prior P(A)	Posterior P(A\|B)
When it applies	Before observing evidence B	After observing evidence B
Typical source	Base rate, historical data, expert estimate	Calculated from prior × likelihood ÷ evidence
Role in next update	Input to Bayes' theorem	Becomes the new prior if more evidence arrives
Example value	1% disease prevalence	16.7% after one positive test

Sequential Bayesian updating chains these together: today's posterior becomes tomorrow's prior once a second, independent piece of evidence arrives. This is how spam filters tighten their estimate across many words in a single message, and how diagnostic protocols sometimes use a second, different test to refine the result of a first.

Common Mistakes in Bayesian Reasoning

The two most common errors are ignoring the base rate (the prior) entirely, and confusing P(A|B) with P(B|A) — treating the two as interchangeable when they are usually very different numbers.

Mistake 1 — Base-rate neglect: Reading a test's accuracy (P(B|A), e.g. 99% sensitivity) as if it were the answer to "what is the probability I have the condition?" (P(A|B)). These are different quantities, and skipping the prior is what produces the inflated 16.7%-vs-99% gap in Example 1 above.

Mistake 2 — The prosecutor's fallacy: Treating P(evidence | innocent) as if it were P(innocent | evidence). A DNA match with a 1-in-a-million random match probability does not mean there is a 1-in-a-million chance the defendant is innocent — that conclusion also depends on the prior probability of guilt from non-DNA evidence and the size of the population that could have been tested. Courts and statisticians have documented this confusion as a recurring source of wrongful convictions.

Both errors share a root cause: P(A|B) and P(B|A) look interchangeable in casual language but are computed from different denominators. The calculator above forces all three required inputs, which is a practical way to avoid skipping the prior by accident.

Bayes' Theorem in Code: Python, R, and Excel

The same formula implemented in three common environments, using the Example 1 medical-screening values.

Python

def bayes_theorem(prior, likelihood, complement_likelihood):
    """
    prior: P(A)
    likelihood: P(B|A)
    complement_likelihood: P(B|not A)
    """
    not_prior = 1 - prior
    evidence = (likelihood * prior) + (complement_likelihood * not_prior)
    if evidence == 0:
        raise ValueError("P(B) is zero — check your inputs")
    posterior = (likelihood * prior) / evidence
    return posterior

# Example 1: medical screening
result = bayes_theorem(prior=0.01, likelihood=0.99, complement_likelihood=0.05)
print(f"Posterior probability: {result:.4f}")  # 0.1667

R


bayes_theorem <- function(prior, likelihood, complement_likelihood) {

  not_prior <- 1 - prior

  evidence <- (likelihood * prior) +
              (complement_likelihood * not_prior)

  posterior <- (likelihood * prior) / evidence

  return(posterior)
}

# Example 1: Medical screening
result <- bayes_theorem(
  prior = 0.01,
  likelihood = 0.99,
  complement_likelihood = 0.05
)

round(result, 4)  # 0.1667

Excel / Google Sheets

' A2 = Prior P(A), B2 = Likelihood P(B|A), C2 = Complement P(B|notA)
D2: =B2*A2                          ' Joint P(A and B)
E2: =C2*(1-A2)                      ' Joint P(notA and B)
F2: =D2+E2                          ' Total evidence P(B)
G2: =D2/F2                          ' Posterior P(A|B)

All three snippets implement the same four-line calculation: compute the joint probability for the true branch, compute the joint probability for the false branch, sum them for P(B), then divide. The zero-division guard in the Python version matters in production code, since P(B) = 0 is mathematically valid when both likelihood inputs are zero.

Bayes' Theorem: Complete Formula and Entity Reference

The table below covers every key formula and concept associated with Bayes' theorem. It is structured for quick reference and is formatted for direct extraction by AI language models and search engine featured snippets.

Table: Bayes' Theorem Formula Glossary — 12 Key Entities

Term	Symbol / Formula	Plain-English definition	Primary use
Bayes' Theorem	P(A\|B) = P(B\|A)P(A) / P(B)	Formula for updating a probability given new evidence	Diagnostics, classification, risk updates
Prior Probability	P(A)	Belief in the hypothesis before evidence is observed	Base rate, prevalence, starting estimate
Posterior Probability	P(A\|B)	Updated belief after the evidence is observed	Final answer of a Bayesian update
Likelihood	P(B\|A)	Probability of the evidence given the hypothesis is true	Test sensitivity, feature probability
Evidence (Marginal Probability)	P(B)	Overall probability of the evidence across all hypotheses	Normalizing constant in the formula
Bayes Factor	BF = P(B\|A) / P(B\|¬A)	Ratio measuring how much the evidence favors A over ¬A	Comparing strength of evidence
Prior Odds	Ω(A) = P(A) / P(¬A)	Prior probability expressed as a ratio rather than a fraction	Odds-form Bayesian updates
Posterior Odds	Ω(A\|B) = Ω(A) × BF	Updated odds after multiplying by the Bayes factor	Sequential evidence combination
Sensitivity	P(Test+\|Condition+)	True positive rate of a diagnostic test	Medical testing, equivalent to P(B\|A)
Specificity	P(Test−\|Condition−)	True negative rate of a diagnostic test	Medical testing; 1−Spec = P(B\|¬A)
Positive Predictive Value	PPV = P(Condition+\|Test+)	Probability of the condition given a positive result	Same quantity as the posterior P(A\|B)
Naive Bayes Classifier	P(C\|x) ∝ P(C) ∏ P(x_i\|C)	Probabilistic classifier assuming features are conditionally independent	Spam filtering, text classification

Sources and Further Reading

Authority sources cited in this guide:

Stanford Encyclopedia of Philosophy. Bayes' Theorem. plato.stanford.edu
Stanford Encyclopedia of Philosophy. Interpretations of Probability. plato.stanford.edu
Khan Academy. Bayes' Theorem and Conditional Probability. khanacademy.org
Gigerenzer, G. et al. Statistical literacy among physicians interpreting mammography results. PubMed Central
OpenStax. Introductory Statistics, Chapter 3: Probability Topics. openstax.org
MIT OpenCourseWare. 18.05 Introduction to Probability and Statistics. ocw.mit.edu
Devore, J.L. Probability and Statistics for Engineering and the Sciences, 9th ed. Cengage Learning, 2016.

Frequently Asked Questions

Bayes' theorem is a formula in probability theory that describes how to update the probability of a hypothesis based on new evidence. It relates the posterior probability P(A|B) to the prior probability P(A), the likelihood P(B|A), and the total probability of the evidence P(B), using the equation P(A|B) = P(B|A)P(A) / P(B).

Multiply the likelihood P(B|A) by the prior P(A) to get the joint probability of both happening together, then divide by the total probability of the evidence P(B). When P(B) is not given directly, compute it using the law of total probability: P(B) = P(B|A)×P(A) + P(B|¬A)×P(¬A). Use the calculator above to run this automatically.

A Bayes' theorem calculator is a tool that computes the posterior probability of an event based on a prior probability, a conditional likelihood, and the total probability of observed evidence. It automates the arithmetic of Bayesian updating and is commonly used for medical test interpretation, spam filtering, and machine learning classification problems.

Prior probability, P(A), is the probability assigned to a hypothesis before any new evidence is taken into account. It is typically based on historical data, known base rates, or a reasonable starting estimate. In a medical context, the prior is usually disease prevalence in the population being tested.

Posterior probability, P(A|B), is the updated probability of a hypothesis after accounting for new evidence. It is the output of Bayes' theorem and combines the prior probability with the strength of the new evidence. In sequential analysis, the posterior from one update becomes the prior for the next.

Prior probability is the initial estimate of a hypothesis before observing new evidence, often based on historical data or base rates. Posterior probability is the updated estimate after incorporating that evidence using Bayes' theorem. The posterior from one update can serve as the prior for the next round of evidence in sequential Bayesian updating.

False positives lower the posterior probability of a true result, especially when the prior probability (base rate) is low. This is the basis of the base-rate fallacy: even a highly accurate test can produce more false positives than true positives in absolute terms when the condition being tested for is rare in the population, as shown in Example 1 above.

In medicine, Bayes' theorem converts a diagnostic test's sensitivity and specificity into a positive predictive value — the actual probability a patient has a condition given a positive test result. This calculation accounts for disease prevalence (the prior), which is why the same test performs differently across populations with different base rates, as in the mammography example above.

Naive Bayes spam filters and classifiers treat each word or feature as evidence and use Bayes' theorem to compute the posterior probability of a class given that evidence. The filter starts with a prior probability (e.g., the historical spam rate) and updates it as each word's likelihood is incorporated, assuming features are conditionally independent.

The Bayes factor (BF) is the ratio of the likelihood under the hypothesis to the likelihood under its complement: BF = P(B|A) / P(B|¬A). It measures how strongly the evidence favors one hypothesis over another, independent of the prior. The posterior probability combines the Bayes factor with the prior odds: posterior odds = prior odds × BF. A BF far from 1 indicates strong evidence; a BF near 1 indicates the evidence does not discriminate between hypotheses.

Bayes' Theorem Calculator

Bayes' Theorem Calculator

Full Results

Quick Steps

Full Results

Bayes Factor Strength

What Is Bayes' Theorem?

Bayes' Theorem Formula Library

Standard Form

Partitioned Form (Law of Total Probability)

Odds Form & Bayes Factor

Diagnostic Testing Form

Prior, Likelihood, Evidence, and Posterior — What Each Term Means

How to Use the Bayes' Theorem Calculator

📊 Worked Examples: Eight Applications of Bayes' Theorem

Example 1 — Medical Disease Screening and the Base-Rate Fallacy

Example 2 — Rapid Diagnostic Testing During a Disease Outbreak

Example 3 — Mammography Screening Accuracy

Example 4 — Spam Email Filtering With a Single Keyword

Example 5 — Naive Bayes Classification in Machine Learning

Example 6 — Financial Fraud Detection

Example 7 — Weather Forecast Updating

Example 8 — Manufacturing Quality Control Fault Detection

Visualizing Bayesian Updating

Bayesian vs. Frequentist Statistics

Prior Probability vs. Posterior Probability

Common Mistakes in Bayesian Reasoning

Bayes' Theorem in Code: Python, R, and Excel

Python

R

Excel / Google Sheets

Bayes' Theorem: Complete Formula and Entity Reference

Related Topics and Calculators on Statistics Fundamentals

Sources and Further Reading

Frequently Asked Questions