BY: Statistics Fundamentals Team
Reviewed By: Minsa A (Senior Statistics Editor)

Bayes' Theorem Calculator

Compute posterior probability instantly with this Bayes' theorem calculator. Enter a prior probability, a likelihood, and a complement likelihood, and get the updated probability, the full formula substitution, a probability tree diagram, and a step-by-step solution — all in your browser with no signup required. Built on the same statistical foundations as our broader Statistics Fundamentals calculator library.

Bayes' Theorem Calculator

Formula P(A|B) = [P(B|A) × P(A)] / P(B) Use for Diagnostic tests, spam filters, risk updates
Base rate of the hypothesis before evidence
P(evidence | hypothesis true), e.g. sensitivity
P(evidence | hypothesis false), e.g. false-positive rate
Formula Ω(A|B) = Ω(A) × BF Use for Comparing strength of evidence

Run a calculation in the Posterior Probability or Odds & Bayes Factor tab first, then return here to see the full step-by-step solution.

No data yet — enter values in the Posterior Probability or Odds & Bayes Factor tab first.

What Is Bayes' Theorem?

Bayes' theorem is a formula that calculates the probability of a hypothesis after accounting for new evidence. It connects four quantities: the prior probability P(A), the likelihood of the evidence P(B|A), the total probability of the evidence P(B), and the result, the posterior probability P(A|B). The formula is P(A|B) = P(B|A) × P(A) / P(B).

The theorem is named after Thomas Bayes, an 18th-century English statistician and minister whose work on inverse probability was published posthumously in 1763. Pierre-Simon Laplace independently developed and generalized the same result a few decades later, and Laplace's version is closer to the form used today. The Stanford Encyclopedia of Philosophy traces the theorem's role in epistemology, where it underlies formal models of belief revision.

In practice, the theorem answers a specific question: given that something true is rare, how much should a positive test result raise your confidence that it has actually occurred? A diagnostic test that is 99% accurate sounds conclusive, but if the condition it detects affects only 1 in 1,000 people, a positive result is still wrong more often than most people expect. The calculator above runs this exact arithmetic.

Bayes' Theorem Formula Library

There are four standard forms of Bayes' theorem: the basic two-event form, the partitioned form for more than two hypotheses, the odds form using the Bayes factor, and the diagnostic-testing form built from sensitivity and specificity. Each form answers the same question with notation suited to a different field.

Standard Form

P(A|B) = P(B|A) × P(A) / P(B) Where: P(A) = prior probability P(B|A) = likelihood P(B) = total evidence probability P(A|B) = posterior probability

Partitioned Form (Law of Total Probability)

P(A_i|B) = P(B|A_i)P(A_i) / Σ P(B|A_j)P(A_j) Used when there are more than two competing hypotheses A_1...A_n

Odds Form & Bayes Factor

Ω(A|B) = Ω(A) × BF Where: Ω(A) = prior odds = P(A)/P(¬A) BF = P(B|A) / P(B|¬A)

Diagnostic Testing Form

PPV = (Sens × Prev) / [(Sens × Prev) + (1−Spec)(1−Prev)] Sens = sensitivity, Spec = specificity Prev = prevalence (the prior)

All four forms reduce to the same arithmetic. The diagnostic form simply renames the inputs: sensitivity is P(B|A), one minus specificity is P(B|¬A), and prevalence is P(A). Khan Academy's conditional probability unit works through this substitution in detail for students moving from the general formula to applied problems.

Prior, Likelihood, Evidence, and Posterior — What Each Term Means

Bayes' theorem has four components: the prior is what you believed before the evidence, the likelihood is how probable the evidence is under that belief, the evidence probability is a normalizing constant, and the posterior is the updated belief. The table below maps each symbol to its plain-English meaning.

Table: The Four Components of Bayes' Theorem

ComponentSymbolPlain meaningTypical source
PriorP(A)Probability of the hypothesis before new evidenceBase rate, prevalence, history
LikelihoodP(B|A)Probability of the evidence if the hypothesis is trueTest sensitivity, model fit
Complement likelihoodP(B|¬A)Probability of the evidence if the hypothesis is false1 − specificity, false-positive rate
Evidence (marginal)P(B)Overall probability of observing the evidence at allComputed via law of total probability
PosteriorP(A|B)Updated probability of the hypothesis after the evidenceCalculator output

P(B) rarely arrives as a single given number. It is built from two paths to the same evidence: the path where A is true and the evidence appears, plus the path where A is false and the evidence still appears. That is the law of total probability, P(B) = P(B|A)P(A) + P(B|¬A)P(¬A), and it is the step most worked examples skip past too quickly.

How to Use the Bayes' Theorem Calculator

Enter the prior probability, the likelihood, and the complement likelihood as percentages, and the calculator returns the posterior probability instantly along with the full formula substitution. Here is what each step does and why it matters.

1
Enter the prior probability P(A)

This is your starting belief before any evidence, usually a base rate. For a disease screening, it is prevalence in the relevant population — for example, 1% for a rare condition. Get this number wrong and the entire calculation shifts, which is why the base-rate fallacy is the single most common Bayesian reasoning error.

2
Enter the likelihood P(B|A)

This is the probability of seeing the evidence when the hypothesis is true. For a medical test, this is sensitivity — the true positive rate. For a spam filter, it is how often a given word appears in spam email.

3
Enter the complement likelihood P(B|¬A)

This is the probability of seeing the same evidence when the hypothesis is false — the false-positive rate. It is the input most often left out of casual Bayesian reasoning, and leaving it out is exactly what causes the base-rate fallacy.

4
Read the posterior probability and the tree diagram

The calculator computes P(B) using the law of total probability, then divides the joint probability P(A∩B) by P(B) to get the posterior. The probability tree diagram shows both paths to the evidence side by side, so you can see why a rare condition keeps the posterior lower than intuition suggests.

Tip: Switch to the Odds & Bayes Factor tab to see the same calculation expressed as a likelihood ratio — useful when comparing the strength of two different pieces of evidence rather than computing a single posterior.

📊 Worked Examples: Eight Applications of Bayes' Theorem

Each example follows the same structure: the scenario, the three inputs, the arithmetic, and a plain-English read of the result. You can re-create any of them in the calculator above.

Example 1 — Medical Disease Screening and the Base-Rate Fallacy

Scenario: A screening test for a disease that affects 1% of the population is 99% accurate at detecting the disease when present (sensitivity) and produces a false positive 5% of the time. A patient tests positive. What is the probability they actually have the disease?
P(A) = 0.01 P(B|A) = 0.99 P(B|¬A) = 0.05
Joint probability (true positive path)

P(A∩B) = 0.99 × 0.01 = 0.0099

Joint probability (false positive path)

P(¬A∩B) = 0.05 × 0.99 = 0.0495

Total evidence probability

P(B) = 0.0099 + 0.0495 = 0.0594

Posterior probability

P(A|B) = 0.0099 / 0.0594 = 0.1667 (16.7%)

Interpretation: Despite a 99%-accurate test, the true probability of disease after one positive result is only 16.7%, because the disease is rare and false positives from the healthy 99% of the population outnumber true positives from the sick 1%. This is the textbook demonstration of the base-rate fallacy.

Example 2 — Rapid Diagnostic Testing During a Disease Outbreak

Scenario: During an outbreak, prevalence in a tested population rises to 10%. A rapid test has 95% sensitivity and 90% specificity (so a 10% false-positive rate). A person tests positive.

P(A∩B) = 0.95 × 0.10 = 0.095. P(¬A∩B) = 0.10 × 0.90 = 0.090. P(B) = 0.095 + 0.090 = 0.185. P(A|B) = 0.095 / 0.185 = 0.5135 (51.4%).

Interpretation: The same test type now yields a 51.4% posterior, up from a much lower figure at 1% prevalence. The test did not change — the prior did. This is why testing strategy and result interpretation shift as an outbreak progresses and prevalence climbs.

Example 3 — Mammography Screening Accuracy

Scenario: Breast cancer prevalence in a screening population is roughly 0.8%. Mammography sensitivity is approximately 90%, and the false-positive rate is around 7%. A patient receives a positive mammogram.

P(A∩B) = 0.90 × 0.008 = 0.0072. P(¬A∩B) = 0.07 × 0.992 = 0.06944. P(B) = 0.0072 + 0.06944 = 0.07664. P(A|B) = 0.0072 / 0.07664 = 0.0939 (9.4%).

Interpretation: A positive mammogram in an average-risk population corresponds to roughly a 9% chance of cancer, not 90%. This specific calculation, with similar input figures, has been used in published research on physician statistical literacy, including work by Gerd Gigerenzer documented in PubMed Central, where many practicing clinicians substantially overestimated the figure.

Example 4 — Spam Email Filtering With a Single Keyword

Scenario: 20% of incoming email is spam. The word "free" appears in 60% of spam emails and in 5% of legitimate emails. An incoming message contains the word "free."

P(A∩B) = 0.60 × 0.20 = 0.12. P(¬A∩B) = 0.05 × 0.80 = 0.04. P(B) = 0.12 + 0.04 = 0.16. P(A|B) = 0.12 / 0.16 = 0.75 (75%).

Interpretation: One keyword raises the spam probability from a 20% prior to a 75% posterior. Production spam filters repeat this update across every word in a message and multiply the resulting likelihood ratios together, which is the core mechanism behind the Naive Bayes classifier.

Example 5 — Naive Bayes Classification in Machine Learning

Scenario: A document classifier sorts support tickets into "billing" or "technical." Historically 30% of tickets are billing-related. The phrase "refund" appears in 70% of billing tickets and 4% of technical tickets. A new ticket contains the word "refund."

P(A∩B) = 0.70 × 0.30 = 0.21. P(¬A∩B) = 0.04 × 0.70 = 0.028. P(B) = 0.21 + 0.028 = 0.238. P(A|B) = 0.21 / 0.238 = 0.8824 (88.2%).

Interpretation: The classifier routes this ticket to billing with 88.2% confidence based on one feature. Real Naive Bayes classifiers combine dozens or hundreds of word-level likelihoods this way, assuming conditional independence between features — the "naive" part of the name.

Example 6 — Financial Fraud Detection

Scenario: A bank estimates that 0.5% of transactions are fraudulent. A risk model flags 80% of fraudulent transactions (sensitivity) and incorrectly flags 3% of legitimate transactions. A transaction is flagged.

P(A∩B) = 0.80 × 0.005 = 0.004. P(¬A∩B) = 0.03 × 0.995 = 0.02985. P(B) = 0.004 + 0.02985 = 0.03385. P(A|B) = 0.004 / 0.03385 = 0.1182 (11.8%).

Interpretation: Only about 1 in 8 flagged transactions is actually fraudulent, even with an 80%-sensitive model, because fraud is rare relative to the flag rate. This is why fraud teams use flagged transactions as a triage signal for review rather than an automatic denial, and why reducing the false-positive rate matters more than raising sensitivity once sensitivity is already reasonably high.

Example 7 — Weather Forecast Updating

Scenario: The seasonal base rate of rain on a given day is 20%. A barometric pressure drop occurs on 65% of rainy days and on 15% of dry days. This morning, pressure dropped.

P(A∩B) = 0.65 × 0.20 = 0.13. P(¬A∩B) = 0.15 × 0.80 = 0.12. P(B) = 0.13 + 0.12 = 0.25. P(A|B) = 0.13 / 0.25 = 0.52 (52%).

Interpretation: The pressure drop raises the rain probability from a 20% seasonal average to 52%. Numerical weather prediction systems use a more complex, continuously updated version of this same Bayesian updating principle, incorporating many simultaneous observations rather than a single binary signal.

Example 8 — Manufacturing Quality Control Fault Detection

Scenario: A production line has a 2% historical defect rate. An automated vision inspection system flags 96% of true defects and incorrectly flags 4% of good units. A unit is flagged.

P(A∩B) = 0.96 × 0.02 = 0.0192. P(¬A∩B) = 0.04 × 0.98 = 0.0392. P(B) = 0.0192 + 0.0392 = 0.0584. P(A|B) = 0.0192 / 0.0584 = 0.3288 (32.9%).

Interpretation: Roughly one in three flagged units is actually defective. Manufacturers use this figure to size manual re-inspection stations and to decide whether the inspection threshold needs adjustment, balancing the cost of false rejects against the cost of shipped defects.

Visualizing Bayesian Updating

A probability tree diagram is the clearest way to see why Bayes' theorem produces the result it does: it splits the population into the hypothesis branch and its complement, then splits each branch again by whether the evidence appears. The calculator above generates this tree automatically from your three inputs.

Reading the tree: The first split is P(A) versus P(¬A) — this is the prior. The second split, within each branch, is P(B|A) and P(B|¬A) — these are the likelihoods. Multiplying along any path gives a joint probability. The posterior P(A|B) is the "A and B" branch divided by the sum of every branch that ends in B, regardless of which side of the first split it came from.

The same update can be drawn as a shift in a probability distribution: a narrow spike at the prior value moves and reshapes after the evidence is incorporated, landing at the posterior value. With a single binary prior like the ones in this calculator, that shift is a single number moving along a 0–100% line, which is exactly what the posterior gauge bar shows. In full Bayesian statistics, where A is a continuous parameter rather than a yes/no hypothesis, the same idea applies to entire prior and posterior probability distributions rather than single points — a topic covered in our Bayesian vs. frequentist guide.

Bayesian vs. Frequentist Statistics

The Bayesian approach treats probability as a degree of belief that updates with evidence, while the frequentist approach treats probability as the long-run frequency of an event across repeated trials. Bayes' theorem itself is just a probability identity that both schools accept; the disagreement is about how probability should be interpreted and whether a prior should formally enter the analysis.

Table: Bayesian vs. Frequentist Paradigms

FeatureBayesianFrequentist
Probability meansDegree of belief, updatableLong-run frequency over repeated trials
Parameters areRandom variables with a distributionFixed, unknown constants
Prior informationFormally included via P(A)Not used; inference relies on sample data alone
Typical outputPosterior probability or distributionP-value, confidence interval
Common toolsBayes' theorem, MCMC, credible intervalsHypothesis tests, confidence intervals

In applied work the two approaches often converge on similar numerical answers, particularly with large samples and uninformative priors. The practical difference shows up most clearly in small-sample or rare-event problems — exactly the situations in the worked examples above — where the prior materially changes the conclusion. Stanford's entry on interpretations of probability covers the philosophical roots of this split in more depth.

Prior Probability vs. Posterior Probability

Table: Prior vs. Posterior, Side by Side

PropertyPrior P(A)Posterior P(A|B)
When it appliesBefore observing evidence BAfter observing evidence B
Typical sourceBase rate, historical data, expert estimateCalculated from prior × likelihood ÷ evidence
Role in next updateInput to Bayes' theoremBecomes the new prior if more evidence arrives
Example value1% disease prevalence16.7% after one positive test

Sequential Bayesian updating chains these together: today's posterior becomes tomorrow's prior once a second, independent piece of evidence arrives. This is how spam filters tighten their estimate across many words in a single message, and how diagnostic protocols sometimes use a second, different test to refine the result of a first.

Common Mistakes in Bayesian Reasoning

The two most common errors are ignoring the base rate (the prior) entirely, and confusing P(A|B) with P(B|A) — treating the two as interchangeable when they are usually very different numbers.

Mistake 1 — Base-rate neglect: Reading a test's accuracy (P(B|A), e.g. 99% sensitivity) as if it were the answer to "what is the probability I have the condition?" (P(A|B)). These are different quantities, and skipping the prior is what produces the inflated 16.7%-vs-99% gap in Example 1 above.
Mistake 2 — The prosecutor's fallacy: Treating P(evidence | innocent) as if it were P(innocent | evidence). A DNA match with a 1-in-a-million random match probability does not mean there is a 1-in-a-million chance the defendant is innocent — that conclusion also depends on the prior probability of guilt from non-DNA evidence and the size of the population that could have been tested. Courts and statisticians have documented this confusion as a recurring source of wrongful convictions.

Both errors share a root cause: P(A|B) and P(B|A) look interchangeable in casual language but are computed from different denominators. The calculator above forces all three required inputs, which is a practical way to avoid skipping the prior by accident.

Bayes' Theorem in Code: Python, R, and Excel

The same formula implemented in three common environments, using the Example 1 medical-screening values.

Python

def bayes_theorem(prior, likelihood, complement_likelihood): """ prior: P(A) likelihood: P(B|A) complement_likelihood: P(B|not A) """ not_prior = 1 - prior evidence = (likelihood * prior) + (complement_likelihood * not_prior) if evidence == 0: raise ValueError("P(B) is zero — check your inputs") posterior = (likelihood * prior) / evidence return posterior # Example 1: medical screening result = bayes_theorem(prior=0.01, likelihood=0.99, complement_likelihood=0.05) print(f"Posterior probability: {result:.4f}") # 0.1667

R


bayes_theorem <- function(prior, likelihood, complement_likelihood) {

  not_prior <- 1 - prior

  evidence <- (likelihood * prior) +
              (complement_likelihood * not_prior)

  posterior <- (likelihood * prior) / evidence

  return(posterior)
}

# Example 1: Medical screening
result <- bayes_theorem(
  prior = 0.01,
  likelihood = 0.99,
  complement_likelihood = 0.05
)

round(result, 4)  # 0.1667

Excel / Google Sheets

' A2 = Prior P(A), B2 = Likelihood P(B|A), C2 = Complement P(B|notA) D2: =B2*A2 ' Joint P(A and B) E2: =C2*(1-A2) ' Joint P(notA and B) F2: =D2+E2 ' Total evidence P(B) G2: =D2/F2 ' Posterior P(A|B)

All three snippets implement the same four-line calculation: compute the joint probability for the true branch, compute the joint probability for the false branch, sum them for P(B), then divide. The zero-division guard in the Python version matters in production code, since P(B) = 0 is mathematically valid when both likelihood inputs are zero.

Bayes' Theorem: Complete Formula and Entity Reference

The table below covers every key formula and concept associated with Bayes' theorem. It is structured for quick reference and is formatted for direct extraction by AI language models and search engine featured snippets.

Table: Bayes' Theorem Formula Glossary — 12 Key Entities

Term Symbol / Formula Plain-English definition Primary use
Bayes' Theorem P(A|B) = P(B|A)P(A) / P(B) Formula for updating a probability given new evidence Diagnostics, classification, risk updates
Prior Probability P(A) Belief in the hypothesis before evidence is observed Base rate, prevalence, starting estimate
Posterior Probability P(A|B) Updated belief after the evidence is observed Final answer of a Bayesian update
Likelihood P(B|A) Probability of the evidence given the hypothesis is true Test sensitivity, feature probability
Evidence (Marginal Probability) P(B) Overall probability of the evidence across all hypotheses Normalizing constant in the formula
Bayes Factor BF = P(B|A) / P(B|¬A) Ratio measuring how much the evidence favors A over ¬A Comparing strength of evidence
Prior Odds Ω(A) = P(A) / P(¬A) Prior probability expressed as a ratio rather than a fraction Odds-form Bayesian updates
Posterior Odds Ω(A|B) = Ω(A) × BF Updated odds after multiplying by the Bayes factor Sequential evidence combination
Sensitivity P(Test+|Condition+) True positive rate of a diagnostic test Medical testing, equivalent to P(B|A)
Specificity P(Test−|Condition−) True negative rate of a diagnostic test Medical testing; 1−Spec = P(B|¬A)
Positive Predictive Value PPV = P(Condition+|Test+) Probability of the condition given a positive result Same quantity as the posterior P(A|B)
Naive Bayes Classifier P(C|x) ∝ P(C) ∏ P(x_i|C) Probabilistic classifier assuming features are conditionally independent Spam filtering, text classification

Related Topics and Calculators on Statistics Fundamentals

Bayes' theorem connects to conditional probability, hypothesis testing, and machine learning classification. These resources build the complete picture.

Sources and Further Reading

Authority sources cited in this guide:

  • Stanford Encyclopedia of Philosophy. Bayes' Theorem. plato.stanford.edu
  • Stanford Encyclopedia of Philosophy. Interpretations of Probability. plato.stanford.edu
  • Khan Academy. Bayes' Theorem and Conditional Probability. khanacademy.org
  • Gigerenzer, G. et al. Statistical literacy among physicians interpreting mammography results. PubMed Central
  • OpenStax. Introductory Statistics, Chapter 3: Probability Topics. openstax.org
  • MIT OpenCourseWare. 18.05 Introduction to Probability and Statistics. ocw.mit.edu
  • Devore, J.L. Probability and Statistics for Engineering and the Sciences, 9th ed. Cengage Learning, 2016.

Frequently Asked Questions

Bayes' theorem is a formula in probability theory that describes how to update the probability of a hypothesis based on new evidence. It relates the posterior probability P(A|B) to the prior probability P(A), the likelihood P(B|A), and the total probability of the evidence P(B), using the equation P(A|B) = P(B|A)P(A) / P(B).

Multiply the likelihood P(B|A) by the prior P(A) to get the joint probability of both happening together, then divide by the total probability of the evidence P(B). When P(B) is not given directly, compute it using the law of total probability: P(B) = P(B|A)×P(A) + P(B|¬A)×P(¬A). Use the calculator above to run this automatically.

A Bayes' theorem calculator is a tool that computes the posterior probability of an event based on a prior probability, a conditional likelihood, and the total probability of observed evidence. It automates the arithmetic of Bayesian updating and is commonly used for medical test interpretation, spam filtering, and machine learning classification problems.

Prior probability, P(A), is the probability assigned to a hypothesis before any new evidence is taken into account. It is typically based on historical data, known base rates, or a reasonable starting estimate. In a medical context, the prior is usually disease prevalence in the population being tested.

Posterior probability, P(A|B), is the updated probability of a hypothesis after accounting for new evidence. It is the output of Bayes' theorem and combines the prior probability with the strength of the new evidence. In sequential analysis, the posterior from one update becomes the prior for the next.

Prior probability is the initial estimate of a hypothesis before observing new evidence, often based on historical data or base rates. Posterior probability is the updated estimate after incorporating that evidence using Bayes' theorem. The posterior from one update can serve as the prior for the next round of evidence in sequential Bayesian updating.

False positives lower the posterior probability of a true result, especially when the prior probability (base rate) is low. This is the basis of the base-rate fallacy: even a highly accurate test can produce more false positives than true positives in absolute terms when the condition being tested for is rare in the population, as shown in Example 1 above.

In medicine, Bayes' theorem converts a diagnostic test's sensitivity and specificity into a positive predictive value — the actual probability a patient has a condition given a positive test result. This calculation accounts for disease prevalence (the prior), which is why the same test performs differently across populations with different base rates, as in the mammography example above.

Naive Bayes spam filters and classifiers treat each word or feature as evidence and use Bayes' theorem to compute the posterior probability of a class given that evidence. The filter starts with a prior probability (e.g., the historical spam rate) and updates it as each word's likelihood is incorporated, assuming features are conditionally independent.

The Bayes factor (BF) is the ratio of the likelihood under the hypothesis to the likelihood under its complement: BF = P(B|A) / P(B|¬A). It measures how strongly the evidence favors one hypothesis over another, independent of the prior. The posterior probability combines the Bayes factor with the prior odds: posterior odds = prior odds × BF. A BF far from 1 indicates strong evidence; a BF near 1 indicates the evidence does not discriminate between hypotheses.