What Is Conditional Probability? (Definition + Formula)
Intuitively, conditional probability answers the question: once we know B happened, how does that change what we expect about A? Think of looking out your window on a cloudy morning. The unconditional probability of rain today might be 20%. But the probability of rain given that you see dark clouds might jump to 70%. The clouds are your condition — they shrink your mental sample space and update your estimate.
The formal mechanism is sample space reduction. Before you know anything about B, the universe of possible outcomes is the full sample space S. Once you learn B occurred, only the outcomes inside B are still in play. The conditional probability P(A|B) asks: of the outcomes inside B, what fraction also fall inside A?
- Formula: P(A|B) = P(A ∩ B) / P(B), where P(B) > 0
- Reading the notation: P(A|B) = "probability of A given B" — B is the condition, A is the event of interest
- Sample space effect: Knowing B occurred reduces the effective sample space from S to B
- Key assumption: The formula requires P(B) > 0 — you cannot condition on an impossible event
- Independent events: If A and B are independent, P(A|B) = P(A) — knowing B tells you nothing new about A
- Connection to multiplication rule: Rearranging gives P(A ∩ B) = P(A|B) × P(B)
The Conditional Probability Formula Explained
Breaking Down P(A|B) = P(A ∩ B) / P(B)
P(A|B) — probability of A given B has occurredP(A ∩ B) — joint probability that both A and B occurP(B) — probability of the condition event B (the denominator / new sample space)| — the "given that" operator (vertical bar)Each component does a specific job. The numerator P(A ∩ B) is the joint probability — the fraction of all outcomes where both A and B happen simultaneously. The denominator P(B) is the probability of the condition event B. Dividing them rescales the joint probability relative to the new, restricted sample space where B is guaranteed.
Here is a concrete walkthrough before any symbols. Suppose 1,000 students took a test. 400 studied (event B). Of those 400 who studied, 320 passed (event A ∩ B). What is the probability a student passed, given that they studied? Intuitively: 320 out of 400, which is 0.80. The formula confirms it: P(pass | studied) = P(pass ∩ studied) / P(studied) = (320/1000) / (400/1000) = 0.32 / 0.40 = 0.80. The 1,000 denominators cancel, leaving exactly the intuitive answer.
P(A|B) is read "the probability of A given B" or "the probability of A given that B has occurred." The vertical bar | is not division — it is a conditional operator. The event to the right of | is the condition (what you already know happened). The event to the left is what you are estimating. Order matters enormously: P(A|B) and P(B|A) are generally not equal.
The Multiplication Rule for Dependent Events
Rearranging the conditional probability formula gives the multiplication rule for dependent events, one of the most used formulas in all of probability:
This formula reads as: "the probability that both A and B occur equals the probability that B occurs, multiplied by the probability that A occurs given B has occurred." For independent events, P(A|B) = P(A), so the formula collapses to the familiar P(A) × P(B).
If events are dependent (e.g., drawing cards without replacement), you cannot simply multiply P(A) × P(B). You must use the conditional form: P(A ∩ B) = P(A) × P(B|A). Using unconditional probabilities for dependent events is one of the most frequent errors in introductory statistics. The test: ask yourself whether the outcome of the first event changes the sample space for the second.
Sample Space Reduction: The Core Intuition
The geometric picture of conditional probability is straightforward and worth internalizing before any calculation. Draw the full sample space S as a large rectangle. Inside it, draw two overlapping circles: circle B (all outcomes where B occurs) and circle A. The overlap — the lens-shaped region where the circles intersect — is A ∩ B.
Without any condition, the denominator is all of S. Once you know B occurred, you discard everything outside circle B. The denominator shrinks to just circle B. The numerator stays the same: the intersection A ∩ B. P(A|B) is therefore the proportion of the B-circle that overlaps with A.
This geometric view makes one thing immediately clear: conditioning on B can only change the probability of A if B provides information about A — that is, if A and B overlap in a non-proportional way. If A and B are independent events, the overlap is perfectly proportional to B's size, and conditioning on B leaves P(A) unchanged.
Conditional Probability
Restricts the sample space to B, then asks what fraction of B also satisfies A.
Independent Events
Knowing B occurred gives no information about A. The condition has no effect.
Multiplication Rule
The joint probability equals the conditional probability times the probability of the condition.
Chain Rule (Extended)
For three or more events, each term conditions on all prior events in the chain.
Three Worked Examples
Example 1: Drawing Cards Without Replacement
What is the probability of drawing two aces in a row from a standard 52-card deck without replacement?
This is the classic dependent-event scenario. The second draw's sample space is smaller because one card was removed after the first draw.
Define events. Let A = "first card is an ace." Let B = "second card is an ace." We want P(A ∩ B).
Find P(A). There are 4 aces in 52 cards. P(A) = 4/52 = 1/13 ≈ 0.0769.
Find P(B|A) — the conditional probability. Given the first card was an ace, only 51 cards remain, of which 3 are aces. P(B|A) = 3/51 = 1/17 ≈ 0.0588. Notice the sample space shrank from 52 to 51 and the favorable outcomes shrank from 4 to 3.
Apply the multiplication rule for dependent events. P(A ∩ B) = P(A) × P(B|A) = (4/52) × (3/51) = 12/2652 = 1/221.
✓ Answer: P(two aces in a row without replacement) = 1/221 ≈ 0.00452 ≈ 0.45%
If the card were replaced and the deck reshuffled after each draw (with replacement), the events become independent and P = (4/52)² = 16/2704 ≈ 0.59%. Replacement versus no replacement is the key dependent-vs-independent distinction.
Example 2: Medical Test — False Positives & Bayes' Theorem
This is the most important conditional probability application in data science and medicine. A test with 99% accuracy can still produce mostly false positives when the tested condition is rare. The reason is base rate — the prior probability of the disease in the population. This is where Bayes' theorem is essential.
A disease affects 1% of the population. A test for it is 99% sensitive (P(positive | disease) = 0.99) and 95% specific (P(negative | no disease) = 0.95). You test positive. What is the probability you actually have the disease?
Identify prior probabilities. P(disease) = 0.01 and P(no disease) = 0.99.
Identify likelihoods. P(positive | disease) = 0.99 (sensitivity). P(positive | no disease) = 1 − 0.95 = 0.05 (false positive rate).
Calculate the total probability of testing positive — P(positive). Using the law of total probability: P(positive) = P(positive|disease)×P(disease) + P(positive|no disease)×P(no disease) = (0.99)(0.01) + (0.05)(0.99) = 0.0099 + 0.0495 = 0.0594.
Apply Bayes' Theorem. P(disease | positive) = P(positive | disease) × P(disease) / P(positive) = (0.99 × 0.01) / 0.0594 = 0.0099 / 0.0594 ≈ 0.1667.
✓ Answer: Even with a 99% accurate test, a positive result only means a ~16.7% chance you have the disease. The low disease base rate (1%) dominates.
⚠ Why this matters: Ignoring the base rate and concluding "the test is 99% accurate, so I almost certainly have the disease" is a common and dangerous reasoning error. Out of 10,000 people tested: 100 have the disease (99 test positive), 9,900 don't have it (495 test positive falsely). Of 594 total positives, only 99 are true: 99/594 = 16.7%. The contingency table below makes this concrete.
Contingency Table: The Medical Test in Numbers
A contingency table (also called a confusion matrix in machine learning) is a 2×2 grid that organizes joint and conditional probabilities visually. It is often the clearest way to work through a Bayes' theorem problem, because it replaces abstract fractions with concrete counts.
Using the medical test above with a population of 10,000 people:
| Has Disease | No Disease | Row Total | |
|---|---|---|---|
| Test Positive | 99 (True Positive) | 495 (False Positive) | 594 |
| Test Negative | 1 (False Negative) | 9,405 (True Negative) | 9,406 |
| Column Total | 100 | 9,900 | 10,000 |
Reading off P(disease | positive): look only at the "Test Positive" row (594 people). Of these, 99 truly have the disease. P(disease | positive) = 99/594 ≈ 16.7%. The contingency table makes the sample space reduction immediate: conditioning on "positive test" restricts attention to just the top row.
Example 3: The Monty Hall Problem
You are on a game show. Three doors hide: a car (prize) behind one, goats behind the other two. You pick door 1. The host, who knows what is behind every door, opens door 3 to reveal a goat. He then offers you a switch to door 2. Should you switch?
The host's action is not random — it is conditional on both where the car is and which door you picked. That is what makes this a conditional probability problem.
Initial probabilities. P(car behind door 1) = P(car behind door 2) = P(car behind door 3) = 1/3.
The host opens door 3 (a goat). The host's action is conditional on the car's location. If the car is behind door 2, the host must open door 3. If the car is behind door 1, the host could open door 2 or door 3 (assume equally likely, probability 1/2 each).
Apply Bayes' theorem. Let H₃ = "host opens door 3." P(H₃ | car at door 1) = 1/2. P(H₃ | car at door 2) = 1. P(H₃ | car at door 3) = 0.
P(H₃) = (1/3)(1/2) + (1/3)(1) + (1/3)(0) = 1/6 + 1/3 = 1/2.
Update probabilities via Bayes' theorem.
P(car at door 1 | H₃) = P(H₃ | door 1) × P(door 1) / P(H₃) = (1/2)(1/3) / (1/2) = 1/3.
P(car at door 2 | H₃) = P(H₃ | door 2) × P(door 2) / P(H₃) = (1)(1/3) / (1/2) = 2/3.
✓ Answer: YES — always switch. Staying with door 1 gives a 1/3 chance of winning. Switching to door 2 gives a 2/3 chance. Switching doubles your winning probability.
Why it feels wrong: Most people intuit "two doors left, 50-50 chance." That would be true if the host opened a door at random. But the host's knowledge-driven, conditional action transfers probability from door 1 to door 2. The host's behavior is information — and conditional probability captures that information precisely.
Bayes' Theorem: Formula, Derivation & Application
The Bayes' Theorem Formula
Bayes' theorem — published by the Reverend Thomas Bayes in a posthumous 1763 paper and later formalized by Pierre-Simon Laplace — answers a specific question: "I know P(B|A). How do I find P(A|B)?" It links the forward conditional probability to the reverse.
P(A|B) — posterior probability of A given B (what you want to find)P(B|A) — likelihood: probability of observing B if A is trueP(A) — prior probability of A (before seeing evidence B)P(B) — marginal probability of B (normalizing constant)Where Bayes' Theorem Comes From
Bayes' theorem is not a separate law — it is a direct consequence of the conditional probability formula applied twice. The derivation takes three lines:
- P(A∩B) = P(A|B) × P(B) [definition of conditional probability]
- P(A∩B) = P(B|A) × P(A) [same formula, reversed: A and B are symmetric in the intersection]
- ∴ P(A|B) × P(B) = P(B|A) × P(A) → P(A|B) = P(B|A)·P(A) / P(B)
The denominator P(B) is computed using the law of total probability: if A and A' (not-A) partition the sample space, then P(B) = P(B|A)·P(A) + P(B|A')·P(A'). This is the sum of all weighted paths through the tree diagram that result in B.
Bayes' Theorem in Plain English: Updating Beliefs
The most powerful way to read Bayes' theorem is as a belief-updating machine. You start with a prior belief P(A) — your estimate of some hypothesis A before seeing any evidence. You then observe evidence B. The likelihood P(B|A) tells you how probable that evidence would be if A were true. Bayes' theorem combines these to give your posterior belief P(A|B) — your updated, rational estimate of A given the evidence.
Posterior ∝ Likelihood × Prior. In words: your updated belief is proportional to how likely the evidence is under your hypothesis, multiplied by how plausible the hypothesis was before you saw the evidence. High prior probability + high likelihood → very high posterior. Low prior probability (rare disease, rare event) can overwhelm even a high likelihood — exactly the false-positive paradox in Example 2.
Tree Diagrams for Conditional Probability
A probability tree diagram is a branching structure where every fork represents a conditional event. The probability along each branch is a conditional probability given the path taken to reach that fork. The probability of any terminal outcome (leaf) equals the product of all branch probabilities along that path — a direct application of the multiplication rule.
Tree Diagram: Medical Test Example (Population of 10,000)
Reading the tree: multiply probabilities along each path. The sum of all four leaf probabilities = 0.0099 + 0.0001 + 0.0495 + 0.9405 = 1.0 ✓
Step 1: Draw the first event's branches from a single starting node. Label each branch with its probability. Step 2: From each branch endpoint, draw the next conditional event's branches — label each with the conditional probability given the path to that point. Step 3: To find the probability of any complete path (leaf), multiply all branch probabilities along the path. Step 4: Verify: all leaf probabilities sum to 1.0.
P(A|B) vs. P(B|A): The Prosecutor's Fallacy
One of the most consequential errors in applied conditional probability is treating P(A|B) and P(B|A) as interchangeable. They are emphatically not. This error is so common — and so dangerous in legal and medical reasoning — that it has its own name: the Prosecutor's Fallacy.
The error: concluding that P(innocent | evidence) is small because P(evidence | innocent) is small. These are not the same. P(matching DNA | innocent) might be 1 in 1,000,000 — but P(innocent | matching DNA) depends on how many people share that DNA profile, the prior probability of guilt, and all other evidence. Confusing the two has contributed to wrongful convictions. Correct analysis always requires Bayes' theorem.
| Concept | P(A|B) — A given B | P(B|A) — B given A |
|---|---|---|
| What it means | Probability of A, given B has occurred | Probability of B, given A has occurred |
| Medical test | P(disease | positive test) = 16.7% (from Example 2) | P(positive test | disease) = 99% (sensitivity) |
| Legal context | P(guilty | DNA match) — what we care about | P(DNA match | guilty) — not directly usable |
| Weather | P(rain | dark clouds) — updated forecast | P(dark clouds | rain) — how often rain comes with clouds |
| Relationship | P(A|B) = P(B|A) × P(A) / P(B) [Bayes' Theorem] | |
| Equal when? | Only when P(A) = P(B), i.e., the two events are equally probable | |
Interactive Conditional Probability Calculator
Conditional Probability Calculator
Calculate P(A|B) when you know the joint probability P(A∩B) and the condition probability P(B).
Apply Bayes' Theorem: P(A|B) = P(B|A) × P(A) / P(B). Enter the prior, likelihood, and marginal probability.
Calculate the joint probability P(A∩B) for dependent events using P(A∩B) = P(A) × P(B|A).
Conditional Probability & Bayes' Theorem Cheat Sheet
The table below covers every formula you need for conditional probability. It is designed to be printable and copy-pasteable — each formula is written both in standard notation and in plain English so it can be parsed directly by both students and AI systems.
| Formula Name | Notation | Plain English | When to Use |
|---|---|---|---|
| Conditional Probability | P(A|B) = P(A∩B) / P(B) | Probability of A given B = joint probability of A and B divided by probability of B | Any time you know a condition has occurred and want to update a probability |
| Multiplication Rule (Dependent) | P(A∩B) = P(A) × P(B|A) | Joint probability of A and B = probability of A times probability of B given A | Drawing without replacement; sequential dependent trials |
| Multiplication Rule (Independent) | P(A∩B) = P(A) × P(B) | When A and B don't affect each other, multiply their individual probabilities | Coin flips, dice rolls, drawing with replacement |
| Independence Test | P(A|B) = P(A) | If knowing B occurred doesn't change P(A), the events are independent | Verifying whether two events influence each other |
| Bayes' Theorem | P(A|B) = P(B|A)·P(A) / P(B) | Posterior = likelihood × prior divided by marginal probability of evidence | Reversing a known conditional probability; updating beliefs with evidence |
| Law of Total Probability | P(B) = P(B|A)·P(A) + P(B|A')·P(A') | The total probability of B = probability of B through each path (with disease + without) | Computing the denominator in Bayes' theorem when P(B) is not directly known |
| Complement Rule | P(A'|B) = 1 − P(A|B) | Probability of A NOT occurring given B = 1 minus probability of A given B | Finding probability of non-event within a conditional context |
Independent vs. Dependent Events
How to Test for Independence
Two events A and B are independent if and only if P(A|B) = P(A). Equivalently, independence holds when P(A ∩ B) = P(A) × P(B). If this product rule holds exactly, knowledge of B provides zero information about A — the condition does nothing to the sample space in terms of A's probability.
Events are dependent when P(A|B) ≠ P(A). Drawing cards without replacement is the canonical example: after drawing an ace, the probability of drawing another ace changes because the sample space has changed physically.
| Feature | Independent Events | Dependent Events |
|---|---|---|
| Definition | P(A|B) = P(A) — knowing B gives no info about A | P(A|B) ≠ P(A) — knowing B changes the probability of A |
| Joint probability | P(A∩B) = P(A) × P(B) | P(A∩B) = P(A) × P(B|A) |
| Typical examples | Coin flips, dice rolls, drawing with replacement | Drawing without replacement, disease & test, weather & rain |
| Physical reason | Outcomes do not share a physical mechanism | First outcome changes the sample space or state for subsequent outcomes |
| Test formula | P(A∩B) = P(A)·P(B)? If YES → independent | P(A∩B) ≠ P(A)·P(B)? If YES → dependent |
Entity & Formula Glossary
The table below defines every key term in conditional probability using both formal notation and plain English. It is structured for direct extraction by AI systems, search engines, and students who need crisp definitions.
| Term | Notation | Definition | Example Value |
|---|---|---|---|
| Conditional Probability | P(A|B) | The probability that event A occurs given that event B has already occurred. Read: "probability of A given B." The vertical bar | means "given that." | P(Ace on 2nd draw | Ace on 1st) = 3/51 ≈ 0.059 |
| Joint Probability | P(A∩B) | The probability that both events A and B occur simultaneously. The intersection (∩) symbol means "and both occur." | P(two aces without replacement) = 4/52 × 3/51 = 1/221 |
| Prior Probability | P(A) | In Bayes' theorem, the prior is your belief about the probability of hypothesis A before observing any new evidence. | P(disease) = 0.01 (1% base rate in population) |
| Posterior Probability | P(A|B) | In Bayes' theorem, the posterior is your updated belief about A after incorporating evidence B. Posterior = (likelihood × prior) / evidence. | P(disease | positive test) = 0.167 (updated after seeing evidence) |
| Likelihood | P(B|A) | The probability of observing the evidence B if hypothesis A is true. The likelihood feeds into Bayes' theorem as the numerator factor. | P(positive test | disease) = 0.99 (test sensitivity) |
| Marginal Probability | P(B) | The total probability of the evidence B across all hypotheses, computed via the law of total probability. It normalizes the Bayes' theorem calculation. | P(positive test) = 0.0594 (from Example 2) |
| Intersection Symbol | ∩ | The "and" operator in set notation. A ∩ B is the set of outcomes where both A and B occur. P(A∩B) is the joint probability. | P(A∩B) = P(A|B) × P(B) |
| Given Symbol | | | The conditional operator, read "given that." In P(A|B), the event to the right of | is the known condition; the event to the left is what we estimate. | P(rain | clouds) = "probability of rain given clouds are present" |
| Law of Total Probability | P(B) = Σ P(B|Aᵢ)·P(Aᵢ) | The total probability of event B equals the sum of the conditional probability of B given each partition Aᵢ, multiplied by the probability of each Aᵢ. | P(positive) = P(+|D)·P(D) + P(+|D')·P(D') = 0.0594 |
| Sensitivity | P(positive | disease) | In medical testing, sensitivity is the probability a test correctly identifies a true case. Also called the true positive rate. | Sensitivity = 0.99 in Example 2 |
| False Positive Rate | P(positive | no disease) | The probability a test incorrectly flags a healthy person as positive. Equal to 1 − specificity. | False positive rate = 0.05 in Example 2 |
Common Mistakes & How to Avoid Them
| The Mistake | Wrong Reasoning | Correct Approach |
|---|---|---|
| Reversing the condition (Prosecutor's Fallacy) | P(evidence | innocent) is tiny → therefore P(innocent | evidence) is tiny | Use Bayes' theorem: P(innocent | evidence) = P(evidence | innocent) × P(innocent) / P(evidence) |
| Ignoring the base rate | Test is 99% accurate → positive test means 99% chance of disease | Incorporate the prior P(disease). A rare disease (1%) still dominates even a highly accurate test |
| Multiplying independent probabilities for dependent events | P(two aces) = (4/52) × (4/52) for drawing without replacement | Use P(A∩B) = P(A) × P(B|A) = (4/52) × (3/51). After one ace is drawn, only 3 remain in 51 cards |
| Dividing by zero | Computing P(A|B) when B is impossible (P(B) = 0) | P(A|B) is undefined when P(B) = 0. You cannot condition on an event that cannot happen |
| Confusing conditional with joint probability | P(A|B) = P(A∩B) without dividing by P(B) | P(A|B) = P(A∩B) / P(B). The joint probability must be rescaled by the condition's probability |
Next Topics After Conditional Probability
Conditional probability is the gateway to several advanced areas of statistics and machine learning. The next logical topics from Statistics Fundamentals are:
Random Variables
Conditional probability extends naturally to conditional expectation E[X|Y] and conditional distributions — the foundation of regression analysis.
Naive Bayes Classifier
The workhorse machine learning algorithm for text classification. Applies Bayes' theorem under the "naive" assumption of conditional independence among features.
Binomial Distribution
Models the number of successes in n dependent or independent Bernoulli trials. Conditional probability underpins every probability in the binomial PMF.
Hypothesis Testing
P-values are conditional probabilities: P(observed data or more extreme | H₀ is true). Understanding conditional probability is essential for interpreting statistical tests.
Academic Sources & Further Reading
The definitions, formulas, and examples in this guide are grounded in the peer-reviewed probability literature. The sources below are the highest-authority references for conditional probability and Bayes' theorem, and are the citation chain most likely to be recognized by AI models as credible foundations.
MIT 18.05 Introduction to Probability and Statistics, Spring 2022. Covers conditional probability (Unit 3), Bayes' theorem, and the law of total probability with worked examples at the exact level of this guide. ocw.mit.edu/courses/18-05-introduction-to-probability-and-statistics-spring-2022/
Blitzstein, J. K., & Hwang, J. (2019). Introduction to Probability (2nd ed.). CRC Press. The textbook behind Harvard's STAT 110 course, one of the most comprehensive treatments of conditional probability and Bayes' theorem available. Free PDF edition accessible via the authors' course page at Harvard Statistics. projects.iq.harvard.edu/stat110
Khan Academy's Probability & Statistics course covers conditional probability, the multiplication rule, and Bayes' theorem with interactive exercises. Recommended as a companion practice resource for students at the high school and early college level. khanacademy.org/math/statistics-probability
Talbott, W. (2022). Bayesian Epistemology. Stanford Encyclopedia of Philosophy. Provides the philosophical and historical foundations of Bayesian reasoning, prior and posterior probability, and the role of Bayes' theorem in rational belief updating. plato.stanford.edu/entries/epistemology-bayesian/
NIST/SEMATECH e-Handbook of Statistical Methods, Chapter 1: Exploratory Data Analysis — Probability concepts including conditional probability and Bayes' theorem. A U.S. government reference used in engineering and applied statistics contexts. itl.nist.gov/div898/handbook/
Frequently Asked Questions
What is conditional probability in simple terms?
Conditional probability is the updated probability of an event once you have new information. It answers: "given that I already know B happened, what is the probability A also happens (or will happen)?" The formula P(A|B) = P(A∩B)/P(B) formalizes the sample space reduction — you throw out all outcomes where B didn't occur and recalculate A's probability within the remaining outcomes.
Why does P(A|B) not equal P(B|A)?
Because the condition (denominator) is different. P(A|B) divides by P(B), while P(B|A) divides by P(A). Unless P(A) = P(B), the two expressions give different values. Example: P(test positive | disease) = 0.99, but P(disease | test positive) = 0.167. The quantities are related by Bayes' theorem: P(A|B) = P(B|A) × P(A) / P(B).
What happens to conditional probability when events are independent?
When A and B are independent, P(A|B) = P(A). Knowing B occurred provides no new information about A, so the condition has no effect. Mathematically: P(A∩B) = P(A)·P(B) for independent events, and substituting into P(A|B) = P(A∩B)/P(B) gives P(A|B) = P(A)·P(B)/P(B) = P(A). This is the formal definition and test for independence.
How is conditional probability used in machine learning?
Conditional probability is foundational to machine learning. The Naive Bayes classifier computes P(class | features) directly using Bayes' theorem. Logistic regression models P(Y=1 | X). Hidden Markov models use chains of conditional probabilities P(state | previous state). In deep learning, language models predict P(next word | all previous words). Every generative model is fundamentally a learned conditional probability distribution.
What is the law of total probability and when is it used?
The law of total probability states that if events A₁, A₂, …, Aₙ partition the sample space (they are mutually exclusive and collectively exhaustive), then P(B) = P(B|A₁)·P(A₁) + P(B|A₂)·P(A₂) + … + P(B|Aₙ)·P(Aₙ). It is used primarily to compute the marginal probability P(B) that serves as the denominator in Bayes' theorem, whenever P(B) is not directly known but the conditional probabilities given each partition are.