What is the prior probability formula?

In Bayes' Theorem, P(H|D) = [P(D|H) × P(H)] / P(D), the prior probability is simply P(H) — the probability of the hypothesis H before observing data D. It is the mathematical starting point of every Bayesian analysis.

What are informative and noninformative priors?

An informative prior encodes specific knowledge about a parameter (e.g., a tight normal distribution around a historical average). A noninformative (flat or diffuse) prior assigns roughly equal probability across all parameter values, letting the data drive the result. A weakly informative prior sits between these, ruling out implausible values while remaining open to a wide range.

How is prior probability used in Bayes' theorem?

Bayes' Theorem multiplies the prior probability P(H) by the likelihood P(D|H) — the probability of seeing the observed data given the hypothesis — then normalizes by the total probability of the data P(D). The result is the posterior probability P(H|D): updated belief after seeing the evidence.

Prior Probability: Bayesian Statistics Formula, Guide & Examples (2026)

Q: What is prior probability?

Prior probability is the initial probability assigned to a hypothesis or event before any new data is gathered. In Bayesian inference, it represents existing knowledge, historical data, or subjective beliefs about how likely an outcome is before updating with evidence.

Q: What is the difference between prior and posterior probability?

Prior probability P(H) represents belief about a hypothesis before seeing data. Posterior probability P(H|D) represents updated belief after combining the prior with the observed data's likelihood via Bayes' Theorem. The posterior from one analysis becomes the prior for the next.

What Is Prior Probability? (Definition)

Definition — Prior Probability

Prior probability is the probability assigned to a hypothesis or event before observing new data. Written P(H), it captures what is already known — or assumed — about how likely a particular outcome is, based on existing evidence, theory, or subjective judgment, before any experiment takes place.

P(H) — probability of hypothesis H before observing data

The word "prior" simply means "before." It is the mathematical counterpart to the question you ask yourself at the start of any investigation: What do I already know, and how confident am I? That answer, translated into a number between 0 and 1, is the prior probability.

Prior probability is the foundational input of Bayes' theorem. Without a prior, Bayesian updating cannot begin. The prior does not need to be perfect or free of subjectivity — it just needs to represent the honest state of knowledge before the data arrives. When that data is collected, the prior is mathematically combined with the likelihood of the data to produce a posterior probability: an updated, data-informed belief.

This framework dates to Thomas Bayes, an eighteenth-century English minister and mathematician whose unpublished work was presented posthumously in 1763. Pierre-Simon Laplace independently formalized the same ideas several decades later. Today, Bayesian reasoning underlies everything from medical diagnostic tests and spam filters to modern neural network training and conditional probability models in finance.

⚡ Quick Reference — Prior Probability Key Facts

Symbol: P(H) — the probability of hypothesis H before seeing data
Range: 0 to 1, where 0 = impossible and 1 = certain
Role: The mathematical starting point of every Bayesian analysis
Sources: Historical data, expert knowledge, theory, or a flat assumption of ignorance
After updating: Prior × Likelihood → (normalized) → Posterior probability P(H|D)
Iterative use: Today's posterior becomes tomorrow's prior as new data arrives

Prior Probability at a Glance

🔎

What It Represents

Initial degree of belief before new evidence. The mathematical launchpad of Bayesian analysis.

🛠️

Core Relationship

Prior × Likelihood (normalized by evidence) = Posterior probability

🔄

Iterative Nature

Each posterior becomes the next prior when fresh data arrives — Bayesian updating is continuous.

🎯

Why It Matters

Integrates domain expertise into statistical models and stabilizes estimates with small samples.

The Prior Probability Formula in Bayes' Theorem

Prior probability does not stand alone — it is one of four interlocking quantities in Bayes' theorem. Understanding the full equation shows exactly how a prior interacts with data to produce an updated belief.

Bayes' Theorem — Full Form

P(H|D) = [ P(D|H) × P(H) ] / P(D)

P(H) = Prior probability P(D|H) = Likelihood P(D) = Marginal likelihood (evidence) P(H|D) = Posterior probability

Each component has a specific interpretation:

Term	Notation	Plain-English Meaning	Role in the Equation
Prior Probability	P(H)	How likely is the hypothesis before we see any data?	Starting point — what you know going in
Likelihood	P(D\|H)	If the hypothesis is true, how probable is this specific data?	Evidence weight — how well the data fits the hypothesis
Marginal Likelihood	P(D)	What is the total probability of seeing this data across all hypotheses?	Normalizing constant — keeps the posterior between 0 and 1
Posterior Probability	P(H\|D)	After seeing the data, how likely is the hypothesis now?	Output — the updated, data-informed belief

Calculating the Marginal Likelihood P(D)

When there are two mutually exclusive hypotheses (H and its complement ¬H), the denominator expands as:

Marginal Likelihood — Two-Hypothesis Case

P(D) = P(D|H) × P(H) + P(D|¬H) × P(¬H)

P(¬H) = 1 − P(H) P(D|¬H) = false-positive rate

This denominator is the same numerical value whether you're updating beliefs about a medical diagnosis, a manufacturing defect, or a classification label in a machine learning model. It normalizes the numerator so the posterior sums to one across all hypotheses.

📋

Featured Snippet — Prior Probability Formula

In Bayes' theorem P(H|D) = [P(D|H) × P(H)] / P(D), the prior probability is P(H) — the probability of the hypothesis before any data is observed. It is multiplied by the likelihood P(D|H) and divided by the total probability of the data P(D) to yield the posterior probability P(H|D).

The 6-Step Bayesian Updating Framework

Bayesian updating follows a consistent sequence regardless of the domain. These steps apply whether you are running a clinical trial, training a text classifier, or adjusting a financial risk model. For a broader look at the underlying probability rules this process depends on, see the probability rules guide on Statistics Fundamentals.

Define the Hypothesis Space

Write out all mutually exclusive, collectively exhaustive hypotheses about the parameter or event. For a binary case: H (disease present) and ¬H (disease absent). For a continuous parameter, the "hypothesis space" is a full prior distribution over possible values.

Assign the Prior Probability P(H)

Set P(H) using historical base rates, domain expertise, or a noninformative assumption. This is the most consequential step in Bayesian analysis — a poorly chosen prior can distort the posterior, especially with small samples. When in doubt, use a weakly informative prior rather than a flat uniform.

Collect Empirical Data D

Run an experiment, take a measurement, or query a dataset. The data is the evidence that will move the prior toward a posterior. More data generally means less sensitivity to the choice of prior — a useful property called "washing out the prior."

Calculate the Likelihood P(D|H)

For each hypothesis, determine how probable the observed data would be if that hypothesis were true. In a diagnostic test, this is the sensitivity (true positive rate). In a coin flip problem, it is the binomial probability of getting the observed number of heads given an assumed bias.

Compute the Marginal Likelihood P(D)

Calculate the total probability of observing data D across all hypotheses: P(D) = P(D|H)×P(H) + P(D|¬H)×P(¬H). This normalizes the result so the posterior is a valid probability between 0 and 1. For more than two hypotheses, sum over all of them.

Apply Bayes' Theorem to Get the Posterior

Divide the numerator P(D|H)×P(H) by the marginal likelihood P(D). The result is P(H|D) — the posterior probability. This posterior then serves as the prior for the next round of data collection, creating the iterative cycle that characterizes Bayesian inference. See the detailed guide to Bayes' theorem for the full mathematical treatment.

Types of Prior Probability Distributions

Choosing a prior is one of the most consequential decisions in Bayesian analysis. The three main categories are defined by how much domain knowledge they encode.

Informative Prior

Contains specific knowledge about a parameter. Used when historical data or expert consensus gives a reliable estimate of where the parameter should fall.

θ ~ N(μ₀, σ₀²) with low σ₀

Noninformative (Flat) Prior

Assigns equal probability across the parameter space, expressing maximum uncertainty. Lets the data drive the posterior almost entirely. Also called a uniform or diffuse prior.

p(θ) ∝ constant

Weakly Informative Prior

Rules out implausible parameter values without strongly constraining the result. A good default choice when you have some domain knowledge but want the data to matter. Wide Cauchy or half-normal distributions are common choices.

θ ~ Cauchy(0, 2.5)

Subjective vs. Objective Priors

A long-standing debate in Bayesian statistics concerns whether priors should reflect personal beliefs or be derived mechanically from the structure of the problem.

Dimension	Subjective Prior	Objective Prior
Source	Personal or expert judgment, quantified as a probability	Derived from mathematical invariance principles (e.g., Jeffreys' prior)
Strength	Can be tightly informative if expert knowledge is strong	Typically diffuse; designed to minimize influence on the posterior
Criticism	Two analysts may choose different priors and reach different posteriors	No single "objective" prior exists for all situations; invariance criteria differ
Best use	Clinical trials with established historical rates, industrial quality control	Exploratory analyses where the researcher wants the data to speak freely

Conjugate Priors

When the prior distribution and the likelihood function belong to families that produce a posterior in the same family as the prior, the prior is called conjugate. This mathematical convenience — the posterior has a known, closed-form distribution — was critical before the advent of modern computational methods like Markov Chain Monte Carlo (MCMC).

Likelihood Distribution	Conjugate Prior	Resulting Posterior	Typical Application
Binomial	Beta(α, β)	Beta(α + successes, β + failures)	Coin bias estimation, A/B testing
Poisson	Gamma(α, β)	Gamma(α + counts, β + n)	Event rate modeling, queueing
Normal (known variance)	Normal(μ₀, σ₀²)	Normal (updated mean and variance)	Height, measurement error, regression coefficients
Exponential	Gamma(α, β)	Gamma(α + n, β + Σxᵢ)	Survival analysis, failure-time modeling

Prior Probability vs. Posterior Probability

The prior and posterior are two snapshots of the same belief — one before the evidence and one after. Understanding how they differ clarifies why the choice of prior matters, and when it matters most.

Characteristic	Prior Probability P(H)	Posterior Probability P(H\|D)
Timing	Before data is observed	After data is observed
Notation	P(H)	P(H\|D)
Information basis	Historical data, theory, or expert judgment	Prior + likelihood of the observed data
Sensitivity to sample size	Independent of the current sample	More data = posterior shifts further from the prior
Use in next analysis	Stands alone as the starting point	Becomes the prior in the next round of Bayesian updating
Effect of a flat prior	Equal weight on all parameter values	Posterior ≈ normalized likelihood (data-driven)
Effect of a strong prior	Concentrated on a narrow range	Posterior pulled toward prior, especially with small n

A key insight: with a fixed prior and growing sample size, the posterior will eventually converge to the same distribution regardless of which prior was chosen, as long as that prior does not assign probability zero to the true parameter value. This property makes Bayesian methods robust to reasonable prior misspecification when data is plentiful. For related background, see the guide to the law of large numbers.

Worked Examples: Prior Probability in Practice

Each example below follows the 6-step Bayesian updating framework from Section 3. All arithmetic is shown in full, with numerical inputs clearly identified so the procedure can be reproduced with different values. For additional probability calculation practice, the probability calculator handles many of these computations directly.

Example 1 — Medical Diagnostics (Rare Disease Paradox)

Worked Example 1 — Medical Diagnostics

Problem: A patient tests positive for a disease that affects 0.1% of the general population. The test has 99% sensitivity (true positive rate) and a 5% false-positive rate. What is the probability the patient actually has the disease?

Inputs

P(Disease) = 0.001 | P(+|Disease) = 0.99 | P(+|Healthy) = 0.05

P(H) = 0.001 (prior — base rate) P(¬H) = 0.999

Hypothesis space: H = "patient has the disease" | ¬H = "patient is healthy"

Assign prior: P(H) = 0.001 — the disease base rate in the population before the test result is known

Data: The test result is positive (+)

Likelihoods: P(+|Disease) = 0.99 | P(+|Healthy) = 0.05

Marginal likelihood:
P(+) = (0.99 × 0.001) + (0.05 × 0.999)
P(+) = 0.00099 + 0.04995 = 0.05094

Posterior probability:
P(Disease|+) = (0.99 × 0.001) / 0.05094
P(Disease|+) = 0.00099 / 0.05094 = 0.0194 ≈ 1.94%

✅ Interpretation: Despite a 99%-accurate test, the patient's probability of actually having the disease is only about 1.94%. The extremely low prior (0.1% base rate) dominates the result. This is why screening programs typically require a confirmatory second test — the prior matters enormously when diseases are rare.

This example illustrates the base rate fallacy, a well-documented reasoning error. See: Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux. For the mathematical foundation, see the conditional probability guide.

Example 2 — Spam Email Detection (Naive Bayes Classifier)

Worked Example 2 — Spam Detection

Problem: A spam filter knows that 40% of all incoming emails are spam. An email arrives containing the phrase "wire transfer." In the training data, 80% of spam emails contained this phrase, versus 5% of legitimate emails. What is the posterior probability the email is spam?

Inputs

P(Spam) = 0.40 | P(phrase|Spam) = 0.80 | P(phrase|Ham) = 0.05

P(H) = 0.40 (prior — spam base rate) P(¬H) = 0.60

Hypothesis space: H = "email is spam" | ¬H = "email is legitimate (ham)"

Prior: P(Spam) = 0.40 — learned from the historical distribution of messages in the training corpus

Data: The email contains "wire transfer"

Likelihoods: P(phrase|Spam) = 0.80 | P(phrase|Ham) = 0.05

Marginal likelihood:
P(phrase) = (0.80 × 0.40) + (0.05 × 0.60)
P(phrase) = 0.320 + 0.030 = 0.350

Posterior probability:
P(Spam|phrase) = (0.80 × 0.40) / 0.350
P(Spam|phrase) = 0.320 / 0.350 = 0.914 ≈ 91.4%

✅ Interpretation: The prior (40% spam rate) combined with a very diagnostic keyword (80% vs. 5% likelihood ratio) pushes the posterior to 91.4%. Real-world spam filters apply this logic simultaneously across dozens or hundreds of features, each contributing a Bayesian update. The expected value of each classification can be computed using the tools in the expected value guide.

Example 3 — Coin Bias Estimation (Beta–Binomial Conjugate)

Worked Example 3 — Parameter Estimation

Problem: You suspect a coin is fair (θ = 0.5). You encode this belief as a Beta(10, 10) prior — centered at 0.5, with moderate confidence. You then flip the coin 20 times and observe 14 heads. What does the posterior distribution say about the true bias θ?

This example uses the Beta–Binomial conjugate pair. Because the Beta prior is conjugate to the Binomial likelihood, the posterior is also a Beta distribution — no integration required.

Prior: θ ~ Beta(α=10, β=10). Prior mean = α/(α+β) = 10/20 = 0.50. This encodes the belief that the coin is roughly fair, with moderate certainty (equivalent to having seen 10 heads and 10 tails previously).

Data: 20 flips, 14 heads (successes = 14, failures = 6)

Likelihood: Binomial with parameters n=20, k=14, and probability θ. For the conjugate update, we need only the sufficient statistics: 14 successes, 6 failures.

Posterior update (conjugate formula):
Posterior = Beta(α + successes, β + failures)
Posterior = Beta(10 + 14, 10 + 6) = Beta(24, 16)

Posterior mean and credible interval:
Posterior mean = 24/(24+16) = 24/40 = 0.60
The 95% credible interval for Beta(24,16) is approximately [0.44, 0.74]

✅ Interpretation: The prior belief of θ = 0.50 (fair coin) is updated toward θ = 0.60 after 14 heads in 20 flips. The posterior mean does not jump all the way to the sample proportion of 0.70 — the prior pulls it back, because our prior was moderately confident. A weaker prior (Beta(1,1) — completely flat) would yield a posterior mean of 14/20 = 0.70, closer to the raw data. For a full treatment of credible intervals, see the credible intervals guide.

Conjugate prior formulas follow Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., & Rubin, D.B. (2013). Bayesian Data Analysis (3rd ed.). Chapman and Hall/CRC. Available at stat.columbia.edu/~gelman/book.

Prior-to-Posterior Calculator

Enter a prior probability, the likelihood of the evidence given the hypothesis, and the false-positive rate (likelihood of the evidence given the alternative). The calculator applies Bayes' theorem and returns the posterior probability with a full step-by-step breakdown.

🧮 Bayesian Updating Calculator

Prior Probability P(H)

Likelihood P(D|H)

False Positive Rate P(D|¬H)

Real-World Applications of Prior Probability

Prior probability is not a theoretical abstraction — it is the practical bridge between what is already known and what new data can tell us. Here are four domains where setting the right prior has direct, measurable consequences.

🏥

Healthcare Diagnostics

Disease prevalence in a population sets the prior before any test is administered. Without accounting for this base rate, positive tests for rare conditions will overwhelmingly be false positives — exactly as Example 1 shows.

🤖

Machine Learning

Naive Bayes classifiers use prior class probabilities learned from training data. In deep learning, Bayesian regularization acts as an informative prior over model weights, preventing overfitting on limited training sets.

📈

Financial Risk Modeling

Credit risk models set prior default probabilities from historical loan performance. These priors are then updated with borrower-specific evidence (credit score, income, debt-to-income ratio) to yield posterior default risk estimates.

🧪

A/B Testing & Optimization

Bayesian A/B tests initialize with a prior based on historical conversion rates. Each observation updates the posterior. This allows decisions to be made continuously as data arrives, rather than waiting for a fixed sample size. Pair this with significance level concepts for context.

Prior Probability and the Frequentist vs. Bayesian Divide

The prior probability is the feature that most clearly separates Bayesian statistics from the classical frequentist approach taught in most introductory courses.

Dimension	Frequentist Approach	Bayesian Approach
Nature of parameters	Fixed, unknown constants	Random variables with probability distributions
Prior knowledge	Not formally incorporated	Explicitly encoded in the prior P(H)
Output	Point estimate + confidence interval	Full posterior distribution + credible interval
Probability meaning	Long-run frequency of events	Degree of belief, updated with evidence
Hypothesis testing	p-value + reject/fail-to-reject H₀	Posterior probability of each hypothesis; Bayes factors
Small-sample behavior	Relies on asymptotic approximations	Prior stabilizes estimates when n is small

Neither approach is categorically superior. For well-understood problems with large samples, frequentist methods from hypothesis testing to confidence intervals are efficient and widely understood. Bayesian methods with explicit priors earn their keep when data is scarce, domain knowledge is strong, or when the goal is to update beliefs continuously as data streams in.

Common Misconceptions About Prior Probability

Misconception	What's Wrong	Correct Understanding
"Any prior will give the same result"	False	With small samples, different priors can produce very different posteriors. Agreement requires large samples.
"A flat prior is always objective"	False	A uniform prior on a parameter is not uniform on a transformed scale (e.g., log scale). Jeffreys' prior is invariant to reparameterization but not flat.
"Prior probability is just a guess"	Misleading	A prior encodes existing knowledge — it may draw on decades of historical data, meta-analyses, or validated theoretical models. The word "subjective" does not mean arbitrary.
"The posterior probability is more accurate than the prior"	Oversimplified	The posterior is only as reliable as the prior and the likelihood model. A badly mis-specified prior or likelihood will produce a confidently wrong posterior.
"Prior probability is the same as base rate"	Partially false	A base rate can serve as a prior, but prior probability is a broader concept. It may come from a full probability distribution over a parameter, not just a single population frequency.

Implementing Prior Probability in Python and R

Modern Bayesian computation handles intractable posterior distributions through sampling algorithms. The code below demonstrates Bayesian updating using PyMC (Python) and a manual implementation in R.

Python — Beta Prior with Binomial Likelihood (PyMC)

Python — PyMC

import pymc as pm
import numpy as np

# Coin flip estimation: 14 heads in 20 flips
# Prior belief: coin is roughly fair — Beta(10, 10)

with pm.Model() as coin_model:

    # 1. Prior probability distribution
    theta = pm.Beta("theta", alpha=10, beta=10)

    # 2. Likelihood — binomial with 20 flips, 14 observed heads
    obs = pm.Binomial("obs", n=20, p=theta, observed=14)

    # 3. Sample from posterior using MCMC (NUTS sampler)
    trace = pm.sample(draws=4000, tune=1000, return_inferencedata=True)

# Posterior mean — analytically Beta(24, 16) → mean = 24/40 = 0.60
posterior_mean = trace.posterior["theta"].values.mean()
print(f"Posterior mean: {posterior_mean:.4f}")   # ≈ 0.60

R — Manual Bayesian Update (Beta–Binomial)

R — Base Implementation

# Prior parameters: Beta(alpha_prior, beta_prior)
alpha_prior <- 10
beta_prior  <- 10

# Data: 14 heads in 20 flips
successes <- 14
failures  <- 6

# Posterior parameters (conjugate Beta update)
alpha_post <- alpha_prior + successes   # 10 + 14 = 24
beta_post  <- beta_prior  + failures    # 10 +  6 = 16

# Posterior mean and 95% credible interval
post_mean <- alpha_post / (alpha_post + beta_post)
ci_95     <- qbeta(c(0.025, 0.975), alpha_post, beta_post)

cat("Prior mean:     ", alpha_prior / (alpha_prior + beta_prior), "\n")  # 0.50
cat("Posterior mean: ", post_mean, "\n")                               # 0.60
cat("95% Credible Interval: [", ci_95[1], ",", ci_95[2], "]\n")    # [0.44, 0.74]

For more complex models involving multiple parameters or non-conjugate priors, Stan is the standard reference implementation. Documentation is available at mc-stan.org. For Python users, the full PyMC documentation and tutorials are at pymc.io.

Entity and Formula Glossary

Term	Notation	Definition
Prior Probability	P(H)	Probability of a hypothesis before observing new data
Prior Distribution	π(θ)	Full probability distribution encoding initial uncertainty over a parameter
Likelihood Function	P(D\|H)	Probability of the observed data given a hypothesis or parameter value
Marginal Likelihood	P(D)	Total probability of the data, summed across all hypotheses; normalizing constant
Posterior Probability	P(H\|D)	Updated probability of a hypothesis after observing data
Bayes' Theorem	P(H\|D) = P(D\|H)P(H)/P(D)	The equation that connects prior, likelihood, and posterior
Prior Odds	O(H) = P(H) / P(¬H)	Ratio of prior probability of H to prior probability of ¬H
Posterior Odds	O(H\|D) = O(H) × [P(D\|H)/P(D\|¬H)]	Prior odds multiplied by the Bayes factor (likelihood ratio)
Conjugate Prior	p(θ) ∈ F → p(θ\|D) ∈ F	A prior that yields a posterior in the same distributional family
Bayesian Updating	Priorₙ → Dataₙ → Posteriorₙ ≡ Priorₙ₊₁	The iterative cycle of treating each posterior as the next prior

Frequently Asked Questions

What is prior probability in simple terms?

Prior probability is your initial estimate of how likely something is before you gather any new data. If you know a coin is fair before flipping it, your prior probability of heads is 0.5. If you know a disease affects 1 in 1,000 people, your prior that a random person has it is 0.001. It is the mathematical starting point of Bayesian reasoning.

How do you calculate prior probability?

For a discrete hypothesis: prior probability is simply the base rate or relative frequency of the event in a relevant reference population (e.g., 3% of patients presenting with symptom X have disease Y → P(Y) = 0.03). For a continuous parameter: the prior is a full probability distribution (e.g., Beta, Normal, Gamma) whose shape and parameters encode existing knowledge about where the true value is likely to fall.

What is the difference between prior and posterior probability?

Prior probability P(H) is computed before seeing the data. Posterior probability P(H|D) is computed after combining the prior with the likelihood of the observed data via Bayes' theorem. The posterior is the updated belief — it always lies somewhere between the prior and what the data alone would suggest, weighted by the relative strength of each.

Does the choice of prior matter?

Yes — especially with small samples. With limited data, the posterior is strongly influenced by the prior. With large samples, the data "washes out" the prior, and the posterior converges to similar values regardless of the initial choice (as long as the prior does not assign zero probability to the true parameter value). This is why Bayesian analysts report their prior choice and perform sensitivity analyses with alternative priors.

What is a conjugate prior and why does it matter?

A conjugate prior is a prior distribution that, when combined with a particular likelihood family, produces a posterior in the same family. For example, a Beta prior combined with a Binomial likelihood gives a Beta posterior. This matters because the update formula is algebraically simple — you just increment the distribution's parameters — avoiding numerical integration entirely. This was the primary tool for tractable Bayesian computation before MCMC methods became widely accessible.

How is prior probability used in machine learning?

In machine learning, prior probability appears in several forms. Naive Bayes classifiers use prior class frequencies learned from training data. Bayesian neural networks place prior distributions over model weights, with the posterior (learned during training) representing the model's uncertainty. Regularization techniques like L2 regularization (ridge regression) are equivalent to placing a Gaussian prior on model coefficients. Bayesian optimization uses a prior over the performance landscape to choose the next hyperparameter configuration to evaluate.

📚

Continue Learning

Prior probability is one piece of the Bayesian framework. For the complete picture, explore the Bayes' theorem guide, the conditional probability guide, and the Bayes factor guide on Statistics Fundamentals.