Bayes' Theorem

The mathematics of uncertainty

Often you know one direction of a conditional but want the other. A medical test tells you P(positive | disease), but the patient wants P(disease | positive). Bayes' theorem is the bridge that flips a conditional probability around.

It falls straight out of last lesson. The multiplication rule gives P(A∩B) two ways, as P(A|B)P(B) and as P(B|A)P(A). Set them equal and divide by P(B). The three pieces have names you'll meet everywhere in ML: P(A) is the prior (belief before evidence), P(B|A) is the likelihood (how well A explains the evidence), and P(A|B) is the posterior (updated belief).

The bottom P(B) is usually computed by splitting across all the ways B can occur, the law of total probability:

Where this lives in MLBayes' theorem is the engine of probabilistic ML. Bayesian inference updates a prior over parameters into a posterior given data: P(θ | data) ∝ P(data | θ)·P(θ). Maximum-likelihood training is the special case where the prior is flat, and adding a prior is exactly what L2 regularization does (a Gaussian prior on the weights). The whole "posterior predictive" of a Bayesian neural net is this…

▶ Bayes' Theorem

← Conditional Probability Independence →