Independence

The mathematics of uncertainty

Two events are independent when knowing one tells you nothing about the other. Learning the first coin landed heads doesn't shift the odds for the second. Formally, independence means the conditional probability equals the plain one, P(A|B) = P(A), which rearranges into a clean test:

So for independent events, the probability that both happen is just the product. This is why n fair coin flips all landing heads has probability (1/2)ⁿ: the flips don't talk to each other.

A fair coin has no memory: after five heads in a row, the next flip is still an even 50/50, because the coin cannot remember what it just did. That "no memory" is exactly independence, where the chance of both flips together is the product P(A ∩ B) = P(A) · P(B). It is also why a streak of n heads carries probability (1/2)ⁿ.

Where this lives in MLWhen you train on a labelled dataset, you almost always assume the examples are i.i.d., independent and identically distributed. That assumption lets a joint likelihood over the dataset factor into a product P(data) = Π P(xᵢ), which becomes a sum of log-terms (the loss). Naive Bayes classifiers go further and assume features are conditionally independent given the class, turning an impossible…
▶ Independence
← Bayes' TheoremRandom Variables →