Concentration Inequalities (brief)

Inference, estimation, and decision-making from data

Statistics so far has been mostly about averages and asymptotics. Concentration inequalities ask a sharper, finite-sample question: how likely is a random quantity to land far from its mean? Their answers are the mathematical backbone of why machine learning can offer guarantees at all.

The most basic, requiring only a non-negative variable and its mean, is Markov's inequality:

It says a non-negative variable can't often be many times its average. If the mean is small, large values must be rare. Crude, but it needs almost nothing.

Where this lives in MLHoeffding's bound is the heart of generalization theory: it's why a model's measured error on a finite test set is provably close to its true error, with high probability, the formal justification for trusting a test score. This is the engine of PAC learning ("Probably Approximately Correct"): with enough samples, the gap between training and true performance is small with high probability.…
▶ Concentration Inequalities (brief)
← Expectation-Maximization (EM)