Expectation & Variance (continuous)

The mathematics of uncertainty

Everything you learned about expectation and variance carries over to continuous variables. You just swap the sum for an integral. The PMF weight p(x) becomes the density f(x) dx, and "add over all values" becomes "integrate over the line."

The intuition is identical: E[X] is still the balance point of the density's mass, and variance is still the average squared distance from that point. Linearity and the scaling rule Var(aX+b)=a²Var(X) all survive unchanged.

Think of a seesaw with weight smeared unevenly along the plank instead of sitting at one point. The single spot where it balances is E[X], the mean of the density. How far the weight is flung out from that pivot, measured as average squared distance, is Var(X): weight bunched near the centre means small variance, weight pushed to the far ends means large variance.

Where this lives in MLContinuous expectations are integrals, and integrals over high-dimensional spaces are usually intractable. So ML leans on Monte Carlo estimation: approximate E[g(X)] = ∫ g(x)f(x)dx by an average (1/n) Σ g(xᵢ) over samples xᵢ drawn from f. Every "expected reward" in RL and every ELBO term in a VAE is one of these integrals, estimated by sampling.
▶ Expectation & Variance (continuous)
← PDF & CDFGaussian Distribution →