Multivariate Gaussian

The mathematics of uncertainty

Real data is rarely one number. It's a vector. The multivariate Gaussian N(μ, Σ) extends the bell curve to many dimensions. The mean becomes a vector μ ∈ ℝⁿ (the centre of the cloud) and the variance becomes a covariance matrix Σ (the shape and tilt of the cloud).

The exponent generalizes the z-score: (x−μ)ᵀΣ⁻¹(x−μ) is the squared Mahalanobis distance, distance from the mean measured in units of the data's own spread. Points of equal density form ellipses (ellipsoids in higher dimensions); the covariance matrix sets their size, stretch, and tilt.

The diagonal of Σ holds the per-coordinate variances; the off-diagonals hold covariances, telling you whether coordinates rise together. A diagonal Σ gives axis-aligned ellipses (independent coordinates); off-diagonal terms tilt them. Σ must be positive semidefinite, since there's no such thing as negative variance in any direction.

Where this lives in MLWhen a Gaussian process does regression with built-in error bars, it places a multivariate Gaussian over functions. A VAE's latent prior is a standard multivariate normal N(0, I). Gaussian latent-variable models and the noise schedules of diffusion models all rely on the fact that linear maps and conditionals of Gaussians stay Gaussian.

▶ Multivariate Gaussian

← Key Continuous Distributions Joint Distributions →