The mathematics of uncertainty
Real data is rarely one number. It's a vector. The multivariate Gaussian N(μ, Σ) extends the bell curve to many dimensions. The mean becomes a vector μ ∈ ℝⁿ (the centre of the cloud) and the variance becomes a covariance matrix Σ (the shape and tilt of the cloud).
The exponent generalizes the z-score: (x−μ)ᵀΣ⁻¹(x−μ) is the squared Mahalanobis distance, distance from the mean measured in units of the data's own spread. Points of equal density form ellipses (ellipsoids in higher dimensions); the covariance matrix sets their size, stretch, and tilt.
The diagonal of Σ holds the per-coordinate variances; the off-diagonals hold covariances, telling you whether coordinates rise together. A diagonal Σ gives axis-aligned ellipses (independent coordinates); off-diagonal terms tilt them. Σ must be positive semidefinite, since there's no such thing as negative variance in any direction.