SVD

Geometry and algebra of linear maps, vectors, and matrices

The singular value decomposition does something no other factorization manages: every matrix, square or rectangular, full rank or not, splits into three clean geometric pieces.

Read right-to-left, any linear map is the same three-step motion: Vᵀ rotates the input to align with the right axes, Σ (diagonal, with the non-negative singular values σ₁ ≥ σ₂ ≥ …) scales each axis, and U rotates the result into the output space. A circle of inputs always maps to an ellipse, and the singular values are the lengths of that ellipse's axes.

In the figure, watch the unit circle become an ellipse whose semi-axes are exactly the singular values.

Where this lives in MLSVD is the math behind model compression. LoRA approximates a weight update with a low-rank product, exploiting the fact that the useful update lives in a few high-σ directions. PCA is SVD of centered data. Truncated SVD compresses embedding tables and images by keeping only the dominant singular directions, the same "keep the big σ's" move every time.
▶ SVD
← Symmetric MatricesPCA via SVD →