Special Matrices

Geometry and algebra of linear maps, vectors, and matrices

A few matrices show up so often, with such clean geometry, that they earn names. Knowing them on sight saves enormous effort.

The identity matrix I has 1s on the diagonal and 0s elsewhere. It is the "do nothing" map: Ix = x for every vector. A diagonal matrix has nonzero entries only on the diagonal; it stretches each axis independently, with entry dᵢ scaling the i-th coordinate and no mixing.

Think of a sound mixing board. The identity matrix I is every slider parked at 1: the signal passes through untouched, exactly "do nothing." A diagonal matrix is a set of independent volume sliders — each one boosts or cuts a single channel on its own, with no channel ever bleeding into another.

Where this lives in MLOrthogonal maps keep signals well-scaled. Orthogonal weight initialization starts a layer as a length-preserving map so activations and gradients neither explode nor vanish as they pass through many layers. Diagonal matrices appear as per-feature scales in batch norm, and the identity is the backbone of a residual connection x + f(x), the "do nothing" path that lets gradients flow straight…

▶ Special Matrices

← Transpose Ax = b: Geometry →