Hessian Geometry

Multivariate calculus from first principles

The Hessian's eigenvalues turn the murky question 'what kind of critical point is this?' into a clean checklist. At a point where the gradient is zero, the signs of the Hessian's eigenvalues tell you whether you're sitting in a bowl, on a dome, or at a saddle.

This is the multivariable second-derivative test, and it's a direct generalization of 1-D: there, f″ > 0 meant a min and f″ a max. The Hessian's eigenvalues are the many directions' versions of that single number.

Picture three snacks. A bowl of soup curves up no matter which way you tip it, a dome of ice cream curves down everywhere, and a Pringle chip bends up along its length but down across its width. The Hessian's eigenvalues are just the curvatures along those special directions: same sign means bowl or dome, opposite signs (like 2 and −2) means the chip, a saddle.

Where this lives in MLIn high dimensions, saddle points vastly outnumber local minima. For a random critical point in n dimensions, all n eigenvalues would have to share a sign for it to be a true min or max, which is exponentially unlikely. So training a deep net is mostly about escaping saddles, places where the gradient is small but you're nowhere near the bottom, rather than getting trapped in bad local minima.…
▶ Hessian Geometry
← The HessianChain Rule: Scalar Composition →