Vectors & Geometry of Rⁿ

Multivariate calculus from first principles

Single-variable calculus lived on a line. Machine learning does not. A neural network's weights, an embedding, a gradient: each is a point in high-dimensional space, Rⁿ. The good news is that the geometry you know from the flat plane R² carries over almost word-for-word. A vector is still an arrow from the origin; length, angle, and "shadow onto another vector" all still make sense. We just stop being able to draw it.

A vector v = (v₁, v₂, …, vₙ) is an ordered list of numbers. You can read it two ways at once: as a location (the point you land on) and as a direction with a length (the arrow that gets you there). Both readings matter constantly in ML.

The norm (length) of a vector comes straight from Pythagoras, just with more terms:

Where this lives in MLWhen a transformer decides how much one token should attend to another, it takes the dot product of a query and a key, q·k. That is the same operation as ranking nearest neighbours in an embedding space by cosine similarity, and the same one a linear classifier uses to ask which side of w·x + b = 0 a point lands on. Most of what gets called 'similarity' in ML is this single number a·b.

▶ Vectors & Geometry of Rⁿ

← Quadratic Forms Functions f: Rⁿ → R →