Multivariate calculus from first principles
A function f: Rⁿ → R takes a vector in and returns a single number. The example that drives machine learning is the loss: feed in every weight of the network, get back one number that says how badly it is doing. The whole of training is a hunt for the lowest point of this function.
For two inputs you can actually picture it: z = f(x, y) is a surface, a landscape of hills and valleys floating above the xy-plane. The height at each (x, y) is the function's value.
Imagine the air in a room: stand at any spot and a thermometer reads exactly one temperature. That is a function f: R² → R in disguise: a position (x, y) goes in, and a single number (the warmth there) comes out. The whole room becomes a landscape of warm and cool patches, higher near the radiator, lower by the window.