Parameters & Estimators

Inference, estimation, and decision-making from data

Almost every statistical question has the same shape. There is some true number out in the world you can't see, the parameter θ (a true mean, a true success probability). You only have a finite sample of data. From that data you compute a guess, the estimator θ̂. Estimation is the art of building good guesses and knowing how much to trust them.

Because the data is random, θ̂ is itself a random quantity: run the experiment again, get a different θ̂. We judge an estimator by two things: its bias (does it land on θ on average?) and its variance (how much does it bounce around from sample to sample?).

You can't drink the whole pot of soup to judge the seasoning, so you stir well and taste one spoonful. The true saltiness of the entire pot is the parameter θ you can't directly see; the saltiness of your spoonful is the estimator θ̂. Stir thoroughly first and a single spoonful estimates the whole pot remarkably well — that stirring is what makes the sample representative.

Where this lives in MLUnderfitting versus overfitting is this same tradeoff. A model's parameters are the θ̂, fit from finite training data. Underfitting = high bias: the model is too simple to capture the truth. Overfitting = high variance: the model is so flexible it memorizes the particular training sample, and a new sample would give wildly different parameters. Choosing model complexity is choosing a point on…

▶ Parameters & Estimators

← Relationships Between Variables Maximum Likelihood Estimation →