Cross-Validation

Inference, estimation, and decision-making from data

You can't judge a model by its training error; it has already seen that data, so it can cheat by memorizing. You need its error on data it has never seen. But holding out a single test set wastes data and gives a noisy estimate. Cross-validation solves both problems.

In k-fold cross-validation, split the data into k equal folds. Train on k−1 of them, validate on the held-out one, and rotate so every fold serves as the validation set exactly once. Average the k validation errors for a stable estimate of how the model generalizes.

Cross-validation is like sitting several practice exams to predict your real-exam score. If you only graded yourself on questions you'd already memorized the answers to, you'd overestimate wildly, so you set aside a fresh batch of questions each time, score yourself on those, and rotate which batch is held back. Averaging your scores across all the practice sittings gives a far steadier forecast of how you'll do on the day than any single mock exam would.

Where this lives in MLCross-validation is how ML practitioners select models and hyperparameters without fooling themselves. It estimates generalization error (the quantity the bias–variance decomposition is about) using all the data efficiently. And it's the front line against data leakage: the silent bug where information from the test distribution sneaks into training and produces gorgeous, completely fake…

▶ Cross-Validation

← Bias-Variance Decomposition Evaluation Metrics →