Non-parametric Tests

Inference, estimation, and decision-making from data

The t-test leans on an assumption: the data is roughly normal. When that fails (small samples, obvious skew, heavy tails, ordinal data) non-parametric tests step in. They make almost no assumption about the distribution's shape, usually by working with ranks instead of raw values.

Two staples. The Wilcoxon signed-rank test is the non-parametric counterpart to the paired t-test (matched pairs). The Mann–Whitney U test is the counterpart to the two-sample t-test (two independent groups). Both ask "do these tend to be larger?" without assuming normality.

Imagine judging a foot race when the stopwatch is broken. You cannot read exact finish times, but you can still see who crossed the line first, second, and third. That finish order, the ranks, is enough to declare a winner, and it does not care whether the times were 10 seconds or 10 minutes apart. Non-parametric tests work the same way: they replace raw values with ranks, so a few wild outliers or a lopsided distribution cannot distort the verdict.

Where this lives in MLWhen comparing model accuracies, the scores are often a handful of non-normal numbers, perfect for non-parametric tests. Permutation tests in particular are a favorite for ML because they make essentially no assumptions and adapt to any test statistic you care about, including weird custom metrics. They're robust exactly where the t-test gets nervous.

▶ Non-parametric Tests

← Multiple Testing Simple Linear Regression →