Inference, estimation, and decision-making from data
Simple linear regression is the bridge from statistics to machine learning: it's the simplest model that predicts. You assume the relationship between an input x and an output y is a line plus random noise, and you find the best-fitting line.
β₀ is the intercept, β₁ the slope, and ε the noise. "Best-fitting" means the line that minimizes the total squared residuals (the vertical gaps between points and line), the method of ordinary least squares (OLS).
Drag the slope and intercept in the figure and watch the sum of squared errors (SSE) change. The OLS line is the unique one that drives the coral residual-sticks' total squared length to its minimum.