Inference, estimation, and decision-making from data
The MLE recipe is always the same: write the log-likelihood, take its derivative with respect to the parameter, set it to zero, solve. For the two distributions you'll meet most, the answer is beautifully simple: it's just a sample average.
For data drawn from a normal distribution, maximizing the log-likelihood gives the most intuitive estimators possible:
Imagine you flip a bent coin a bunch of times to guess how biased it is. Maximum likelihood doesn't agonize over it: the single best guess for the chance of heads is just the fraction of heads you actually saw. The estimate p̂ is nothing more than the running tally turned into an average, the same plain sample mean x̄ in disguise.