1. Hypothesis testing and maximum likelihood estimation (MLE) are introduced as methods for finding the best explanation for observed data and predicting similar data.
2. MLE aims to find the parameter value that maximizes the likelihood function, which represents the probability of the data given different parameter values.
3. For language models, MLE is used to estimate the probability of the next word given previous words by calculating n-gram probabilities from word co-occurrence counts in training data.