To download the presentation, please go to Harlan's or Jared's websites:
Although traditional multiple regression is an extremely powerful tool for prediction, it can be inadequate when the goal is to predict relationships that differ among groups. For example, the relationship between income and political affiliation varies among American states, and the relationship between income level and calorie intake varies among counties of the world. Traditional multiple regression will either try to independently estimate these relationships, which can be very problematic if there is not enough data, or will lump the groups together, throwing away potentially valuable differences among groups. A more powerful approach is to assume that the groups have a statistical distribution of their own, just as the error among individual observations is assumed to come from a (often normal) distribution. Then the data in each group is "partially pooled" with all of the other data, appropriately splitting the difference between the two extremes. In the general case, this is Bayesian model estimation, which can be very complex and difficult to do well. But in more common cases, simpler statistical techniques called variously "multilevel" or "hierarchical regression," and "mixed-effects modeling" can be used to improve the quality of predictions. In this talk, we will motivate and explain the basics of practical multilevel regression, and will demonstrate how it works using R.
Harlan D. Harris, PhD, works as a statistical data scientist for Kaplan Test Prep and Admissions in New York City. He previously worked as a cognitive psychology researcher at NYU, UConn and Columbia University, and studied machine learning and cognitive science at the University of Illinois at Urbana-Champaign.
Jared Lander is a statistical consultant based in New York City. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations ranges from music and fund raising to finance and humanitarian relief efforts. He specializes in data management, multilevel models, machine learning, generalized linear models, data management and