John Maxwell, a data scientist at Nordstrom, did his graduate work in international development economics, focusing on field experiments. He has since led research projects in Indonesia and Ethiopia related to microenterprise, developed large mathematical simulation models used for investment decisions by WSDOT, built dynamic pricing algorithms at Thriftbooks.com, and led the development of Nordstrom’s open source a/b testing service: Elwin. He currently focuses on contextual multi-armed bandit problems and machine learning infrastructure at Nordstrom.
Solving the Contextual Multi-Armed Bandit Problem at Nordstrom:
The contextual multi-armed bandit problem, also known as associative reinforcement learning or bandits with side information, is a useful formulation of the multi-armed bandit problem that takes into account information about arms and users when deciding which arm to pull. The barrier to entry for both understanding and implementing contextual multi-armed bandits in production is high. The literature in this field pulls from disparate sources including (but not limited to) classical statistics, reinforcement learning, and information theory. Because of this, finding material that fills the gap between very basic explanations and academic journal articles is challenging. The goal of this talk is to provide those lacking intermediate materials as well as an example implementation. Specifically, I will explain key findings from some of the more cited papers in the contextual bandit literature, discuss the minimum requirements for implementation, and give an overview of a production system for solving contextual multi-armed bandit problems.