Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page, Radio, and Related Artists. Due to the iterative nature of these models they are a natural fit to the Spark computation paradigm and suffer from the IO overhead incurred by Hadoop. In this talk, I review the ALS algorithm for Matrix Factorization with implicit feedback data and how we’ve scaled it up to handle 100s of Billions of data points using Scala, Breeze, and Spark.