SlideShare a Scribd company logo
Spotify’s Music
Recommendations
Lambda Architecture
Esh Kumar @eshvk
Emily Samuels @emilymsa
Overview
‣ Why Lambda?
‣ Use Case: Discover Recommendations
• Batch Architecture
• Real-time Architecture
• Challenges
‣ Future Work
Why Lambda?
• 1 new user every 3 seconds.
• Contextual, time based recs
more & more important
Discover
Recs
The Discover Page
Algorithmically generated fresh
recs for users.
The Discover Batch Pipeline
Machine Learning Deep Dive
Word2Vec
Words with similar
contexts have similar
meaning
Word2Vec
King – Man + Woman = Queen
Annoy
• Approximate
Nearest Neighbors
Oh Yeah!
• https://github.com/s
potify/annoy
Batch Architecture
Strengths
Intro to
Storm
Storm
• Distributed real-time
computation system
Storm @
Spotify
Real-time Architecture
• Workers die -> Cascading JVM
Process death
• Memcache flakiness
• Cassandra JVM problems due to
write/overwrite pattern
Challenges
Future/Ongoing Work
• Simplify the topology
• Keep listens for 24 hours
• Ongoing work on other
real time personalization
features.
Questions
Esh Kumar eshvk@spotify.com
Emily Samuels esamuels@spotify.com

More Related Content

What's hot

Fraud Detection with Amazon Machine Learning on AWS
Fraud Detection with Amazon Machine Learning on AWSFraud Detection with Amazon Machine Learning on AWS
Fraud Detection with Amazon Machine Learning on AWSAmazon Web Services
 
Amazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in HeavenAmazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in HeavenAmazon Web Services
 
Datastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobsDatastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobsshanker_uma
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSAmazon Web Services
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 
Cloud adoption and rudiments
Cloud  adoption and rudimentsCloud  adoption and rudiments
Cloud adoption and rudimentsgaurav jain
 
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...Spark Summit
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasFlink Forward
 
Introduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsIntroduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsAmazon Web Services
 
Data Flow Diagram and USe Case Diagram
Data Flow Diagram and USe Case DiagramData Flow Diagram and USe Case Diagram
Data Flow Diagram and USe Case DiagramKumar
 
당근마켓 고언어 도입기, 그리고 활용법
당근마켓 고언어 도입기, 그리고 활용법당근마켓 고언어 도입기, 그리고 활용법
당근마켓 고언어 도입기, 그리고 활용법Kyuhyun Byun
 
Introduction to Amazon EC2 Spot Instances
Introduction to Amazon EC2 Spot InstancesIntroduction to Amazon EC2 Spot Instances
Introduction to Amazon EC2 Spot InstancesAmazon Web Services
 
AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기
AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기
AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기Amazon Web Services Korea
 
SMC301 The State of Serverless Computing
SMC301 The State of Serverless ComputingSMC301 The State of Serverless Computing
SMC301 The State of Serverless ComputingAmazon Web Services
 

What's hot (20)

Fraud Detection with Amazon Machine Learning on AWS
Fraud Detection with Amazon Machine Learning on AWSFraud Detection with Amazon Machine Learning on AWS
Fraud Detection with Amazon Machine Learning on AWS
 
Amazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in HeavenAmazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
Amazon CloudWatch Logs and AWS Lambda: A Match Made in Heaven
 
Datastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobsDatastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobs
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWS
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
Cloud adoption and rudiments
Cloud  adoption and rudimentsCloud  adoption and rudiments
Cloud adoption and rudiments
 
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
 
Introduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsIntroduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless Applications
 
Data Flow Diagram and USe Case Diagram
Data Flow Diagram and USe Case DiagramData Flow Diagram and USe Case Diagram
Data Flow Diagram and USe Case Diagram
 
Amazon SQS overview
Amazon SQS overviewAmazon SQS overview
Amazon SQS overview
 
당근마켓 고언어 도입기, 그리고 활용법
당근마켓 고언어 도입기, 그리고 활용법당근마켓 고언어 도입기, 그리고 활용법
당근마켓 고언어 도입기, 그리고 활용법
 
Dbms vs dsms
Dbms vs dsmsDbms vs dsms
Dbms vs dsms
 
Introduction to Amazon EC2 Spot Instances
Introduction to Amazon EC2 Spot InstancesIntroduction to Amazon EC2 Spot Instances
Introduction to Amazon EC2 Spot Instances
 
AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3) AWS Simple Storage Service (s3)
AWS Simple Storage Service (s3)
 
Big Data and Analytics on AWS
Big Data and Analytics on AWS Big Data and Analytics on AWS
Big Data and Analytics on AWS
 
Amazon Rekognition
Amazon RekognitionAmazon Rekognition
Amazon Rekognition
 
Amazon S3 Masterclass
Amazon S3 MasterclassAmazon S3 Masterclass
Amazon S3 Masterclass
 
AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기
AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기
AWS Finance Symposium_바로 도입할 수 있는 금융권 업무의 클라우드 아키텍처 알아보기
 
SMC301 The State of Serverless Computing
SMC301 The State of Serverless ComputingSMC301 The State of Serverless Computing
SMC301 The State of Serverless Computing
 

Viewers also liked

Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.
Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.
Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.Hubbub Media
 
A thousand fronts: on the architectures I like
A thousand fronts: on the architectures I likeA thousand fronts: on the architectures I like
A thousand fronts: on the architectures I likeDavide Tommaso Ferrando
 
4.4 mb portfolio print 2012-2016
4.4 mb portfolio print 2012-20164.4 mb portfolio print 2012-2016
4.4 mb portfolio print 2012-2016Meghan Garnett
 
We’ve created a monster! Truth and fiction in SOA
We’ve created a monster! Truth and fiction in SOAWe’ve created a monster! Truth and fiction in SOA
We’ve created a monster! Truth and fiction in SOAJon Collins
 
Moving into movies - using video in E-Learning
Moving into movies - using video in E-Learning Moving into movies - using video in E-Learning
Moving into movies - using video in E-Learning Aurion Learning
 
Architectural structures world Wide
Architectural structures   world WideArchitectural structures   world Wide
Architectural structures world WideSagun Rakibe
 
360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)
360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)
360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)360i
 
exploring architecture and music
exploring architecture and musicexploring architecture and music
exploring architecture and musicmichielmoyaert
 
Big Idea: FIction & Non-Fiction
Big Idea: FIction & Non-FictionBig Idea: FIction & Non-Fiction
Big Idea: FIction & Non-FictionAngela Maiers
 
WA1. Cycle Fullcourseware, September 2008
WA1. Cycle Fullcourseware, September 2008WA1. Cycle Fullcourseware, September 2008
WA1. Cycle Fullcourseware, September 2008tanglay
 
PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015
PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015
PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015PeopleBrowsr
 
Architecture Music Acoustics, Part 2
Architecture Music Acoustics, Part 2Architecture Music Acoustics, Part 2
Architecture Music Acoustics, Part 2Shannon Mattern
 
The Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainThe Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainRafał Wojdyła
 
But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...
But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...
But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...Davide Tommaso Ferrando
 
Subjects in realistic fiction revised
Subjects in realistic fiction revisedSubjects in realistic fiction revised
Subjects in realistic fiction revisedKrishna Ponce
 

Viewers also liked (20)

Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.
Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.
Story, Sci-Fi & Transmedia to develop Corporate Technology Strategies.
 
A thousand fronts: on the architectures I like
A thousand fronts: on the architectures I likeA thousand fronts: on the architectures I like
A thousand fronts: on the architectures I like
 
4.4 mb portfolio print 2012-2016
4.4 mb portfolio print 2012-20164.4 mb portfolio print 2012-2016
4.4 mb portfolio print 2012-2016
 
Math music and architecture
Math music and architectureMath music and architecture
Math music and architecture
 
We’ve created a monster! Truth and fiction in SOA
We’ve created a monster! Truth and fiction in SOAWe’ve created a monster! Truth and fiction in SOA
We’ve created a monster! Truth and fiction in SOA
 
Moving into movies - using video in E-Learning
Moving into movies - using video in E-Learning Moving into movies - using video in E-Learning
Moving into movies - using video in E-Learning
 
Barnes and Noble
Barnes and NobleBarnes and Noble
Barnes and Noble
 
Architectural structures world Wide
Architectural structures   world WideArchitectural structures   world Wide
Architectural structures world Wide
 
360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)
360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)
360i Idea Safari: The Hunt of the Mysterious BIG IDEA (Presented at Cannes 2012)
 
exploring architecture and music
exploring architecture and musicexploring architecture and music
exploring architecture and music
 
Postmodernism
PostmodernismPostmodernism
Postmodernism
 
Big Idea: FIction & Non-Fiction
Big Idea: FIction & Non-FictionBig Idea: FIction & Non-Fiction
Big Idea: FIction & Non-Fiction
 
WA1. Cycle Fullcourseware, September 2008
WA1. Cycle Fullcourseware, September 2008WA1. Cycle Fullcourseware, September 2008
WA1. Cycle Fullcourseware, September 2008
 
PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015
PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015
PeopleBrowsr Presents A Brief Cartoon History of Social Networking 1930-2015
 
The Spencer Pavillion 100152598
The Spencer Pavillion 100152598The Spencer Pavillion 100152598
The Spencer Pavillion 100152598
 
Architecture Music Acoustics, Part 2
Architecture Music Acoustics, Part 2Architecture Music Acoustics, Part 2
Architecture Music Acoustics, Part 2
 
The Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainThe Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and Pain
 
But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...
But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...
But Today We Collect Bullshit: Architecture and Storytelling in the Age of So...
 
Urban Design
Urban DesignUrban Design
Urban Design
 
Subjects in realistic fiction revised
Subjects in realistic fiction revisedSubjects in realistic fiction revised
Subjects in realistic fiction revised
 

Similar to Spotify's Music Recommendations Lambda Architecture

Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017Amazon Web Services
 
Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02
Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02
Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02Derek Ashmore
 
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark Summit
 
Aws Lambda for Java Architects - Illinois VJug -2016-05-03
Aws Lambda for Java Architects - Illinois VJug -2016-05-03Aws Lambda for Java Architects - Illinois VJug -2016-05-03
Aws Lambda for Java Architects - Illinois VJug -2016-05-03Derek Ashmore
 
Patterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, EuropePatterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, EuropeFlip Kromer
 
Serverless Architecture Patterns
Serverless Architecture PatternsServerless Architecture Patterns
Serverless Architecture PatternsAmazon Web Services
 
Wait! What’s going on inside my database? (PASS 2023 Update)
Wait! What’s going on inside my database? (PASS 2023 Update)Wait! What’s going on inside my database? (PASS 2023 Update)
Wait! What’s going on inside my database? (PASS 2023 Update)Jeremy Schneider
 
Streaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby ChackoStreaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby ChackoVMware Tanzu
 
Serverless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best PracticesServerless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best PracticesAmazon Web Services
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Exampleconfluent
 
Samza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next LevelSamza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next LevelMartin Kleppmann
 
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAmazon Web Services
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandRan Silberman
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafkaconfluent
 
Serverless conference-labrador-at-2018
Serverless conference-labrador-at-2018Serverless conference-labrador-at-2018
Serverless conference-labrador-at-2018Antonio Terreno
 
Serverless Architectural Patterns and Best Practices | AWS
Serverless Architectural Patterns and Best Practices | AWSServerless Architectural Patterns and Best Practices | AWS
Serverless Architectural Patterns and Best Practices | AWSAWS Germany
 
AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019
AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019
AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019Derek Ashmore
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 

Similar to Spotify's Music Recommendations Lambda Architecture (20)

Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
Real Time Data Processing Using AWS Lambda - DevDay Los Angeles 2017
 
Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02
Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02
Aws Lambda for Java Architects - Illinois JUG-Northwest -2016-08-02
 
Akka streams kafka kinesis
Akka streams kafka kinesisAkka streams kafka kinesis
Akka streams kafka kinesis
 
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
 
Aws Lambda for Java Architects - Illinois VJug -2016-05-03
Aws Lambda for Java Architects - Illinois VJug -2016-05-03Aws Lambda for Java Architects - Illinois VJug -2016-05-03
Aws Lambda for Java Architects - Illinois VJug -2016-05-03
 
Patterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, EuropePatterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April -- Hadoop Summit, Europe
 
Serverless Architecture Patterns
Serverless Architecture PatternsServerless Architecture Patterns
Serverless Architecture Patterns
 
Dystopia as a Service
Dystopia as a ServiceDystopia as a Service
Dystopia as a Service
 
Wait! What’s going on inside my database? (PASS 2023 Update)
Wait! What’s going on inside my database? (PASS 2023 Update)Wait! What’s going on inside my database? (PASS 2023 Update)
Wait! What’s going on inside my database? (PASS 2023 Update)
 
Streaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby ChackoStreaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
 
Serverless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best PracticesServerless Architectural Patterns and Best Practices
Serverless Architectural Patterns and Best Practices
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
 
Samza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next LevelSamza at LinkedIn: Taking Stream Processing to the Next Level
Samza at LinkedIn: Taking Stream Processing to the Next Level
 
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised Land
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
 
Serverless conference-labrador-at-2018
Serverless conference-labrador-at-2018Serverless conference-labrador-at-2018
Serverless conference-labrador-at-2018
 
Serverless Architectural Patterns and Best Practices | AWS
Serverless Architectural Patterns and Best Practices | AWSServerless Architectural Patterns and Best Practices | AWS
Serverless Architectural Patterns and Best Practices | AWS
 
AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019
AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019
AWS Lambda: Best Practices and Common Mistakes - Dev Ops West 2019
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 

Recently uploaded

Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILNatan Silnitsky
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandIES VE
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1KnowledgeSeed
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...Alluxio, Inc.
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfOrtus Solutions, Corp
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisNeo4j
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...rajkumar669520
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Krakówbim.edu.pl
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfmbmh111980
 

Recently uploaded (20)

Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 

Spotify's Music Recommendations Lambda Architecture

Editor's Notes

  1. Emily Introduce Emily and Esh give background on what we do and who we are Discover Page Google Now Playlist Recommendations Discover Weekly Personalization features
  2. Emily
  3. Emily Scalding
  4. Esh
  5. Emily Scalding
  6. Emily Scalding
  7. Esh Talk about high intent vs low intent when talking about building user vectors.
  8. Esh A single machine = 100 billion words a day. Word2vec works on the basis of the distributional hypothesis. The idea being that words which appear in the same context, have similar meanings. One model in the word2vec framework that we use is the Skipgram model. So essentially, what happens is that we go through documents, for each word in the document, we try to predict what the future or previous words should be. Mathematically there is a way to show that this is like factorizing a word-context matrix. For us, playlists are documents, and words are songs that we would like to learn vectors for. The advantage of something like word2vec is that at the end, you have a geometry defined on top of vectors. So you could add the tracks of an artists to get the vector representation of an artist.
  9. Esh Static indices that can be shipped around. Core principle being LSH.
  10. Esh reflective of their whole music taste
  11. Emily We are the first team to build a production ready personalization feature using Storm at Spotify. The Kafka queues were optimized for Hadoop ingestion Localized close to the Hadoop Cluster in London.
  12. Emily Spouts, bolts, tuples Topology to stitch together the bolts
  13. Emily First team at Spotify to do real-time recommendations The Kafka queues were optimized for Hadoop ingestion. The Kafka cluster was localized close to the Hadoop cluster. Both in our data center in London. Localized close to the Hadoop Cluster in London.
  14. Emily Write into LON Cassandra cluster Use sparkey files to store vector info Splash to ship sparkey files not writing user vectors, only writing out the recs
  15. Esh Despite the challenges, we had a successful ab test and are running this in production
  16. Emily Write out the vectors, not just the recs Service for vectors Aggregation service on top of vectors to compute recs Use real-time data to improve recs for all users
  17. Emily Why Lambda The Batch Architecture Real-time Architecture Challenges Future Work