SlideShare a Scribd company logo
1 of 28
Download to read offline
Titolo presentazione

sottotitolo
Milano, XX mese 20XX
Artist-driven layering and user’s behaviour impact on
recommendations in a playlist continuation scenario
Creamy Fireflies
RecSys Challenge Workshop 2018
The Creamy Fireflies Team
We are a team of six MSc students from Politecnico di Milano:
• Sebastiano Antenucci
• Simone Boglio
• Emanuele Chioso
and one PhD candidate:
• Maurizio Ferrari Dacrema
• Ervin Dervishaj
• Shuwen Kang
• Tommaso Scarlatti
Spotify RecSys Challenge 2018
• Music recommendation, automatic playlist continuation
• Recommend 500 tracks for 10K playlists, divided in 10 categories
Tracks
• Main: only data provided by
Spotify through the MPD
• Creative: external, public
freely available data allowed
Metrics
• R-precision
• NDCG
• Recommender Song Clicks
Preprocessing
The cold-start problem
For playlists with no interactions we built a feature space starting
from playlists titles:
1. Removing spaces from titles made by only separated single letters
2. Elimination of uncommon characters
3. Extraction and reconciliation of dates
4. Apply Lancaster and Porter stemming to generate tokens
w o r k o u t —> workout
Preprocessing
Artist Heterogeneity
• Playlists sometimes exhibit a common underlying structure due to the
way a user fills them:
- Adding tracks from same album
- Adding tracks from same artist (and featuring)
- Creating a playlist with many different artists at first and add

tracks of the same artists later on
Preprocessing
Artist Heterogeneity
Algorithms
• Personalised Top Popular
- Track based
- Album based
• Collaborative Filtering - Track based
• Collaborative Filtering - Playlist based
• Content Based Filtering - Track based
• Content Based Filtering - Playlist based
- Track features
- Playlist names
Personalized Top Popular
• For playlists with just one track, we applied a personalized top
popular algorithm at two levels:
- Track-based: compute top popular over all the playlists that
contain that track
- Album-based: given the album of the track, compute top
popular over all the playlists that contain the tracks of the album
Collaborative Filtering
Track based Playlist based
BM25 normalization
tracks similarity
score prediction
playlists similarity (Tversky)
score prediction
Content Based Filtering - Track based
BM25 normalization
tracks similarity
score prediction
ICM
TRACKS
FEATURES
FEATURES = ALBUMS + ARTISTS
Content Based Filtering - Playlist based
• Two different approaches starting from the playlists title:
1. CBF based on tokens extracted in preprocessing phase
2. CBF based on an exact title match
PLAYLISTS
TOKENS
PLAYLISTS
TITLES
Ensemble
• Different algorithms are better suited for subsets of playlists with
specific characteristics
- Content-based: short playlists with similar features
- Collaborative filtering: long and heterogeneous playlists
• Weighted sum of the predictions of each algorithm for each category:
Parameters tuning
• For each algorithm and each category:
- k-nearest neighbours
- power p for similarity values
- Tversky coefficients
- shrink term h
Ensemble:
- Bayesian optimization
- NDCG
Creative track
External datasets
• We tried several external datasets to enrich the MPD
• We used Spotify API to retrieve tracks popularity and audio
features such as: loudness, danceability, energy, tempo…
Creative track
• CBF which is able to adjust the artist-based track recommendation
using 10 additional features
• Track-track similarity computed using only artists as features
cannot distinguish tracks belonging to same artist
TRACKS
ARTISTS
1
1
1
1
1
1
Creative track
• CBF which is able to adjust the artist-based track recommendation
using 10 additional features
• Track-track similarity computed using only artists as features
cannot distinguish tracks belonging to same artist
TRACKS
ARTISTS
1
1
1
1
1
1
Creative track - Artist layering
1. Split tracks into 4 clusters with equal number of elements for each feature
2. Considering feature clusters as a 3rd dimension, split the dense ICM into
4 sparse layers
3. Concatenate 4 layers of sparse matrices horizontally in order to create a
final sparse ICM and apply CBF
Postprocessing
• We improve our score leveraging on domain-specific patterns of
the dataset
• Re-ranking with boosts that share a common workflow
1. Start from a list of K predicted tracks for a playlist p
2. Normalize the score
3. Boost the precomputed score in this way:
Postprocessing
Gap Boost
• Heuristic for playlists where known tracks are given not in order
• Re-rank the final prediction giving more weight to tracks which
seems to better "fit" between all gaps
Postprocessing
Gap Boost
• Heuristic for playlists where known tracks are given not in order
• Re-rank the final prediction giving more weight to tracks which
seems to better "fit" between all gaps
k
Postprocessing
Gap Boost
• Heuristic for playlists where known tracks are given not in order
• Re-rank the final prediction giving more weight to tracks which
seems to better "fit" between all gaps
k
Postprocessing
Gap Boost
• Heuristic for playlists where known tracks are given not in order
• Re-rank the final prediction giving more weight to tracks which
seems to better "fit" between all gaps
k
Postprocessing
Gap Boost
• Heuristic for playlists where known tracks are given not in order
• Re-rank the final prediction giving more weight to tracks which
seems to better "fit" between all gaps
k
Computational requirements
• To run our model we used a AWS memory optimized cr1.8xlarge
VM with 32 vCPU and 244 GiB of RAM
• Parameters tuning for the ensemble takes up to 16h but only
computed once
Results and conclusions
• Simple, modular architecture
• Extensible with no impact on the pre-existent workflow
• Implementation in Cython of the most computationally
intensive tasks
SimilariPy - Fast Python KNN-Similarity algorithms for
Collaborative Filtering models
Thank you!
Questions?
creamy.fireflies@gmail.com
github.com/maurizioFD/spotify-recsys-challenge
github.com/bogliosimone/similaripy

More Related Content

Similar to Spotify RecSys Challenge 2018 Recommendations

More Like This: Machine Learning Approaches to Music similarity
More Like This: Machine Learning Approaches to Music similarityMore Like This: Machine Learning Approaches to Music similarity
More Like This: Machine Learning Approaches to Music similarityBrian McFee
 
Cassandra and Clojure
Cassandra and ClojureCassandra and Clojure
Cassandra and Clojurenickmbailey
 
Audio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV ContentAudio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV ContentPetra Galuscakova
 
QMUL C4DM API Presentation @ BCN Music Hack Day
QMUL C4DM API Presentation @ BCN Music Hack DayQMUL C4DM API Presentation @ BCN Music Hack Day
QMUL C4DM API Presentation @ BCN Music Hack DayAmélie Anglade
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsChris Johnson
 
A system to generate rhythms automatically for songs in rhythm game
A system to generate rhythms automatically for songs in rhythm gameA system to generate rhythms automatically for songs in rhythm game
A system to generate rhythms automatically for songs in rhythm gameKuan Ting Chen
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015RIILP
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyNeville Li
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Yi-Hsuan Yang
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...Gábor Szárnyas
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
 
Case-based Sequential Ordering of Songs for Playlist Recommendation
Case-based Sequential Ordering of Songs for Playlist RecommendationCase-based Sequential Ordering of Songs for Playlist Recommendation
Case-based Sequential Ordering of Songs for Playlist RecommendationUri Levanon
 
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about MusicCrowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Musicmultimediaeval
 
Audio Separation Comparison: Clustering Repeating Period and Hidden Markov Model
Audio Separation Comparison: Clustering Repeating Period and Hidden Markov ModelAudio Separation Comparison: Clustering Repeating Period and Hidden Markov Model
Audio Separation Comparison: Clustering Repeating Period and Hidden Markov ModelYao Yao
 
lastfm contentdashboards project description
lastfm contentdashboards project descriptionlastfm contentdashboards project description
lastfm contentdashboards project descriptionGaurav Bhardwaj
 

Similar to Spotify RecSys Challenge 2018 Recommendations (20)

More Like This: Machine Learning Approaches to Music similarity
More Like This: Machine Learning Approaches to Music similarityMore Like This: Machine Learning Approaches to Music similarity
More Like This: Machine Learning Approaches to Music similarity
 
Cassandra and Clojure
Cassandra and ClojureCassandra and Clojure
Cassandra and Clojure
 
Audio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV ContentAudio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV Content
 
QMUL C4DM API Presentation @ BCN Music Hack Day
QMUL C4DM API Presentation @ BCN Music Hack DayQMUL C4DM API Presentation @ BCN Music Hack Day
QMUL C4DM API Presentation @ BCN Music Hack Day
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
A system to generate rhythms automatically for songs in rhythm game
A system to generate rhythms automatically for songs in rhythm gameA system to generate rhythms automatically for songs in rhythm game
A system to generate rhythms automatically for songs in rhythm game
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ Spotify
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
Case-based Sequential Ordering of Songs for Playlist Recommendation
Case-based Sequential Ordering of Songs for Playlist RecommendationCase-based Sequential Ordering of Songs for Playlist Recommendation
Case-based Sequential Ordering of Songs for Playlist Recommendation
 
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about MusicCrowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
Crowsourcing for Social Multimedia Task: Crowsorting Timed Comments about Music
 
Ch06 rna
Ch06 rnaCh06 rna
Ch06 rna
 
Bioalgo 2012-02-rna
Bioalgo 2012-02-rnaBioalgo 2012-02-rna
Bioalgo 2012-02-rna
 
Audio Separation Comparison: Clustering Repeating Period and Hidden Markov Model
Audio Separation Comparison: Clustering Repeating Period and Hidden Markov ModelAudio Separation Comparison: Clustering Repeating Period and Hidden Markov Model
Audio Separation Comparison: Clustering Repeating Period and Hidden Markov Model
 
lastfm contentdashboards project description
lastfm contentdashboards project descriptionlastfm contentdashboards project description
lastfm contentdashboards project description
 
Compilers
CompilersCompilers
Compilers
 

Recently uploaded

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 

Spotify RecSys Challenge 2018 Recommendations

  • 1. Titolo presentazione
 sottotitolo Milano, XX mese 20XX Artist-driven layering and user’s behaviour impact on recommendations in a playlist continuation scenario Creamy Fireflies RecSys Challenge Workshop 2018
  • 2. The Creamy Fireflies Team We are a team of six MSc students from Politecnico di Milano: • Sebastiano Antenucci • Simone Boglio • Emanuele Chioso and one PhD candidate: • Maurizio Ferrari Dacrema • Ervin Dervishaj • Shuwen Kang • Tommaso Scarlatti
  • 3. Spotify RecSys Challenge 2018 • Music recommendation, automatic playlist continuation • Recommend 500 tracks for 10K playlists, divided in 10 categories Tracks • Main: only data provided by Spotify through the MPD • Creative: external, public freely available data allowed Metrics • R-precision • NDCG • Recommender Song Clicks
  • 4. Preprocessing The cold-start problem For playlists with no interactions we built a feature space starting from playlists titles: 1. Removing spaces from titles made by only separated single letters 2. Elimination of uncommon characters 3. Extraction and reconciliation of dates 4. Apply Lancaster and Porter stemming to generate tokens w o r k o u t —> workout
  • 5. Preprocessing Artist Heterogeneity • Playlists sometimes exhibit a common underlying structure due to the way a user fills them: - Adding tracks from same album - Adding tracks from same artist (and featuring) - Creating a playlist with many different artists at first and add
 tracks of the same artists later on
  • 7. Algorithms • Personalised Top Popular - Track based - Album based • Collaborative Filtering - Track based • Collaborative Filtering - Playlist based • Content Based Filtering - Track based • Content Based Filtering - Playlist based - Track features - Playlist names
  • 8. Personalized Top Popular • For playlists with just one track, we applied a personalized top popular algorithm at two levels: - Track-based: compute top popular over all the playlists that contain that track - Album-based: given the album of the track, compute top popular over all the playlists that contain the tracks of the album
  • 9. Collaborative Filtering Track based Playlist based BM25 normalization tracks similarity score prediction playlists similarity (Tversky) score prediction
  • 10. Content Based Filtering - Track based BM25 normalization tracks similarity score prediction ICM TRACKS FEATURES FEATURES = ALBUMS + ARTISTS
  • 11. Content Based Filtering - Playlist based • Two different approaches starting from the playlists title: 1. CBF based on tokens extracted in preprocessing phase 2. CBF based on an exact title match PLAYLISTS TOKENS PLAYLISTS TITLES
  • 12. Ensemble • Different algorithms are better suited for subsets of playlists with specific characteristics - Content-based: short playlists with similar features - Collaborative filtering: long and heterogeneous playlists • Weighted sum of the predictions of each algorithm for each category:
  • 13. Parameters tuning • For each algorithm and each category: - k-nearest neighbours - power p for similarity values - Tversky coefficients - shrink term h Ensemble: - Bayesian optimization - NDCG
  • 14. Creative track External datasets • We tried several external datasets to enrich the MPD • We used Spotify API to retrieve tracks popularity and audio features such as: loudness, danceability, energy, tempo…
  • 15. Creative track • CBF which is able to adjust the artist-based track recommendation using 10 additional features • Track-track similarity computed using only artists as features cannot distinguish tracks belonging to same artist TRACKS ARTISTS 1 1 1 1 1 1
  • 16. Creative track • CBF which is able to adjust the artist-based track recommendation using 10 additional features • Track-track similarity computed using only artists as features cannot distinguish tracks belonging to same artist TRACKS ARTISTS 1 1 1 1 1 1
  • 17. Creative track - Artist layering 1. Split tracks into 4 clusters with equal number of elements for each feature 2. Considering feature clusters as a 3rd dimension, split the dense ICM into 4 sparse layers 3. Concatenate 4 layers of sparse matrices horizontally in order to create a final sparse ICM and apply CBF
  • 18. Postprocessing • We improve our score leveraging on domain-specific patterns of the dataset • Re-ranking with boosts that share a common workflow 1. Start from a list of K predicted tracks for a playlist p 2. Normalize the score 3. Boost the precomputed score in this way:
  • 19. Postprocessing Gap Boost • Heuristic for playlists where known tracks are given not in order • Re-rank the final prediction giving more weight to tracks which seems to better "fit" between all gaps
  • 20. Postprocessing Gap Boost • Heuristic for playlists where known tracks are given not in order • Re-rank the final prediction giving more weight to tracks which seems to better "fit" between all gaps k
  • 21. Postprocessing Gap Boost • Heuristic for playlists where known tracks are given not in order • Re-rank the final prediction giving more weight to tracks which seems to better "fit" between all gaps k
  • 22. Postprocessing Gap Boost • Heuristic for playlists where known tracks are given not in order • Re-rank the final prediction giving more weight to tracks which seems to better "fit" between all gaps k
  • 23. Postprocessing Gap Boost • Heuristic for playlists where known tracks are given not in order • Re-rank the final prediction giving more weight to tracks which seems to better "fit" between all gaps k
  • 24. Computational requirements • To run our model we used a AWS memory optimized cr1.8xlarge VM with 32 vCPU and 244 GiB of RAM • Parameters tuning for the ensemble takes up to 16h but only computed once
  • 25. Results and conclusions • Simple, modular architecture • Extensible with no impact on the pre-existent workflow • Implementation in Cython of the most computationally intensive tasks
  • 26. SimilariPy - Fast Python KNN-Similarity algorithms for Collaborative Filtering models