SlideShare a Scribd company logo
Personalizing a Stream of Content
Saving a Legacy Broadcaster from the Graying of Radio
May 21, 2016
NB: Our Team signed an NDA and will refer to our
partner organization as the “Broadcaster” within
published materials.
Understanding the problem
Company
The Broadcaster is a legacy
media organization that
produces and distributes
audio content to radio
stations around the
country.
Context
Younger Americans have
different media
consumption habits,
expectations, and aesthetic
inclinations than the
millions of loyal listeners
our partner has served
since it was founded.
Problem
A drop in younger
listeners has made a dent
in the Broadcaster’s
audience. Without a large
younger audience to
replace older Americans
who die, the future of the
organization is in jeopardy.
The median age of our
partner’s radio audience
has steadily climbed in the
last two decades.
Nora Smith
● Her parents listen to the
Broadcaster and when the
station is on at home.
● She doesn’t own a radio at home.
● She doesn’t use the car radio.
● Listens to spotify via or local
music files on phone.
● She gets her news and audio
stories from podcasts.
For any given user, for any
given hour, should we serve
them a
Podcast or News?
Hypothesis:
Listening Sessions
Everyone wants news, unless
interactions with the app in previous
listening sessions show the user
prefers podcasts at this time of day.
We hypothesize that user preference
can be inferred from users’
interactions with the app during
previous listening session.
Ingestion and Wrangling
Start time
Completion
Data deep-dive
Raw Data Implicit User Signals Explicit User Signals
Dated 8/2014-2/2016
614,000+ Unique Users
98,000,000+ Records
Shared
Search Begin
Search Complete
Skipped
Thumbs Up
Ingesting the Data
App Interactions over Time
User Trends
Interactions by Day Interactions by Hour
Time
User interaction
Types of Interaction: Complete, Start, Skip, Search Begin,
Search Complete, Thumbs Up, Share
Defining Listening Sessions
Calculate Story Duration
Duration
If gap is ≤ 10 seconds,
assume next story is part
of same listening session
8 3 120 1 1 80
Measure the Gap Length Between Content
Session One Session Two Session Three
Define Sessions
User Trends
Users over Total Listening Time in Seconds Total Actions by Type
Data for Machine Learning
prev_duration
prev_num_ratings
prev_avg_rating_news
prev_avg_rating_podcast
prev_shift
prev_num_news
prev_num_podcast
prev_num_complete
prev_num_thumbup
prev_num_skip
prev_num_searchcomplete
time_diff_hr
Sample One: All sessions related to
randomly selected 10K users
Sample Two: Randomly chosen
20K sessions
Sample Three: A set of 20K sessions
that reflected the total population’s
behavior
Sample Four: A set of 50K sessions
that reflected the total population’s
behavior
Twelve Features Four Sample Sets
Feature Analysis and Modeling
20K Randomly
Selected Sessions
20K Reflecting
Total Pop.
10K Users with All
Sessions
50K Reflecting
Total Pop.
Extra Trees Cl f1 0.191344
precision 0.291667
recall 0.142373
f1 0.190164
precision 0.305263
recall 0.138095
f1 0.250842
precision 0.330313
recall 0.202195
f1 0.221719
precision 0.316129
recall 0.170732
Random Forest Cl f1 0.170213
precision 0.271186
recall 0.124031
f1 0.170648
precision 0.308642
recall 0.117925
f1 0.253954
precision 0.369869
recall 0.193357
f1 0.204947
precision 0.390135
recall 0.138978
GaussianNB f1 0.293173
precision 0.287402
recall 0.299180
f1 0.257757
precision 0.248848
recall 0.267327
f1 0.311551
precision 0.305672
recall 0.317661
f1 0.288052
precision 0.294807
recall 0.281600
BernoulliNB f1 0.285156
precision 0.295547
recall 0.275472
f1 0.234987
precision 0.258621
recall 0.215311
f1 0.300412
precision 0.313837
recall 0.288088
f1 0.280107
precision 0.317172
recall 0.250799
SVM f1 0.030651
precision 0.666667
recall 0.015686
f1 0.000000
precision 0.000000
recall 0.000000
Cannot Compute f1 0.015723
precision 0.555556
recall 0.007974
LR f1 0.040268
precision 0.400000
recall 0.021201
f1 0.010101
precision 0.142857
recall 0.005236
f1 0.050543
precision 0.441729
recall 0.026805
f1 0.032949
precision 0.714286
recall 0.016863
Feature 0 prev_duration
Feature 1 prev_num_ratings
Feature 2 prev_avg_rating_news
Feature 3 prev_avg_rating_podcast
Feature 4 prev_shift
Feature 5 prev_num_news
Feature 6 prev_num_podcast
Feature 7 prev_num_complete
Feature 8 prev_num_thumbup
Feature 9 prev_num_skip
Feature 10 prev_num_searchcomplete
Feature 11 time_diff_hr
Machine Learning Results of Random Forest Classifier
Model Refinement and Tuning
20K Randomly
Selected Sessions
20K Reflecting
Total Pop.
10K Users with
All Sessions
50K Reflecting
Total Pop.
GaussianNB f1 0.308000
precision 0.331897
recall 0.287313
f1 0.359091
precision 0.389163
recall 0.333333
f1 0.292221
precision 0.313606
recall 0.273565
f1 0.269928
precision 0.290448
recall 0.252115
LR f1 0.084507
precision 0.461538
recall 0.046512
f1 0.090535
precision 0.354839
recall 0.051887
f1 0.046577
precision 0.423762
recall 0.024643
f1 0.037037
precision 0.354839
recall 0.019538
Broadcaster Data
without Sessions GaussianNB f1 0.967882
precision 0.937957
recall 0.999780
LR f1 0.968729
precision 0.940750
recall 0.998425
Conclusion: Session Data v Non Session Data
Precision Recall F1-score Support
News 0.88 0.9 0.89 8497
Podcast 0.32 0.28 0.3 1435
Total 0.8 0.81 0.8 9932
Precision Recall F1-score Support
News 0.99 0.99 0.99 3656335
Podcast 0.92 0.85 0.89 323856
Total 0.98 0.98 0.98 3980191
Validation of provided data with the Broadcaster validation set and Broadcaster model
Validation of extracted data with 10K sample and GaussianNB model
Conclusions and
Future Investigations
Some about our Troubles and Lessons Learned
● Hosting on Dreamhost.
● We used up a lot of time trying SVM.
● Feature weighting can be performed with tree classifiers.
● During the beginning of the project, staying flexible.
● User segmentation.
● Don’t be afraid to network to find real world problems with real world data.
● When in doubt, Google it!
The Team
Anthea Watson
Strong
At age 8, thought quicksand
was going to be a much
bigger problem than it’s
turned out to be.
Nicole Donnelly
Recovering consultant
Sujit Ray
Knows things about mail
The End

More Related Content

Viewers also liked

Red Blue Presentation
Red Blue PresentationRed Blue Presentation
Red Blue Presentation
Lincoln Jackson
 
Probabilistic generative models for machine vision
Probabilistic generative models for machine visionProbabilistic generative models for machine vision
Probabilistic generative models for machine vision
zukun
 
Machine learning
Machine learningMachine learning
Machine learning
Shreyas G S
 
Machine learning
Machine learningMachine learning
Machine learning
Andrea Iacono
 
Discriminant analysis basicrelationships
Discriminant analysis basicrelationshipsDiscriminant analysis basicrelationships
Discriminant analysis basicrelationships
divyakalsi89
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
Duyen Do
 

Viewers also liked (6)

Red Blue Presentation
Red Blue PresentationRed Blue Presentation
Red Blue Presentation
 
Probabilistic generative models for machine vision
Probabilistic generative models for machine visionProbabilistic generative models for machine vision
Probabilistic generative models for machine vision
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Discriminant analysis basicrelationships
Discriminant analysis basicrelationshipsDiscriminant analysis basicrelationships
Discriminant analysis basicrelationships
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
 

Similar to Personalizing a Stream of Content

Eamt olga beregovaya_keynote
Eamt olga beregovaya_keynoteEamt olga beregovaya_keynote
Eamt olga beregovaya_keynote
Robert Martin
 
How To Write A Perfect Research
How To Write A Perfect ResearchHow To Write A Perfect Research
How To Write A Perfect Research
Erin Perez
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015
Welocalize
 
Twitter Case Study
Twitter Case StudyTwitter Case Study
Twitter Case Study
Abhigyan Pandey
 
Slide deckupdated hk version
Slide deckupdated hk versionSlide deckupdated hk version
Slide deckupdated hk version
mashupdotcom
 
Paris Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glassesParis Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glasses
AugmentedWorldExpo
 
Essay My First Day At College With Quotes
Essay My First Day At College With QuotesEssay My First Day At College With Quotes
Essay My First Day At College With Quotes
Breanne Brooks
 
Voice Recognition Accelerometers
Voice Recognition AccelerometersVoice Recognition Accelerometers
Voice Recognition Accelerometers
Nathan Glatz
 
How to launch a podcast from an idea. Repeatedly!
How to launch a podcast from an idea. Repeatedly!How to launch a podcast from an idea. Repeatedly!
How to launch a podcast from an idea. Repeatedly!
Henrik de Gyor
 
Digital Signage Solution and System- Advanced Digital Signage ADS
Digital Signage Solution and System- Advanced Digital Signage ADSDigital Signage Solution and System- Advanced Digital Signage ADS
Digital Signage Solution and System- Advanced Digital Signage ADS
Everest It Services Pvt.Ltd.
 
Podcasting Demystified Forever
Podcasting Demystified ForeverPodcasting Demystified Forever
Podcasting Demystified Forever
Paul Dunay
 
2008 Spie Defense + Security Presentation
2008 Spie Defense + Security Presentation2008 Spie Defense + Security Presentation
2008 Spie Defense + Security Presentation
Clyde Lettsome
 
Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017
Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017
Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017
MLconf
 
Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)
Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)
Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)
TwoWayDigitalRadio.com
 
Optimal Radio Channel Recommendations with Explicit and Implicit Feedback
Optimal Radio Channel Recommendations with Explicit and Implicit FeedbackOptimal Radio Channel Recommendations with Explicit and Implicit Feedback
Optimal Radio Channel Recommendations with Explicit and Implicit Feedback
Omar Moling
 
GC_SP_full report
GC_SP_full reportGC_SP_full report
GC_SP_full report
Dan Seidman
 
Building a worship streaming system
Building a worship streaming systemBuilding a worship streaming system
Building a worship streaming system
Paul Richards
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
Twitter Developers
 
When a FILTER makes the di fference in continuously answering SPARQL queries ...
When a FILTER makes the difference in continuously answering SPARQL queries ...When a FILTER makes the difference in continuously answering SPARQL queries ...
When a FILTER makes the di fference in continuously answering SPARQL queries ...
Shima Zahmatkesh
 
Giggpigg pitch
Giggpigg pitchGiggpigg pitch
Giggpigg pitch
kieranwalkin
 

Similar to Personalizing a Stream of Content (20)

Eamt olga beregovaya_keynote
Eamt olga beregovaya_keynoteEamt olga beregovaya_keynote
Eamt olga beregovaya_keynote
 
How To Write A Perfect Research
How To Write A Perfect ResearchHow To Write A Perfect Research
How To Write A Perfect Research
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015
 
Twitter Case Study
Twitter Case StudyTwitter Case Study
Twitter Case Study
 
Slide deckupdated hk version
Slide deckupdated hk versionSlide deckupdated hk version
Slide deckupdated hk version
 
Paris Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glassesParis Fan (Rokid): User Experience for AR glasses
Paris Fan (Rokid): User Experience for AR glasses
 
Essay My First Day At College With Quotes
Essay My First Day At College With QuotesEssay My First Day At College With Quotes
Essay My First Day At College With Quotes
 
Voice Recognition Accelerometers
Voice Recognition AccelerometersVoice Recognition Accelerometers
Voice Recognition Accelerometers
 
How to launch a podcast from an idea. Repeatedly!
How to launch a podcast from an idea. Repeatedly!How to launch a podcast from an idea. Repeatedly!
How to launch a podcast from an idea. Repeatedly!
 
Digital Signage Solution and System- Advanced Digital Signage ADS
Digital Signage Solution and System- Advanced Digital Signage ADSDigital Signage Solution and System- Advanced Digital Signage ADS
Digital Signage Solution and System- Advanced Digital Signage ADS
 
Podcasting Demystified Forever
Podcasting Demystified ForeverPodcasting Demystified Forever
Podcasting Demystified Forever
 
2008 Spie Defense + Security Presentation
2008 Spie Defense + Security Presentation2008 Spie Defense + Security Presentation
2008 Spie Defense + Security Presentation
 
Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017
Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017
Sara Hooker & Sean McPherson, Delta Analytics, at MLconf Seattle 2017
 
Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)
Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)
Motorola MotoTRBO Firmware 2.4 Release Notes (November 2014)
 
Optimal Radio Channel Recommendations with Explicit and Implicit Feedback
Optimal Radio Channel Recommendations with Explicit and Implicit FeedbackOptimal Radio Channel Recommendations with Explicit and Implicit Feedback
Optimal Radio Channel Recommendations with Explicit and Implicit Feedback
 
GC_SP_full report
GC_SP_full reportGC_SP_full report
GC_SP_full report
 
Building a worship streaming system
Building a worship streaming systemBuilding a worship streaming system
Building a worship streaming system
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
 
When a FILTER makes the di fference in continuously answering SPARQL queries ...
When a FILTER makes the difference in continuously answering SPARQL queries ...When a FILTER makes the difference in continuously answering SPARQL queries ...
When a FILTER makes the di fference in continuously answering SPARQL queries ...
 
Giggpigg pitch
Giggpigg pitchGiggpigg pitch
Giggpigg pitch
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 

Personalizing a Stream of Content

  • 1. Personalizing a Stream of Content Saving a Legacy Broadcaster from the Graying of Radio May 21, 2016
  • 2. NB: Our Team signed an NDA and will refer to our partner organization as the “Broadcaster” within published materials.
  • 3. Understanding the problem Company The Broadcaster is a legacy media organization that produces and distributes audio content to radio stations around the country. Context Younger Americans have different media consumption habits, expectations, and aesthetic inclinations than the millions of loyal listeners our partner has served since it was founded. Problem A drop in younger listeners has made a dent in the Broadcaster’s audience. Without a large younger audience to replace older Americans who die, the future of the organization is in jeopardy.
  • 4. The median age of our partner’s radio audience has steadily climbed in the last two decades.
  • 5. Nora Smith ● Her parents listen to the Broadcaster and when the station is on at home. ● She doesn’t own a radio at home. ● She doesn’t use the car radio. ● Listens to spotify via or local music files on phone. ● She gets her news and audio stories from podcasts.
  • 6. For any given user, for any given hour, should we serve them a Podcast or News?
  • 7. Hypothesis: Listening Sessions Everyone wants news, unless interactions with the app in previous listening sessions show the user prefers podcasts at this time of day. We hypothesize that user preference can be inferred from users’ interactions with the app during previous listening session.
  • 9. Start time Completion Data deep-dive Raw Data Implicit User Signals Explicit User Signals Dated 8/2014-2/2016 614,000+ Unique Users 98,000,000+ Records Shared Search Begin Search Complete Skipped Thumbs Up
  • 12. User Trends Interactions by Day Interactions by Hour
  • 13. Time User interaction Types of Interaction: Complete, Start, Skip, Search Begin, Search Complete, Thumbs Up, Share Defining Listening Sessions
  • 15. If gap is ≤ 10 seconds, assume next story is part of same listening session 8 3 120 1 1 80 Measure the Gap Length Between Content
  • 16. Session One Session Two Session Three Define Sessions
  • 17. User Trends Users over Total Listening Time in Seconds Total Actions by Type
  • 18. Data for Machine Learning prev_duration prev_num_ratings prev_avg_rating_news prev_avg_rating_podcast prev_shift prev_num_news prev_num_podcast prev_num_complete prev_num_thumbup prev_num_skip prev_num_searchcomplete time_diff_hr Sample One: All sessions related to randomly selected 10K users Sample Two: Randomly chosen 20K sessions Sample Three: A set of 20K sessions that reflected the total population’s behavior Sample Four: A set of 50K sessions that reflected the total population’s behavior Twelve Features Four Sample Sets
  • 20. 20K Randomly Selected Sessions 20K Reflecting Total Pop. 10K Users with All Sessions 50K Reflecting Total Pop. Extra Trees Cl f1 0.191344 precision 0.291667 recall 0.142373 f1 0.190164 precision 0.305263 recall 0.138095 f1 0.250842 precision 0.330313 recall 0.202195 f1 0.221719 precision 0.316129 recall 0.170732 Random Forest Cl f1 0.170213 precision 0.271186 recall 0.124031 f1 0.170648 precision 0.308642 recall 0.117925 f1 0.253954 precision 0.369869 recall 0.193357 f1 0.204947 precision 0.390135 recall 0.138978 GaussianNB f1 0.293173 precision 0.287402 recall 0.299180 f1 0.257757 precision 0.248848 recall 0.267327 f1 0.311551 precision 0.305672 recall 0.317661 f1 0.288052 precision 0.294807 recall 0.281600 BernoulliNB f1 0.285156 precision 0.295547 recall 0.275472 f1 0.234987 precision 0.258621 recall 0.215311 f1 0.300412 precision 0.313837 recall 0.288088 f1 0.280107 precision 0.317172 recall 0.250799 SVM f1 0.030651 precision 0.666667 recall 0.015686 f1 0.000000 precision 0.000000 recall 0.000000 Cannot Compute f1 0.015723 precision 0.555556 recall 0.007974 LR f1 0.040268 precision 0.400000 recall 0.021201 f1 0.010101 precision 0.142857 recall 0.005236 f1 0.050543 precision 0.441729 recall 0.026805 f1 0.032949 precision 0.714286 recall 0.016863
  • 21. Feature 0 prev_duration Feature 1 prev_num_ratings Feature 2 prev_avg_rating_news Feature 3 prev_avg_rating_podcast Feature 4 prev_shift Feature 5 prev_num_news Feature 6 prev_num_podcast Feature 7 prev_num_complete Feature 8 prev_num_thumbup Feature 9 prev_num_skip Feature 10 prev_num_searchcomplete Feature 11 time_diff_hr Machine Learning Results of Random Forest Classifier
  • 22. Model Refinement and Tuning 20K Randomly Selected Sessions 20K Reflecting Total Pop. 10K Users with All Sessions 50K Reflecting Total Pop. GaussianNB f1 0.308000 precision 0.331897 recall 0.287313 f1 0.359091 precision 0.389163 recall 0.333333 f1 0.292221 precision 0.313606 recall 0.273565 f1 0.269928 precision 0.290448 recall 0.252115 LR f1 0.084507 precision 0.461538 recall 0.046512 f1 0.090535 precision 0.354839 recall 0.051887 f1 0.046577 precision 0.423762 recall 0.024643 f1 0.037037 precision 0.354839 recall 0.019538
  • 23. Broadcaster Data without Sessions GaussianNB f1 0.967882 precision 0.937957 recall 0.999780 LR f1 0.968729 precision 0.940750 recall 0.998425
  • 24. Conclusion: Session Data v Non Session Data Precision Recall F1-score Support News 0.88 0.9 0.89 8497 Podcast 0.32 0.28 0.3 1435 Total 0.8 0.81 0.8 9932 Precision Recall F1-score Support News 0.99 0.99 0.99 3656335 Podcast 0.92 0.85 0.89 323856 Total 0.98 0.98 0.98 3980191 Validation of provided data with the Broadcaster validation set and Broadcaster model Validation of extracted data with 10K sample and GaussianNB model
  • 26. Some about our Troubles and Lessons Learned ● Hosting on Dreamhost. ● We used up a lot of time trying SVM. ● Feature weighting can be performed with tree classifiers. ● During the beginning of the project, staying flexible. ● User segmentation. ● Don’t be afraid to network to find real world problems with real world data. ● When in doubt, Google it!
  • 27. The Team Anthea Watson Strong At age 8, thought quicksand was going to be a much bigger problem than it’s turned out to be. Nicole Donnelly Recovering consultant Sujit Ray Knows things about mail