Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Pinterest
Iterative
supervised
clustering
Adancebetweendata
scienceandmachinelearning
DrJuneAndrews—September2016
ExplorePinterest’scontent
Questionourunderstanding
Inspirethefuture
Agenda
1
2
3
ExplorePinterest’scontent
Questionourunderstanding
Inspirethefuture
Agenda
1
2
3
Clothing
Cooking
Decorating
Beauty
Teaching
Carpentry
Cars
Animated GIFs
Electronics
Stereos
Fashion
Sewing
Articles
Paint...
Chairs
Fashion
Travel
Garden
Chairs
Food
Linksare

behind

everyPin
Howareusersengaging

withlinkdomains?
2:50 PM 100%
Tool Pros Cons
Cluster algorithms
(SVM, K-Means, Spectral)
• Considers all users
• Accurate
• Tough to communicate
• Defini...
Tool Pros Cons
Cluster algorithms
(SVM, K-Means, Spectral)
• Considers all users
• Accurate
• Tough to communicate
• Defini...
Currentclusteranalysis
Cleanandloaddataintofavoriteclusteringalgorithm
Buildvisualizationsontopofclusters
Fiddlewithparame...
Currentclusteranalysis
Cleanandloaddataintofavoriteclusteringalgorithm
Buildvisualizationsontopofclusters
Fiddlewithparame...
Humanintheloopcomputing
Community membership identification from small
seed sets (Kloumann & Kleinberg)
T
Domain Expert
Fav...
Humanintheloopcomputing
When machine confidence dips, engage with domain
expert
T
Domain Expert
Favorite
Clustering
Algorit...
Humanintheloopcomputing
When machine confidence dips, engage with domain
expert
T
Domain Expert
Favorite
Clustering
Algorit...
Humanintheloopcomputing
Domain expert determines when labeling is done
T
Domain Expert
Favorite
Clustering
Algorithm
T
Tha...
Currentanalysismethodology
Cleanandloaddataintofavoriteclusteringalgorithm
Buildvisualizationsontopofclusters
Fiddlewithpa...
Humanintheloopcomputing
Stage 1: Machine clusters data
Favorite
Clustering
Algorithm
Humanintheloopcomputing
Stage 2: Domain expert creates 1 human interpretable
cluster
Domain Expert
Humanintheloopcomputing
Stage 3: Remove human labeled clusters and iterate
Favorite
Clustering
Algorithm
Domain Expert
How are users engaging
with link domains?
•Forasamplesetoflinkdomains
we’reinterestedin:
• AllPincreatesintheirfirstyearon...
Python
Notebook
Provides guided iteration
Python
Notebook
Sample visualization 

for each cluster
Python
Notebook
Pin creates Repins
Few Many
Many
Few
Iteration1
Title Dark content
Description Fewer than 2 Pins a week on average
Examples Noisy low quality content
Iteration2
42% of domains left
Few Many Few Some Few Many
0 0 0 0 0 0
Cluster 1 Cluster 3Cluster 2
Pin creates Repins Pin ...
Description
Domains with few Pins, but
these Pins thrive in the
Pinterest ecosystem
Calculation
def
detect_pinterest_speci...
Iteration3
33% of domains left
Few Few Few Some Few Many
0 0 0 0 0 0
Cluster 1 Cluster 3Cluster 2
Pin creates Repins Pin c...
Iteration3
Steady growth
Description
Active Pin creates and
steady growth throughout
the year
Calculation
def detect_stead...
Iteration4
25% of domains left
Few Some Many Some Few Some
0 0 0 0 0 0
Cluster 1 Cluster 3Cluster 2
Pin creates Repins Pin...
Iteration4
Slow growth
Description Similar to steady growth,
but not as fast
Calculation
def detect_steady_growth(domain_e...
Iteration5
Churning
Description Slowly fade through the year
Calculation
def detect_churning(domain_engagement):
(repin_gr...
Iteration6
Yearly
Description Slowly fade through the year
Calculation
def detect_churning(domain_engagement):
(repin_grow...
Iteration7
Late bloomer
Description Peak mid year
Calculation
def detect_late_bloomer(domain_engagement):
(concavity, pin_...
Clusters
•Darkcontent
•Pinterestspecials
•Steadygrowth
•Slowgrowth
•Churning
•Yearly
•Latebloomer
ExplorePinterest’scontent
Questionourunderstanding
Inspirethefuture
Agenda
1
2
3
Doesasking
twiceyield
thesame
answer?
Shouldweclusteragain?
2:50 PM 100%
Costofreplicatinganalysisis
leavingotherbusiness
opportunitiesonthetable
2:50 PM 100%
Data
scienceis
expensive
Unknown
2:50 PM 100%
Wouldit
makea
difference?
Replication
Crisisin
Psychology
Silberzahn & Ahlmann; Crowdsourced research: Many hands make tight work
NatureAugust2015
Crowd
sourced
studyon
redcards

insoccer
Silberzahn & Ahlmann; Crowdsourced research: Many hands make tight work
NatureOct...
TheNewYorkTimesonpredictingthepresidency
September, 2016
Cohn; We Gave Four Good Pollsters the Same Raw Data. They Had Fou...
…butwe’veloweredthecost!
2:50 PM 100%
Data
scienceis
expensive
…9datascientistsand

machinelearningengineers.
Samedata,sameUI,sameday.
Everyonefinishedin~1hour.
…so

wedidit
again
Modelsarealworldsituation
withlimitedresources
9ishuge!
weretheresultsthesame?
Everythingwas
thesame
Baseline
clusters
Results e Results l Results d Results m Results z Results b Results k
Dark content
Pinterest specials
St...
Baseline
clusters
Results e Results l Results d Results m Results z Results b Results k
Dark content Unpopular (95%) Trail...
Baseline
clusters
Results e Results l Results d Results m Results z Results b Results k
Dark content Unpopular (95%) Trail...
Baseline
Clusters
Results e Results l Results d Results m Results z Results b Results k
Dark content Unpopular (95%) Trail...
Baseline
clusters
Results e Results l Results d Results m Results z Results b Results k
Dark content Unpopular (95%) Trail...
Baseline
clusters
Results e Results l Results d Results m Results z Results b Results k
Dark content Unpopular (95%) Trail...
Baseline
clusters
Results e Results l Results d Results m Results z Results b Results k
Yearly Seasonal Throwback Seasonal...
…Goodvs.bad
Differencesinperspective
Two

rootsof
variations
Signsofsuboptimalclustering
•Leadingwithbiases
•Cherry-picking:responding
toalimitedsubsetofthe
data
Few
Seasonal
Pin crea...
Differences
ofperspective
•Resultsm-Viralgrowthcentric
• ViralonPinterest
• Viralontheinternet
• Lame
•Resultsd-Originalco...
Impactimplications
9datascientists

9answers
•Productsdependingonclusterused
• Viralmechanisms
• SpeedingPindemotion
• Pro...
Bottomline
Itmatterswhichdatascientist
doesananalysis
ExplorePinterest’scontent
Questionourunderstanding
Inspirethefuture
Agenda
1
2
3
Let’saskthehardquestion

andbravetheanswertogether
Whenis

datascience

ahouse

ofcards?
Avalancheof
Resources
Measuringdatascienceimpact
•Experimentalsystemsarenowstandard
•Datascientistsaremoreavailable
•Repro...
Utilize
Resources
Experiment
• Recordendtoendfromanalysistoimpact
• Innovateonprocesses
• Borrowideasonreplicationfromscie...
Concrete
experiments
Breakdowntheproblem

andbuildup
•NarrowDifferenceinPerception
throughPriminganalysts
•Developarubrico...
Pinterest

isinterested
pin.it/Data
Reachout!
DrJuneAndrews
june@pinterest.com/ DrAndrews/ DrJuneAndrews
Let’sdatascience,

datascience!
Let’scrackthecodeto
systematicinnovation
Thankyou!
Wearehiring!
https://engineering.pinterest.com/
pin.it/Data
Replication in Data Science - A Dance Between Data Science & Machine Learning Strata 2016
Replication in Data Science - A Dance Between Data Science & Machine Learning Strata 2016
Upcoming SlideShare
Loading in …5
×

Replication in Data Science - A Dance Between Data Science & Machine Learning Strata 2016

3,235 views

Published on

We use Iterative Supervised Clustering as a simple building block for exploring Pinterest's Content. But simplicity can unlock great power and with this building block we show the shocking result of how hard it is to replicated data science conclusions. This begs us to challenge the future for When is Data Science a House of Cards?

Published in: Data & Analytics

Replication in Data Science - A Dance Between Data Science & Machine Learning Strata 2016

  1. 1. Pinterest
  2. 2. Iterative supervised clustering Adancebetweendata scienceandmachinelearning DrJuneAndrews—September2016
  3. 3. ExplorePinterest’scontent Questionourunderstanding Inspirethefuture Agenda 1 2 3
  4. 4. ExplorePinterest’scontent Questionourunderstanding Inspirethefuture Agenda 1 2 3
  5. 5. Clothing Cooking Decorating Beauty Teaching Carpentry Cars Animated GIFs Electronics Stereos Fashion Sewing Articles Painting Photography Nature Cute cats Tattoos Hair Microscopy TV shows Apps Self help Motorcycles
  6. 6. Chairs
  7. 7. Fashion Travel Garden Chairs Food
  8. 8. Linksare
 behind
 everyPin Howareusersengaging
 withlinkdomains? 2:50 PM 100%
  9. 9. Tool Pros Cons Cluster algorithms (SVM, K-Means, Spectral) • Considers all users • Accurate • Tough to communicate • Definitions change over time User experience studies • Deep knowledge • Captures the immeasurable • Costly • Considers few users Domain expert hypothesis • Human interpretable • Inaccurate
  10. 10. Tool Pros Cons Cluster algorithms (SVM, K-Means, Spectral) • Considers all users • Accurate • Tough to communicate • Definitions change over time User experience studies • Deep knowledge • Captures the immeasurable • Costly • Considers few users Domain expert hypothesis • Human interpretable • Inaccurate
  11. 11. Currentclusteranalysis Cleanandloaddataintofavoriteclusteringalgorithm Buildvisualizationsontopofclusters Fiddlewithparametersinclusteringalgorithm Addhumanlabelstoeachcluster Sharehumaninterpretationofclusters 1 2 3 4 5
  12. 12. Currentclusteranalysis Cleanandloaddataintofavoriteclusteringalgorithm Buildvisualizationsontopofclusters Fiddlewithparametersinclusteringalgorithm Addhumanlabelstoeachcluster Sharehumaninterpretationofclusters 1 2 3 4 5 Fatal flaw
  13. 13. Humanintheloopcomputing Community membership identification from small seed sets (Kloumann & Kleinberg) T Domain Expert Favorite Clustering Algorithm
  14. 14. Humanintheloopcomputing When machine confidence dips, engage with domain expert T Domain Expert Favorite Clustering Algorithm ? T Unsure Confident
  15. 15. Humanintheloopcomputing When machine confidence dips, engage with domain expert T Domain Expert Favorite Clustering Algorithm T T Unsure Confident ?
  16. 16. Humanintheloopcomputing Domain expert determines when labeling is done T Domain Expert Favorite Clustering Algorithm T Thats all!
  17. 17. Currentanalysismethodology Cleanandloaddataintofavoriteclusteringalgorithm Buildvisualizationsontopofclusters Fiddlewithparametersinclusteringalgorithm Addhumanlabelstoeachcluster Sharehumaninterpretationofclusters 1 2 3 4 5
  18. 18. Humanintheloopcomputing Stage 1: Machine clusters data Favorite Clustering Algorithm
  19. 19. Humanintheloopcomputing Stage 2: Domain expert creates 1 human interpretable cluster Domain Expert
  20. 20. Humanintheloopcomputing Stage 3: Remove human labeled clusters and iterate Favorite Clustering Algorithm Domain Expert
  21. 21. How are users engaging with link domains? •Forasamplesetoflinkdomains we’reinterestedin: • AllPincreatesintheirfirstyearonPinterest • AllrepinsintheirfirstyearonPinterest • 100klinkdomainssampledtotal Linksarebehind everyPin 2:50 PM 100%
  22. 22. Python Notebook
  23. 23. Provides guided iteration Python Notebook
  24. 24. Sample visualization 
 for each cluster Python Notebook Pin creates Repins Few Many Many Few
  25. 25. Iteration1 Title Dark content Description Fewer than 2 Pins a week on average Examples Noisy low quality content
  26. 26. Iteration2 42% of domains left Few Many Few Some Few Many 0 0 0 0 0 0 Cluster 1 Cluster 3Cluster 2 Pin creates Repins Pin creates RepinsPin creates Repins
  27. 27. Description Domains with few Pins, but these Pins thrive in the Pinterest ecosystem Calculation def detect_pinterest_specials(domain_engagement): ratio = domain_engagement.n_repins / max(1.0, float(domain_engagement.n_pin_creates)) return domain_engagement.n_pin_creates <= X and ratio >= Y Examples Fashion and impulse sites Iteration2 Pinterest specials Few Pinterest specials Repins Many 0 0 Pin creates
  28. 28. Iteration3 33% of domains left Few Few Few Some Few Many 0 0 0 0 0 0 Cluster 1 Cluster 3Cluster 2 Pin creates Repins Pin creates RepinsPin creates Repins
  29. 29. Iteration3 Steady growth Description Active Pin creates and steady growth throughout the year Calculation def detect_steady_growth(domain_engagement): (growth_rate, intercept) = np.polyfit(range(len(domain_engagement.monthly_repins) ), domain_engagement.monthly_repins,1) return months_pins_created >= X and growth_rate >= Y Examples Recipe and DIY sites Some Steady growth Repins Many 0 0 Pin creates
  30. 30. Iteration4 25% of domains left Few Some Many Some Few Some 0 0 0 0 0 0 Cluster 1 Cluster 3Cluster 2 Pin creates Repins Pin creates RepinsPin creates Repins
  31. 31. Iteration4 Slow growth Description Similar to steady growth, but not as fast Calculation def detect_steady_growth(domain_engagement): (growth_rate, intercept) = np.podef detect_steady_growth(domain_engagement): (growth_rate, intercept) = np.polyfit(range(len(domain_engagement.monthly_repins)), domain_engagement.monthly_repins,1) return months_pins_created >= X and growth_rate >= Ylyfit(range(len(domain_engagement.monthly_repins)), domain_engagement.monthly_repins,1) return months_pins_created >= X and growth_rate >= Y Examples Little lower quality recipe 
 and DIY sites Few Slow growth Repins Many 0 0 Pin creates
  32. 32. Iteration5 Churning Description Slowly fade through the year Calculation def detect_churning(domain_engagement): (repin_growth, intercept) = np.polyfit( range(len(domain_engagement.monthly_repins) - 2), domain_engagement.monthly_repins[2:], 1) (pin_create_growth, intercept) = np.polyfit( range(len(domain_engagement.monthly_repins) - 2), domain_engagement.monthly_pin_creates[2:], 1) return repin_growth < 0 and pin_create_growth < 0 Examples Fashion sale 
 and click bait sites Few Churning Repins Many 0 0 Pin creates
  33. 33. Iteration6 Yearly Description Slowly fade through the year Calculation def detect_churning(domain_engagement): (repin_growth, intercept) = np.polyfit( range(len(domain_engagement.monthly_repins) - 2), domain_engagement.monthly_repins[2:], 1) (pin_create_growth, intercept) = np.polyfit( range(len(domain_engagement.monthly_repins) - 2), domain_engagement.monthly_pin_creates[2:], 1) return repin_growth < 0 and pin_create_growth < 0 Examples Seasonal fashion, 
 such as snow boots Few Yearly Pin creates Repins Many 0 0
  34. 34. Iteration7 Late bloomer Description Peak mid year Calculation def detect_late_bloomer(domain_engagement): (concavity, pin_growth, intercept) = np.polyfit( range(len(domain_engagement.monthly_repins) - 2), [r + p for (r, p) in zip(domain_engagement.monthly_repins[2:], domain_engagement.monthly_pin_creates[2:])], 2) return concavity < 0 Examples Blogs that get off to a slow start Few Pinterest late bloomer Pin creates Repins Many 0 0
  35. 35. Clusters •Darkcontent •Pinterestspecials •Steadygrowth •Slowgrowth •Churning •Yearly •Latebloomer
  36. 36. ExplorePinterest’scontent Questionourunderstanding Inspirethefuture Agenda 1 2 3
  37. 37. Doesasking twiceyield thesame answer? Shouldweclusteragain? 2:50 PM 100%
  38. 38. Costofreplicatinganalysisis leavingotherbusiness opportunitiesonthetable 2:50 PM 100% Data scienceis expensive
  39. 39. Unknown 2:50 PM 100% Wouldit makea difference?
  40. 40. Replication Crisisin Psychology Silberzahn & Ahlmann; Crowdsourced research: Many hands make tight work NatureAugust2015
  41. 41. Crowd sourced studyon redcards
 insoccer Silberzahn & Ahlmann; Crowdsourced research: Many hands make tight work NatureOctober2015
  42. 42. TheNewYorkTimesonpredictingthepresidency September, 2016 Cohn; We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results.
  43. 43. …butwe’veloweredthecost! 2:50 PM 100% Data scienceis expensive
  44. 44. …9datascientistsand
 machinelearningengineers. Samedata,sameUI,sameday. Everyonefinishedin~1hour. …so
 wedidit again
  45. 45. Modelsarealworldsituation withlimitedresources 9ishuge!
  46. 46. weretheresultsthesame? Everythingwas thesame
  47. 47. Baseline clusters Results e Results l Results d Results m Results z Results b Results k Dark content Pinterest specials Steady growth Slow growth Churning Yearly Late bloomer Existingclustersasourbaseline
  48. 48. Baseline clusters Results e Results l Results d Results m Results z Results b Results k Dark content Unpopular (95%) Trailing (90%) Pinterest specials Trailing (100%) Viral on Pinterest (98%) Pin creates drop off (97%) Steady growth Increasing repins (94%) Continuous growth (94%) Slow growth Churning Yearly Late bloomer 90%Matches
  49. 49. Baseline clusters Results e Results l Results d Results m Results z Results b Results k Dark content Unpopular (95%) Trailing (90%) Original pinny (84%) Pinterest specials Trailing (100%) Minimal original Pins (66%) Viral on Pinterest (98%) Pin creates drop off (97%) Steady growth Pinterest viral content (62%) Other (53%) Original Pinny (51%) Viral on the internet (69%) Increasing repins (94%) Continuous growth (94%) Suspected Save button high Pin creates (73%) Slow growth Pinterest viral content (55%) Original Pinny (82%) Viral on the internet (65%) Increasing repins (65%) Continuous growth (86%) Suspected Save button high Pin creates (51%) Churning Original Pinny (68%) Viral on the internet (53%) Yearly Original Pinny (71%) Late bloomer Original Pinny (71%) Continuous growth (55%) Suspected Save button high Pin creates (59%) 50%Matches
  50. 50. Baseline Clusters Results e Results l Results d Results m Results z Results b Results k Dark content Unpopular (95%) Trailing (90%) Original pinny (84%) Pinterest specials Trailing (100%) Minimal original Pins (66%) Viral on Pinterest (98%) Pin creates drop off (97%) Steady growth Pinterest viral content (62%) Other (53%) Original Pinny (51%) Viral on the internet (69%) Increasing repins (94%) Continuous growth (94%) Suspected Save button high Pin creates (73%) Slow growth Pinterest viral content (55%) Original Pinny (82%) Viral on the internet (65%) Increasing repins (65%) Continuous growth (86%) Suspected Save button high Pin creates (51%) Churning Original Pinny (68%) Viral on the internet (53%) Yearly Original Pinny (71%) Late bloomer Original Pinny (71%) Continuous growth (55%) Suspected Save button high Pin creates (59%) 50%Matches
  51. 51. Baseline clusters Results e Results l Results d Results m Results z Results b Results k Dark content Unpopular (95%) Trailing (90%) Original pinny (84%) Pinterest specials Trailing (100%) Minimal original Pins (66%) Viral on Pinterest (98%) Pin creates drop off (97%) Steady growth Pinterest viral content (62%) Other (53%) Original Pinny (51%) Viral on the internet (69%) Increasing repins (94%) Continuous growth (94%) Suspected Save button high Pin creates (73%) Slow growth Pinterest viral content (55%) Original Pinny (82%) Viral on the internet (65%) Increasing repins (65%) Continuous growth (86%) Suspected Save button high Pin creates (51%) Churning Original Pinny (68%) Viral on the internet (53%) Yearly Original Pinny (71%) Late bloomer Original Pinny (71%) Continuous growth (55%) Suspected Save button high Pin creates (59%) 50%Matches
  52. 52. Baseline clusters Results e Results l Results d Results m Results z Results b Results k Dark content Unpopular (95%) Trailing (90%) Original pinny (84%) Pinterest specials Trailing (100%) Minimal original Pins (66%) Viral on Pinterest (98%) Pin creates drop off (97%) Steady growth Pinterest viral content (62%) Other (53%) Original Pinny (51%) Viral on the internet (69%) Increasing repins (94%) Continuous growth (94%) Suspected Save button high Pin creates (73%) Slow growth Pinterest viral content (55%) Original Pinny (82%) Viral on the internet (65%) Increasing repins (65%) Continuous growth (86%) Suspected Save button high Pin creates (51%) Churning Original Pinny (68%) Viral on the internet (53%) Yearly Original Pinny (71%) Late bloomer Original Pinny (71%) Continuous growth (55%) Suspected Save button high Pin creates (59%) 50%Matches
  53. 53. Baseline clusters Results e Results l Results d Results m Results z Results b Results k Yearly Seasonal Throwback Seasonal Annual Steady growth Gaining popularity Increasing repins Continuous growth High engagement Pinterest specials Initial flurry Minimal original Pins Viral on Pinterest Pin create drop off Unpopular domains with good content Conceptuallysimilarclusters But not related in implementation
  54. 54. …Goodvs.bad Differencesinperspective Two
 rootsof variations
  55. 55. Signsofsuboptimalclustering •Leadingwithbiases •Cherry-picking:responding toalimitedsubsetofthe data Few Seasonal Pin creates Repins Few 0 0
  56. 56. Differences ofperspective •Resultsm-Viralgrowthcentric • ViralonPinterest • Viralontheinternet • Lame •Resultsd-Originalcontentcentric • PersistentoriginalPins • MinimaloriginalPins • OriginalPinny •Resultsl-Returnoninvestmentcentric • Underserved • Draught • Trailing
  57. 57. Impactimplications 9datascientists
 9answers •Productsdependingonclusterused • Viralmechanisms • SpeedingPindemotion • PromotingunderservedPins •Forsameproduct,
 domainsimpacteddifferfor • Seasonality • Steadygrowth • Pinterestspecials
  58. 58. Bottomline Itmatterswhichdatascientist doesananalysis
  59. 59. ExplorePinterest’scontent Questionourunderstanding Inspirethefuture Agenda 1 2 3
  60. 60. Let’saskthehardquestion
 andbravetheanswertogether Whenis
 datascience
 ahouse
 ofcards?
  61. 61. Avalancheof Resources Measuringdatascienceimpact •Experimentalsystemsarenowstandard •Datascientistsaremoreavailable •Reproducibleanalysis •[Now]Fastreplicableanalysis
  62. 62. Utilize Resources Experiment • Recordendtoendfromanalysistoimpact • Innovateonprocesses • Borrowideasonreplicationfromscience • Tailorourtechniques forreplication
  63. 63. Concrete experiments Breakdowntheproblem
 andbuildup •NarrowDifferenceinPerception throughPriminganalysts •Developarubricofexcellence •Trainanalystsongenerateddata •Addprocessstabilizers
  64. 64. Pinterest
 isinterested pin.it/Data Reachout! DrJuneAndrews june@pinterest.com/ DrAndrews/ DrJuneAndrews
  65. 65. Let’sdatascience,
 datascience! Let’scrackthecodeto systematicinnovation
  66. 66. Thankyou! Wearehiring! https://engineering.pinterest.com/ pin.it/Data

×