SlideShare a Scribd company logo
1 of 13
Download to read offline
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
Data Update - 01/27/2016vsco.co/blevishkin
Data Update - 03/17/17vsco.co/prazakj
07 DEC 2017
RUBEN KOGEL ( VSCO )
RUBEN@VSCO.CO
@CHILICONDATA on Twitter
Data-based User
Segmentation
VSCO→CONFIDENTIAL→DONOTDISTRIBUTEVSCO→CONFIDENTIAL→DONOTDISTRIBUTE
What is VSCO?
→ Community and tools for creators
→ 45M monthly audience (web + mobile)
→ 12B images served monthly
→ 70% of daily audience create
VSCO→CONFIDENTIAL→DONOTDISTRIBUTEVSCO→CONFIDENTIAL→DONOTDISTRIBUTE
Why segment?
→ Marketing / Design
• where do we position our product?
• how do we message our target audience?
• what usage do we design for?
• how do we make our UI more intuitive?
→ Growth / Biz Ops
• are our users engaged?
• how are they using our app in practice?
vsco.co/evanhundelt
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
the theory
usage frequency
milesdriven
commuters
taxi driversweekenders
greenies
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
where do you draw the line??
0 20 40 60 80 100
0102030
editing usage
number of actions
numberofpeople(inthousands)
the practice
0 20 40 60 80 100
01020304050
sessions
number of actions
numberofpeople(inthousands) 0 20 40 60 80 100
010203040
publishing usage
number of actions
numberofpeople(inthousands)
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
→ k-means find the dimensions with the most separation and use that information to form “clusters”
• each additional dimension will change the output - but does it add information?
→ eliminate unnecessary input variables
• use intuition and data exploration
→ segment only on the things that matter:
• age on the platform
• sum of past behavior
• current behavior - what we want to model
→ this is an iterative process: re-do this step after running the clustering algorithm
step 1: choose the right inputs
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
step 2:
0 20 40 60 80 100
0200004000060000
0 1 2 3 4
010000200003000040000
→ otherwise your model assumes the gap between people
editing 1 and 2 photos counts the same as between people
editing 101 and 102 photos
→ log transform so that the gap between few actions gets
blown up and the gap between large numbers get shrieked
• log(2) - log(1) = 0.69
• log(102) - log(101) = 0.01
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
step 3: choose the number of clusters that make sense
balance:
→ sparseness
→ interpretability
• does it match intuition?
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
step 4: deliver the insights in an intuitive way
1 2 3 4 5 6
dimension 0.0 0.0 0.0 0.9 2.8 0.5
dimension 0.0 0.0 0.0 0.6 1.9 0.3
dimension 0.0 0.0 0.0 0.5 1.5 0.3
dimension 0.2 0.1 0.1 8.5 18.4 2.5
dimension 0.2 0.1 0.1 3.1 3.9 1.4
dimension 0.3 4.8 27.1 2.1 20.5 22.7
dimension 0.3 2.5 7.6 1.3 7.7 6.9
dimension 0.3 1.9 3.3 1.1 3.4 3.3
dimension 0.2 3.6 21.4 0.3 3.4 7.3
dimension 0.1 0.2 0.1 2.7 13.0 10.5
dimension 0.1 0.1 0.1 1.6 6.5 4.1
dimension 0.1 0.1 0.1 1.3 3.2 2.5
dimension 0.0 0.0 0.0 0.5 6.4 0.1
dimension 0.0 0.0 0.0 0.4 4.2 0.1
dimension 0.0 0.0 0.0 0.4 2.5 0.1
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
step 5: use programmatic rules to track segments
→ what happens if we re-compute the clusters every month?
• k-means will define different looking clusters for every different dataset
• a user classified “super editor” one period might be classified “casual editor” the next period with
the exact same behavior
→ instead infer the segment boundaries from the cluster analysis and use these set boundaries to classify
users on an on-going basis
• more stable
• easier to explain
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
step 6: track on-going classification on a dashboard
segmentation, over time source of the “green” segment, in each month
VSCO→CONFIDENTIAL→DONOTDISTRIBUTEVSCO→CONFIDENTIAL→DONOTDISTRIBUTE
Summary
→ marketers, designers, and analysts use different
but complementary segmentation approaches
→ data-based segmentation is useful to track
usage; should be based on behavioral data only
→ most usage data is exponential so need log
transform and machine algorithms to identify
cluster boundaries
6 steps to doing a clustering analysis
1. choose the right inputs
2. log transform (almost) everything
3. choose the number of clusters that make sense
4. deliver the insights in an intuitive way
5. use programmatic rules to track cohorts
6. deliver dashboard or on-going classification
vsco.co/sannalinn
VSCO→CONFIDENTIAL→DONOTDISTRIBUTE
Questions?

More Related Content

Similar to Data based user segmentation - a practical guide for data analysts

Data Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB DataData Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB DataMongoDB
 
B.Pearson_Ten Trends Reshaping our Industry_r01
B.Pearson_Ten Trends Reshaping our Industry_r01B.Pearson_Ten Trends Reshaping our Industry_r01
B.Pearson_Ten Trends Reshaping our Industry_r01W2O Group
 
Visualizations that make an impact - see what s new in minitab statistical s...
Visualizations that make an impact  - see what s new in minitab statistical s...Visualizations that make an impact  - see what s new in minitab statistical s...
Visualizations that make an impact - see what s new in minitab statistical s...Minitab, LLC
 
Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...GameCamp
 
Ordina SOFTC Presentation - UsingGeoData_ReportBuilder
Ordina SOFTC Presentation - UsingGeoData_ReportBuilderOrdina SOFTC Presentation - UsingGeoData_ReportBuilder
Ordina SOFTC Presentation - UsingGeoData_ReportBuilderOrdina Belgium
 
Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...
Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...
Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...UXPA International
 
Quantitative Analysis of 3D Refractive Index Maps
Quantitative Analysis of 3D Refractive Index MapsQuantitative Analysis of 3D Refractive Index Maps
Quantitative Analysis of 3D Refractive Index MapsMathieuFRECHIN
 
Modern Engineering Practices - Building Blocks for the New Digital Economy (A...
Modern Engineering Practices - Building Blocks for the New Digital Economy (A...Modern Engineering Practices - Building Blocks for the New Digital Economy (A...
Modern Engineering Practices - Building Blocks for the New Digital Economy (A...IT Arena
 
CML's Presentation at FengChia University
CML's Presentation at FengChia UniversityCML's Presentation at FengChia University
CML's Presentation at FengChia UniversityTunghai University
 
The Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB DataThe Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB DataMongoDB
 
The Amino Analytical Framework - Leveraging Accumulo to the Fullest
The Amino Analytical Framework - Leveraging Accumulo to the Fullest The Amino Analytical Framework - Leveraging Accumulo to the Fullest
The Amino Analytical Framework - Leveraging Accumulo to the Fullest Donald Miner
 
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07pseybold
 
Brand Strategy and Super Bowl Twitter AnalyticsImage Sou.docx
Brand Strategy and Super Bowl Twitter AnalyticsImage Sou.docxBrand Strategy and Super Bowl Twitter AnalyticsImage Sou.docx
Brand Strategy and Super Bowl Twitter AnalyticsImage Sou.docxAASTHA76
 
Data Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM ExellysData Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM ExellysWout Scheepers
 
Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...
Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...
Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...Olympus IMS
 
Disciplined Entrepreneurship: What can you do for your customer?
Disciplined Entrepreneurship: What can you do for your customer?Disciplined Entrepreneurship: What can you do for your customer?
Disciplined Entrepreneurship: What can you do for your customer?Elaine Chen
 

Similar to Data based user segmentation - a practical guide for data analysts (20)

Piano rubyslava final
Piano rubyslava finalPiano rubyslava final
Piano rubyslava final
 
Data Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB DataData Analytics: Understanding Your MongoDB Data
Data Analytics: Understanding Your MongoDB Data
 
Ogilvie - Beyond the statistical average
Ogilvie  - Beyond the statistical averageOgilvie  - Beyond the statistical average
Ogilvie - Beyond the statistical average
 
Openobject bi
Openobject biOpenobject bi
Openobject bi
 
B.Pearson_Ten Trends Reshaping our Industry_r01
B.Pearson_Ten Trends Reshaping our Industry_r01B.Pearson_Ten Trends Reshaping our Industry_r01
B.Pearson_Ten Trends Reshaping our Industry_r01
 
Openobject bi
Openobject biOpenobject bi
Openobject bi
 
Visualizations that make an impact - see what s new in minitab statistical s...
Visualizations that make an impact  - see what s new in minitab statistical s...Visualizations that make an impact  - see what s new in minitab statistical s...
Visualizations that make an impact - see what s new in minitab statistical s...
 
Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...Building the BI system and analytics capabilities at the company based on Rea...
Building the BI system and analytics capabilities at the company based on Rea...
 
Ordina SOFTC Presentation - UsingGeoData_ReportBuilder
Ordina SOFTC Presentation - UsingGeoData_ReportBuilderOrdina SOFTC Presentation - UsingGeoData_ReportBuilder
Ordina SOFTC Presentation - UsingGeoData_ReportBuilder
 
Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...
Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...
Introducing a New UX Maturity Metric Team Engagement Score (TES) During Usabi...
 
Quantitative Analysis of 3D Refractive Index Maps
Quantitative Analysis of 3D Refractive Index MapsQuantitative Analysis of 3D Refractive Index Maps
Quantitative Analysis of 3D Refractive Index Maps
 
Modern Engineering Practices - Building Blocks for the New Digital Economy (A...
Modern Engineering Practices - Building Blocks for the New Digital Economy (A...Modern Engineering Practices - Building Blocks for the New Digital Economy (A...
Modern Engineering Practices - Building Blocks for the New Digital Economy (A...
 
CML's Presentation at FengChia University
CML's Presentation at FengChia UniversityCML's Presentation at FengChia University
CML's Presentation at FengChia University
 
The Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB DataThe Path to Truly Understanding Your MongoDB Data
The Path to Truly Understanding Your MongoDB Data
 
The Amino Analytical Framework - Leveraging Accumulo to the Fullest
The Amino Analytical Framework - Leveraging Accumulo to the Fullest The Amino Analytical Framework - Leveraging Accumulo to the Fullest
The Amino Analytical Framework - Leveraging Accumulo to the Fullest
 
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
 
Brand Strategy and Super Bowl Twitter AnalyticsImage Sou.docx
Brand Strategy and Super Bowl Twitter AnalyticsImage Sou.docxBrand Strategy and Super Bowl Twitter AnalyticsImage Sou.docx
Brand Strategy and Super Bowl Twitter AnalyticsImage Sou.docx
 
Data Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM ExellysData Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM Exellys
 
Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...
Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...
Using Digital Microscopes to Solve Common Microscopy Issues: Even First-Time ...
 
Disciplined Entrepreneurship: What can you do for your customer?
Disciplined Entrepreneurship: What can you do for your customer?Disciplined Entrepreneurship: What can you do for your customer?
Disciplined Entrepreneurship: What can you do for your customer?
 

Recently uploaded

Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"John Sobanski
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancingmohamed Elzalabany
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证ppy8zfkfm
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...yulianti213969
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证zifhagzkk
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证acoha1
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证dq9vz1isj
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证pwgnohujw
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024patrickdtherriault
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...BabaJohn3
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...ssuserf63bd7
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeBoston Institute of Analytics
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...ssuserf63bd7
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证pwgnohujw
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证acoha1
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 

Recently uploaded (20)

Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 

Data based user segmentation - a practical guide for data analysts

  • 1. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE Data Update - 01/27/2016vsco.co/blevishkin Data Update - 03/17/17vsco.co/prazakj 07 DEC 2017 RUBEN KOGEL ( VSCO ) RUBEN@VSCO.CO @CHILICONDATA on Twitter Data-based User Segmentation
  • 2. VSCO→CONFIDENTIAL→DONOTDISTRIBUTEVSCO→CONFIDENTIAL→DONOTDISTRIBUTE What is VSCO? → Community and tools for creators → 45M monthly audience (web + mobile) → 12B images served monthly → 70% of daily audience create
  • 3. VSCO→CONFIDENTIAL→DONOTDISTRIBUTEVSCO→CONFIDENTIAL→DONOTDISTRIBUTE Why segment? → Marketing / Design • where do we position our product? • how do we message our target audience? • what usage do we design for? • how do we make our UI more intuitive? → Growth / Biz Ops • are our users engaged? • how are they using our app in practice? vsco.co/evanhundelt
  • 5. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE where do you draw the line?? 0 20 40 60 80 100 0102030 editing usage number of actions numberofpeople(inthousands) the practice 0 20 40 60 80 100 01020304050 sessions number of actions numberofpeople(inthousands) 0 20 40 60 80 100 010203040 publishing usage number of actions numberofpeople(inthousands)
  • 6. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE → k-means find the dimensions with the most separation and use that information to form “clusters” • each additional dimension will change the output - but does it add information? → eliminate unnecessary input variables • use intuition and data exploration → segment only on the things that matter: • age on the platform • sum of past behavior • current behavior - what we want to model → this is an iterative process: re-do this step after running the clustering algorithm step 1: choose the right inputs
  • 7. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE step 2: 0 20 40 60 80 100 0200004000060000 0 1 2 3 4 010000200003000040000 → otherwise your model assumes the gap between people editing 1 and 2 photos counts the same as between people editing 101 and 102 photos → log transform so that the gap between few actions gets blown up and the gap between large numbers get shrieked • log(2) - log(1) = 0.69 • log(102) - log(101) = 0.01
  • 8. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE step 3: choose the number of clusters that make sense balance: → sparseness → interpretability • does it match intuition?
  • 9. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE step 4: deliver the insights in an intuitive way 1 2 3 4 5 6 dimension 0.0 0.0 0.0 0.9 2.8 0.5 dimension 0.0 0.0 0.0 0.6 1.9 0.3 dimension 0.0 0.0 0.0 0.5 1.5 0.3 dimension 0.2 0.1 0.1 8.5 18.4 2.5 dimension 0.2 0.1 0.1 3.1 3.9 1.4 dimension 0.3 4.8 27.1 2.1 20.5 22.7 dimension 0.3 2.5 7.6 1.3 7.7 6.9 dimension 0.3 1.9 3.3 1.1 3.4 3.3 dimension 0.2 3.6 21.4 0.3 3.4 7.3 dimension 0.1 0.2 0.1 2.7 13.0 10.5 dimension 0.1 0.1 0.1 1.6 6.5 4.1 dimension 0.1 0.1 0.1 1.3 3.2 2.5 dimension 0.0 0.0 0.0 0.5 6.4 0.1 dimension 0.0 0.0 0.0 0.4 4.2 0.1 dimension 0.0 0.0 0.0 0.4 2.5 0.1
  • 10. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE step 5: use programmatic rules to track segments → what happens if we re-compute the clusters every month? • k-means will define different looking clusters for every different dataset • a user classified “super editor” one period might be classified “casual editor” the next period with the exact same behavior → instead infer the segment boundaries from the cluster analysis and use these set boundaries to classify users on an on-going basis • more stable • easier to explain
  • 11. VSCO→CONFIDENTIAL→DONOTDISTRIBUTE step 6: track on-going classification on a dashboard segmentation, over time source of the “green” segment, in each month
  • 12. VSCO→CONFIDENTIAL→DONOTDISTRIBUTEVSCO→CONFIDENTIAL→DONOTDISTRIBUTE Summary → marketers, designers, and analysts use different but complementary segmentation approaches → data-based segmentation is useful to track usage; should be based on behavioral data only → most usage data is exponential so need log transform and machine algorithms to identify cluster boundaries 6 steps to doing a clustering analysis 1. choose the right inputs 2. log transform (almost) everything 3. choose the number of clusters that make sense 4. deliver the insights in an intuitive way 5. use programmatic rules to track cohorts 6. deliver dashboard or on-going classification vsco.co/sannalinn