SlideShare a Scribd company logo
1 of 42
Download to read offline
Data-­‐Driven	
  Product	
  Innova1on	
  
Xin	
  Fu	
  
LinkedIn	
  
Hernán	
  Asorey	
  
Salesforce	
  
WHY?	
  
2	
  
Data	
  Science	
  
Fuels	
  Product	
  
Innova1on	
  	
  
3	
  
Example	
  of	
  Data	
  Guiding	
  Product	
  Dev	
  
Queries	
  
4	
  
Unique	
  
Searchers	
   Unique	
  
Searchers	
  
Sessions	
   Queries	
  
Session	
  
*	
   *	
  
=	
  
How	
  Did	
  Data	
  Science	
  Help?	
  
§  Metric	
  to	
  define	
  product	
  “true	
  north”	
  
§  Tools	
  to	
  help	
  understanding	
  
§  A/B	
  experiments	
  to	
  learn	
  and	
  iterate	
  
5	
  
Data	
  Science’s	
  Role	
  in	
  a	
  Product	
  Org	
  
§  Model	
  1:	
  Data	
  Science	
  as	
  an	
  Owner	
  
§  Model	
  2:	
  Data	
  Science	
  as	
  a	
  Service	
  
§  Model	
  3:	
  Data	
  Science	
  as	
  a	
  Partner	
  
6	
  
Data	
  Science	
  as	
  an	
  Owner	
  
§  Operates	
  in	
  a	
  “hacky	
  way”	
  (early	
  stage	
  
companies)	
  
§  Key	
  steps	
  in	
  business	
  rely	
  heavily	
  on	
  data	
  
– Recommender	
  
– Relevance	
  
– Matching	
  
– Scoring	
  
§  Mostly	
  back-­‐end,	
  rela1vely	
  more	
  stand-­‐alone	
  
features	
  
7	
  
Data	
  Science	
  as	
  a	
  Service	
  
§  Engagement	
  is	
  “on-­‐demand”,	
  project-­‐based	
  
§  Examples:	
  
– some	
  of	
  the	
  BI	
  roles	
  
– strategy	
  roles	
  (consultancy)	
  
– data	
  API	
  to	
  product	
  
– modeling	
  for	
  specific	
  purposes:	
  propensity	
  
to	
  {x},	
  where	
  x:	
  {buy,	
  ahrite,	
  convert,	
  etc.}	
  	
  
8	
  
Data	
  Science	
  as	
  a	
  Partner	
  
§  Plays	
  ac1ve	
  role	
  in	
  every	
  stage	
  of	
  Product	
  Life	
  
Cycle	
  
§  Shares	
  the	
  ul1mate	
  goal	
  of	
  product	
  success	
  
§  Oken	
  requires	
  an	
  embedded	
  engagement	
  
model	
  
9	
  
Product	
  Life	
  Cycle	
  
1.	
  Idea1on	
  
2.	
  Design	
  &	
  Specs	
  
3.	
  Development	
  
4.	
  Test	
  &	
  Iterate	
  
5.	
  Release	
  
10	
  
1.	
  Idea1on	
  
2.	
  Design	
  &	
  Specs	
  
3.	
  Development	
  
4.	
  Test	
  &	
  Iterate	
  
5.	
  Release	
  
Data-­‐Driven	
  Product	
  
Innova1on	
  Framework	
  
1.	
  Hypothesis	
  &	
  
Ques1ons	
  
2.	
  Holis1c	
  Evalua1on	
  
Criteria	
  
3.	
  Tracking	
  &	
  
Valida1on	
  
4.	
  Experimenta1on	
  &	
  
Analysis	
  	
  
5.	
  Monitoring	
  &	
  
Learning	
  
11	
  
5.	
  Monitoring	
  &	
  Learning	
  
§  Start	
  from	
  good	
  
reliable	
  repor1ng	
  of	
  
key	
  product	
  metrics	
  
§  Par1cularly	
  important	
  
for	
  new	
  partnership	
  
(“credibility	
  
projects”)	
  
12	
  
Partner	
  with	
  Data	
  Infrastructure	
  
Team	
  to	
  Build	
  Great	
  Tools	
  
13	
  
4.	
  Experimenta1on	
  &	
  Analysis	
  
Ronny Kohavi’s key note “Online Controlled Experiments:
Lessons from Running A/B/n Tests for 12 Years”
Ya Xu et al.’s talk “From Infrastructure to Culture: A/B
Testing Challenges in Large Scale Social Networks”
§  A	
  company-­‐wide	
  
plaqorm	
  for	
  A/B	
  tes1ng,	
  
ramping,	
  and	
  advanced	
  
targe1ng	
  needs	
  
§  Automated	
  repor1ng	
  
and	
  analysis	
  capabili1es	
  
14	
  
Streams	
  of	
  Innova1on	
  around	
  	
  
A/B	
  Tes1ng	
  
15	
  
3.	
  Tracking	
  and	
  Valida1on	
  
§  Joint	
  ownership	
  between	
  
data	
  scien1st,	
  engineer,	
  
QA	
  and	
  product	
  manager	
  
§  Quarterly	
  sign-­‐off	
  process	
  
from	
  product	
  owners	
  
§  Created	
  a	
  tool	
  for	
  
ongoing	
  monitoring	
  
16	
  
2.	
  Holis1c	
  Evalua1on	
  Criteria	
  
Opportunity for strong thought leadership from
data scientists
Past: focus on product-specific metrics
–  Relevance change CTR
–  Profile redesign # Profile Views
Now: standardized, tiered metric
system
–  Site-wide Tier 1 metrics
–  Product-specific Tier 2 / Tier 3
metrics
17	
  
Tier	
  1	
   Tier	
  2	
   Tier	
  3	
  
1.	
  Hypotheses	
  &	
  Ques1ons	
  
Components	
  of	
  a	
  good	
  analysis	
  
What	
  we	
  would	
  like	
  to	
  see	
  
18	
  
INSIGHTS RECOMMENDATIONSDATA
Effec1ve	
  Ways	
  to	
  Drive	
  Ac1ons	
  
Secure	
  a	
  sponsor	
  and	
  
build	
  buy	
  in	
  	
  
Consider	
  the	
  scale	
  
of	
  target	
  audience	
  
and	
  technology	
  
Verify	
  by	
  tes1ng	
  with	
  
low-­‐cost	
  prototypes	
   19	
  
Outline	
  expected	
  
benefits	
  
Data	
  Science	
  in	
  Ac1on!	
  
20	
  
OrgDNA	
  
Abstract:	
  
Develop	
  prescrip1ve	
  tool	
  to	
  help	
  admins	
  configure	
  orgs.	
  Use	
  machine	
  learning	
  to	
  iden1fy	
  
the	
  rela1onships	
  between	
  perms,	
  prefs,	
  user	
  behavior	
  and	
  adop1on	
  metrics.	
  
	
  
Value	
  Crea;on:	
  
§  Increased	
  product	
  adop1on	
  
§  Increased	
  customer	
  engagement	
  
	
  
Accomplishments:	
  
§  Decoding	
  of	
  user	
  permissions	
  and	
  preferences	
  into	
  flat	
  binary	
  table	
  for	
  analysis	
  
§  Joined	
  in	
  ini1al	
  test	
  target	
  metrics	
  (red	
  accounts,	
  tenure)	
  and	
  behavioral	
  variables	
  
§  Data	
  QA	
  and	
  sanity	
  checks	
  complete	
  (explore	
  distribu1ons	
  and	
  ini1al	
  correla1ons)	
  
	
  
Sponsor:	
  SVP	
  of	
  Mobile	
  
	
  
Benefi;ng	
  Audience:	
  Salesforce	
  Administrators	
  at	
  Customer	
  Loca1ons	
  
	
   21	
  
Just	
  the	
  Beginning…	
  
22	
  
Permission	
  Frequency	
  By	
  Edi1on	
   Preference	
  Frequency	
  By	
  Edi1on	
  
Just	
  the	
  Beginning…	
  
23	
  
Super	
  High	
  Reten;on	
  
Group	
  (cluster	
  2)	
  
-­‐-­‐Days_S1	
  >	
  16	
  
-­‐-­‐CHURN	
  1%	
  
High	
  Reten;on	
  Group	
  
(cluster	
  1)	
  
-­‐-­‐Days_S1	
  between	
  8	
  
and	
  16	
  
-­‐-­‐CHURN	
  5%	
  
Lower	
  Reten;on	
  
Group	
  (cluster	
  4)	
  
-­‐-­‐Days_S1	
  <	
  8	
  
-­‐-­‐CHURN	
  34%	
  
High	
  Ac;vity,	
  Low	
  
Reten;on	
  Group	
  
(cluster	
  3)	
  
-­‐-­‐Days_S1	
  <	
  4	
  
-­‐-­‐Ac1vi1es	
  Per	
  Day	
  >	
  6	
  
-­‐-­‐CHURN	
  46%	
  Cluster	
  Averages	
  
NMI	
  Score	
  =	
  0.082	
  
Data	
  Science	
  as	
  a	
  Partner	
  
Quality &
Speed of
Product
Decisions
24	
  
Data	
  Science	
  as	
  a	
  Partner	
  
§  Focus	
  on	
  Product	
  and	
  balance	
  between	
  	
  
– Product	
  innova1on	
  and	
  opera1on	
  
– Velocity	
  and	
  scale	
  
– Theore1cal	
  research	
  and	
  prac1cal	
  impact	
  
§  Leverage	
  technology	
  for	
  reliability	
  and	
  
sustainability	
  
– Reliability	
  is	
  key	
  
– Reduce	
  “one-­‐offs”	
  
– Speed	
  mahers	
   25	
  
Case	
  Study	
  
26	
  
Data-­‐Driven	
  Product	
  In	
  Ac1on	
  
In	
  order	
  to	
  increase	
  the	
  connec1on	
  
density	
  of	
  new	
  users,	
  we	
  will	
  
lookup	
  the	
  exis1ng	
  members	
  
whose	
  address	
  books	
  contain	
  these	
  
new	
  users	
  as	
  contacts	
  
	
  
The	
  exis1ng	
  members	
  will	
  be	
  
no1fied	
  that	
  their	
  contacts	
  just	
  
joined	
  LinkedIn	
  and	
  will	
  be	
  
prompted	
  to	
  connect	
  with	
  them	
  
27	
  
Anatomy	
  of	
  a	
  Data-­‐Driven	
  Product	
  
1.  Problem	
  Statement	
  
2.  Opportunity	
  Analysis	
  
–  Iden1fy	
  target	
  audience	
  
–  Verify	
  feasibility	
  
3.  Holis1c	
  Evalua1on	
  Criteria	
  
–  Including	
  impact	
  es1mate	
  
4.  Low-­‐cost	
  Tes1ng	
  with	
  Tracking	
  
5.  Analysis	
  of	
  Test	
  Results	
  
6.  Hand	
  off	
  for	
  Scalable	
  Implementa1on	
  
28	
  
Step	
  1.	
  Problem	
  Statement	
  
Goal:	
  Increase	
  the	
  connec1on	
  density	
  of	
  new	
  
LinkedIn	
  members.	
  	
  Why?	
  
§  X	
  percent	
  of	
  new	
  users	
  have	
  Y	
  or	
  few	
  
connec1ons	
  at	
  30	
  days	
  aker	
  registra1on	
  
§  High	
  connec1on	
  density	
  =>	
  Beher	
  engagement	
  
29	
  
Step	
  2.	
  Opportunity	
  Analysis	
  
30	
  
Target	
  Audience:	
  5	
  (50%)	
  
Of	
  the	
  10	
  new	
  members,	
  5	
  are	
  eligible	
  
	
  
Feasibility	
  Analysis:	
  
Poten1al	
  Upside:	
  	
  X	
  new	
  connec1ons	
  for	
  
each	
  new	
  member	
  
For	
  the	
  5	
  eligible	
  New	
  Members,	
  median	
  
number	
  of	
  poten1al	
  ‘welcomer’	
  is	
  2	
  
X	
  =	
  	
  Median	
  (1,2,2,1,3)	
  	
  
	
  
No1fica1on	
  System	
  Load:	
  Of	
  the	
  10	
  exis1ng	
  
members,	
  4	
  are	
  poten1al	
  ‘welcomers’	
  	
  
	
  
Member	
  Overload:	
  The	
  volume	
  of	
  
no1fica1ons	
  to	
  receive	
  by	
  poten1al	
  
welcomers	
  has	
  a	
  distribu1on	
  with	
  Median=2	
  
and	
  Max=3	
  
E1	
  
N7	
  
E10	
  
E2	
  
E9	
  
E8	
  
N8	
  
N4	
  
N1	
  
N6	
  
E5	
  
N3	
  
E6	
  
E4	
  
E3	
  
E7	
  
N2	
  
N9	
  
N5	
  
N10	
  
AàB	
  :	
  A	
  appears	
  in	
  B’s	
  address	
  book	
  	
  	
  	
  	
  	
  	
  	
  	
  N:	
  New	
  Member	
  	
  	
  	
  	
  	
  	
  E:	
  Exis1ng	
  Member	
  
1	
  
2	
  
2	
  
3	
  
1	
  
Step	
  3.	
  Holis1c	
  Evalua1on	
  Criteria	
  
§  Primary	
  Metrics	
  (New	
  Members)	
  
– connec1on	
  density	
  
– reten1on	
  rate	
  
§  Secondary	
  Metrics	
  (Exis;ng	
  Members)	
  
– Connec1on	
  invita1ons	
  sent	
  	
  
– Connec1on	
  density	
  	
  
– Engagement	
  	
  
•  Impact	
  Es;mate	
  
– Need	
  to	
  make	
  assump1ons	
  about	
  conversion	
  rate	
  
31	
  
Step	
  4.	
  Low-­‐Cost	
  Tes1ng	
  
§  Email	
  Test	
  
§  Daily	
  offline	
  Hadoop	
  job	
  to	
  generate	
  the	
  list	
  of	
  
members	
  eligible	
  to	
  receive	
  the	
  email	
  (with	
  eng.	
  
review)	
  
§  Randomize	
  for	
  the	
  recipients	
  
§  Partner	
  Involvement	
  
§  Work	
  with	
  Engineering	
  and	
  QA	
  to	
  set	
  up	
  tracking	
  
§  Work	
  with	
  Ops,	
  SRE	
  and	
  Customer	
  Service	
  teams	
  
to	
  set	
  up	
  monitoring	
  
32	
  
Step	
  5.	
  Analysis	
  of	
  Test	
  Results	
  
§  Email	
  open	
  rate,	
  invite	
  rate,	
  invite	
  acceptance	
  rate	
  
§  Es1mate	
  impact	
  on	
  exis1ng	
  members	
  
§  Es1mate	
  impact	
  on	
  new	
  members	
  
	
  
33	
  
Step	
  6.	
  Scalable	
  Implementa1on	
  
Handed	
  off	
  to	
  Engineering	
  to	
  scale	
  up	
  the	
  impact:	
  
§  Expand	
  target	
  audience,	
  e.g.	
  to	
  dormant	
  
members	
  
§  Offline	
  email	
  to	
  online	
  and	
  mobile	
  no1fica1ons	
  
§  Introduce	
  relevance	
  rules	
  
34	
  
Exercise	
  
35	
  
Design	
  Your	
  Data-­‐Driven	
  Product	
  
Please	
  refer	
  to	
  the	
  hand-­‐out	
  
§  1.	
  Pick	
  an	
  idea	
  for	
  a	
  data-­‐driven	
  product	
  either	
  for	
  
your	
  ins1tu1on,	
  or	
  for	
  your	
  favorite	
  Web	
  product	
  
§  2.	
  Outline	
  your	
  plan	
  to	
  evaluate	
  the	
  idea	
  
– Problem	
  statement	
  
– Opportunity	
  analysis	
  	
  (target	
  audience,	
  feasibility)	
  
– Holis1c	
  Evalua1on	
  Criteria	
  and	
  poten1al	
  impact	
  
– Low-­‐cost	
  tes1ng	
  plan	
  
§  3.	
  Iden1fy	
  who	
  you	
  can	
  partner	
  with	
  
	
  
36	
  
Data	
  Science’s	
  Role	
  in	
  a	
  Product	
  Org	
  
Data	
  
Science	
  
Team	
  
Structure	
  
Importance	
  of	
  
Priori@za@on	
  
Velocity	
   Sustainability	
   Proac@ve	
  
Approach	
  
As	
  an	
  
Owner	
  
Single	
  
As	
  a	
  
Service	
  
Center	
  of	
  
Excellence	
  
As	
  a	
  
Partner	
  
Hub	
  &	
  
Spoke	
  
37	
  
Low	
   High	
  
Data	
  Science	
  as	
  Product	
  Partner	
  
PARTNERSHIP	
  
TECHNOLOGY	
  
TALENTS	
  
§  Start	
  w/	
  credibility	
  projects	
  
§  Context	
  and	
  ownership	
  
§  Scalable	
  path	
  to	
  innova1on	
  
§  Leverage	
  technology	
  to	
  
improve	
  quality	
  and	
  speed	
  
§  Reliability	
  is	
  key	
  	
  
§  Automate,	
  Automate!	
  
?	
   38	
  
Data	
  Science	
  as	
  Product	
  Partner	
  
§  We	
  put	
  great	
  emphasis	
  on	
  
understanding	
  and	
  solving	
  real	
  
business	
  problems	
  
– Data	
  sense:	
  what	
  is	
  possible	
  
– Product	
  sense:	
  what	
  is	
  valuable	
  
§  We	
  value	
  people	
  who	
  think	
  
holis1cally	
  and	
  in	
  scale	
  
– Cross-­‐product	
  impact	
  
– Viral	
  effect	
  
39	
  
Talents	
  
Tutorial	
  Takeaway	
  
PARTNERSHIP	
  
TECHNOLOGY	
  
TALENTS	
  
§  Start	
  w/	
  credibility	
  projects	
  
§  Context	
  and	
  ownership	
  
§  Scalable	
  path	
  to	
  innova1on	
  
§  Leverage	
  technology	
  to	
  
improve	
  quality	
  and	
  speed	
  
§  Reliability	
  is	
  key	
  	
  
§  Automate,	
  Automate!	
  
§  Passion	
  for	
  product	
  innova1on	
  
§  Proac;ve	
  in	
  partnership	
  
§  Impact	
  oriented	
  
Data	
  Science	
  as	
  a	
  Partner	
  
40	
  
41	
  
THANK	
  YOU	
  
Data-­‐Driven	
  Product	
  Innova1on	
  
Xin	
  Fu	
  
LinkedIn	
  
Hernán	
  Asorey	
  
Salesforce	
  

More Related Content

What's hot

Leveraging Information Steward
Leveraging Information StewardLeveraging Information Steward
Leveraging Information StewardMethod360
 
Knowledge Graphs: Changing How We Think About Data
Knowledge Graphs: Changing How We Think About DataKnowledge Graphs: Changing How We Think About Data
Knowledge Graphs: Changing How We Think About DataTim Williams
 
The Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience Insight
The Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience InsightThe Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience Insight
The Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience InsightFilipp Paster
 
OberservePoint - The Digital Data Quality Playbook
OberservePoint - The Digital Data Quality  PlaybookOberservePoint - The Digital Data Quality  Playbook
OberservePoint - The Digital Data Quality PlaybookObservePoint
 
Doing qualitative data analysis
Doing qualitative data analysisDoing qualitative data analysis
Doing qualitative data analysisIrene Torres
 
How to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital worldHow to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital worldCaseWare IDEA
 
How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...A.J. Riedel
 
BABOK v3 KA Task Summary v0.15
BABOK v3 KA Task Summary v0.15BABOK v3 KA Task Summary v0.15
BABOK v3 KA Task Summary v0.15Alan Maxwell, CBAP
 
IRJET- Predicting Review Ratings for Product Marketing
IRJET- Predicting Review Ratings for Product MarketingIRJET- Predicting Review Ratings for Product Marketing
IRJET- Predicting Review Ratings for Product MarketingIRJET Journal
 
Relational Databases are Evolving To Support New Data Capabilities
Relational Databases are Evolving To Support New Data CapabilitiesRelational Databases are Evolving To Support New Data Capabilities
Relational Databases are Evolving To Support New Data CapabilitiesEDB
 
Analytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drsAnalytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drsDavid Schiller
 
Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Paul Laughlin
 
CMMI Institute Conference Seattle_Final +.PPT
CMMI Institute Conference Seattle_Final +.PPTCMMI Institute Conference Seattle_Final +.PPT
CMMI Institute Conference Seattle_Final +.PPTMargaret Tanner Glover
 
Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)
Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)
Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)Joe Gollner
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseLisa Cohen
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringData-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringDATAVERSITY
 

What's hot (19)

Leveraging Information Steward
Leveraging Information StewardLeveraging Information Steward
Leveraging Information Steward
 
Knowledge Graphs: Changing How We Think About Data
Knowledge Graphs: Changing How We Think About DataKnowledge Graphs: Changing How We Think About Data
Knowledge Graphs: Changing How We Think About Data
 
The Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience Insight
The Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience InsightThe Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience Insight
The Outlook for Data 2017: A Snapshot Into the Evolving Role of Audience Insight
 
OberservePoint - The Digital Data Quality Playbook
OberservePoint - The Digital Data Quality  PlaybookOberservePoint - The Digital Data Quality  Playbook
OberservePoint - The Digital Data Quality Playbook
 
Doing qualitative data analysis
Doing qualitative data analysisDoing qualitative data analysis
Doing qualitative data analysis
 
How to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital worldHow to build a data analytics strategy in a digital world
How to build a data analytics strategy in a digital world
 
How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...How To Improve Profitability & Outperform Your Competition: the Guide to Data...
How To Improve Profitability & Outperform Your Competition: the Guide to Data...
 
BABOK v3 KA Task Summary v0.15
BABOK v3 KA Task Summary v0.15BABOK v3 KA Task Summary v0.15
BABOK v3 KA Task Summary v0.15
 
Data Quality Control
Data Quality ControlData Quality Control
Data Quality Control
 
BA Techniques BABOK
BA Techniques BABOKBA Techniques BABOK
BA Techniques BABOK
 
IRJET- Predicting Review Ratings for Product Marketing
IRJET- Predicting Review Ratings for Product MarketingIRJET- Predicting Review Ratings for Product Marketing
IRJET- Predicting Review Ratings for Product Marketing
 
Data driven decision making
Data driven decision makingData driven decision making
Data driven decision making
 
Relational Databases are Evolving To Support New Data Capabilities
Relational Databases are Evolving To Support New Data CapabilitiesRelational Databases are Evolving To Support New Data Capabilities
Relational Databases are Evolving To Support New Data Capabilities
 
Analytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drsAnalytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drs
 
Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020Presentation to Analytics Network of the OR Society Nov 2020
Presentation to Analytics Network of the OR Society Nov 2020
 
CMMI Institute Conference Seattle_Final +.PPT
CMMI Institute Conference Seattle_Final +.PPTCMMI Institute Conference Seattle_Final +.PPT
CMMI Institute Conference Seattle_Final +.PPT
 
Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)
Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)
Lean Manufacturing and DITA (Gnostyx at DITA Europe 2014)
 
Tips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the EnterpriseTips for Effective Data Science in the Enterprise
Tips for Effective Data Science in the Enterprise
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringData-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering
 

Similar to 2015kddtutorial

Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
DCSUG - Finding Lean in Agile
DCSUG - Finding Lean in AgileDCSUG - Finding Lean in Agile
DCSUG - Finding Lean in AgileExcella
 
Innovative Data Leveraging for Procurement Analytics
Innovative Data Leveraging for Procurement AnalyticsInnovative Data Leveraging for Procurement Analytics
Innovative Data Leveraging for Procurement AnalyticsTejari
 
APRA_Contact Reports_2016_Turner_Hrubik_IJM
APRA_Contact Reports_2016_Turner_Hrubik_IJMAPRA_Contact Reports_2016_Turner_Hrubik_IJM
APRA_Contact Reports_2016_Turner_Hrubik_IJMThomas Turner
 
Inside good summary 2012 v1
Inside good summary 2012 v1Inside good summary 2012 v1
Inside good summary 2012 v1KatieWeiss
 
Enriching the Value of Clinical Data with Oracle Data Management Workbench
Enriching the Value of Clinical Data with Oracle Data Management WorkbenchEnriching the Value of Clinical Data with Oracle Data Management Workbench
Enriching the Value of Clinical Data with Oracle Data Management WorkbenchPerficient, Inc.
 
Optimize Project Intake Approval and Prioritization
Optimize Project Intake Approval and PrioritizationOptimize Project Intake Approval and Prioritization
Optimize Project Intake Approval and PrioritizationInfo-Tech Research Group
 
Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"Lviv Startup Club
 
Product Era is Here
Product Era is HereProduct Era is Here
Product Era is HereAmplitude
 
Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationOptimizely
 
Change Management: The Secret to a Successful SAS® Implementation
Change Management:  The Secret to a Successful SAS® ImplementationChange Management:  The Secret to a Successful SAS® Implementation
Change Management: The Secret to a Successful SAS® ImplementationThotWave
 
DataSpryng Overview
DataSpryng OverviewDataSpryng Overview
DataSpryng Overviewjkvr
 

Similar to 2015kddtutorial (20)

Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
Fairbanks
FairbanksFairbanks
Fairbanks
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
IdeaScreen 2013
IdeaScreen 2013IdeaScreen 2013
IdeaScreen 2013
 
Agile Data
Agile DataAgile Data
Agile Data
 
Finding Lean in Agile by Adam Parker
Finding Lean in Agile by Adam ParkerFinding Lean in Agile by Adam Parker
Finding Lean in Agile by Adam Parker
 
DCSUG - Finding Lean in Agile
DCSUG - Finding Lean in AgileDCSUG - Finding Lean in Agile
DCSUG - Finding Lean in Agile
 
Innovative Data Leveraging for Procurement Analytics
Innovative Data Leveraging for Procurement AnalyticsInnovative Data Leveraging for Procurement Analytics
Innovative Data Leveraging for Procurement Analytics
 
APRA_Contact Reports_2016_Turner_Hrubik_IJM
APRA_Contact Reports_2016_Turner_Hrubik_IJMAPRA_Contact Reports_2016_Turner_Hrubik_IJM
APRA_Contact Reports_2016_Turner_Hrubik_IJM
 
Inside good summary 2012 v1
Inside good summary 2012 v1Inside good summary 2012 v1
Inside good summary 2012 v1
 
Enriching the Value of Clinical Data with Oracle Data Management Workbench
Enriching the Value of Clinical Data with Oracle Data Management WorkbenchEnriching the Value of Clinical Data with Oracle Data Management Workbench
Enriching the Value of Clinical Data with Oracle Data Management Workbench
 
Optimize Project Intake Approval and Prioritization
Optimize Project Intake Approval and PrioritizationOptimize Project Intake Approval and Prioritization
Optimize Project Intake Approval and Prioritization
 
Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"
 
The Data Overhaul
The Data OverhaulThe Data Overhaul
The Data Overhaul
 
Product Era is Here
Product Era is HereProduct Era is Here
Product Era is Here
 
Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive Experimentation
 
Change Management: The Secret to a Successful SAS® Implementation
Change Management:  The Secret to a Successful SAS® ImplementationChange Management:  The Secret to a Successful SAS® Implementation
Change Management: The Secret to a Successful SAS® Implementation
 
Benchmarking
BenchmarkingBenchmarking
Benchmarking
 
DataSpryng Overview
DataSpryng OverviewDataSpryng Overview
DataSpryng Overview
 
Data Science and Analytics
Data Science and Analytics Data Science and Analytics
Data Science and Analytics
 

2015kddtutorial

  • 1. Data-­‐Driven  Product  Innova1on   Xin  Fu   LinkedIn   Hernán  Asorey   Salesforce  
  • 3. Data  Science   Fuels  Product   Innova1on     3  
  • 4. Example  of  Data  Guiding  Product  Dev   Queries   4   Unique   Searchers   Unique   Searchers   Sessions   Queries   Session   *   *   =  
  • 5. How  Did  Data  Science  Help?   §  Metric  to  define  product  “true  north”   §  Tools  to  help  understanding   §  A/B  experiments  to  learn  and  iterate   5  
  • 6. Data  Science’s  Role  in  a  Product  Org   §  Model  1:  Data  Science  as  an  Owner   §  Model  2:  Data  Science  as  a  Service   §  Model  3:  Data  Science  as  a  Partner   6  
  • 7. Data  Science  as  an  Owner   §  Operates  in  a  “hacky  way”  (early  stage   companies)   §  Key  steps  in  business  rely  heavily  on  data   – Recommender   – Relevance   – Matching   – Scoring   §  Mostly  back-­‐end,  rela1vely  more  stand-­‐alone   features   7  
  • 8. Data  Science  as  a  Service   §  Engagement  is  “on-­‐demand”,  project-­‐based   §  Examples:   – some  of  the  BI  roles   – strategy  roles  (consultancy)   – data  API  to  product   – modeling  for  specific  purposes:  propensity   to  {x},  where  x:  {buy,  ahrite,  convert,  etc.}     8  
  • 9. Data  Science  as  a  Partner   §  Plays  ac1ve  role  in  every  stage  of  Product  Life   Cycle   §  Shares  the  ul1mate  goal  of  product  success   §  Oken  requires  an  embedded  engagement   model   9  
  • 10. Product  Life  Cycle   1.  Idea1on   2.  Design  &  Specs   3.  Development   4.  Test  &  Iterate   5.  Release   10  
  • 11. 1.  Idea1on   2.  Design  &  Specs   3.  Development   4.  Test  &  Iterate   5.  Release   Data-­‐Driven  Product   Innova1on  Framework   1.  Hypothesis  &   Ques1ons   2.  Holis1c  Evalua1on   Criteria   3.  Tracking  &   Valida1on   4.  Experimenta1on  &   Analysis     5.  Monitoring  &   Learning   11  
  • 12. 5.  Monitoring  &  Learning   §  Start  from  good   reliable  repor1ng  of   key  product  metrics   §  Par1cularly  important   for  new  partnership   (“credibility   projects”)   12  
  • 13. Partner  with  Data  Infrastructure   Team  to  Build  Great  Tools   13  
  • 14. 4.  Experimenta1on  &  Analysis   Ronny Kohavi’s key note “Online Controlled Experiments: Lessons from Running A/B/n Tests for 12 Years” Ya Xu et al.’s talk “From Infrastructure to Culture: A/B Testing Challenges in Large Scale Social Networks” §  A  company-­‐wide   plaqorm  for  A/B  tes1ng,   ramping,  and  advanced   targe1ng  needs   §  Automated  repor1ng   and  analysis  capabili1es   14  
  • 15. Streams  of  Innova1on  around     A/B  Tes1ng   15  
  • 16. 3.  Tracking  and  Valida1on   §  Joint  ownership  between   data  scien1st,  engineer,   QA  and  product  manager   §  Quarterly  sign-­‐off  process   from  product  owners   §  Created  a  tool  for   ongoing  monitoring   16  
  • 17. 2.  Holis1c  Evalua1on  Criteria   Opportunity for strong thought leadership from data scientists Past: focus on product-specific metrics –  Relevance change CTR –  Profile redesign # Profile Views Now: standardized, tiered metric system –  Site-wide Tier 1 metrics –  Product-specific Tier 2 / Tier 3 metrics 17   Tier  1   Tier  2   Tier  3  
  • 18. 1.  Hypotheses  &  Ques1ons   Components  of  a  good  analysis   What  we  would  like  to  see   18   INSIGHTS RECOMMENDATIONSDATA
  • 19. Effec1ve  Ways  to  Drive  Ac1ons   Secure  a  sponsor  and   build  buy  in     Consider  the  scale   of  target  audience   and  technology   Verify  by  tes1ng  with   low-­‐cost  prototypes   19   Outline  expected   benefits  
  • 20. Data  Science  in  Ac1on!   20  
  • 21. OrgDNA   Abstract:   Develop  prescrip1ve  tool  to  help  admins  configure  orgs.  Use  machine  learning  to  iden1fy   the  rela1onships  between  perms,  prefs,  user  behavior  and  adop1on  metrics.     Value  Crea;on:   §  Increased  product  adop1on   §  Increased  customer  engagement     Accomplishments:   §  Decoding  of  user  permissions  and  preferences  into  flat  binary  table  for  analysis   §  Joined  in  ini1al  test  target  metrics  (red  accounts,  tenure)  and  behavioral  variables   §  Data  QA  and  sanity  checks  complete  (explore  distribu1ons  and  ini1al  correla1ons)     Sponsor:  SVP  of  Mobile     Benefi;ng  Audience:  Salesforce  Administrators  at  Customer  Loca1ons     21  
  • 22. Just  the  Beginning…   22   Permission  Frequency  By  Edi1on   Preference  Frequency  By  Edi1on  
  • 23. Just  the  Beginning…   23   Super  High  Reten;on   Group  (cluster  2)   -­‐-­‐Days_S1  >  16   -­‐-­‐CHURN  1%   High  Reten;on  Group   (cluster  1)   -­‐-­‐Days_S1  between  8   and  16   -­‐-­‐CHURN  5%   Lower  Reten;on   Group  (cluster  4)   -­‐-­‐Days_S1  <  8   -­‐-­‐CHURN  34%   High  Ac;vity,  Low   Reten;on  Group   (cluster  3)   -­‐-­‐Days_S1  <  4   -­‐-­‐Ac1vi1es  Per  Day  >  6   -­‐-­‐CHURN  46%  Cluster  Averages   NMI  Score  =  0.082  
  • 24. Data  Science  as  a  Partner   Quality & Speed of Product Decisions 24  
  • 25. Data  Science  as  a  Partner   §  Focus  on  Product  and  balance  between     – Product  innova1on  and  opera1on   – Velocity  and  scale   – Theore1cal  research  and  prac1cal  impact   §  Leverage  technology  for  reliability  and   sustainability   – Reliability  is  key   – Reduce  “one-­‐offs”   – Speed  mahers   25  
  • 27. Data-­‐Driven  Product  In  Ac1on   In  order  to  increase  the  connec1on   density  of  new  users,  we  will   lookup  the  exis1ng  members   whose  address  books  contain  these   new  users  as  contacts     The  exis1ng  members  will  be   no1fied  that  their  contacts  just   joined  LinkedIn  and  will  be   prompted  to  connect  with  them   27  
  • 28. Anatomy  of  a  Data-­‐Driven  Product   1.  Problem  Statement   2.  Opportunity  Analysis   –  Iden1fy  target  audience   –  Verify  feasibility   3.  Holis1c  Evalua1on  Criteria   –  Including  impact  es1mate   4.  Low-­‐cost  Tes1ng  with  Tracking   5.  Analysis  of  Test  Results   6.  Hand  off  for  Scalable  Implementa1on   28  
  • 29. Step  1.  Problem  Statement   Goal:  Increase  the  connec1on  density  of  new   LinkedIn  members.    Why?   §  X  percent  of  new  users  have  Y  or  few   connec1ons  at  30  days  aker  registra1on   §  High  connec1on  density  =>  Beher  engagement   29  
  • 30. Step  2.  Opportunity  Analysis   30   Target  Audience:  5  (50%)   Of  the  10  new  members,  5  are  eligible     Feasibility  Analysis:   Poten1al  Upside:    X  new  connec1ons  for   each  new  member   For  the  5  eligible  New  Members,  median   number  of  poten1al  ‘welcomer’  is  2   X  =    Median  (1,2,2,1,3)       No1fica1on  System  Load:  Of  the  10  exis1ng   members,  4  are  poten1al  ‘welcomers’       Member  Overload:  The  volume  of   no1fica1ons  to  receive  by  poten1al   welcomers  has  a  distribu1on  with  Median=2   and  Max=3   E1   N7   E10   E2   E9   E8   N8   N4   N1   N6   E5   N3   E6   E4   E3   E7   N2   N9   N5   N10   AàB  :  A  appears  in  B’s  address  book                  N:  New  Member              E:  Exis1ng  Member   1   2   2   3   1  
  • 31. Step  3.  Holis1c  Evalua1on  Criteria   §  Primary  Metrics  (New  Members)   – connec1on  density   – reten1on  rate   §  Secondary  Metrics  (Exis;ng  Members)   – Connec1on  invita1ons  sent     – Connec1on  density     – Engagement     •  Impact  Es;mate   – Need  to  make  assump1ons  about  conversion  rate   31  
  • 32. Step  4.  Low-­‐Cost  Tes1ng   §  Email  Test   §  Daily  offline  Hadoop  job  to  generate  the  list  of   members  eligible  to  receive  the  email  (with  eng.   review)   §  Randomize  for  the  recipients   §  Partner  Involvement   §  Work  with  Engineering  and  QA  to  set  up  tracking   §  Work  with  Ops,  SRE  and  Customer  Service  teams   to  set  up  monitoring   32  
  • 33. Step  5.  Analysis  of  Test  Results   §  Email  open  rate,  invite  rate,  invite  acceptance  rate   §  Es1mate  impact  on  exis1ng  members   §  Es1mate  impact  on  new  members     33  
  • 34. Step  6.  Scalable  Implementa1on   Handed  off  to  Engineering  to  scale  up  the  impact:   §  Expand  target  audience,  e.g.  to  dormant   members   §  Offline  email  to  online  and  mobile  no1fica1ons   §  Introduce  relevance  rules   34  
  • 36. Design  Your  Data-­‐Driven  Product   Please  refer  to  the  hand-­‐out   §  1.  Pick  an  idea  for  a  data-­‐driven  product  either  for   your  ins1tu1on,  or  for  your  favorite  Web  product   §  2.  Outline  your  plan  to  evaluate  the  idea   – Problem  statement   – Opportunity  analysis    (target  audience,  feasibility)   – Holis1c  Evalua1on  Criteria  and  poten1al  impact   – Low-­‐cost  tes1ng  plan   §  3.  Iden1fy  who  you  can  partner  with     36  
  • 37. Data  Science’s  Role  in  a  Product  Org   Data   Science   Team   Structure   Importance  of   Priori@za@on   Velocity   Sustainability   Proac@ve   Approach   As  an   Owner   Single   As  a   Service   Center  of   Excellence   As  a   Partner   Hub  &   Spoke   37   Low   High  
  • 38. Data  Science  as  Product  Partner   PARTNERSHIP   TECHNOLOGY   TALENTS   §  Start  w/  credibility  projects   §  Context  and  ownership   §  Scalable  path  to  innova1on   §  Leverage  technology  to   improve  quality  and  speed   §  Reliability  is  key     §  Automate,  Automate!   ?   38  
  • 39. Data  Science  as  Product  Partner   §  We  put  great  emphasis  on   understanding  and  solving  real   business  problems   – Data  sense:  what  is  possible   – Product  sense:  what  is  valuable   §  We  value  people  who  think   holis1cally  and  in  scale   – Cross-­‐product  impact   – Viral  effect   39   Talents  
  • 40. Tutorial  Takeaway   PARTNERSHIP   TECHNOLOGY   TALENTS   §  Start  w/  credibility  projects   §  Context  and  ownership   §  Scalable  path  to  innova1on   §  Leverage  technology  to   improve  quality  and  speed   §  Reliability  is  key     §  Automate,  Automate!   §  Passion  for  product  innova1on   §  Proac;ve  in  partnership   §  Impact  oriented   Data  Science  as  a  Partner   40  
  • 42. Data-­‐Driven  Product  Innova1on   Xin  Fu   LinkedIn   Hernán  Asorey   Salesforce