SlideShare a Scribd company logo
Building Personalized
Data Products with Dato
Trey Causey
trey@dato.com
Questions?
• Now: We are monitoring chat window
• Later: Email me at trey@dato.com
• dato.com
What are data products?
• Products that produce and consume data.
• Products that improve as they produce and
consume data.
• Products that use data to provide a personalized
experience.
• Personalized experiences increase engagement
and retention.
What data?
• You probably already have this data
• Usage logs, transaction data, etc.
• Need a way to turn this existing data into
an intelligent application
Recommender systems
• Personalized experiences through
recommendations
• Recommend products, social network
connections, events, songs, and more
• Implicitly and explicitly drive many of
experiences you’re familiar with
Recommender uses
• Netflix, Spotify, LinkedIn, Facebook with the most
visible examples
• “You May Also Like”
“People You May Know”
“People to Follow”
• Also silently power many other experiences
• Product listings, up-sell options, add-ons,
• Netflix —> $1MM for 10% better
What data do you need?
• Required for implicit data
• User identifier
• Product identifier
• That’s it!
• Further customization
• Ratings (explicit data), counts
• Side data
Implicit data
• User x product
interactions
• Consumed / used /
clicked / etc.
How do recommenders work?
• Most basic: item similarity
Matrix factorization
• Treat users and products as a giant matrix
with (very) many missing values
• Users have latent factors that describe
how much they like various genres
• Items have latent factors that describe
how much like each genre they are
Matrix factorization
• Turn this into a fill-in-the-missing-value
exercise by learning the latent factors
• Implicit or explicit data
• Part of the winning formula for the Netflix
Prize
• Predict ratings or rankings
Matrix factorization
Fill in the blanks
• Learn the latent factors that minimize
prediction error on the observed values
• Fill in the missing values
• Sort the list by predicted rating &
recommend the unseen items
Rankings?
• Often less concerned with predicting
precise scores
• Just want to get the first few items right
• Screen real estate is precious
• Ranking factorization recommender
Side features
• Include information about users
• Geographic, demographic, time of day,
etc.
• Include information about products
• Product subtypes, geographic
availability, etc.
• Help with the cold start problem
How to choose which model?
• Select the appropriate model for your data
(implicit/explicit), if you want side features
or not, select hyperparameters, tune
them…
• … or let GraphLab Create do it for you and
automatically tune hyperparameters
Evaluation
• Train on a portion of your data
• Test on a held-out portion
• Ratings: RMSE
• Ranking: Precision, recall
• Business metrics
• Evaluate against popularity
Live demo
• Building and deploying a recommender
system with GraphLab Create and Dato
Predictive Services
Thank you!
• dato.com
• @datoinc
• trey@dato.com

More Related Content

What's hot

Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and Practices
Will Gardella
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
Ivo Andreev
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
André Karpištšenko
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
Turi, Inc.
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
Mark Tabladillo
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!
Turi, Inc.
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
PyData
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
Johann Schleier-Smith
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanSquare's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong Yan
Hakka Labs
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Machine Learning system architecture – Microsoft Translator, a Case Study :  ...Machine Learning system architecture – Microsoft Translator, a Case Study :  ...
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Vishal Chowdhary
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Sri Ambati
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
Dataiku
 
A quick overview of Eaagle
A quick overview of EaagleA quick overview of Eaagle
A quick overview of Eaagle
Eaagle
 
Building Better Models Faster Using Active Learning
Building Better Models Faster Using Active LearningBuilding Better Models Faster Using Active Learning
Building Better Models Faster Using Active Learning
CrowdFlower
 
Deploying ml
Deploying mlDeploying ml
Deploying ml
Turi, Inc.
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
Sri Ambati
 
Introduction to Azure machine learning
Introduction to Azure machine learningIntroduction to Azure machine learning
Introduction to Azure machine learning
Jasjit Chopra
 
Emerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper reviewEmerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper review
Gopi Krishna Nuti
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Sri Ambati
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
BigDataCloud
 

What's hot (20)

Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and Practices
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanSquare's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong Yan
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Machine Learning system architecture – Microsoft Translator, a Case Study :  ...Machine Learning system architecture – Microsoft Translator, a Case Study :  ...
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
 
A quick overview of Eaagle
A quick overview of EaagleA quick overview of Eaagle
A quick overview of Eaagle
 
Building Better Models Faster Using Active Learning
Building Better Models Faster Using Active LearningBuilding Better Models Faster Using Active Learning
Building Better Models Faster Using Active Learning
 
Deploying ml
Deploying mlDeploying ml
Deploying ml
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Introduction to Azure machine learning
Introduction to Azure machine learningIntroduction to Azure machine learning
Introduction to Azure machine learning
 
Emerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper reviewEmerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper review
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
 

Viewers also liked

Leveraging data science to keep commerce safe
Leveraging data science to keep commerce safeLeveraging data science to keep commerce safe
Leveraging data science to keep commerce safe
Turi, Inc.
 
Politi-K
Politi-KPoliti-K
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Ecologistas en Accion
 
Dieta de líquidos completos
Dieta de líquidos completosDieta de líquidos completos
Dieta de líquidos completos
Bryan Gonzalez
 
Neck pain
Neck painNeck pain
Neck pain
Robert Briggs
 
World health organization - Nikhil - HRM, Welingkar
World health organization - Nikhil - HRM, WelingkarWorld health organization - Nikhil - HRM, Welingkar
World health organization - Nikhil - HRM, Welingkar
Nikhil Wayngankar
 
Lobbysheets 2017
Lobbysheets 2017Lobbysheets 2017
Lobbysheets 2017
Jan Henk Verburg
 
Hammerdesk Profile
Hammerdesk  ProfileHammerdesk  Profile
Hammerdesk Profile
HammerDesk.Com
 

Viewers also liked (9)

Leveraging data science to keep commerce safe
Leveraging data science to keep commerce safeLeveraging data science to keep commerce safe
Leveraging data science to keep commerce safe
 
RESUME
RESUMERESUME
RESUME
 
Politi-K
Politi-KPoliti-K
Politi-K
 
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
 
Dieta de líquidos completos
Dieta de líquidos completosDieta de líquidos completos
Dieta de líquidos completos
 
Neck pain
Neck painNeck pain
Neck pain
 
World health organization - Nikhil - HRM, Welingkar
World health organization - Nikhil - HRM, WelingkarWorld health organization - Nikhil - HRM, Welingkar
World health organization - Nikhil - HRM, Welingkar
 
Lobbysheets 2017
Lobbysheets 2017Lobbysheets 2017
Lobbysheets 2017
 
Hammerdesk Profile
Hammerdesk  ProfileHammerdesk  Profile
Hammerdesk Profile
 

Similar to Building Personalized Data Products with Dato

Career in Data Using Tableau
Career in Data Using TableauCareer in Data Using Tableau
Career in Data Using Tableau
Jen Vaughan
 
Nonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & TricksNonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & Tricks
Minds On Design Lab
 
Design Recommender systems from scratch
Design Recommender systems from scratchDesign Recommender systems from scratch
Design Recommender systems from scratch
Dr. Amit Sachan
 
Data Detectives - Presentation
Data Detectives - PresentationData Detectives - Presentation
Data Detectives - PresentationClint Campbell
 
Creating a marketing calendar that works for you
Creating a marketing calendar that works for youCreating a marketing calendar that works for you
Creating a marketing calendar that works for you
Leading Results, Inc
 
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
Vishrut Shukla
 
7 Step Data Cleanse: Salesforce Hygiene
7 Step Data Cleanse: Salesforce Hygiene7 Step Data Cleanse: Salesforce Hygiene
7 Step Data Cleanse: Salesforce Hygiene
CloudFixer
 
Interactive Marketing week 7 Ethan Chazin
Interactive Marketing week 7 Ethan ChazinInteractive Marketing week 7 Ethan Chazin
Interactive Marketing week 7 Ethan Chazin
Ethan Chazin MBA
 
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.com
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.comDiscover the Benefits of Cloud Computing with Google Apps and Salesforce.com
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.com
abcboston
 
Webinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchWebinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better Search
Lucidworks
 
ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012TRG Arts
 
Fueling Your Growth With Smart Data Management
Fueling Your Growth With Smart Data ManagementFueling Your Growth With Smart Data Management
Fueling Your Growth With Smart Data Management
MDR
 
Raab innovation begins with data
Raab innovation begins with dataRaab innovation begins with data
Raab innovation begins with data
draab
 
Think tank - Data Culture for a Better Business
Think tank - Data Culture for a Better BusinessThink tank - Data Culture for a Better Business
Think tank - Data Culture for a Better Business
Dan Cave
 
SEO 101 deck for 3dCart webinar
SEO 101 deck for 3dCart webinarSEO 101 deck for 3dCart webinar
SEO 101 deck for 3dCart webinar
Duane Forrester
 
Stc preso2012 b
Stc preso2012 bStc preso2012 b
Stc preso2012 b
prboswell
 
Eventbrite sxsw
Eventbrite sxswEventbrite sxsw
Eventbrite sxsw
Vipul Sharma
 
116 Machine learning for Product Managers
116   Machine learning for Product Managers116   Machine learning for Product Managers
116 Machine learning for Product Managers
ProductCamp Boston
 
Machine learning for product managers. Presented at Boston ProductCamp (June...
Machine learning for product  managers. Presented at Boston ProductCamp (June...Machine learning for product  managers. Presented at Boston ProductCamp (June...
Machine learning for product managers. Presented at Boston ProductCamp (June...
Mukund Seshadri
 
Digital Marketing Analytics Certification - Session One
Digital Marketing Analytics Certification - Session OneDigital Marketing Analytics Certification - Session One
Digital Marketing Analytics Certification - Session One
Brand Digital, Inc
 

Similar to Building Personalized Data Products with Dato (20)

Career in Data Using Tableau
Career in Data Using TableauCareer in Data Using Tableau
Career in Data Using Tableau
 
Nonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & TricksNonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & Tricks
 
Design Recommender systems from scratch
Design Recommender systems from scratchDesign Recommender systems from scratch
Design Recommender systems from scratch
 
Data Detectives - Presentation
Data Detectives - PresentationData Detectives - Presentation
Data Detectives - Presentation
 
Creating a marketing calendar that works for you
Creating a marketing calendar that works for youCreating a marketing calendar that works for you
Creating a marketing calendar that works for you
 
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
 
7 Step Data Cleanse: Salesforce Hygiene
7 Step Data Cleanse: Salesforce Hygiene7 Step Data Cleanse: Salesforce Hygiene
7 Step Data Cleanse: Salesforce Hygiene
 
Interactive Marketing week 7 Ethan Chazin
Interactive Marketing week 7 Ethan ChazinInteractive Marketing week 7 Ethan Chazin
Interactive Marketing week 7 Ethan Chazin
 
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.com
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.comDiscover the Benefits of Cloud Computing with Google Apps and Salesforce.com
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.com
 
Webinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchWebinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better Search
 
ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012
 
Fueling Your Growth With Smart Data Management
Fueling Your Growth With Smart Data ManagementFueling Your Growth With Smart Data Management
Fueling Your Growth With Smart Data Management
 
Raab innovation begins with data
Raab innovation begins with dataRaab innovation begins with data
Raab innovation begins with data
 
Think tank - Data Culture for a Better Business
Think tank - Data Culture for a Better BusinessThink tank - Data Culture for a Better Business
Think tank - Data Culture for a Better Business
 
SEO 101 deck for 3dCart webinar
SEO 101 deck for 3dCart webinarSEO 101 deck for 3dCart webinar
SEO 101 deck for 3dCart webinar
 
Stc preso2012 b
Stc preso2012 bStc preso2012 b
Stc preso2012 b
 
Eventbrite sxsw
Eventbrite sxswEventbrite sxsw
Eventbrite sxsw
 
116 Machine learning for Product Managers
116   Machine learning for Product Managers116   Machine learning for Product Managers
116 Machine learning for Product Managers
 
Machine learning for product managers. Presented at Boston ProductCamp (June...
Machine learning for product  managers. Presented at Boston ProductCamp (June...Machine learning for product  managers. Presented at Boston ProductCamp (June...
Machine learning for product managers. Presented at Boston ProductCamp (June...
 
Digital Marketing Analytics Certification - Session One
Digital Marketing Analytics Certification - Session OneDigital Marketing Analytics Certification - Session One
Digital Marketing Analytics Certification - Session One
 

More from Turi, Inc.

Webinar - Analyzing Video
Webinar - Analyzing VideoWebinar - Analyzing Video
Webinar - Analyzing Video
Turi, Inc.
 
Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
Turi, Inc.
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)
Turi, Inc.
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Turi, Inc.
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log Data
Turi, Inc.
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
Turi, Inc.
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
Turi, Inc.
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
Turi, Inc.
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Turi, Inc.
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
Turi, Inc.
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Turi, Inc.
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
Turi, Inc.
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
Turi, Inc.
 
SFrame
SFrameSFrame
SFrame
Turi, Inc.
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
Turi, Inc.
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
Turi, Inc.
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
Turi, Inc.
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Turi, Inc.
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data Experience
Turi, Inc.
 
Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark
Turi, Inc.
 

More from Turi, Inc. (20)

Webinar - Analyzing Video
Webinar - Analyzing VideoWebinar - Analyzing Video
Webinar - Analyzing Video
 
Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log Data
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
SFrame
SFrameSFrame
SFrame
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data Experience
 
Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark
 

Recently uploaded

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 

Recently uploaded (20)

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 

Building Personalized Data Products with Dato

  • 1. Building Personalized Data Products with Dato Trey Causey trey@dato.com
  • 2. Questions? • Now: We are monitoring chat window • Later: Email me at trey@dato.com • dato.com
  • 3. What are data products? • Products that produce and consume data. • Products that improve as they produce and consume data. • Products that use data to provide a personalized experience. • Personalized experiences increase engagement and retention.
  • 4. What data? • You probably already have this data • Usage logs, transaction data, etc. • Need a way to turn this existing data into an intelligent application
  • 5. Recommender systems • Personalized experiences through recommendations • Recommend products, social network connections, events, songs, and more • Implicitly and explicitly drive many of experiences you’re familiar with
  • 6. Recommender uses • Netflix, Spotify, LinkedIn, Facebook with the most visible examples • “You May Also Like” “People You May Know” “People to Follow” • Also silently power many other experiences • Product listings, up-sell options, add-ons, • Netflix —> $1MM for 10% better
  • 7. What data do you need? • Required for implicit data • User identifier • Product identifier • That’s it! • Further customization • Ratings (explicit data), counts • Side data
  • 8. Implicit data • User x product interactions • Consumed / used / clicked / etc.
  • 9. How do recommenders work? • Most basic: item similarity
  • 10. Matrix factorization • Treat users and products as a giant matrix with (very) many missing values • Users have latent factors that describe how much they like various genres • Items have latent factors that describe how much like each genre they are
  • 11. Matrix factorization • Turn this into a fill-in-the-missing-value exercise by learning the latent factors • Implicit or explicit data • Part of the winning formula for the Netflix Prize • Predict ratings or rankings
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Fill in the blanks • Learn the latent factors that minimize prediction error on the observed values • Fill in the missing values • Sort the list by predicted rating & recommend the unseen items
  • 18. Rankings? • Often less concerned with predicting precise scores • Just want to get the first few items right • Screen real estate is precious • Ranking factorization recommender
  • 19. Side features • Include information about users • Geographic, demographic, time of day, etc. • Include information about products • Product subtypes, geographic availability, etc. • Help with the cold start problem
  • 20. How to choose which model? • Select the appropriate model for your data (implicit/explicit), if you want side features or not, select hyperparameters, tune them… • … or let GraphLab Create do it for you and automatically tune hyperparameters
  • 21. Evaluation • Train on a portion of your data • Test on a held-out portion • Ratings: RMSE • Ranking: Precision, recall • Business metrics • Evaluate against popularity
  • 22. Live demo • Building and deploying a recommender system with GraphLab Create and Dato Predictive Services
  • 23. Thank you! • dato.com • @datoinc • trey@dato.com