Intelligent Applications with Machine Learning Toolkits
1. 11
Shawn Scully - VP of Customer Success & Applications
scully@dato.com @backwoodsbrains
Intelligent Applications with
Machine Learning Toolkits
2. Within 5 years, every innovative application
will be intelligent.
3. 33
Intelligent applications create tremendous value
…but take a lot of time & specialized skills to build.
Recommenders
Lead Scoring
Churn Prediction
Multi-channel Targeting
Auto-Summarization
Fraud detection
Intrusion Detection
Demand Forecasting
Data Matching
Failure Prediction
4. Our mission is to
Accelerate innovators to create intelligent
applications with agile machine learning.
5. Needs of an Agile ML Platform
5
Dato
Predictive Services
GraphLab Create
rapid development
deploy as microservice
live serving,
monitoring,
& model management
iterate
w/feedback
7. 77
Algorithms vs. toolkits
SVD++ w/SGD
vs.SVD
Recommender
• item similarity
• SVD++
• iALS
• factorization machine
• many more!
• PhD students care a lot about these!
• many papers focused on “my curve is better
than your curve”
• Not always the most practical…
• Grouped by a common task
• Focused on meaningful differences in data &
problem
• Practical implementations
8. 8
import graphlab as gl
data = gl.SFrame.read_csv('my_data.csv')
model = gl.recommender.create(
data,
user_id='user',
item_id='movie’,
target='rating')
recommendations = model.recommend(k=5)
cluster = gl.deploy.load(‘s3://path’)
cluster.add(‘servicename’, model)
Easily create a live machine learning service
Create a Recommender
5 lines of code
Toolkit w/auto selection
Deploy in minutes
12. Recommend
12
Value:
• Increase user engagement
• Sell more/increase clickthrough
• Create better user experiences
Goal: Find or recommend similar or related items.
16. Sentiment Analysis & Product Sentiment
16
Value:
• Quantitative measures from unstructured text
• Eliminate the need to read everything
• Summarize on aspects you care about
Goal: Score sentiment of a sentence, document, or aspect.
17. 1717
Sentiment scoring- Data + Toolkit
sentiment_analysis
graphlab.sentiment_analysis.create
graphlab.product_sentiment.create
23. Churn Prediction
23
Value:
• Keep your customers
• Optimize marketing/customer success spend
• Identify issues with product or business
Goal: Identify users that are likely to stop doing something
(e.g. paying for your service, using a product feature, etc.)
24. Confidential - GraphLab internal use only
Problem setup
Period 1
Period 2
Period 3
Features Target
Hold out set
Goal: model that predicts if a user does not appear in Period 2
Evaluation: score for (app, user) pairs absent in Period 3 Machine
learning
model
Evaluation
25. Data Transformations
25
Time Unique
pairs
app user time etc app user feature
1
feature
2
Features:
● time since last use
● time since first use
● # unique days user has used app
● # times user used app in last delta days
● Rolling aggregates
● etc
Aggregate to generate predictive featuresopens
26. 2626
Predict Churn - Data + Toolkit
user_id event datetimestamp
103 play ‘01-01-15’
102 click ’02-05-15’
102 visit ‘03-06-15’
102 visit ’03-09-15’
103 purchase ’03-21-15’
103 click ’03-22-15’
102 click ’03-23-15’
103 click ’04-02-15’
103 play ‘04-01-15’
103 purchase ’05-02-15’
103 play ‘05-01-15’
103 play ’05-15-15’
churn_predictor
graphlab.churn_predictor.create
28. 2828
Examples of data matching
record= {‘SSN’:None,
‘Name’:’Smith, Will’
‘Sex’:’Male’,
‘ZIP;:94701}
29. Data Matching
29
Value:
• Deduplicate contacts/records
• “360 view” of customer across multiple properties
• Improve data quality
Goal: Identify entities & appropriately link records.
30. 3030
Data matching – Data + Toolkit
data_matching
graphlab.deduplication.create
graphlab.record_linker.create
33. Tools built for innovators
The Agile Machine Learning Platform
Dato Confidential - Do not Distribute
34. 34
Agility to create machine learning services
GraphLab Create
Application Toolkits:
• Auto-select the best algorithm
• Auto-prepare the data for ML
• Task-oriented methods
Data Layer for ML
• Manipulate all-relevant data types
• Out-of-core design eliminates scale pains
Robust Enterprise-Grade Algorithms
• 50+ of best-practice & novel algorithms
• Robust to real-world data
35. 3535
Dato Predictive ServicesReal-time Recommendations
Online Ad Scoring & Serving
Transactional Fraud detection
Agility to deploy – Microservices on AWS, premises, Yarn
37. 37
Thanks!
get the software!: https://www.dato.com/download/
platform overview: https://dato.com/products/
talk about ML at your company: scully@dato.com
Toolkits:
overview:https://dato.com/products/create/docs/graphlab.toolkits.html
recommender: https://dato.com/products/create/docs/graphlab.toolkits.recommender.html
churn_predictor: https://dato.com/products/create/docs/graphlab.toolkits.churn_predictor.html
similarity_search: https://dato.com/products/create/docs/graphlab.toolkits.data_matching.html#similarity-search-model
sentiment_analysis: https://dato.com/products/create/docs/graphlab.toolkits.sentiment_analysis.html
data_matching: https://dato.com/products/create/docs/graphlab.toolkits.data_matching.html
Editor's Notes
Empower businesses not about create, stay competitive, destroy,
Empower businesses not about create, stay competitive, destroy,
Innovators want…
Have I convinced you that we are right for you?
Why not?