SlideShare a Scribd company logo
It’s All About Me
From Big Data Models to Personalized
Experience
Yao Morin, Ph.D.
Go from this…
… to this …
• 30 Million users filed their taxes with
TurboTax
• 5 Million used desktop
• 25 Million used online
• TurboTax is 25 years old
• Roots as a Desktop App (and old)
SERVICES
 Hard-coded business logic
 Fixed UI flow
 Domain knowledge embedded
Business Logic and TurboTax
Experience A
Experience B
We know
what you
PREFER
We serve up what’s RELEVANT to you
We know when you need
HELP
How can we tailor
the experience just for
YOU?
Marriage between Data Science
and Dynamic and Responsive
Frontend
What is Data Science?
 It is multidisciplinary study and incorporates various
techniques and theories from many fields, such as statistics,
mathematics, artificial intelligence, data engineering, etc.
 Answers questions based on data instead of assumptions
 extract meaning from data and explain phenomenon
 uncover patterns from data and develop predictive models
E2E goals
definition
Model KPI,
Input/
Output
definition
Model
creation
and offline
evaluation
Online
model
coding &
validation
Integration/
Experience
QA
Online
evaluation
Result
analysis
Training/
test set
preprocessing
Algorithm &
method
selection
Model
training/
parameters
selection
KPI
measurement/
accuracy
assessment
From business problems to
models
Data model building cycle
Training/
test set
preprocessing
Algorithm &
method selection
KPI measurement/
accuracy
assessment
Model training/
parameters
selection
Identify data
 Features - what information do you have
 From data inventory and/or domain experts
 Examples: Demographic, behavioral or geographic data, etc.
 Labels ( for supervised learning ): what you want to predict
 What kind of products to recommend
 Whether a customer buys a product
 How a customer reacts to an experience
Pre-processing data
 “Encoding” categorical data
 ZIP code, feelings, occupations
 dummy coding, bucketing, and others
 Imputations – “filling in” missing data
 ML estimations, stochastic regression, multiple imputation
 Other cleaning
Model training
Learning the relationship between
features and labels through data
Not this kind of
Labels =
f(Features)
Regressors
Classifiers,
etc.
But this kind of
relationship
Model evaluation
 Evaluate model performance against model-specific
performance metrics with hold-out data and iterate on
 Model type
 Hyperparameters
 Features
 …
Example: Training a model
Preprocessing
Separate into
training and
validation sets
User data
Labels
Training Set
Validation Set
Preprocessing
Model Training
(Random
Forest)
Model
Validation
( FP/FN)
Model Metric
Advantages of data models
To have dynamic personalized experience, we need to decide
what to show out of a large variety of possible experiences, in
an algorithmic way.
Data models solve this:
- Connect user data to user preferences
- Machine learning is automated and handles the complexity
Limitations of data models
 Uncertainties
 May not be suitable when applications require 100% accurate
 May need to build in safeguards for applications that require high accuracy
 Vulnerable to inaccurate, missing or insufficient data
• Send information
about the user
Logic
• Dispatcher
• If… else… logic
blocks
Pages
• Static flow
• Static pages
• Hide/show DOM
elements
User Requests
Traditional process flow
• Send information
about the user
Model Service
Platform
• Hosts models
• Processes user
requests based on
user data received
Player
• Consume received
decision and
generate final user
experience
User Requests
Dynamic process flow
Design With Data Science Mindset
Not Static Configurable
Data science works well with
configurable
components
Use templates
Scalability
Experiences should support
large amounts of
variability
Use templates (again!)
Maintainability
A refresh of design should
not break underlying
logic
Build experiences with
separation of logic and
design
Data science and static do
not mix
Do not hardcode
paths/pages
How do we apply Data Science to
TurboTax UI?
Dynamic Views
Traditional Dynamic UI
Dynamic Data
Static Templates
+
=
Dynamic Site
Truly Dynamic UI
Dynamic Data
Dynamic Semantic
Templates
+
=
Dynamic Site
{type: template
}
Dynamic Flow
Statically Defined
Routes/States
Dynamic Finite State
Machine
• Relationships between pages are
pre-determined
• Entry points into the app are pre-
determined
• All flow and variation in the
application is hard coded
• Relationships among data are pre-
determined
• Entry points are determined
dynamically
• Flow though the application is
completely data driven
• Data science model enabled
• Semantically defined dynamic experiences
• Dynamic application flow
• Device agnostic representation of the UI
• Device specific applications to render the UI
FUEGO

More Related Content

What's hot

An overview of machine learning
An overview of machine learningAn overview of machine learning
An overview of machine learning
KhaledAbdElFatha
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
Saurabh Kaushik
 
Leveraging Technology in Graduate Admissions
Leveraging Technology in Graduate AdmissionsLeveraging Technology in Graduate Admissions
Leveraging Technology in Graduate Admissions
Marcus Hanscom
 
Supervised learning
  Supervised learning  Supervised learning
Supervised learning
Learnbay Datascience
 
2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx
2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx
2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx
Minh Nguyen
 
Supervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And TechniquesSupervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And Techniques
SlideTeam
 
Business Analytics Forum #BAF3
Business Analytics Forum #BAF3Business Analytics Forum #BAF3
Business Analytics Forum #BAF3
Simon Harrison ACMA CGMA
 
Human Factors In Groupware Applications
Human Factors In Groupware ApplicationsHuman Factors In Groupware Applications
Human Factors In Groupware Applications
ESS
 
Build a Sentiment Model using ML.Net
Build a Sentiment Model using ML.NetBuild a Sentiment Model using ML.Net
Build a Sentiment Model using ML.Net
Cheah Eng Soon
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
O. R. Kumaran
 
Learn How to Make Machine Learning Work
Learn How to Make Machine Learning WorkLearn How to Make Machine Learning Work
Learn How to Make Machine Learning Work
iTrainMalaysia1
 
[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineering[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineering
Ivano Malavolta
 

What's hot (12)

An overview of machine learning
An overview of machine learningAn overview of machine learning
An overview of machine learning
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
 
Leveraging Technology in Graduate Admissions
Leveraging Technology in Graduate AdmissionsLeveraging Technology in Graduate Admissions
Leveraging Technology in Graduate Admissions
 
Supervised learning
  Supervised learning  Supervised learning
Supervised learning
 
2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx
2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx
2018-Sogeti-TestExpo-Intelligent_Predictive_Models.pptx
 
Supervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And TechniquesSupervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And Techniques
 
Business Analytics Forum #BAF3
Business Analytics Forum #BAF3Business Analytics Forum #BAF3
Business Analytics Forum #BAF3
 
Human Factors In Groupware Applications
Human Factors In Groupware ApplicationsHuman Factors In Groupware Applications
Human Factors In Groupware Applications
 
Build a Sentiment Model using ML.Net
Build a Sentiment Model using ML.NetBuild a Sentiment Model using ML.Net
Build a Sentiment Model using ML.Net
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Learn How to Make Machine Learning Work
Learn How to Make Machine Learning WorkLearn How to Make Machine Learning Work
Learn How to Make Machine Learning Work
 
[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineering[2015/2016] RESEARCH in software engineering
[2015/2016] RESEARCH in software engineering
 

Similar to It’s all about me_ From big data models to personalized experience Presentation

Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
Mostafa
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
Value Amplify Consulting
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
Jeremy Lehman
 
Using ML to Protect Customer Privacy by fmr Amazon Sr PM
Using ML to Protect Customer Privacy by fmr Amazon Sr PMUsing ML to Protect Customer Privacy by fmr Amazon Sr PM
Using ML to Protect Customer Privacy by fmr Amazon Sr PM
Product School
 
Chapter 10 Modeling and Analysis Heuristic Search Methods
Chapter 10 Modeling and Analysis Heuristic Search Methods Chapter 10 Modeling and Analysis Heuristic Search Methods
Chapter 10 Modeling and Analysis Heuristic Search Methods
EstelaJeffery653
 
Delivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PMDelivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PM
Product School
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in Production
Databricks
 
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s GuideIntegrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Salesforce.org
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
Humayun Kabir
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
Mostafa
 
Machine learning
Machine learningMachine learning
Machine learning
Saravanan Subburayal
 
CoArtha Pragmatic Learning Management Solution
CoArtha Pragmatic Learning Management SolutionCoArtha Pragmatic Learning Management Solution
CoArtha Pragmatic Learning Management Solution
MapRecruit.com
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
Tash Bickley
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platform
Haoran Du
 
Ibm test data_management_v0.4
Ibm test data_management_v0.4Ibm test data_management_v0.4
Ibm test data_management_v0.4
Rosario Cunha
 
Twitter Sentiment Analysis in 10 Minutes using Machine Learning
Twitter Sentiment Analysis in 10 Minutes using Machine LearningTwitter Sentiment Analysis in 10 Minutes using Machine Learning
Twitter Sentiment Analysis in 10 Minutes using Machine Learning
Skyl.ai
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
Roger Barga
 
Data scientist Methods | Artificial Intelligence | Rahul Gulab Singh
Data scientist Methods  | Artificial Intelligence | Rahul Gulab SinghData scientist Methods  | Artificial Intelligence | Rahul Gulab Singh
Data scientist Methods | Artificial Intelligence | Rahul Gulab Singh
Rahul Singh
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life Cycle
SrujanaMerugu1
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in production
Stepan Pushkarev
 

Similar to It’s all about me_ From big data models to personalized experience Presentation (20)

Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
 
Using ML to Protect Customer Privacy by fmr Amazon Sr PM
Using ML to Protect Customer Privacy by fmr Amazon Sr PMUsing ML to Protect Customer Privacy by fmr Amazon Sr PM
Using ML to Protect Customer Privacy by fmr Amazon Sr PM
 
Chapter 10 Modeling and Analysis Heuristic Search Methods
Chapter 10 Modeling and Analysis Heuristic Search Methods Chapter 10 Modeling and Analysis Heuristic Search Methods
Chapter 10 Modeling and Analysis Heuristic Search Methods
 
Delivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PMDelivering Machine Learning Solutions by fmr Sears Dir of PM
Delivering Machine Learning Solutions by fmr Sears Dir of PM
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in Production
 
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s GuideIntegrating SIS’s with Salesforce: An Accidental Integrator’s Guide
Integrating SIS’s with Salesforce: An Accidental Integrator’s Guide
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
 
Machine learning
Machine learningMachine learning
Machine learning
 
CoArtha Pragmatic Learning Management Solution
CoArtha Pragmatic Learning Management SolutionCoArtha Pragmatic Learning Management Solution
CoArtha Pragmatic Learning Management Solution
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platform
 
Ibm test data_management_v0.4
Ibm test data_management_v0.4Ibm test data_management_v0.4
Ibm test data_management_v0.4
 
Twitter Sentiment Analysis in 10 Minutes using Machine Learning
Twitter Sentiment Analysis in 10 Minutes using Machine LearningTwitter Sentiment Analysis in 10 Minutes using Machine Learning
Twitter Sentiment Analysis in 10 Minutes using Machine Learning
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Data scientist Methods | Artificial Intelligence | Rahul Gulab Singh
Data scientist Methods  | Artificial Intelligence | Rahul Gulab SinghData scientist Methods  | Artificial Intelligence | Rahul Gulab Singh
Data scientist Methods | Artificial Intelligence | Rahul Gulab Singh
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life Cycle
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in production
 

It’s all about me_ From big data models to personalized experience Presentation

  • 1. It’s All About Me From Big Data Models to Personalized Experience Yao Morin, Ph.D.
  • 4.
  • 5. • 30 Million users filed their taxes with TurboTax • 5 Million used desktop • 25 Million used online • TurboTax is 25 years old • Roots as a Desktop App (and old)
  • 7.  Hard-coded business logic  Fixed UI flow  Domain knowledge embedded Business Logic and TurboTax
  • 8. Experience A Experience B We know what you PREFER
  • 9. We serve up what’s RELEVANT to you
  • 10. We know when you need HELP
  • 11. How can we tailor the experience just for YOU?
  • 12. Marriage between Data Science and Dynamic and Responsive Frontend
  • 13. What is Data Science?  It is multidisciplinary study and incorporates various techniques and theories from many fields, such as statistics, mathematics, artificial intelligence, data engineering, etc.  Answers questions based on data instead of assumptions  extract meaning from data and explain phenomenon  uncover patterns from data and develop predictive models
  • 14.
  • 15. E2E goals definition Model KPI, Input/ Output definition Model creation and offline evaluation Online model coding & validation Integration/ Experience QA Online evaluation Result analysis Training/ test set preprocessing Algorithm & method selection Model training/ parameters selection KPI measurement/ accuracy assessment From business problems to models
  • 16. Data model building cycle Training/ test set preprocessing Algorithm & method selection KPI measurement/ accuracy assessment Model training/ parameters selection
  • 17. Identify data  Features - what information do you have  From data inventory and/or domain experts  Examples: Demographic, behavioral or geographic data, etc.  Labels ( for supervised learning ): what you want to predict  What kind of products to recommend  Whether a customer buys a product  How a customer reacts to an experience
  • 18. Pre-processing data  “Encoding” categorical data  ZIP code, feelings, occupations  dummy coding, bucketing, and others  Imputations – “filling in” missing data  ML estimations, stochastic regression, multiple imputation  Other cleaning
  • 19. Model training Learning the relationship between features and labels through data
  • 22. Model evaluation  Evaluate model performance against model-specific performance metrics with hold-out data and iterate on  Model type  Hyperparameters  Features  …
  • 23. Example: Training a model Preprocessing Separate into training and validation sets User data Labels Training Set Validation Set Preprocessing Model Training (Random Forest) Model Validation ( FP/FN) Model Metric
  • 24. Advantages of data models To have dynamic personalized experience, we need to decide what to show out of a large variety of possible experiences, in an algorithmic way. Data models solve this: - Connect user data to user preferences - Machine learning is automated and handles the complexity
  • 25. Limitations of data models  Uncertainties  May not be suitable when applications require 100% accurate  May need to build in safeguards for applications that require high accuracy  Vulnerable to inaccurate, missing or insufficient data
  • 26. • Send information about the user Logic • Dispatcher • If… else… logic blocks Pages • Static flow • Static pages • Hide/show DOM elements User Requests Traditional process flow
  • 27. • Send information about the user Model Service Platform • Hosts models • Processes user requests based on user data received Player • Consume received decision and generate final user experience User Requests Dynamic process flow
  • 28. Design With Data Science Mindset Not Static Configurable Data science works well with configurable components Use templates Scalability Experiences should support large amounts of variability Use templates (again!) Maintainability A refresh of design should not break underlying logic Build experiences with separation of logic and design Data science and static do not mix Do not hardcode paths/pages
  • 29. How do we apply Data Science to TurboTax UI?
  • 30. Dynamic Views Traditional Dynamic UI Dynamic Data Static Templates + = Dynamic Site Truly Dynamic UI Dynamic Data Dynamic Semantic Templates + = Dynamic Site {type: template }
  • 31. Dynamic Flow Statically Defined Routes/States Dynamic Finite State Machine • Relationships between pages are pre-determined • Entry points into the app are pre- determined • All flow and variation in the application is hard coded • Relationships among data are pre- determined • Entry points are determined dynamically • Flow though the application is completely data driven
  • 32. • Data science model enabled • Semantically defined dynamic experiences • Dynamic application flow • Device agnostic representation of the UI • Device specific applications to render the UI FUEGO