SlideShare a Scribd company logo
Keep it Simple
And it works!
Keep it simple, stupid (KISS Principle)
OCcam’s principle of parsimony
Any hypothesis that is more probable is possibly the right
one to explain it
simple data science - but why?
Complexity depletes robustness
Mathematical reason
1. Curse of
dimensionality
2. Sparse Matrix
3. Tough to optimize
4. Overfitting
Pragmatic reason
1. Danger of unstable
data pipelines
2. Compute time
Simple solutions sell faster
Humans need not explain how they think - but machines need
to!
Use the time more to analyze the data
Explore the data Create Features
You get a high return on effort
Defining the right business problem
Choosing the right metric
Choosing right dependent
variable,timeframe & data
Feature creation
Algorithm
Hyper Parameter Tuning
ML optimization funnel
Time in hand for adequate validations
1. Validations on
business
accuracy
2. Out of time
validations
3. Thinking about
any “Leakage”
variables
4. Tossing to
business
stakeholders
simple data science - and how?
Understand status quo
1. What does the
current system
do today?. How
is success
measured
Does the problem warrant a ml solution
1. Some problems can be solved by just business rules
a. For ex a hair shampoo company wants to send offers for all customers
who have bought the product earlier
b. Company is not averse to spend on all the people who qualify nor is
it concerned by response rate
1. Sometimes even freq distribution can solve problems
a. We want to predict the top 3 reasons for calling from available 10
reasons
b. Top 3 reasons lets say covers 95% of the use cases
c. Ordering of the reason within top 3 does not matter
Prefer supervised over unsupervised
1. Label data whenever possible
2. Instrument dependent variables if
possible
3. Accuracy trade-off Vs cost of
collecting samples
Prefer batch models over real time models
Vs

More Related Content

What's hot

Scaling for holiday season
Scaling for holiday seasonScaling for holiday season
Scaling for holiday season
Jags Krishnamurthy
 
KETL Quick guide to data analytics
KETL Quick guide to data analytics KETL Quick guide to data analytics
KETL Quick guide to data analytics
KETL Limited
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his mac
Rising Media, Inc.
 
840 plenary elder_using his laptop
840 plenary elder_using his laptop840 plenary elder_using his laptop
840 plenary elder_using his laptop
Rising Media, Inc.
 
1055 track3 soules
1055 track3 soules1055 track3 soules
1055 track3 soules
Rising Media, Inc.
 
Lightning talk on the future of analytics - CloudCamp London, 2016
Lightning talk on the future of analytics - CloudCamp London, 2016 Lightning talk on the future of analytics - CloudCamp London, 2016
Lightning talk on the future of analytics - CloudCamp London, 2016
Jon Hawes
 
Simplifying analytics
Simplifying analyticsSimplifying analytics
Simplifying analytics
Sindhura Venkatesh
 
Data Analytics with Managerial Applications Internship
Data Analytics with Managerial Applications InternshipData Analytics with Managerial Applications Internship
Data Analytics with Managerial Applications Internship
Jahanvi Khedwal
 
1120 track1 taylor
1120 track1 taylor1120 track1 taylor
1120 track1 taylor
Rising Media, Inc.
 
Business Analytics
Business AnalyticsBusiness Analytics
Business Analytics
Lambert1035
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
Sergey Shelpuk
 
Data Mining Technique - SEMMA
Data Mining Technique - SEMMAData Mining Technique - SEMMA
Data Mining Technique - SEMMA
Ashish Chandra Jha
 
"a leader's guide to data analytics"
"a leader's guide to data analytics""a leader's guide to data analytics"
"a leader's guide to data analytics"
Nishant Kumar
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
Michal Hodinka
 
Think better using “Descriptive-Prescriptive” Approach
Think better using “Descriptive-Prescriptive” ApproachThink better using “Descriptive-Prescriptive” Approach
Think better using “Descriptive-Prescriptive” Approach
STAG Software Private Limited
 

What's hot (15)

Scaling for holiday season
Scaling for holiday seasonScaling for holiday season
Scaling for holiday season
 
KETL Quick guide to data analytics
KETL Quick guide to data analytics KETL Quick guide to data analytics
KETL Quick guide to data analytics
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his mac
 
840 plenary elder_using his laptop
840 plenary elder_using his laptop840 plenary elder_using his laptop
840 plenary elder_using his laptop
 
1055 track3 soules
1055 track3 soules1055 track3 soules
1055 track3 soules
 
Lightning talk on the future of analytics - CloudCamp London, 2016
Lightning talk on the future of analytics - CloudCamp London, 2016 Lightning talk on the future of analytics - CloudCamp London, 2016
Lightning talk on the future of analytics - CloudCamp London, 2016
 
Simplifying analytics
Simplifying analyticsSimplifying analytics
Simplifying analytics
 
Data Analytics with Managerial Applications Internship
Data Analytics with Managerial Applications InternshipData Analytics with Managerial Applications Internship
Data Analytics with Managerial Applications Internship
 
1120 track1 taylor
1120 track1 taylor1120 track1 taylor
1120 track1 taylor
 
Business Analytics
Business AnalyticsBusiness Analytics
Business Analytics
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Data Mining Technique - SEMMA
Data Mining Technique - SEMMAData Mining Technique - SEMMA
Data Mining Technique - SEMMA
 
"a leader's guide to data analytics"
"a leader's guide to data analytics""a leader's guide to data analytics"
"a leader's guide to data analytics"
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
 
Think better using “Descriptive-Prescriptive” Approach
Think better using “Descriptive-Prescriptive” ApproachThink better using “Descriptive-Prescriptive” Approach
Think better using “Descriptive-Prescriptive” Approach
 

Similar to Keep it simple and it works - Simplicity and sticking to fundamentals in the world of big data By Mathangi Sri Head of Data Science at PhonePe

Doing Analytics Right - Designing and Automating Analytics
Doing Analytics Right - Designing and Automating AnalyticsDoing Analytics Right - Designing and Automating Analytics
Doing Analytics Right - Designing and Automating Analytics
Tasktop
 
The Zen Organisation
The Zen OrganisationThe Zen Organisation
The Zen Organisation
Rakesh Kariholoo
 
Customer Intelligence & Analytics - Part I
Customer Intelligence & Analytics - Part ICustomer Intelligence & Analytics - Part I
Customer Intelligence & Analytics - Part I
Vivastream
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
Dario Mangano
 
Process Troubleshooting
Process TroubleshootingProcess Troubleshooting
Process Troubleshooting
www.thepetrosolutions.com
 
Machine Learning for SEOs - SMXL
Machine Learning for SEOs - SMXLMachine Learning for SEOs - SMXL
Machine Learning for SEOs - SMXL
Britney Muller
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
Chitrachitrap
 
This is AI doing – applying artificial intelligence to business problems by H...
This is AI doing – applying artificial intelligence to business problems by H...This is AI doing – applying artificial intelligence to business problems by H...
This is AI doing – applying artificial intelligence to business problems by H...
Mindtrek
 
and-done.io - Processes and how to automate them
and-done.io - Processes and how to automate themand-done.io - Processes and how to automate them
and-done.io - Processes and how to automate them
Patrick Dreier
 
Making advanced analytics work for you
Making advanced analytics work for youMaking advanced analytics work for you
Making advanced analytics work for you
Rahul Chawla
 
How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...
WiMLDSMontreal
 
The Era of Evidence-Based Business Process Management by Marlon Dumas
The Era of Evidence-Based Business Process Management by Marlon DumasThe Era of Evidence-Based Business Process Management by Marlon Dumas
The Era of Evidence-Based Business Process Management by Marlon Dumas
LEADingPractice
 
Success with on line CRM
Success with on line CRMSuccess with on line CRM
Success with on line CRM
James Bogue
 
Should Businesses Move to the Cloud
Should Businesses Move to the CloudShould Businesses Move to the Cloud
Should Businesses Move to the Cloud
nhainisaini
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
GDSCIIITDHARWAD
 
Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4
SadhanaParameswaran
 
A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016
Jon Hawes
 
Intel Faster Risk Oct08 - Andrew Parry
Intel Faster Risk Oct08 - Andrew ParryIntel Faster Risk Oct08 - Andrew Parry
Intel Faster Risk Oct08 - Andrew Parry
mikeohara
 
Solidifying Vague Requirements & Establishing Unknown User Needs
Solidifying Vague Requirements & Establishing Unknown User NeedsSolidifying Vague Requirements & Establishing Unknown User Needs
Solidifying Vague Requirements & Establishing Unknown User Needs
Vanessa Turke
 
ML Session-2
ML Session-2ML Session-2
ML Session-2
DSCIITPatna
 

Similar to Keep it simple and it works - Simplicity and sticking to fundamentals in the world of big data By Mathangi Sri Head of Data Science at PhonePe (20)

Doing Analytics Right - Designing and Automating Analytics
Doing Analytics Right - Designing and Automating AnalyticsDoing Analytics Right - Designing and Automating Analytics
Doing Analytics Right - Designing and Automating Analytics
 
The Zen Organisation
The Zen OrganisationThe Zen Organisation
The Zen Organisation
 
Customer Intelligence & Analytics - Part I
Customer Intelligence & Analytics - Part ICustomer Intelligence & Analytics - Part I
Customer Intelligence & Analytics - Part I
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
 
Process Troubleshooting
Process TroubleshootingProcess Troubleshooting
Process Troubleshooting
 
Machine Learning for SEOs - SMXL
Machine Learning for SEOs - SMXLMachine Learning for SEOs - SMXL
Machine Learning for SEOs - SMXL
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
This is AI doing – applying artificial intelligence to business problems by H...
This is AI doing – applying artificial intelligence to business problems by H...This is AI doing – applying artificial intelligence to business problems by H...
This is AI doing – applying artificial intelligence to business problems by H...
 
and-done.io - Processes and how to automate them
and-done.io - Processes and how to automate themand-done.io - Processes and how to automate them
and-done.io - Processes and how to automate them
 
Making advanced analytics work for you
Making advanced analytics work for youMaking advanced analytics work for you
Making advanced analytics work for you
 
How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...
 
The Era of Evidence-Based Business Process Management by Marlon Dumas
The Era of Evidence-Based Business Process Management by Marlon DumasThe Era of Evidence-Based Business Process Management by Marlon Dumas
The Era of Evidence-Based Business Process Management by Marlon Dumas
 
Success with on line CRM
Success with on line CRMSuccess with on line CRM
Success with on line CRM
 
Should Businesses Move to the Cloud
Should Businesses Move to the CloudShould Businesses Move to the Cloud
Should Businesses Move to the Cloud
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4Explore ML with Crowdsource | ML Extended - Session 4
Explore ML with Crowdsource | ML Extended - Session 4
 
A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016
 
Intel Faster Risk Oct08 - Andrew Parry
Intel Faster Risk Oct08 - Andrew ParryIntel Faster Risk Oct08 - Andrew Parry
Intel Faster Risk Oct08 - Andrew Parry
 
Solidifying Vague Requirements & Establishing Unknown User Needs
Solidifying Vague Requirements & Establishing Unknown User NeedsSolidifying Vague Requirements & Establishing Unknown User Needs
Solidifying Vague Requirements & Establishing Unknown User Needs
 
ML Session-2
ML Session-2ML Session-2
ML Session-2
 

More from Analytics India Magazine

Deep Learning in Search for E-Commerce
Deep Learning in Search for E-CommerceDeep Learning in Search for E-Commerce
Deep Learning in Search for E-Commerce
Analytics India Magazine
 
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
Analytics India Magazine
 
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Analytics India Magazine
 
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
Analytics India Magazine
 
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Analytics India Magazine
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Analytics India Magazine
 
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Analytics India Magazine
 
10 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 201910 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 2019
Analytics India Magazine
 
The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19
Analytics India Magazine
 
Data Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great LearningData Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great Learning
Analytics India Magazine
 
Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...
Analytics India Magazine
 
Predicting outcome of legal case using machine learning algorithms By Ankita ...
Predicting outcome of legal case using machine learning algorithms By Ankita ...Predicting outcome of legal case using machine learning algorithms By Ankita ...
Predicting outcome of legal case using machine learning algorithms By Ankita ...
Analytics India Magazine
 
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Analytics India Magazine
 
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Analytics India Magazine
 
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Analytics India Magazine
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
Analytics India Magazine
 
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ..."Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
Analytics India Magazine
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
Analytics India Magazine
 
Analytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning PathAnalytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning Path
Analytics India Magazine
 
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIMAnalytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics India Magazine
 

More from Analytics India Magazine (20)

Deep Learning in Search for E-Commerce
Deep Learning in Search for E-CommerceDeep Learning in Search for E-Commerce
Deep Learning in Search for E-Commerce
 
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
 
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
 
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
 
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
 
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
 
10 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 201910 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 2019
 
The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19
 
Data Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great LearningData Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great Learning
 
Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...
 
Predicting outcome of legal case using machine learning algorithms By Ankita ...
Predicting outcome of legal case using machine learning algorithms By Ankita ...Predicting outcome of legal case using machine learning algorithms By Ankita ...
Predicting outcome of legal case using machine learning algorithms By Ankita ...
 
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
 
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
 
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
 
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ..."Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
 
Analytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning PathAnalytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning Path
 
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIMAnalytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
 

Recently uploaded

一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
22ad0301
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
yuvarajkumar334
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
Vineet
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
newdirectionconsulta
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
Rebecca Bilbro
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
actyx
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
Vineet
 

Recently uploaded (20)

一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
 

Keep it simple and it works - Simplicity and sticking to fundamentals in the world of big data By Mathangi Sri Head of Data Science at PhonePe

  • 1. Keep it Simple And it works!
  • 2. Keep it simple, stupid (KISS Principle)
  • 3. OCcam’s principle of parsimony Any hypothesis that is more probable is possibly the right one to explain it
  • 4. simple data science - but why?
  • 5. Complexity depletes robustness Mathematical reason 1. Curse of dimensionality 2. Sparse Matrix 3. Tough to optimize 4. Overfitting Pragmatic reason 1. Danger of unstable data pipelines 2. Compute time
  • 6. Simple solutions sell faster Humans need not explain how they think - but machines need to!
  • 7. Use the time more to analyze the data Explore the data Create Features
  • 8. You get a high return on effort Defining the right business problem Choosing the right metric Choosing right dependent variable,timeframe & data Feature creation Algorithm Hyper Parameter Tuning ML optimization funnel
  • 9. Time in hand for adequate validations 1. Validations on business accuracy 2. Out of time validations 3. Thinking about any “Leakage” variables 4. Tossing to business stakeholders
  • 10. simple data science - and how?
  • 11. Understand status quo 1. What does the current system do today?. How is success measured
  • 12. Does the problem warrant a ml solution 1. Some problems can be solved by just business rules a. For ex a hair shampoo company wants to send offers for all customers who have bought the product earlier b. Company is not averse to spend on all the people who qualify nor is it concerned by response rate 1. Sometimes even freq distribution can solve problems a. We want to predict the top 3 reasons for calling from available 10 reasons b. Top 3 reasons lets say covers 95% of the use cases c. Ordering of the reason within top 3 does not matter
  • 13. Prefer supervised over unsupervised 1. Label data whenever possible 2. Instrument dependent variables if possible 3. Accuracy trade-off Vs cost of collecting samples
  • 14. Prefer batch models over real time models Vs

Editor's Notes

  1. Kelly Johnson
  2. Henny-Penny the sky is falling
  3. Prof from George Tech
  4. Stakeholders find it far difficult to digest a highly complex black box models NLP & CV - people dont question Personalization or Targeting or risk based selection kind of problems people do need it to be explained Even if not the mathematical model - atleast the input features need to be explainable
  5. Less Labels Good number of samples per label Historic data is representative
  6. Most companies have systems well oiled for batch Lot of targeting/personalization/fraud/risk etc can well be solved by batch Trade off analysis - accuracy Vs complexity Keep a offset period in models