SlideShare a Scribd company logo
1 of 42
How Data Science
Can Grow Your
Business?
Le-Wagon talk ,Tel-Aviv
2018
Hi!
I am Noam Cohen
Lead Data-Scientist,
2
Talk agenda
3
◎What is data science?
◎How is it used in the industry?
◎DS methodology and life cycle
◎Who are the Data-team members?
◎Limitations and caveats
1.
What is Data
science?
4
It is...
5
It’s not!
6
“
Data Science (Wikipedia):
An interdisciplinary field that uses
scientific methods, processes,
algorithms and systems to extract
knowledge and insights from
data in various forms, both
structured and unstructured
7
Data Science vs. Statistics
◎ The term data scientist was originally coined by a
statistician, trying to rebrand statisticians (Chien-Fu
1998)
◎ Statistics vs. DS - Data models vs. Algorithmic
modeling (Leo Breiman 2001)
◎ Data Science = Aggr(`stats`,`advanced
computing`,`hacking`,`business
logic`,`math`,`domain knowledge`,`data
analysis`)
8
Demystifying data science
◎ DS Purpose - achieving ‘Data Driven Decision
making’ (basing decisions on data with certain
confidence)
9
Buzzwords terminology
10
Data Science (DS)
The science of recognizing and utilizing patterns in data in order to
develop actionable insight and confidence for decisions.
Artificial Intelligence (AI)
Any technique which enables computers
to mimic human behaviour
Machine Learning (ML)
Subset of AI techniques which use
statistical methods to enable machine
-tasks to improve with experience
Deep learning (DL)
Subset of ML which allows in
certain conditions to model the
data with less human intervention
2.
How is it used in the
industry?
Typical use cases and market
overview
11
Why should I use DS in my business?
◎ Derive insights on business challenges
○ Sales
○ Pricing
○ Marketing
○ Churn
◎ Improve user experience
○ Faster
○ Personalized
○ Accurate
◎ Automate cumbersome routines involving human
labor
12
Data science business - where and how
13
Marketing
◎ Advertisement targeting
◎ User profiling
◎ Targeting direct marketing
◎ Churn
◎ Causal modeling
◎ Optimized viral marketing
14
Sales
◎ Discount offering
◎ Demand forecasting
◎ Dynamic pricing
◎ Product bundling
◎ Sales monitoring and
investigation
◎ Leads discovery and
prioritization
◎ Upselling 15
Transportation
◎ Customer wait time
estimation
◎ Recommending driver-
location via heatmap
◎ Surge pricing (Geosurge)
◎ Traffic and demand
visualization
◎ Drive duration estimation
16
Israel companies overview
17
Israel companies overview
18
Israel companies overview
19
3.
How should I use
it?
Methodology and DS lifecycle
20
Methodology - preliminaries
21
Digital service with growing
user community
Methodology - preliminaries
22
Basic analytics
(no need for AI/ML)
Database instrumentation
and data structuring
◎ ** If needed, create a rule-based system of expert-defined thresholds as
the ‘AI’ backend and continue to gather data
Methodology - CRISP DM
23
* Cross Industry Standard Process for
Data Mining (CRISP-DM)
Methodology - Business understanding
24
Problem definition & Business
understanding
◎ Define business targets and qualitative
success metrics
◎ Asses risks, costs, benefits, data-
resources
◎ Project to data science subtasks and
identify the class of the problems
◎ Plan the project - estimate
requirements, timeline and budget
Methodology - Data understanding
25
Data understanding
◎ Refine initial data and enrich with if
needed
◎ Match data to business problem
◎ Describe and explore the data
○ Spot anomalies
○ Basic amounts and value types
◎ Verify data quality
○ Missing data
○ Collection errors/biases
anomalies?
outliers?
Methodology - Data Preparation
26
Data Preparation
◎ Clean data
○ Correct errors
○ Fill missing data
◎ Select right data
○ Representative
○ Data partitioning - train/test/hold-
out
◎ Format data
◎ Beware of “leaks”
Source: KDNuggets Poll 2003
Methodology - Modeling
27
Modeling
◎ Build cost/risk target to optimize
◎ Understand models assumptions and
check data compatibility
◎ Build model and optimize parameters
◎ Generate test design
◎ Assess model on provided data
Methodology - Evaluation
28
Evaluation
◎ Analyze model performance and
summarize results
○ New insights
○ A/B testing
○ Validation cases
◎ Error analysis
◎ Prediction interpretability
◎ Robustness and maintainability of model
◎ Business related performance -
cumulative response and lift curves
Methodology - Deployment
29
Deployment
◎ Integrate prototype into productions
system
◎ Implement software features inspired by
the data-mining process
◎ Plan model maintenance and support
Methodology - CRISP-DM
30
4.
Who should I
recruit?
Building your data team
31
Unicorn
fairytale
Data science is actually comprised of
multiple disciplines. Typically, a
single creature cannot manage the
engineering process, lead modeling
efforts, coordinate the product
roadmap, and articulate results to
stakeholders.
32
The magnificent data warriors
★ Descriptive and conditional statistics
★ Error analysis
★ Finding sense in results and
monitoring production model
performance
★ Feature engineering and
formalization of prior knowledge
★ Domain expertise
★ Validation
★ Excel, SQL, DB, R
(Scripting), Statistics
33
AI Analyst
The magnificent data warriors
★ Machine learning and statistical
analysis
★ Experiment design and research
★ Familiar with Big Data technologies
★ Dev foundations - Pipelines,
testing, performance optimization
★ Storytelling and visualization
★ Feature engineering
★ Bias and leakages discovery
★ Generalization and overfitting
★ Python, R, Matlab, SQL, OOP,
Spark, Pig, Hive 34
Data Scientist
The magnificent data warriors
★ Data orchestration and system
architecture
★ Scaling with Big Data technologies
★ Database maintenance and data
storage
★ Production processes - code
deployment, optimization and
testing
35
Data Engineer
★ OOP and functional programming,
Python/Java/Scala/Ruby/Closure,
Spark, Hadoop, Pig, Hive, DB &
SQL, Jenkins, Luigi/Airflow
The magnificent data warriors
★ Setting goals
★ Tracking progress
★ Coordinates between team
members
★ Strong understanding of data
mining , evaluation metrics and
statistics
★ Deliver results to stakeholders
★ Leader
36
Manager
Who & where?
37
Other options
Let your developers
carefully integrate
Data science licensed
APIs for predictive
modeling in product.
Outsource DS task to
a consulting company
38
5.
Limitations and
caveats
Setting realistic expectations
39
Limitations
◎ No magic - when there is no predictive
information in data
◎ No 100%
◎ No hidden golden feature
◎ For tomorrow, it is impossible
◎ Tasks with subjective nature are hard
◎ Outdated data and outdated models
◎ Train and test data discrepancies
40
Thanks!
Any questions?
You can find me at:
noambox@gmail.com
41
References
42
◎ Foster Provost and Tom Fawcett. 2013. Data Science for Business: What You Need to Know about
Data Mining and Data-Analytic Thinking (1st ed.). O'Reilly Media, Inc.
◎ https://www.salesforce.com/quotable/articles/why-AI-will-be-your-new-best-friend-in-sales/
◎ https://hbr.org/2018/07/how-ai-is-changing-sales
◎ https://neilpatel.com/blog/how-uber-uses-data/
◎ https://www.marketingaiinstitute.com/blog/7-top-marketing-and-sales-companies-using-
artificial-intelligence-and-machine-learning
◎ https://www.vccafe.com/2017/09/11/israels-machine-intelligence-startup-landscape-2017/
◎ https://www.slideshare.net/kuonen/a-statisticians-view-on-big-data-and-data-science
◎ https://www.datasciencecentral.com/profiles/blogs/10-most-popular-data-science-
presentations-on-slideshare
◎ https://medium.com/high-alpha/how-to-build-a-great-data-science-team-d921fb41b5b1
◎ https://towardsdatascience.com/what-is-the-most-effective-way-to-structure-a-data-science-
team-498041b88dae
◎ https://towardsdatascience.com/the-limits-of-data-science-b4e5faad20f4

More Related Content

What's hot

Data Mining: A Short Survey
Data Mining: A Short SurveyData Mining: A Short Survey
Data Mining: A Short SurveyArvin Jenabi
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Big Data & Social Analytics presentation
Big Data & Social Analytics presentationBig Data & Social Analytics presentation
Big Data & Social Analytics presentationgustavosouto
 
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Ilkay Altintas, Ph.D.
 
Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning bigdata trunk
 
Machine learning in action at Pipedrive
Machine learning in action at PipedriveMachine learning in action at Pipedrive
Machine learning in action at PipedriveAndré Karpištšenko
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Machine Learning Landscape Today
Machine Learning Landscape TodayMachine Learning Landscape Today
Machine Learning Landscape TodayAiko Klostermann
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecasesSreenatha Reddy K R
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analyticsInbavalli Valli
 
01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BIAchmad Solichin
 

What's hot (16)

Data Science in Action
Data Science in ActionData Science in Action
Data Science in Action
 
Data Mining: A Short Survey
Data Mining: A Short SurveyData Mining: A Short Survey
Data Mining: A Short Survey
 
Data Science
Data ScienceData Science
Data Science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data & Social Analytics presentation
Big Data & Social Analytics presentationBig Data & Social Analytics presentation
Big Data & Social Analytics presentation
 
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
 
Ayasdi: Demystifying the Unknown
Ayasdi: Demystifying the UnknownAyasdi: Demystifying the Unknown
Ayasdi: Demystifying the Unknown
 
Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning
 
Machine learning in action at Pipedrive
Machine learning in action at PipedriveMachine learning in action at Pipedrive
Machine learning in action at Pipedrive
 
Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Machine Learning Landscape Today
Machine Learning Landscape TodayMachine Learning Landscape Today
Machine Learning Landscape Today
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecases
 
data mining
data miningdata mining
data mining
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analytics
 
01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI
 

Similar to How Data Science Can Grow Your Business?

Data - Science and Engineering slide at Bandungpy Sharing Session
Data - Science and Engineering slide at Bandungpy Sharing SessionData - Science and Engineering slide at Bandungpy Sharing Session
Data - Science and Engineering slide at Bandungpy Sharing SessionHendri Karisma
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840
 
Lesson 3 ai in the enterprise
Lesson 3   ai in the enterpriseLesson 3   ai in the enterprise
Lesson 3 ai in the enterpriseankit_ppt
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Essential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big DataEssential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big DataSociety of Petroleum Engineers
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 
1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdfAyele40
 
Data Analytics Today - Data, Tech, and Regulation.pdf
Data Analytics Today - Data, Tech, and Regulation.pdfData Analytics Today - Data, Tech, and Regulation.pdf
Data Analytics Today - Data, Tech, and Regulation.pdfHendri Karisma
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Betacowork
 
Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...
Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...
Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...Dataconomy Media
 
Avoid hiring data ninja rockstars: how to build effective data teams
Avoid hiring data ninja rockstars: how to build effective data teamsAvoid hiring data ninja rockstars: how to build effective data teams
Avoid hiring data ninja rockstars: how to build effective data teamsJodieBurchell1
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...Mihai Criveti
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptxXanGwaps
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)Denodo
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressMarcel Blattner, PhD
 
Neo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j
 
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...DATAVERSITY
 

Similar to How Data Science Can Grow Your Business? (20)

Data - Science and Engineering slide at Bandungpy Sharing Session
Data - Science and Engineering slide at Bandungpy Sharing SessionData - Science and Engineering slide at Bandungpy Sharing Session
Data - Science and Engineering slide at Bandungpy Sharing Session
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
Data Science and Analytics
Data Science and Analytics Data Science and Analytics
Data Science and Analytics
 
Lesson 3 ai in the enterprise
Lesson 3   ai in the enterpriseLesson 3   ai in the enterprise
Lesson 3 ai in the enterprise
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Essential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big DataEssential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big Data
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf1. Overview_of_data_analytics (1).pdf
1. Overview_of_data_analytics (1).pdf
 
Data Analytics Today - Data, Tech, and Regulation.pdf
Data Analytics Today - Data, Tech, and Regulation.pdfData Analytics Today - Data, Tech, and Regulation.pdf
Data Analytics Today - Data, Tech, and Regulation.pdf
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
 
Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...
Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...
Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to ...
 
Avoid hiring data ninja rockstars: how to build effective data teams
Avoid hiring data ninja rockstars: how to build effective data teamsAvoid hiring data ninja rockstars: how to build effective data teams
Avoid hiring data ninja rockstars: how to build effective data teams
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Data mining applications
Data mining applicationsData mining applications
Data mining applications
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR Congress
 
Neo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j Graph Data Science - Webinar
Neo4j Graph Data Science - Webinar
 
Pan Dhoni - Modernizing Data And Analytics using AI.pdf
Pan Dhoni - Modernizing Data And Analytics using AI.pdfPan Dhoni - Modernizing Data And Analytics using AI.pdf
Pan Dhoni - Modernizing Data And Analytics using AI.pdf
 
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
Smart Data Slides: Data Science and Business Analysis - A Look at Best Practi...
 

Recently uploaded

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

How Data Science Can Grow Your Business?

  • 1. How Data Science Can Grow Your Business? Le-Wagon talk ,Tel-Aviv 2018
  • 2. Hi! I am Noam Cohen Lead Data-Scientist, 2
  • 3. Talk agenda 3 ◎What is data science? ◎How is it used in the industry? ◎DS methodology and life cycle ◎Who are the Data-team members? ◎Limitations and caveats
  • 7. “ Data Science (Wikipedia): An interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured 7
  • 8. Data Science vs. Statistics ◎ The term data scientist was originally coined by a statistician, trying to rebrand statisticians (Chien-Fu 1998) ◎ Statistics vs. DS - Data models vs. Algorithmic modeling (Leo Breiman 2001) ◎ Data Science = Aggr(`stats`,`advanced computing`,`hacking`,`business logic`,`math`,`domain knowledge`,`data analysis`) 8
  • 9. Demystifying data science ◎ DS Purpose - achieving ‘Data Driven Decision making’ (basing decisions on data with certain confidence) 9
  • 10. Buzzwords terminology 10 Data Science (DS) The science of recognizing and utilizing patterns in data in order to develop actionable insight and confidence for decisions. Artificial Intelligence (AI) Any technique which enables computers to mimic human behaviour Machine Learning (ML) Subset of AI techniques which use statistical methods to enable machine -tasks to improve with experience Deep learning (DL) Subset of ML which allows in certain conditions to model the data with less human intervention
  • 11. 2. How is it used in the industry? Typical use cases and market overview 11
  • 12. Why should I use DS in my business? ◎ Derive insights on business challenges ○ Sales ○ Pricing ○ Marketing ○ Churn ◎ Improve user experience ○ Faster ○ Personalized ○ Accurate ◎ Automate cumbersome routines involving human labor 12
  • 13. Data science business - where and how 13
  • 14. Marketing ◎ Advertisement targeting ◎ User profiling ◎ Targeting direct marketing ◎ Churn ◎ Causal modeling ◎ Optimized viral marketing 14
  • 15. Sales ◎ Discount offering ◎ Demand forecasting ◎ Dynamic pricing ◎ Product bundling ◎ Sales monitoring and investigation ◎ Leads discovery and prioritization ◎ Upselling 15
  • 16. Transportation ◎ Customer wait time estimation ◎ Recommending driver- location via heatmap ◎ Surge pricing (Geosurge) ◎ Traffic and demand visualization ◎ Drive duration estimation 16
  • 20. 3. How should I use it? Methodology and DS lifecycle 20
  • 21. Methodology - preliminaries 21 Digital service with growing user community
  • 22. Methodology - preliminaries 22 Basic analytics (no need for AI/ML) Database instrumentation and data structuring ◎ ** If needed, create a rule-based system of expert-defined thresholds as the ‘AI’ backend and continue to gather data
  • 23. Methodology - CRISP DM 23 * Cross Industry Standard Process for Data Mining (CRISP-DM)
  • 24. Methodology - Business understanding 24 Problem definition & Business understanding ◎ Define business targets and qualitative success metrics ◎ Asses risks, costs, benefits, data- resources ◎ Project to data science subtasks and identify the class of the problems ◎ Plan the project - estimate requirements, timeline and budget
  • 25. Methodology - Data understanding 25 Data understanding ◎ Refine initial data and enrich with if needed ◎ Match data to business problem ◎ Describe and explore the data ○ Spot anomalies ○ Basic amounts and value types ◎ Verify data quality ○ Missing data ○ Collection errors/biases anomalies? outliers?
  • 26. Methodology - Data Preparation 26 Data Preparation ◎ Clean data ○ Correct errors ○ Fill missing data ◎ Select right data ○ Representative ○ Data partitioning - train/test/hold- out ◎ Format data ◎ Beware of “leaks” Source: KDNuggets Poll 2003
  • 27. Methodology - Modeling 27 Modeling ◎ Build cost/risk target to optimize ◎ Understand models assumptions and check data compatibility ◎ Build model and optimize parameters ◎ Generate test design ◎ Assess model on provided data
  • 28. Methodology - Evaluation 28 Evaluation ◎ Analyze model performance and summarize results ○ New insights ○ A/B testing ○ Validation cases ◎ Error analysis ◎ Prediction interpretability ◎ Robustness and maintainability of model ◎ Business related performance - cumulative response and lift curves
  • 29. Methodology - Deployment 29 Deployment ◎ Integrate prototype into productions system ◎ Implement software features inspired by the data-mining process ◎ Plan model maintenance and support
  • 32. Unicorn fairytale Data science is actually comprised of multiple disciplines. Typically, a single creature cannot manage the engineering process, lead modeling efforts, coordinate the product roadmap, and articulate results to stakeholders. 32
  • 33. The magnificent data warriors ★ Descriptive and conditional statistics ★ Error analysis ★ Finding sense in results and monitoring production model performance ★ Feature engineering and formalization of prior knowledge ★ Domain expertise ★ Validation ★ Excel, SQL, DB, R (Scripting), Statistics 33 AI Analyst
  • 34. The magnificent data warriors ★ Machine learning and statistical analysis ★ Experiment design and research ★ Familiar with Big Data technologies ★ Dev foundations - Pipelines, testing, performance optimization ★ Storytelling and visualization ★ Feature engineering ★ Bias and leakages discovery ★ Generalization and overfitting ★ Python, R, Matlab, SQL, OOP, Spark, Pig, Hive 34 Data Scientist
  • 35. The magnificent data warriors ★ Data orchestration and system architecture ★ Scaling with Big Data technologies ★ Database maintenance and data storage ★ Production processes - code deployment, optimization and testing 35 Data Engineer ★ OOP and functional programming, Python/Java/Scala/Ruby/Closure, Spark, Hadoop, Pig, Hive, DB & SQL, Jenkins, Luigi/Airflow
  • 36. The magnificent data warriors ★ Setting goals ★ Tracking progress ★ Coordinates between team members ★ Strong understanding of data mining , evaluation metrics and statistics ★ Deliver results to stakeholders ★ Leader 36 Manager
  • 38. Other options Let your developers carefully integrate Data science licensed APIs for predictive modeling in product. Outsource DS task to a consulting company 38
  • 40. Limitations ◎ No magic - when there is no predictive information in data ◎ No 100% ◎ No hidden golden feature ◎ For tomorrow, it is impossible ◎ Tasks with subjective nature are hard ◎ Outdated data and outdated models ◎ Train and test data discrepancies 40
  • 41. Thanks! Any questions? You can find me at: noambox@gmail.com 41
  • 42. References 42 ◎ Foster Provost and Tom Fawcett. 2013. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking (1st ed.). O'Reilly Media, Inc. ◎ https://www.salesforce.com/quotable/articles/why-AI-will-be-your-new-best-friend-in-sales/ ◎ https://hbr.org/2018/07/how-ai-is-changing-sales ◎ https://neilpatel.com/blog/how-uber-uses-data/ ◎ https://www.marketingaiinstitute.com/blog/7-top-marketing-and-sales-companies-using- artificial-intelligence-and-machine-learning ◎ https://www.vccafe.com/2017/09/11/israels-machine-intelligence-startup-landscape-2017/ ◎ https://www.slideshare.net/kuonen/a-statisticians-view-on-big-data-and-data-science ◎ https://www.datasciencecentral.com/profiles/blogs/10-most-popular-data-science- presentations-on-slideshare ◎ https://medium.com/high-alpha/how-to-build-a-great-data-science-team-d921fb41b5b1 ◎ https://towardsdatascience.com/what-is-the-most-effective-way-to-structure-a-data-science- team-498041b88dae ◎ https://towardsdatascience.com/the-limits-of-data-science-b4e5faad20f4

Editor's Notes

  1. In data models you assume to know in some sense the prediction function and the types of interactions between predictor variables. Then you only need to seek for the optimal settings (params) for the model to fit the data. In algorithmic modeling you assume that the function is an unknown box and you let an algorithm and the data to find out the prediction function and the variables.
  2. Optimove’s Customer Marketing Cloud automatically schedules, executes and evaluates highly individualized marketing campaigns. helps marketers retarget ads only to website visitors most likely to make a purchase on the site. Datorama’s - process of mapping new sources of marketing information to generate enhanced insights for decision-makers. Predictive advertisement targeting What’s predicted: which female customer will have a baby in coming months, which ad each customer is most likely to click What’s done about it: suggests relevant offers for soon-to-be Parents, display best add Targeting direct marketing What’s predicted: which customers will respond to marketing contact What’s done about it: contact customers more likely to respond Churn What’s predicted: which customers will leave What’s done about it: retention efforts targeting at risk customers Causal modeling predictive modeling to target advertisements to consumers. Was this because the advertisements influenced the consumers to purchase? Or did the predictive models simply do a good job Viral marketing Recognize influencers and seed them with free products they will cause an increase in the likelihood that the people they know will purchase the product.
  3. Gong “ shining the light on their sales conversations.” Automatically record, transcribe and analyze all “sales calls, demos, and meetings so sales teams can scale the effectiveness of their sales conversations.” Conversica uses AI to automate “routine business conversations in a human way.” They sell an automated sales assistant that “engages, qualifies and follows-up with sales leads via human-like, two-way email conversations.” The idea is that salespeople can talk to the right people at the right time, while AI does the heavy lifting the rest of the time. Demand forecasting (strawberry pop-tarts and beer in hurricane (NY - TIMES 2004) What’s predicted: products to be consumed before an event (such as hurricane) What’s done about it: pricing, supply Upselling and cross-selling What’s predicted: identify which of your existing clients are more likely to buy a better version of what they currently own (up-sell).The net effect is an increase in revenue and a drop in marketing costs. Leads Predicting which leads are most likely to be converted into a deal, while considering the geography, size of a company, and titles, to engagement such as signing up for a trial or downloading a white paper.
  4. Then - Uber was originally started as a black car-hailing service: UberCab, in San Francisco. Now - closely monitor which features of the Service are used most, to analyze usage pattern.Predict everything from the customer’s wait time, to recommending where drivers should place themselves via heatmap in order to take advantage of the best fares and most passengers. Dynamic pricing is similar to the pricing strategy used by hotels and flights for their weekend or holiday fares and rates – except Uber leverages predictive modeling in real-time based on traffic patterns, supply and demand.
  5. AI - Brain inspired programming ML - data driven optimization
  6. Simple business questions - User profile (age, gender, background etc.) How pays more and for what product
  7. Iterative and very difficult step Be able to tell what is unrealistic or ill defined If data is good, be patience for vaguely defined problems
  8. Do not economize on this phase The earlier you discover issues with your data the better (yes, your data will have issues!) Data understanding leads to domain understanding, it will pay off in the modelling phase Do not trust data quality estimates provided by your customer Verify as far as you can, if your data is correct, complete, coherent, deduplicated, representative, independent, up-to-date, stationary Investigate what sort of processing was applied to the raw data Understand anomalies and outliers
  9. Data understanding and preparation will usually consume half or more of your project time! Examples converting data to tabular format Removing or inferring missing values, converting data to different types. Scaling and normalizing Some data mining techniques are designed for symbolic and categorical data, while others handle only numeric values.
  10. Whenever possible, peek inside your model and consult it with domain expert • Assess feature importance • Run your model on simulated data
  11. Cumulative response curves - plot the hit rate (tp rate; y axis). You return a list ranked by your model, and you check your accuracy vs. the change in the size of the list. the percentage of positives correctly classified, as a function of the percentage of the population that is targeted (x axis). So, conceptually as we move down the list of instances ranked by the model, we target increasingly larger proportions of all the instances. Intuitively, the lift of a classifier represents the advantage it provides over random guessing. The lift is the degree to which it “pushes up” the positive instances in a list above the negative instances
  12. \
  13. Analysts monitor processes, evaluate data quality, and monitor production model performance. These steps seem relatively routine but when you realize the fact that a model is never “complete” and will always require some oversight then appointing an analyst to manage the process makes sense. This allows your more senior assets to focus on innovation instead of maintenance.
  14. Data Scientist then owns the modeling process. Generally, they take input parameters from product or other team leads in order to understand the model’s business objective. They then work to articulate requirements to the engineers and other stakeholders. Once these criteria have been defined, the process of building tests, models, and evaluating performance begins.
  15. Data Engineers are responsible for building and maintaining the technical infrastructure required in order do modeling, predictions, and analysis. The engineers create and maintain databases, machine learning pipelines, and production processes. Without having properly stored data, modeling processes, and the ability to serve predictions in production a Data Scientist is essentially useless.
  16. As the data team and number of models grows, the need for a Data Science Manager appears. This person coordinates the quants, devs, and analysts as well as manages external demand of the data science team. The Data Science Manager essentially guides the process, allocates resources, and occasionally shields the team from ad hoc requests so they are able to achieve their primary objectives.
  17. Ignoring methodology and overlooking phases lead to fragile insights and unreliable products
  18. \