SlideShare a Scribd company logo
1 of 33
Data Science
applications in business
Case Study
December 7, RaTSiF-2018
Agenda for today
Part 1:
- What is Data Science?
- Data Scientist’s work
- Project approach
Part 2:
- Case study # 1
- Case study # 2
- Q/A
PART 1
Data Science as it is
What is Data Science?
Data science is an interdisciplinary field that uses scientific methods,
processes, algorithms and systems to extract knowledge and
insights from data in various forms, both structured and
unstructured
Wikipedia
Data Science is:
- About understanding what data says
- Analysis and Storytelling
- Based on math and statistics
Data Science is NOT:
- Big Data
- Robots
- Magic
Data Science in numbers
• Data Science platform market is constantly growing and will reach
$195.7 billion by 2023
• 61% of businesses said they implemented AI in 2017, up from just
38% in 2016
• 20% of all 4-year institutions in the U.S. have at least one analytics
program
• 2.35 million of Data Science and Analytics job listings in 2015,
projected to grow to ~ 2.720 by 2020
• Data scientist job openings on Glassdoor in 2016 had an average
salary of $116k.
Sources: Prescient & Strategic Intelligence, Narrative Science, Tableau, IBM, Glassdor
What we do
…not really
Instead
• Listen carefully our clients / stakeholders
• Explain where data science can and can not
be applied
• Define business metrics to influence
• Propose type of solution
Then
• Understand data and processes
• Transform business problem to data science
task
• Develop solution code (googling and
stackoveflowing mostly)
• Productize the solution (god bless data
engineers)
Expectations
• Present the solution to client / stakeholder
• Deploy it
• Measure results
• Complete project
Reality: Iterate!
How data scientists spend
work time?
Data Preparation Data Analysis
80 % 20%
…not exactly…
Adopting business problem:
communication, research,
brainstorm
Data
Analysis
80 % 20%
but:
Project Lifecycle
Business
Understanding
Data
Understanding
Data
Preparation
Data
Modeling
Model Evaluation
Deployment
Data
CRISP-DM
Cross-industry standard
process for data mining
PART 2
Data Science case study
Case study #1
Problem: Large sports retail company experiences lack of proper
stock planning.
DEMAND
DEMAND
DEMAND
PRODUCT: STOCK:
Proactive inventory management
Proactive inventory management
Use case: Company launches brand new model of sport
shoes. Supply team wants to know precise distribution of
shoes sizes that need to be delivered from factory to the
stock.
FACTORY STOCK SHOP
100 x
150 x
160 x
140 x
80 x
Proactive inventory management
Data Science Solution: Prediction model that provides
recommendations in regards to number of items to meet
customers demand.
PREDICTION
MODEL
98 x
154 x
81 x
…
HOW?
1. Collect historical sales data
2. Identify key descriptors / features of the product
3. Train clustering algorithm that separates all products into
sub-categories
4. Apply algorithm to new product to define cluster it belongs to
5. Make a prediction based on internal cluster characteristics
Hierarchical clustering
Type Material Season Min Size Max Size …
Shoes Leather Demi 32 46 …
Find cluster
A B C D
Once cluster is detected,
make prediction based on
weighted average of
existing cluster members 0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
36 37 38 39 40 41 42 43 44 45 46
Shoes quantity distribution per size
Member 1 Member 2 Member 3 New member (prediction)
A
Prediction
Results
1. Model integrated into production
2. Improved stock planning (must be measured!)
3. Average prediction error is 2% (percentage difference
between predicted values and real ones for all sizes)
4. Clustering algorithm re-trained every month and tuned based
on communication with business team
Case study # 2
Flights tickets price predictor
Problem: Over project engagement consultants frequently
travel on client site to deliver best quality of the services. The
most convenient transport is plane whereas flight tickets
costs burn significant part of project budget.
Aim: Reduce spending of flight tickets by choosing optimal
offering
Simple Solution
Can we?
What if
• We could forecast ticket price fluctuations exactly until departure
date
• Choose date with lowest price and buy it then
80 81 81
72
81 81 85 85
90 90
120 120
180
0
20
40
60
80
100
120
140
160
180
200
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
Riga - Paris
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
Why it may work?
80 81 81
72
81 81 85 85
90 90
120 120
180
0
20
40
60
80
100
120
140
160
180
200
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
Riga - Paris
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
• Tickets tend to cost more as you get closer to departure date, BUT!
• Prices are still fluctuating
• Proven insights like “book seven weeks in advance”, “the cheapest
day to buy flight is Tuesday”
• Low demand
• Promotions
How?
1. Collect (web-scrape) historical data about price fluctuations of the
usual flight directions for our consultants:
Date of
collection
From To Departure
Date
Carrier Price
Dec 1 Riga Paris Jan 25 Air Baltic 80
Dec 1 Riga Frankfurt Jan 25 Lufthansa 92
Dec 2 Riga Paris Jan 24 Air Baltic 80
Dec 2 Riga Frankfurt Jan 24 Lufthansa 91
… … … … … …
How?
2. Generate additional features that may be good predictors:
• Days till flight
• Flight distance
• Weekday/Month of departure
• Departure/Arrival airport
• Segment of airline (low cost, premium)
• Number of days till public holiday from departure day
Date of
collecti
on
Days till
flight
From Dep
Airport
To Ar
Airport
Flight
dist
Dep
Date
Dep
weekda
y
Till
public
holiday
Carrier Price
Dec 1 55 Riga RIX Paris CDG 2241 Jan 25 Fr 84 Air Baltic 80
Dec 1 55 Riga RIX Frankfurt FRA 1205 Jan 25 Fr 84 Lufthansa 92
Dec 2 54 Riga RIX Paris CDG 2241 Jan 24 Thu 85 Air Baltic 80
Dec 2 54 Riga RIX Frankfurt FRA 1205 Jan 24 Thu 85 Lufthansa 91
… … … … … … … … … … … …
How?
3. Build regression models that predict tickets price starting from day
when data started to be collected:
𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆
Oct 1, 2017
𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆
…
Oct 2, 2017
…
𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆
Today, Dec 7, 2018
How?
4. Forecast regression coefficients (using ARIMA, for example) till
departure date of desired tickets
1-Oct 2-Oct 3-Oct 4-Oct … 7-Dec 8-Dec … 24-Jan 25-Jan
Beta1 Beta2 Beta3 BetaK BetaN
FORECAST
Coefficients that were calculated
based on data collected today
How?
5. Forecast prices based on forecasted coefficients and choose day
with lowest price
80 81 81
72
81 81 85 85
90 90
120 120
180
0
20
40
60
80
100
120
140
160
180
200
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆
Jan 15, 2019 (forecast)
FORECAST
Results
1. Built PoC in RShiny based on historical data for 6 months
2. Forecast period: 3 weeks
3. Expected savings: 13% (on test set of ticket orders)
80 81 81
72
81 81 85 85
90 90
120 120
180
0
20
40
60
80
100
120
140
160
180
200
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
FORECAST
Saving 10.5%
EXAMPLE:
Questions?
About me
• Vladyslav Yakovenko
• Senior Data Science consultant
at Deloitte
• Former Data Scientist at
Accenture
• Graduated with Master’s
degree in Statistics from Taras
Shevchenko National University
of Kyiv, Ukraine

More Related Content

What's hot

Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data scienceShilpaKrishna6
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceFerdin Joe John Joseph PhD
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Machine learning in action at Pipedrive
Machine learning in action at PipedriveMachine learning in action at Pipedrive
Machine learning in action at PipedriveAndré Karpištšenko
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overviewColleen Farrelly
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceGabriel Moreira
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectbodaceacat
 
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSMSunView Software, Inc.
 
Exploring the Data science Process
Exploring the Data science ProcessExploring the Data science Process
Exploring the Data science ProcessVishal Patel
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —swethaT16
 
Data Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralData Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralFrank Kienle
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceMark West
 

What's hot (20)

Data Science
Data ScienceData Science
Data Science
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to Data Science by Datalent Team @Data Science Clinic #9
Introduction to Data Science by Datalent Team @Data Science Clinic #9Introduction to Data Science by Datalent Team @Data Science Clinic #9
Introduction to Data Science by Datalent Team @Data Science Clinic #9
 
Data science
Data scienceData science
Data science
 
Data science
Data scienceData science
Data science
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
data science
data sciencedata science
data science
 
Machine learning in action at Pipedrive
Machine learning in action at PipedriveMachine learning in action at Pipedrive
Machine learning in action at Pipedrive
 
2005)
2005)2005)
2005)
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
 
Exploring the Data science Process
Exploring the Data science ProcessExploring the Data science Process
Exploring the Data science Process
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —
 
Data Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralData Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information Collateral
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data Science
 

Similar to Data Science applications in business

Le rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économiqueLe rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économiqueCARTO
 
Machine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSMachine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSAmazon Web Services
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data MiningNofel Elahi
 
Xpanse Analytics Platform
Xpanse Analytics PlatformXpanse Analytics Platform
Xpanse Analytics PlatformMichael Keane
 
Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Dolapo Amusat
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine LearningJulien SIMON
 
Tracxn Big Data Analytics Landscape Report, June 2016
Tracxn Big Data Analytics Landscape Report, June 2016Tracxn Big Data Analytics Landscape Report, June 2016
Tracxn Big Data Analytics Landscape Report, June 2016Tracxn
 
Introduction to Business Analytics-sample.pptx
Introduction to Business Analytics-sample.pptxIntroduction to Business Analytics-sample.pptx
Introduction to Business Analytics-sample.pptxabedeh1
 
Data analytics and analysis trends in 2015 - Webinar
Data analytics and analysis trends in 2015 - WebinarData analytics and analysis trends in 2015 - Webinar
Data analytics and analysis trends in 2015 - WebinarAli Zeeshan
 
How to Calculate OA APC Spend for Your University
How to Calculate OA APC Spend for Your UniversityHow to Calculate OA APC Spend for Your University
How to Calculate OA APC Spend for Your UniversityHeather Piwowar
 
The Role of Data Science in Real Estate
The Role of Data Science in Real EstateThe Role of Data Science in Real Estate
The Role of Data Science in Real EstateCARTO
 
ASMD 2022 for class.pptx
ASMD 2022 for class.pptxASMD 2022 for class.pptx
ASMD 2022 for class.pptxMahekSinghania2
 
Let's understand Data Science
Let's understand Data Science Let's understand Data Science
Let's understand Data Science Sachin Rastogi
 
«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving services
«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving services«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving services
«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving servicesDecision Science Community
 
Il ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply ChainIl ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply ChainACTOR
 
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...logisticaefficiente
 

Similar to Data Science applications in business (20)

TYPES OF ANALYTICS.pptx
TYPES OF ANALYTICS.pptxTYPES OF ANALYTICS.pptx
TYPES OF ANALYTICS.pptx
 
Le rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économiqueLe rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économique
 
Machine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSMachine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWS
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data mining
Data miningData mining
Data mining
 
Xpanse Analytics Platform
Xpanse Analytics PlatformXpanse Analytics Platform
Xpanse Analytics Platform
 
Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)Data Analytics & Visualization (Introduction)
Data Analytics & Visualization (Introduction)
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Tracxn Big Data Analytics Landscape Report, June 2016
Tracxn Big Data Analytics Landscape Report, June 2016Tracxn Big Data Analytics Landscape Report, June 2016
Tracxn Big Data Analytics Landscape Report, June 2016
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Introduction to Business Analytics-sample.pptx
Introduction to Business Analytics-sample.pptxIntroduction to Business Analytics-sample.pptx
Introduction to Business Analytics-sample.pptx
 
Data Science Webinar
Data Science WebinarData Science Webinar
Data Science Webinar
 
Data analytics and analysis trends in 2015 - Webinar
Data analytics and analysis trends in 2015 - WebinarData analytics and analysis trends in 2015 - Webinar
Data analytics and analysis trends in 2015 - Webinar
 
How to Calculate OA APC Spend for Your University
How to Calculate OA APC Spend for Your UniversityHow to Calculate OA APC Spend for Your University
How to Calculate OA APC Spend for Your University
 
The Role of Data Science in Real Estate
The Role of Data Science in Real EstateThe Role of Data Science in Real Estate
The Role of Data Science in Real Estate
 
ASMD 2022 for class.pptx
ASMD 2022 for class.pptxASMD 2022 for class.pptx
ASMD 2022 for class.pptx
 
Let's understand Data Science
Let's understand Data Science Let's understand Data Science
Let's understand Data Science
 
«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving services
«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving services«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving services
«DIGITAL SUPPLY CHAIN»: using data optimisation and IA for improving services
 
Il ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply ChainIl ruolo chiave degli Advanced Analytics per la Supply Chain
Il ruolo chiave degli Advanced Analytics per la Supply Chain
 
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...ACTOR -  "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
ACTOR - "Il ruolo chiave degli Advanced Analytics per la Supply Chain. Intel...
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

Data Science applications in business

  • 1. Data Science applications in business Case Study December 7, RaTSiF-2018
  • 2. Agenda for today Part 1: - What is Data Science? - Data Scientist’s work - Project approach Part 2: - Case study # 1 - Case study # 2 - Q/A
  • 4. What is Data Science? Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured Wikipedia Data Science is: - About understanding what data says - Analysis and Storytelling - Based on math and statistics Data Science is NOT: - Big Data - Robots - Magic
  • 5. Data Science in numbers • Data Science platform market is constantly growing and will reach $195.7 billion by 2023 • 61% of businesses said they implemented AI in 2017, up from just 38% in 2016 • 20% of all 4-year institutions in the U.S. have at least one analytics program • 2.35 million of Data Science and Analytics job listings in 2015, projected to grow to ~ 2.720 by 2020 • Data scientist job openings on Glassdoor in 2016 had an average salary of $116k. Sources: Prescient & Strategic Intelligence, Narrative Science, Tableau, IBM, Glassdor
  • 7. Instead • Listen carefully our clients / stakeholders • Explain where data science can and can not be applied • Define business metrics to influence • Propose type of solution
  • 8. Then • Understand data and processes • Transform business problem to data science task • Develop solution code (googling and stackoveflowing mostly) • Productize the solution (god bless data engineers)
  • 9. Expectations • Present the solution to client / stakeholder • Deploy it • Measure results • Complete project
  • 11. How data scientists spend work time? Data Preparation Data Analysis 80 % 20% …not exactly… Adopting business problem: communication, research, brainstorm Data Analysis 80 % 20% but:
  • 13. PART 2 Data Science case study
  • 14. Case study #1 Problem: Large sports retail company experiences lack of proper stock planning. DEMAND DEMAND DEMAND PRODUCT: STOCK: Proactive inventory management
  • 15. Proactive inventory management Use case: Company launches brand new model of sport shoes. Supply team wants to know precise distribution of shoes sizes that need to be delivered from factory to the stock. FACTORY STOCK SHOP 100 x 150 x 160 x 140 x 80 x
  • 16. Proactive inventory management Data Science Solution: Prediction model that provides recommendations in regards to number of items to meet customers demand. PREDICTION MODEL 98 x 154 x 81 x …
  • 17. HOW? 1. Collect historical sales data 2. Identify key descriptors / features of the product 3. Train clustering algorithm that separates all products into sub-categories 4. Apply algorithm to new product to define cluster it belongs to 5. Make a prediction based on internal cluster characteristics
  • 18. Hierarchical clustering Type Material Season Min Size Max Size … Shoes Leather Demi 32 46 … Find cluster A B C D
  • 19. Once cluster is detected, make prediction based on weighted average of existing cluster members 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 36 37 38 39 40 41 42 43 44 45 46 Shoes quantity distribution per size Member 1 Member 2 Member 3 New member (prediction) A Prediction
  • 20. Results 1. Model integrated into production 2. Improved stock planning (must be measured!) 3. Average prediction error is 2% (percentage difference between predicted values and real ones for all sizes) 4. Clustering algorithm re-trained every month and tuned based on communication with business team
  • 21. Case study # 2 Flights tickets price predictor Problem: Over project engagement consultants frequently travel on client site to deliver best quality of the services. The most convenient transport is plane whereas flight tickets costs burn significant part of project budget. Aim: Reduce spending of flight tickets by choosing optimal offering
  • 24. What if • We could forecast ticket price fluctuations exactly until departure date • Choose date with lowest price and buy it then 80 81 81 72 81 81 85 85 90 90 120 120 180 0 20 40 60 80 100 120 140 160 180 200 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan Riga - Paris 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan
  • 25. Why it may work? 80 81 81 72 81 81 85 85 90 90 120 120 180 0 20 40 60 80 100 120 140 160 180 200 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan Riga - Paris 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan • Tickets tend to cost more as you get closer to departure date, BUT! • Prices are still fluctuating • Proven insights like “book seven weeks in advance”, “the cheapest day to buy flight is Tuesday” • Low demand • Promotions
  • 26. How? 1. Collect (web-scrape) historical data about price fluctuations of the usual flight directions for our consultants: Date of collection From To Departure Date Carrier Price Dec 1 Riga Paris Jan 25 Air Baltic 80 Dec 1 Riga Frankfurt Jan 25 Lufthansa 92 Dec 2 Riga Paris Jan 24 Air Baltic 80 Dec 2 Riga Frankfurt Jan 24 Lufthansa 91 … … … … … …
  • 27. How? 2. Generate additional features that may be good predictors: • Days till flight • Flight distance • Weekday/Month of departure • Departure/Arrival airport • Segment of airline (low cost, premium) • Number of days till public holiday from departure day Date of collecti on Days till flight From Dep Airport To Ar Airport Flight dist Dep Date Dep weekda y Till public holiday Carrier Price Dec 1 55 Riga RIX Paris CDG 2241 Jan 25 Fr 84 Air Baltic 80 Dec 1 55 Riga RIX Frankfurt FRA 1205 Jan 25 Fr 84 Lufthansa 92 Dec 2 54 Riga RIX Paris CDG 2241 Jan 24 Thu 85 Air Baltic 80 Dec 2 54 Riga RIX Frankfurt FRA 1205 Jan 24 Thu 85 Lufthansa 91 … … … … … … … … … … … …
  • 28. How? 3. Build regression models that predict tickets price starting from day when data started to be collected: 𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆 Oct 1, 2017 𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆 … Oct 2, 2017 … 𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆 Today, Dec 7, 2018
  • 29. How? 4. Forecast regression coefficients (using ARIMA, for example) till departure date of desired tickets 1-Oct 2-Oct 3-Oct 4-Oct … 7-Dec 8-Dec … 24-Jan 25-Jan Beta1 Beta2 Beta3 BetaK BetaN FORECAST Coefficients that were calculated based on data collected today
  • 30. How? 5. Forecast prices based on forecasted coefficients and choose day with lowest price 80 81 81 72 81 81 85 85 90 90 120 120 180 0 20 40 60 80 100 120 140 160 180 200 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan 𝜷 𝟏 ∗ 𝑫𝒂𝒚𝒔_𝒕𝒊𝒍𝒍_𝒇𝒍𝒊𝒈𝒉𝒕 + 𝜷 𝟐* 𝑭𝒓𝒐𝒎 + … + 𝜷 𝒏* 𝑪𝒂𝒓𝒓𝒊𝒆𝒓 = 𝒑𝒓𝒊𝒄𝒆 Jan 15, 2019 (forecast) FORECAST
  • 31. Results 1. Built PoC in RShiny based on historical data for 6 months 2. Forecast period: 3 weeks 3. Expected savings: 13% (on test set of ticket orders) 80 81 81 72 81 81 85 85 90 90 120 120 180 0 20 40 60 80 100 120 140 160 180 200 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan 19-Jan 20-Jan 21-Jan 22-Jan 23-Jan 24-Jan FORECAST Saving 10.5% EXAMPLE:
  • 33. About me • Vladyslav Yakovenko • Senior Data Science consultant at Deloitte • Former Data Scientist at Accenture • Graduated with Master’s degree in Statistics from Taras Shevchenko National University of Kyiv, Ukraine

Editor's Notes

  1. https://www.pexels.com/search/meeting/
  2. https://www.pexels.com/search/meeting/
  3. https://www.pexels.com/search/meeting/