SlideShare a Scribd company logo
Predictive Vehicle Inspection
Matous Havlena
matous@havlena.net
Tim Ojo
timmyojo@gmail.com
Akin Alao
alaoraufu@yahoo.co.uk
Project Charter
Evaluate the feasibility of using Big Data analytics solutions for
Manufacturing to solve the problem of Predictive Vehicle
Inspection:
● Analyzing vehicle production history to predict car inspection
failures from the production line.
● Production shifts, specific employee, and other factors
The two Big Data Analytics solutions to be evaluated:
● IBM BigInsights
● Datameer 2.1
Approach & Proposed Solution
● Recognized the problem as a classification problem
similar to credit scoring or fraud detection.
● Classification is the problem of identifying to which of a
set of categories a new observation belongs, on the basis
of a training set of data containing observations whose
category membership is known.
● Build a predictive model based on machine learning
classification (supervised learning) to identify whether a
vehicle can be classified as good (passes quality check
on 1st try) or bad (fails quality check on 1st try)
Proposed Solutions - Tools
● BigInsights + SPSS Modeler
○ Hadoop is used to store big data and execute data
processing jobs in an efficient and distributed
fashion. IBM provides BigInsights as a management
and operational interface to simplify working with
Hadoop without doing much coding.
○ SPSS Modeler is a data analytics workbench that
allows the user to build predictive models by
leveraging built in algorithms and functions without
the need for programming
Proposed Solutions - Tools
● Datameer
○ Like BigInsights, Datameer Analytics Solution presents a
web based spreadsheet interface on top of a Hadoop
cluster and provides analytics functions and
visualizations out of the box without the need for writing
code.
○ DAS also has a Smart Analytics suite. One of the tools
available in that suite is a decision tree model which is a
descriptive model that can identify important factors that
affect quality.
○ Datameer can also be extended to run predictive models
created in R, SAS, SPSS, etc.
IBM Solution Architecture
SPSS Modeler
Client (only
Windows)

SPSS Modeler
Server (multiplatform)

SPSS Analytic Server
● allows analysts to do predictive analytics over big
data
● data centric architecture ensures scalability and
performance
SPSS Analytic Catalyst
● automatically discovers statistically interesting
relationships in data
● close the analytic specialist gap
● good in early discovery dataset stage (helps to
focus on important parts)
● automate some parts of CRISP-DM

SPSS Analytic
Server
(multiplatform)

SPSS Analytic
Catalyst

Hadoop
(BigInsights)
Prediction in SPSS Modeler

425 predictors
85.4% accuracy
(on the training dataset)
Model Outcome
Original value | Predicted value | Confidence
Predictor Importance
c5.0 Algorithm
● C5.o is an algorithm used to generate a decision tree
which can be used for classification therefore it is often
referred to as a statistical classifier
● A C5.0 model works by splitting the sample based on the
field that provides the maximum information gain. Each
subsample defined by the first split is then split again,
usually based on a different field, and the process
repeats until the subsamples cannot be split any further.
Finally, the lowest-level splits are reexamined, and those
that do not contribute significantly to the value of the
model are removed or pruned.
c5.0 Algorithm
● C5.0 models are quite robust in the presence of
problems such as missing data and large numbers of
input fields.
● They usually do not require long training times to
create. Because of the algorithm’s recursive nature it can
benefit from parallel processing.
● C5.0 offers the boosting method to increase accuracy of
classification
Datameer Analysis
● As previously mentioned Datameer has some built in
advanced analytics tools but most of them are in the
descriptive analytics area. The sole predictive analytics
tool they have is a specialized recommendation engine.
● Datameer can be extended to include predictive models
generated in tools like R, SAS, SPSS, etc. These take the
form of functions in DAS similar to the concept of
functions in Excel.
○ The disadvantage of this approach is that the hard work
of building the model is done without the support of big
data
○ Another disadvantage is the lack of tight integration that
is present in the IBM solution however you do get the
freedom to use any tool
Project Challenges & Opportunities
● Data understanding and formatting
● Time constraints
● More interaction with people on the ground
● More predictor data (diverse dataset is a key!)
○ Plant environment (temperature, humidity,
pressure)
○ Specific employees
○ Supplier & parts data
○ Warranty data
Questions?
Matous Havlena
matous@havlena.net
Tim Ojo
timmyojo@gmail.com
Akin Alao
alaoraufu@yahoo.co.uk

More Related Content

What's hot

introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
dataalcott
 
introduction to machin learning
introduction to machin learningintroduction to machin learning
introduction to machin learning
nilimapatel6
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Vivek Garg
 
AI Powered Conversational Interfaces
AI Powered Conversational InterfacesAI Powered Conversational Interfaces
AI Powered Conversational Interfaces
Amazon Web Services
 
Churn Prediction in Practice
Churn Prediction in PracticeChurn Prediction in Practice
Churn Prediction in Practice
BigData Republic
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Simplilearn
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
ankit panigrahy
 
Prediction of Car Price using Linear Regression
Prediction of Car Price using Linear RegressionPrediction of Car Price using Linear Regression
Prediction of Car Price using Linear Regression
ijtsrd
 
Machine learning
Machine learningMachine learning
Machine learning
Dr Geetha Mohan
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
Vrishit Saraswat
 
Machine learning ppt
Machine learning pptMachine learning ppt
Machine learning ppt
Rajat Sharma
 
Building Recommender Systems for Fashion
Building Recommender Systems for FashionBuilding Recommender Systems for Fashion
Building Recommender Systems for Fashion
Nick Landia
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt
Poojamanic
 
Final presentation on chatbot
Final presentation on chatbotFinal presentation on chatbot
Final presentation on chatbot
VaishnaviKhandelwal6
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Srishti44
 
Machine learning
Machine learningMachine learning
Machine learning
Rajib Kumar De
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
datamantra
 

What's hot (20)

introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
introduction to machin learning
introduction to machin learningintroduction to machin learning
introduction to machin learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
AI Powered Conversational Interfaces
AI Powered Conversational InterfacesAI Powered Conversational Interfaces
AI Powered Conversational Interfaces
 
Churn Prediction in Practice
Churn Prediction in PracticeChurn Prediction in Practice
Churn Prediction in Practice
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Prediction of Car Price using Linear Regression
Prediction of Car Price using Linear RegressionPrediction of Car Price using Linear Regression
Prediction of Car Price using Linear Regression
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
Machine learning ppt
Machine learning pptMachine learning ppt
Machine learning ppt
 
Building Recommender Systems for Fashion
Building Recommender Systems for FashionBuilding Recommender Systems for Fashion
Building Recommender Systems for Fashion
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt
 
Final presentation on chatbot
Final presentation on chatbotFinal presentation on chatbot
Final presentation on chatbot
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
 

Viewers also liked

Sample SOP For MS in Business Analytics
Sample SOP For MS in Business AnalyticsSample SOP For MS in Business Analytics
Sample SOP For MS in Business Analytics
SOP MBA
 
Functional Programming Fundamentals
Functional Programming FundamentalsFunctional Programming Fundamentals
Functional Programming Fundamentals
Shahriar Hyder
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011
Milind Bhandarkar
 
Lambda Calculus by Dustin Mulcahey
Lambda Calculus by Dustin Mulcahey Lambda Calculus by Dustin Mulcahey
Lambda Calculus by Dustin Mulcahey
Hakka Labs
 
Interactive Scientific Image Analysis using Spark
Interactive Scientific Image Analysis using SparkInteractive Scientific Image Analysis using Spark
Interactive Scientific Image Analysis using Spark
Kevin Mader
 
Functional programming
Functional programmingFunctional programming
Functional programmingedusmildo
 
Machine Learning with Apache Mahout
Machine Learning with Apache MahoutMachine Learning with Apache Mahout
Machine Learning with Apache MahoutDaniel Glauser
 
Functional Programming in JavaScript by Luis Atencio
Functional Programming in JavaScript by Luis AtencioFunctional Programming in JavaScript by Luis Atencio
Functional Programming in JavaScript by Luis Atencio
Luis Atencio
 
The Lambda Calculus and The JavaScript
The Lambda Calculus and The JavaScriptThe Lambda Calculus and The JavaScript
The Lambda Calculus and The JavaScript
Norman Richards
 
Functional programming
Functional programmingFunctional programming
Functional programming
Prateek Jain
 
Functional programming ii
Functional programming iiFunctional programming ii
Functional programming ii
Prashant Kalkar
 
Introduction to Functional Programming in JavaScript
Introduction to Functional Programming in JavaScriptIntroduction to Functional Programming in JavaScript
Introduction to Functional Programming in JavaScript
tmont
 

Viewers also liked (12)

Sample SOP For MS in Business Analytics
Sample SOP For MS in Business AnalyticsSample SOP For MS in Business Analytics
Sample SOP For MS in Business Analytics
 
Functional Programming Fundamentals
Functional Programming FundamentalsFunctional Programming Fundamentals
Functional Programming Fundamentals
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011
 
Lambda Calculus by Dustin Mulcahey
Lambda Calculus by Dustin Mulcahey Lambda Calculus by Dustin Mulcahey
Lambda Calculus by Dustin Mulcahey
 
Interactive Scientific Image Analysis using Spark
Interactive Scientific Image Analysis using SparkInteractive Scientific Image Analysis using Spark
Interactive Scientific Image Analysis using Spark
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Machine Learning with Apache Mahout
Machine Learning with Apache MahoutMachine Learning with Apache Mahout
Machine Learning with Apache Mahout
 
Functional Programming in JavaScript by Luis Atencio
Functional Programming in JavaScript by Luis AtencioFunctional Programming in JavaScript by Luis Atencio
Functional Programming in JavaScript by Luis Atencio
 
The Lambda Calculus and The JavaScript
The Lambda Calculus and The JavaScriptThe Lambda Calculus and The JavaScript
The Lambda Calculus and The JavaScript
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Functional programming ii
Functional programming iiFunctional programming ii
Functional programming ii
 
Introduction to Functional Programming in JavaScript
Introduction to Functional Programming in JavaScriptIntroduction to Functional Programming in JavaScript
Introduction to Functional Programming in JavaScript
 

Similar to Predictive Analytics Project in Automotive Industry

Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
Data Science Milan
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDC
gdgsurrey
 
A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...
IRJET Journal
 
Python and data analytics
Python and data analyticsPython and data analytics
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
Pouria Amirian
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
Pouria Amirian
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Daniel Zivkovic
 
MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
sundharakumarkb1
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
CarolineRebeccaD
 
laptop price prediction presentation
laptop price prediction presentationlaptop price prediction presentation
laptop price prediction presentation
NeerajNishad4
 
Aws autopilot
Aws autopilotAws autopilot
Aws autopilot
Vivek Raja P S
 
Microsoft_Databricks Datathon - Submission Deck TEMPLATE.pptx
Microsoft_Databricks Datathon - Submission Deck TEMPLATE.pptxMicrosoft_Databricks Datathon - Submission Deck TEMPLATE.pptx
Microsoft_Databricks Datathon - Submission Deck TEMPLATE.pptx
Abdoulaye DOUCOURE
 
Choosing The Right Data Annotation Option: Pros And Cons
Choosing The Right Data Annotation Option: Pros And ConsChoosing The Right Data Annotation Option: Pros And Cons
Choosing The Right Data Annotation Option: Pros And Cons
Arnav Malhotra
 
Practical data science
Practical data sciencePractical data science
Practical data science
Ding Li
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
Yuriy Guts
 
Bhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogueBhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogue
Vijayananda Mohire
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Ramiro Aduviri Velasco
 
Ibm watson
Ibm watsonIbm watson
Ibm watson
Vivek Mohan
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
Knoldus Inc.
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
Knoldus Inc.
 

Similar to Predictive Analytics Project in Automotive Industry (20)

Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDC
 
A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...A Machine learning based framework for Verification and Validation of Massive...
A Machine learning based framework for Verification and Validation of Massive...
 
Python and data analytics
Python and data analyticsPython and data analytics
Python and data analytics
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
 
laptop price prediction presentation
laptop price prediction presentationlaptop price prediction presentation
laptop price prediction presentation
 
Aws autopilot
Aws autopilotAws autopilot
Aws autopilot
 
Microsoft_Databricks Datathon - Submission Deck TEMPLATE.pptx
Microsoft_Databricks Datathon - Submission Deck TEMPLATE.pptxMicrosoft_Databricks Datathon - Submission Deck TEMPLATE.pptx
Microsoft_Databricks Datathon - Submission Deck TEMPLATE.pptx
 
Choosing The Right Data Annotation Option: Pros And Cons
Choosing The Right Data Annotation Option: Pros And ConsChoosing The Right Data Annotation Option: Pros And Cons
Choosing The Right Data Annotation Option: Pros And Cons
 
Practical data science
Practical data sciencePractical data science
Practical data science
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Bhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogueBhadale group of companies data science project methodologies catalogue
Bhadale group of companies data science project methodologies catalogue
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Ibm watson
Ibm watsonIbm watson
Ibm watson
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
 

More from Matouš Havlena

Data warehousing
Data warehousingData warehousing
Data warehousing
Matouš Havlena
 
Predictive Analytics [UTC]
Predictive Analytics [UTC]Predictive Analytics [UTC]
Predictive Analytics [UTC]
Matouš Havlena
 
Big Data Analytics [UTC]
Big Data Analytics [UTC]Big Data Analytics [UTC]
Big Data Analytics [UTC]
Matouš Havlena
 
Agile requirementspraguefinal
Agile requirementspraguefinalAgile requirementspraguefinal
Agile requirementspraguefinalMatouš Havlena
 
Presentation IBM Rational AppScan
Presentation IBM Rational AppScanPresentation IBM Rational AppScan
Presentation IBM Rational AppScanMatouš Havlena
 

More from Matouš Havlena (6)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Predictive Analytics [UTC]
Predictive Analytics [UTC]Predictive Analytics [UTC]
Predictive Analytics [UTC]
 
Big Data Analytics [UTC]
Big Data Analytics [UTC]Big Data Analytics [UTC]
Big Data Analytics [UTC]
 
Koucink [MUNI]
Koucink [MUNI]Koucink [MUNI]
Koucink [MUNI]
 
Agile requirementspraguefinal
Agile requirementspraguefinalAgile requirementspraguefinal
Agile requirementspraguefinal
 
Presentation IBM Rational AppScan
Presentation IBM Rational AppScanPresentation IBM Rational AppScan
Presentation IBM Rational AppScan
 

Recently uploaded

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 

Recently uploaded (20)

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 

Predictive Analytics Project in Automotive Industry

  • 1. Predictive Vehicle Inspection Matous Havlena matous@havlena.net Tim Ojo timmyojo@gmail.com Akin Alao alaoraufu@yahoo.co.uk
  • 2. Project Charter Evaluate the feasibility of using Big Data analytics solutions for Manufacturing to solve the problem of Predictive Vehicle Inspection: ● Analyzing vehicle production history to predict car inspection failures from the production line. ● Production shifts, specific employee, and other factors The two Big Data Analytics solutions to be evaluated: ● IBM BigInsights ● Datameer 2.1
  • 3. Approach & Proposed Solution ● Recognized the problem as a classification problem similar to credit scoring or fraud detection. ● Classification is the problem of identifying to which of a set of categories a new observation belongs, on the basis of a training set of data containing observations whose category membership is known. ● Build a predictive model based on machine learning classification (supervised learning) to identify whether a vehicle can be classified as good (passes quality check on 1st try) or bad (fails quality check on 1st try)
  • 4. Proposed Solutions - Tools ● BigInsights + SPSS Modeler ○ Hadoop is used to store big data and execute data processing jobs in an efficient and distributed fashion. IBM provides BigInsights as a management and operational interface to simplify working with Hadoop without doing much coding. ○ SPSS Modeler is a data analytics workbench that allows the user to build predictive models by leveraging built in algorithms and functions without the need for programming
  • 5. Proposed Solutions - Tools ● Datameer ○ Like BigInsights, Datameer Analytics Solution presents a web based spreadsheet interface on top of a Hadoop cluster and provides analytics functions and visualizations out of the box without the need for writing code. ○ DAS also has a Smart Analytics suite. One of the tools available in that suite is a decision tree model which is a descriptive model that can identify important factors that affect quality. ○ Datameer can also be extended to run predictive models created in R, SAS, SPSS, etc.
  • 6. IBM Solution Architecture SPSS Modeler Client (only Windows) SPSS Modeler Server (multiplatform) SPSS Analytic Server ● allows analysts to do predictive analytics over big data ● data centric architecture ensures scalability and performance SPSS Analytic Catalyst ● automatically discovers statistically interesting relationships in data ● close the analytic specialist gap ● good in early discovery dataset stage (helps to focus on important parts) ● automate some parts of CRISP-DM SPSS Analytic Server (multiplatform) SPSS Analytic Catalyst Hadoop (BigInsights)
  • 7. Prediction in SPSS Modeler 425 predictors 85.4% accuracy (on the training dataset)
  • 8. Model Outcome Original value | Predicted value | Confidence
  • 10. c5.0 Algorithm ● C5.o is an algorithm used to generate a decision tree which can be used for classification therefore it is often referred to as a statistical classifier ● A C5.0 model works by splitting the sample based on the field that provides the maximum information gain. Each subsample defined by the first split is then split again, usually based on a different field, and the process repeats until the subsamples cannot be split any further. Finally, the lowest-level splits are reexamined, and those that do not contribute significantly to the value of the model are removed or pruned.
  • 11. c5.0 Algorithm ● C5.0 models are quite robust in the presence of problems such as missing data and large numbers of input fields. ● They usually do not require long training times to create. Because of the algorithm’s recursive nature it can benefit from parallel processing. ● C5.0 offers the boosting method to increase accuracy of classification
  • 12. Datameer Analysis ● As previously mentioned Datameer has some built in advanced analytics tools but most of them are in the descriptive analytics area. The sole predictive analytics tool they have is a specialized recommendation engine. ● Datameer can be extended to include predictive models generated in tools like R, SAS, SPSS, etc. These take the form of functions in DAS similar to the concept of functions in Excel. ○ The disadvantage of this approach is that the hard work of building the model is done without the support of big data ○ Another disadvantage is the lack of tight integration that is present in the IBM solution however you do get the freedom to use any tool
  • 13. Project Challenges & Opportunities ● Data understanding and formatting ● Time constraints ● More interaction with people on the ground ● More predictor data (diverse dataset is a key!) ○ Plant environment (temperature, humidity, pressure) ○ Specific employees ○ Supplier & parts data ○ Warranty data