SlideShare a Scribd company logo
Data Analytics in Manufacturing 
Gian Antonio Susto 
Statwolf LTD 
gianantonio.susto@statwolf.com 
1
Outline 
1. The Data Analytics Environment 
2. Principles of Manufacturing Informatics 
3. Machine (Statistical) Learning 
4. Machine Learning in Manufacturing 
a) Virtual Metrology 
b) Root Cause Analysis 
c) Predictive Maintenance 
d) Fault Detection 
2
(1) The Data Analytics Environment 
3
The (Big) Data Era 
• Data Explosion 
– Increased storage capability (Moore’s Law) 
– Internet of Things 
Gartner: 26 billion of IoT object by 2020 
4 
Techradar.com 
Data aggregated by Gongos Research
The (Big) Data Opportunity 
5
6 
The (Big) Data Deluge
We are drowning in information and starving for 
knowledge 
- Rutherford D. Rogers 
• Insights/learning 
• Predictions 
• Decision making suggestions 
• ... 
7 
The (Big) Data Deluge
Statistical 
Learning 
Software 
Engineering 
Data Analytics 
Finance Manufacturing 
Biology Robotics 
... 
8 
Data Analytics: an Interdisciplinary Field
9 
(2) Principles of Manufacturing Informatics
(Big) Data in Manufacturing 
10 
• Manufacturing companies record enormous amount of process 
data 
• Example [1] - Consumer Package Goods company that produces 
a personal care product generates: 
[1] The rise of Industrial Big Data 
- General Electrics
(Big) Data in Manufacturing 
11 
• ‘Leveraging big data is imperative as information is at the heart of competition 
and growth for industrial businesses. Data-driven strategies based on real-time 
and historical process information will help companies optimize performance’ [1] 
• Possible improvements: 
- Proving quality to trading partner/costumer 
- Maximizing yield 
- Reduce downtime 
- Recovering capacity 
[1] The rise of Industrial Big Data 
- General Electrics
The Manufacturing Data Analysis Process 
12 
- Conversion 
- Parsing 
- Aggregation 
- Alignment 
Problem Collection Cleaning 
Modelling Roll-out 
- Definition 
- Expected 
Impact 
- Evaluation 
metric 
- Quality 
- Reconciliation 
- Missing data 
handling 
- Denoising 
- Outlier detection 
- On-line 
implementation 
- Business 
outcome 
- Improvement 
- Feature 
Extraction 
- Building 
- Evaluation/ 
Comparison
The Manufacturing Data Analysis Process 
13 
- Conversion 
- Parsing 
- Aggregation 
- Alignment 
Problem Collection Cleaning 
Modelling Roll-out 
- Definition 
- Expected 
Impact 
- Evaluation 
metric 
- Quality 
- Reconciliation 
- Missing data 
handling 
- Denoising 
- Outlier detection 
- Feature 
Extraction 
- Building 
- Evaluation/ 
Comparison 
Machine Learning 
modeling based on historical 
dataset Z of 
- n observations 
(samples) 
- p variables (features) 
- On-line 
implementation 
- Business 
outcome 
- Improvement
(3) Machine Learning 
14
Machine Learning Problems 
• Two classes of modeling problem depending on 
the type of data 
– Supervised if labeled data (Z = [X Y] - X input, Y output) 
– Unsupervised if un-labeled data (Z = X) 
15 
Modeling 
Problem 
Supervised 
Regression 
Classification 
Unsupervised
Machine Learning Problems 
• Two categories in case of supervised learning, 
depending on the output type 
– Regression if Y is continuous 
– Classification if Y is discrete/categorical 
16 
Modeling 
Problem 
Supervised 
Regression 
Classification 
Unsupervised
Supervised Learning: a Regression example 
• Example: house pricing for real estate market [2] 
• Historical dataset of n house transactions with 
information regarding 
– House price (output - Y) 
– Land square footage (input - X) 
– Living square feet (input - X) 
– Effective year built (input - X) 
– Mailing address (input - X) 
[2] Machine Learning and the Spatial Structure of House Prices and 
Housing Returns – A. Caplin et al. 
17
Supervised Learning: a Classification example 
• Example: Shazam 
• A ‘digital fingerprint’ (X) is extracted from a 
song sample and compared with a database 
of 11 million songs (classes - Y) 
Tip 1 - Defining good features is generally half of the battle
Unsupervised Learning 
• Unlabeled data: quest for 
hidden structure in the data 
– Market Basket/Affinity 
Analysis 
• Pattern in the purchases: what is 
bought together? 
• Amazon 2009 revenue $24.5B, 
$5B from recommended 
products 
– Clustering 
• Grouping of a set of ‘similar’ 
object 
• ‘You may also like’
(4) Machine Learning in Manufacturing 
20
Manufacturing Data Analytics Example 
Four Examples of Manufacturing data analytics 
problems: 
A. Regression – Virtual Metrology (Semiconductor) 
B. Regression – Root Cause Analysis (Pharmaceutical) 
C. Classification – Predictive Maintenance 
(Semiconductor) 
D. Unsupervised Learning – Fault/Novelty Detection 
(Semiconductor/HVAC) 
21
[A] Regression – Virtual Metrology (VM) 
22 
• Semiconductor Manufacturing 
• Production based on wafers 
• Organization in lots (25 wafers) 
• Hundreds (thousands!) of processes: 
- Etching 
- Lithography 
- Chemical Vapor Deposition (CVD) 
- ... 
• Goodness of a process assessed by measuring one or more parameters (Y) 
on the wafer (for CVD the thickness of the deposited layer) 
• Unfortunately, measuring is costly and time-consuming
[A] Regression – Virtual Metrology (VM) 
23 
Wafer with metrology data Wafer without metrology data 
• Common practice to save money/time: measuring just 1 wafer on a lot 
• Drawbacks: 
- Delays in detecting drifts in production 
- No quality check for unmeasured wafers 
- Update of the eventual controller just once on 25 process iterations
[A] Regression – Virtual Metrology (VM) 
24 
• Tool data X available for every iteration 
(temperatures, pressures, flows, …) 
• Exploit tool/logistic/production data to 
estimate Y 
• Each wafer has now at least an 
estimation for quality/control purposes 
X 
i.e. From Lot-to-Lot to 
Run-to-Run control [3] 
[3] ‘Virtual Metrology and Feedback Control for 
Semiconductor Manufacturing Processes using Recursive 
Partial Least Squares’ - Journal of Process Control, Khan, 
Moyne and Tilbury
[A] Regression – Virtual Metrology (VM) 
25 
• Modeling difficulties 
1. Data fragmentation: several multiple-chambers 
machines, multiple 
products/recipes 
2. High-dimensionality: thousands of variables 
3. ‘Skinny problem’ (p >> n): numerical 
problems for model estimation 
Example Prediction of thickness for CVD: tool 
with 3 chambers with 2 sub-chambers 
- Exploiting Clustering for subset modeling 
Tip 2 – ‘Visualize’/Examine data before 
modeling
Dealing with high-dimensionality: Regularization 
methods 
26 
• Not all the regression techniques are suitable for high-dimensional problems 
• Simplest approach: Least Square Regression 
• Objective: minimization of the prediction error on the training data 
• OLS solutions with high-dimensional dataset are often ill-conditioned: the 
predicted output can change drastically with small perturbations of the input 
causing poor prediction performance
Dealing with high-dimensionality: Regularization 
methods 
27 
• Regularization methods overcome the issue 
• Ridge Regression (RR) [L2]: stable (“easier”) solutions are 
encouraged by penalizing coefficients (ill-posed problems or 
over-fitting issues are generally resolved) 
• Least Absolute Shrinkage and Selection Operator (LASSO) [L1]:
Dealing with high-dimensionality: Regularization 
methods 
28 
• A penalty on model complexity generally enhance performances 
• Different behaviour: LASSO provides sparse results! 
• Ie. Diabetes data: p = 10, n = 367 [4] 
• Sparsity provides interpretable models 
Essentially, all models are wrong, 
but some are useful 
- George E.P. Box 
[4] ‘The Elements of Statistical Learning: 
Data Mining, Inference, and Prediction’ – 
Hastie, Tibshirani, Friedman 2009
Regularization methods: guidelines 
29 
• RR & LASSO: no a-priori guarantee on best prediction accuracy (cross-validation 
always a necessary step to evaluate results generality) 
• LASSO is generally outperformed by RR when: 
– p > n 
– if there are high correlations between predictors 
• Elastic Nets combined the 2 techniques 
• Kernel Methods 
– non-linear solutions 
embedded in a linear framework 
(augmented space) 
From Chris Thornton, U. Sussex
Non-linear Regression: Neural Networks (NNs) 
30 
• NNs mimic the structure of the brain and how it learns from experience 
• Example architecture: 
Variables are associated with 
nodes and functions with arches 
x 
a(x) 
y 
S 
b 
x 
u1 
u2 
un 
w1 
w2 
wn
Non-linear Regression: Neural Networks (NNs) 
31 
• PROS: 
- Great prediction accuracy 
- Flexibility in modeling non-linearities 
• CONS: 
- Time consuming tuning 
- Not suitable for high-dimensional problems 
• In case of high-dimensionality, 2 steps procedure applied: 
1. Dimensionality reduction (correlation, PCA, etc… ) 
2. Modeling 
Tip 3 - The choice between linear vs non-linear approaches should be 
tailored to the problem at hand
[B] Regression – Root Cause Analysis (RCA) 
32 
• Pharmaceutical 
Manufacturing 
• Slow-Release (Time 
Release) technologies: 
capsules that dissolve over 
time for a controlled 
release of drug into the 
bloodstream 
• Dissolution profiles (y1,2,3,4) over different time intervals (T1, T2, T3, T4) 
are required to fall within intervals 
• Variability in the production: where does it come from? Root Cause 
Analysis
[B] Regression – Root Cause Analysis (RCA) 
x0 y1,2,3,4 
• Several production steps and can 
be influenced by many factors (e.g. 
raw materials quality) 
• All the available data sources are 
exploited for modeling the 
dissolution curves (y1,2,3,4) 
• Modeling with sparse approaches 
to pinpoint most influential 
parameter for variability 
33 
Process #1 Process #2 Process #3 Process #4 
x1 x2 x3 x4 
16 
14 
12 
10 
8 
6 
4 
2 
0 
RCA 
X1 X2 X3 X4
• With data analytics sophisticated approaches maintenances 
handling 
• 3 groups of approaches in manufacturing for dealing with 
maintenances: 
34 
[C] Classification – Predictive Maintenance 
(PdM) 
R2F PvM PdM 
1. Run-to-Failure (R2F) 
• Repairs or restore actions 
performed only after the 
occurrence of a failure 
• ‘If it’s not broken don’t fix it’
• With data analytics sophisticated approaches maintenances 
handling 
• 3 groups of approaches in manufacturing for dealing with 
maintenances: 
35 
[C] Classification – Predictive Maintenance 
(PdM) 
R2F PvM PdM 
2. Preventive Maintenance (PvM) 
• Planned schedule of maintenances 
with the aim of anticipating 
failures 
• Failures generally warded off 
• Unnecessary maintenances 
performed
• With data analytics sophisticated approaches maintenances 
handling 
• 3 groups of approaches in manufacturing for dealing with 
maintenances: 
36 
[C] Classification – Predictive Maintenance 
(PdM) 
R2F PvM PdM 
3. Predictive Maintenance (PdM) 
• Maintenance actions based on 
suggestion provided by a data 
analytics module 
• PdM module based on data 
available on the tool/production
[C] Classification – Predictive Maintenance 
(PdM) 
37 
• Semiconductor 
Manufacturing 
• Forecast of integral type 
faults (caused by machine 
usage) 
• Use case: breaking of 
tungsten filament in ion-implanters 
• Goal: define an indicator (y) – health factor – of the current component 
status from process parameters (X)
[C] Classification – Predictive Maintenance 
(PdM) 
38 
• Health factor indicator is a quantitave index, however we treat this as a 
Classification problem 
• Observations divided into: 
o ‘Non-Faulty’ (data of process iterations with working component) 
o ‘Faulty’ (data of process iterations with broken component) 
• Use of Support Vector Machines: the distance from the decision 
boundary is exploited as ‘distance to fail’ 
Decision 
boundary 
Adapted from [4]
[C] Classification – Predictive Maintenance 
(PdM) 
39 
• Health factor indicator is a quantitave index, however we treat this as a 
Classification problem 
• Observations divided into: 
o ‘Non-Faulty’ (data of process iterations with working component) 
o ‘Faulty’ (data of process iterations with broken component) 
• Use of Support Vector Machines: the distance from the decision 
boundary is exploited as ‘distance to fail’ 
Adapted from [4]
[C] Classification – Predictive Maintenance 
(PdM) 
40 
• Health factor indicator is a quantitave index, however we treat this as a 
Classification problem 
• Observations divided into: 
o ‘Non-Faulty’ (data of process iterations with working component) 
o ‘Faulty’ (data of process iterations with broken component) 
• Use of Support Vector Machines: the distance from the decision 
boundary is exploited as ‘distance to fail’ 
Adapted from [4]
[C] Classification – Predictive Maintenance 
(PdM) 
41 
• Health factor indicator is a quantitave index, however we treat this as a 
Classification problem 
• Observations divided into: 
o ‘Non-Faulty’ (data of process iterations with working component) 
o ‘Faulty’ (data of process iterations with broken component) 
• Use of Support Vector Machines: the distance from the decision 
boundary is exploited as ‘distance to fail’ 
Adapted from [4]
[C] Classification – Predictive Maintenance 
(PdM) 
42 
• Health factor indicator is a quantitave index, however we treat this as a 
Classification problem 
• Observations divided into: 
o ‘Non-Faulty’ (data of process iterations with working component) 
o ‘Faulty’ (data of process iterations with broken component) 
• Use of Support Vector Machines: the distance from the decision 
boundary is exploited as ‘distance to fail’ 
Adapted from [4]
[C] Classification – Predictive Maintenance 
(PdM) 
43 
• Trigger of maintenance action 
• Maintenance management 
performance indicators: 
- Unexpected Breaks NUB 
(associated cost CUB) 
- Unexploited Lifetime NUL 
(associated cost CUL) 
Health factor Threshold 
Unexploited 
Lifetime 
Unexpected 
Breaks
[C] Classification – Predictive Maintenance 
(PdM) 
44 
• Minimization of the overall costs 
• Support Decision System: 
from process data and production/maintenances 
costs, the PdM module suggests when actions should 
be taken to minimize costs
[D] Unsupervised Learning – Fault Detection 
45 
• Two classes of failures related problem 
1) Prediction (breakings in the future) 
2) Detection (already happened breaking) 
• With thousands of variables the 
detection of a breaking is not 
always a trivial task 
• Univariate monitoring can be 
measleading 
Tip 4 - Multivariate systems need 
multivariate approaches
[D] Unsupervised Learning – Fault Detection 
46 
• Employment 
1. Issue recognized by the system 
2. Drill-down of the ‘guilty’ parameter/s 
3. Original data inspection
Data Analytics in Manufacturing 
Gian Antonio Susto 
Statwolf LTD 
gianantonio.susto@statwolf.com 
47

More Related Content

What's hot

Big data ppt
Big data pptBig data ppt
Big data ppt
Deepika ParthaSarathy
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Use of data in manufacturing
Use of data in manufacturingUse of data in manufacturing
Use of data in manufacturing
Plastindustrien
 
Smart manufacturing and a iot
Smart manufacturing and a iotSmart manufacturing and a iot
Smart manufacturing and a iot
Daniel Li
 
Machine learning life cycle
Machine learning life cycleMachine learning life cycle
Machine learning life cycle
Ramjee Ganti
 
Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019
Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019
Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019
Unity Technologies
 
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
Kai Wähner
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
Ricky Barron
 
Data Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing ConcernData Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing Concern
Amin Chowdhury
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
Srinath Perera
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Maruf Abdullah (Rion)
 
Big Data
Big DataBig Data
Big Data
Seminar Links
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Informatica
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
Digital twin - Internet of Things
Digital twin - Internet of ThingsDigital twin - Internet of Things
Digital twin - Internet of Things
Ahmed Sayed
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
Vrishit Saraswat
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analytics
Prasad Narasimhan
 
Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...
Anastasija Nikiforova
 
Industry 4.0 and Smart Factory
Industry 4.0 and Smart FactoryIndustry 4.0 and Smart Factory
Industry 4.0 and Smart Factory
Alaa Khamis, PhD, SMIEEE
 

What's hot (20)

Big data ppt
Big data pptBig data ppt
Big data ppt
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Use of data in manufacturing
Use of data in manufacturingUse of data in manufacturing
Use of data in manufacturing
 
Smart manufacturing and a iot
Smart manufacturing and a iotSmart manufacturing and a iot
Smart manufacturing and a iot
 
Machine learning life cycle
Machine learning life cycleMachine learning life cycle
Machine learning life cycle
 
Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019
Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019
Digital twins: the power of a virtual visual copy - Unite Copenhagen 2019
 
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
Data Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing ConcernData Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing Concern
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big Data
Big DataBig Data
Big Data
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business Success
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Digital twin - Internet of Things
Digital twin - Internet of ThingsDigital twin - Internet of Things
Digital twin - Internet of Things
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analytics
 
Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...
 
Industry 4.0 and Smart Factory
Industry 4.0 and Smart FactoryIndustry 4.0 and Smart Factory
Industry 4.0 and Smart Factory
 

Viewers also liked

Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
Big Data & Analytics in the Manufacturing Industry: The Vaasan GroupBig Data & Analytics in the Manufacturing Industry: The Vaasan Group
Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
IBM Analytics
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Impetus Technologies
 
MES, Operational Excellence, Data Analytics and Manufacturing Intelligence
MES, Operational Excellence, Data Analytics and Manufacturing IntelligenceMES, Operational Excellence, Data Analytics and Manufacturing Intelligence
MES, Operational Excellence, Data Analytics and Manufacturing Intelligence
Bora Susmaz
 
Innovating Enterprise Innovation
Innovating Enterprise InnovationInnovating Enterprise Innovation
Innovating Enterprise Innovation
Förderverein Technische Fakultät
 
Jonathan Abir and James Norman - Cranfield University
Jonathan Abir and James Norman - Cranfield UniversityJonathan Abir and James Norman - Cranfield University
Jonathan Abir and James Norman - Cranfield UniversityJonathan Abir
 
Implementing a QbD program to make Process Validation a Lifestyle
Implementing a QbD program to make Process Validation a LifestyleImplementing a QbD program to make Process Validation a Lifestyle
Implementing a QbD program to make Process Validation a Lifestyle
Institute of Validation Technology
 
Honeywell User's Group Almirall's MES case study
Honeywell User's Group Almirall's MES case studyHoneywell User's Group Almirall's MES case study
Honeywell User's Group Almirall's MES case study
David Badia
 
DS Smith Meeting
DS Smith MeetingDS Smith Meeting
DS Smith MeetingStellaITEC
 
World Class Manufacturing - Analysis on Hayes and Wheelwright foundation
World Class Manufacturing - Analysis on Hayes and Wheelwright foundation World Class Manufacturing - Analysis on Hayes and Wheelwright foundation
World Class Manufacturing - Analysis on Hayes and Wheelwright foundation
mukesh00007
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia alliance debates   analytics 15-09-2015 16.00Pistoia alliance debates   analytics 15-09-2015 16.00
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia Alliance
 
Plant Integration and MES Solution for Industry
Plant Integration and MES Solution for IndustryPlant Integration and MES Solution for Industry
Plant Integration and MES Solution for Industry
Sunil Wadhwa -MIE, EPLM (IIMC)
 
Predictive Data Analytics to Help Your Customers
Predictive Data Analytics to Help Your CustomersPredictive Data Analytics to Help Your Customers
Predictive Data Analytics to Help Your Customers
Experian_US
 
Practical Advanced Process Control for Engineers and Technicians
Practical Advanced Process Control for Engineers and TechniciansPractical Advanced Process Control for Engineers and Technicians
Practical Advanced Process Control for Engineers and Technicians
Living Online
 
Cloud computing for manufacturing
Cloud computing for manufacturingCloud computing for manufacturing
Cloud computing for manufacturing
Jeff Chu
 

Viewers also liked (14)

Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
Big Data & Analytics in the Manufacturing Industry: The Vaasan GroupBig Data & Analytics in the Manufacturing Industry: The Vaasan Group
Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
 
MES, Operational Excellence, Data Analytics and Manufacturing Intelligence
MES, Operational Excellence, Data Analytics and Manufacturing IntelligenceMES, Operational Excellence, Data Analytics and Manufacturing Intelligence
MES, Operational Excellence, Data Analytics and Manufacturing Intelligence
 
Innovating Enterprise Innovation
Innovating Enterprise InnovationInnovating Enterprise Innovation
Innovating Enterprise Innovation
 
Jonathan Abir and James Norman - Cranfield University
Jonathan Abir and James Norman - Cranfield UniversityJonathan Abir and James Norman - Cranfield University
Jonathan Abir and James Norman - Cranfield University
 
Implementing a QbD program to make Process Validation a Lifestyle
Implementing a QbD program to make Process Validation a LifestyleImplementing a QbD program to make Process Validation a Lifestyle
Implementing a QbD program to make Process Validation a Lifestyle
 
Honeywell User's Group Almirall's MES case study
Honeywell User's Group Almirall's MES case studyHoneywell User's Group Almirall's MES case study
Honeywell User's Group Almirall's MES case study
 
DS Smith Meeting
DS Smith MeetingDS Smith Meeting
DS Smith Meeting
 
World Class Manufacturing - Analysis on Hayes and Wheelwright foundation
World Class Manufacturing - Analysis on Hayes and Wheelwright foundation World Class Manufacturing - Analysis on Hayes and Wheelwright foundation
World Class Manufacturing - Analysis on Hayes and Wheelwright foundation
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia alliance debates   analytics 15-09-2015 16.00Pistoia alliance debates   analytics 15-09-2015 16.00
Pistoia alliance debates analytics 15-09-2015 16.00
 
Plant Integration and MES Solution for Industry
Plant Integration and MES Solution for IndustryPlant Integration and MES Solution for Industry
Plant Integration and MES Solution for Industry
 
Predictive Data Analytics to Help Your Customers
Predictive Data Analytics to Help Your CustomersPredictive Data Analytics to Help Your Customers
Predictive Data Analytics to Help Your Customers
 
Practical Advanced Process Control for Engineers and Technicians
Practical Advanced Process Control for Engineers and TechniciansPractical Advanced Process Control for Engineers and Technicians
Practical Advanced Process Control for Engineers and Technicians
 
Cloud computing for manufacturing
Cloud computing for manufacturingCloud computing for manufacturing
Cloud computing for manufacturing
 

Similar to Manufacturing Data Analytics

A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
Akin Osman Kazakci
 
Data quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of CambridgeData quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of Cambridge
BCS Data Management Specialist Group
 
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
Boris Glavic
 
6 data envelopment_analysis
6 data envelopment_analysis6 data envelopment_analysis
6 data envelopment_analysis
FEG
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
Arthur240715
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniques
Piet J.H. Daas
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand
 
Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma  Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma
Brandon Theiss, PE
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-report
Aravindharamanan S
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Six_Sigma.pptx
Six_Sigma.pptxSix_Sigma.pptx
Six_Sigma.pptx
ABHISHEKGAUTAM856791
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
Revolution Analytics
 
Big data
Big dataBig data
Big data
Big dataBig data
Big data
Harshit Namdev
 
Hadoop PDF
Hadoop PDFHadoop PDF
Hadoop PDF
1904saikrishna
 
Quality engineering
Quality engineeringQuality engineering
When Should I Use Simulation?
When Should I Use Simulation?When Should I Use Simulation?
When Should I Use Simulation?
SIMUL8 Corporation
 
Decision Sciences Management
Decision Sciences ManagementDecision Sciences Management
Decision Sciences Management
Piyush Vijay
 
Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011lideresacademicos
 
Software metrics by Dr. B. J. Mohite
Software metrics by Dr. B. J. MohiteSoftware metrics by Dr. B. J. Mohite
Software metrics by Dr. B. J. Mohite
Zeal Education Society, Pune
 

Similar to Manufacturing Data Analytics (20)

A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
 
Data quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of CambridgeData quality in decision making - Dr. Philip Woodall, University of Cambridge
Data quality in decision making - Dr. Philip Woodall, University of Cambridge
 
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
 
6 data envelopment_analysis
6 data envelopment_analysis6 data envelopment_analysis
6 data envelopment_analysis
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniques
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma  Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-report
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Six_Sigma.pptx
Six_Sigma.pptxSix_Sigma.pptx
Six_Sigma.pptx
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Hadoop PDF
Hadoop PDFHadoop PDF
Hadoop PDF
 
Quality engineering
Quality engineeringQuality engineering
Quality engineering
 
When Should I Use Simulation?
When Should I Use Simulation?When Should I Use Simulation?
When Should I Use Simulation?
 
Decision Sciences Management
Decision Sciences ManagementDecision Sciences Management
Decision Sciences Management
 
Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011
 
Software metrics by Dr. B. J. Mohite
Software metrics by Dr. B. J. MohiteSoftware metrics by Dr. B. J. Mohite
Software metrics by Dr. B. J. Mohite
 

Recently uploaded

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 

Recently uploaded (20)

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 

Manufacturing Data Analytics

  • 1. Data Analytics in Manufacturing Gian Antonio Susto Statwolf LTD gianantonio.susto@statwolf.com 1
  • 2. Outline 1. The Data Analytics Environment 2. Principles of Manufacturing Informatics 3. Machine (Statistical) Learning 4. Machine Learning in Manufacturing a) Virtual Metrology b) Root Cause Analysis c) Predictive Maintenance d) Fault Detection 2
  • 3. (1) The Data Analytics Environment 3
  • 4. The (Big) Data Era • Data Explosion – Increased storage capability (Moore’s Law) – Internet of Things Gartner: 26 billion of IoT object by 2020 4 Techradar.com Data aggregated by Gongos Research
  • 5. The (Big) Data Opportunity 5
  • 6. 6 The (Big) Data Deluge
  • 7. We are drowning in information and starving for knowledge - Rutherford D. Rogers • Insights/learning • Predictions • Decision making suggestions • ... 7 The (Big) Data Deluge
  • 8. Statistical Learning Software Engineering Data Analytics Finance Manufacturing Biology Robotics ... 8 Data Analytics: an Interdisciplinary Field
  • 9. 9 (2) Principles of Manufacturing Informatics
  • 10. (Big) Data in Manufacturing 10 • Manufacturing companies record enormous amount of process data • Example [1] - Consumer Package Goods company that produces a personal care product generates: [1] The rise of Industrial Big Data - General Electrics
  • 11. (Big) Data in Manufacturing 11 • ‘Leveraging big data is imperative as information is at the heart of competition and growth for industrial businesses. Data-driven strategies based on real-time and historical process information will help companies optimize performance’ [1] • Possible improvements: - Proving quality to trading partner/costumer - Maximizing yield - Reduce downtime - Recovering capacity [1] The rise of Industrial Big Data - General Electrics
  • 12. The Manufacturing Data Analysis Process 12 - Conversion - Parsing - Aggregation - Alignment Problem Collection Cleaning Modelling Roll-out - Definition - Expected Impact - Evaluation metric - Quality - Reconciliation - Missing data handling - Denoising - Outlier detection - On-line implementation - Business outcome - Improvement - Feature Extraction - Building - Evaluation/ Comparison
  • 13. The Manufacturing Data Analysis Process 13 - Conversion - Parsing - Aggregation - Alignment Problem Collection Cleaning Modelling Roll-out - Definition - Expected Impact - Evaluation metric - Quality - Reconciliation - Missing data handling - Denoising - Outlier detection - Feature Extraction - Building - Evaluation/ Comparison Machine Learning modeling based on historical dataset Z of - n observations (samples) - p variables (features) - On-line implementation - Business outcome - Improvement
  • 15. Machine Learning Problems • Two classes of modeling problem depending on the type of data – Supervised if labeled data (Z = [X Y] - X input, Y output) – Unsupervised if un-labeled data (Z = X) 15 Modeling Problem Supervised Regression Classification Unsupervised
  • 16. Machine Learning Problems • Two categories in case of supervised learning, depending on the output type – Regression if Y is continuous – Classification if Y is discrete/categorical 16 Modeling Problem Supervised Regression Classification Unsupervised
  • 17. Supervised Learning: a Regression example • Example: house pricing for real estate market [2] • Historical dataset of n house transactions with information regarding – House price (output - Y) – Land square footage (input - X) – Living square feet (input - X) – Effective year built (input - X) – Mailing address (input - X) [2] Machine Learning and the Spatial Structure of House Prices and Housing Returns – A. Caplin et al. 17
  • 18. Supervised Learning: a Classification example • Example: Shazam • A ‘digital fingerprint’ (X) is extracted from a song sample and compared with a database of 11 million songs (classes - Y) Tip 1 - Defining good features is generally half of the battle
  • 19. Unsupervised Learning • Unlabeled data: quest for hidden structure in the data – Market Basket/Affinity Analysis • Pattern in the purchases: what is bought together? • Amazon 2009 revenue $24.5B, $5B from recommended products – Clustering • Grouping of a set of ‘similar’ object • ‘You may also like’
  • 20. (4) Machine Learning in Manufacturing 20
  • 21. Manufacturing Data Analytics Example Four Examples of Manufacturing data analytics problems: A. Regression – Virtual Metrology (Semiconductor) B. Regression – Root Cause Analysis (Pharmaceutical) C. Classification – Predictive Maintenance (Semiconductor) D. Unsupervised Learning – Fault/Novelty Detection (Semiconductor/HVAC) 21
  • 22. [A] Regression – Virtual Metrology (VM) 22 • Semiconductor Manufacturing • Production based on wafers • Organization in lots (25 wafers) • Hundreds (thousands!) of processes: - Etching - Lithography - Chemical Vapor Deposition (CVD) - ... • Goodness of a process assessed by measuring one or more parameters (Y) on the wafer (for CVD the thickness of the deposited layer) • Unfortunately, measuring is costly and time-consuming
  • 23. [A] Regression – Virtual Metrology (VM) 23 Wafer with metrology data Wafer without metrology data • Common practice to save money/time: measuring just 1 wafer on a lot • Drawbacks: - Delays in detecting drifts in production - No quality check for unmeasured wafers - Update of the eventual controller just once on 25 process iterations
  • 24. [A] Regression – Virtual Metrology (VM) 24 • Tool data X available for every iteration (temperatures, pressures, flows, …) • Exploit tool/logistic/production data to estimate Y • Each wafer has now at least an estimation for quality/control purposes X i.e. From Lot-to-Lot to Run-to-Run control [3] [3] ‘Virtual Metrology and Feedback Control for Semiconductor Manufacturing Processes using Recursive Partial Least Squares’ - Journal of Process Control, Khan, Moyne and Tilbury
  • 25. [A] Regression – Virtual Metrology (VM) 25 • Modeling difficulties 1. Data fragmentation: several multiple-chambers machines, multiple products/recipes 2. High-dimensionality: thousands of variables 3. ‘Skinny problem’ (p >> n): numerical problems for model estimation Example Prediction of thickness for CVD: tool with 3 chambers with 2 sub-chambers - Exploiting Clustering for subset modeling Tip 2 – ‘Visualize’/Examine data before modeling
  • 26. Dealing with high-dimensionality: Regularization methods 26 • Not all the regression techniques are suitable for high-dimensional problems • Simplest approach: Least Square Regression • Objective: minimization of the prediction error on the training data • OLS solutions with high-dimensional dataset are often ill-conditioned: the predicted output can change drastically with small perturbations of the input causing poor prediction performance
  • 27. Dealing with high-dimensionality: Regularization methods 27 • Regularization methods overcome the issue • Ridge Regression (RR) [L2]: stable (“easier”) solutions are encouraged by penalizing coefficients (ill-posed problems or over-fitting issues are generally resolved) • Least Absolute Shrinkage and Selection Operator (LASSO) [L1]:
  • 28. Dealing with high-dimensionality: Regularization methods 28 • A penalty on model complexity generally enhance performances • Different behaviour: LASSO provides sparse results! • Ie. Diabetes data: p = 10, n = 367 [4] • Sparsity provides interpretable models Essentially, all models are wrong, but some are useful - George E.P. Box [4] ‘The Elements of Statistical Learning: Data Mining, Inference, and Prediction’ – Hastie, Tibshirani, Friedman 2009
  • 29. Regularization methods: guidelines 29 • RR & LASSO: no a-priori guarantee on best prediction accuracy (cross-validation always a necessary step to evaluate results generality) • LASSO is generally outperformed by RR when: – p > n – if there are high correlations between predictors • Elastic Nets combined the 2 techniques • Kernel Methods – non-linear solutions embedded in a linear framework (augmented space) From Chris Thornton, U. Sussex
  • 30. Non-linear Regression: Neural Networks (NNs) 30 • NNs mimic the structure of the brain and how it learns from experience • Example architecture: Variables are associated with nodes and functions with arches x a(x) y S b x u1 u2 un w1 w2 wn
  • 31. Non-linear Regression: Neural Networks (NNs) 31 • PROS: - Great prediction accuracy - Flexibility in modeling non-linearities • CONS: - Time consuming tuning - Not suitable for high-dimensional problems • In case of high-dimensionality, 2 steps procedure applied: 1. Dimensionality reduction (correlation, PCA, etc… ) 2. Modeling Tip 3 - The choice between linear vs non-linear approaches should be tailored to the problem at hand
  • 32. [B] Regression – Root Cause Analysis (RCA) 32 • Pharmaceutical Manufacturing • Slow-Release (Time Release) technologies: capsules that dissolve over time for a controlled release of drug into the bloodstream • Dissolution profiles (y1,2,3,4) over different time intervals (T1, T2, T3, T4) are required to fall within intervals • Variability in the production: where does it come from? Root Cause Analysis
  • 33. [B] Regression – Root Cause Analysis (RCA) x0 y1,2,3,4 • Several production steps and can be influenced by many factors (e.g. raw materials quality) • All the available data sources are exploited for modeling the dissolution curves (y1,2,3,4) • Modeling with sparse approaches to pinpoint most influential parameter for variability 33 Process #1 Process #2 Process #3 Process #4 x1 x2 x3 x4 16 14 12 10 8 6 4 2 0 RCA X1 X2 X3 X4
  • 34. • With data analytics sophisticated approaches maintenances handling • 3 groups of approaches in manufacturing for dealing with maintenances: 34 [C] Classification – Predictive Maintenance (PdM) R2F PvM PdM 1. Run-to-Failure (R2F) • Repairs or restore actions performed only after the occurrence of a failure • ‘If it’s not broken don’t fix it’
  • 35. • With data analytics sophisticated approaches maintenances handling • 3 groups of approaches in manufacturing for dealing with maintenances: 35 [C] Classification – Predictive Maintenance (PdM) R2F PvM PdM 2. Preventive Maintenance (PvM) • Planned schedule of maintenances with the aim of anticipating failures • Failures generally warded off • Unnecessary maintenances performed
  • 36. • With data analytics sophisticated approaches maintenances handling • 3 groups of approaches in manufacturing for dealing with maintenances: 36 [C] Classification – Predictive Maintenance (PdM) R2F PvM PdM 3. Predictive Maintenance (PdM) • Maintenance actions based on suggestion provided by a data analytics module • PdM module based on data available on the tool/production
  • 37. [C] Classification – Predictive Maintenance (PdM) 37 • Semiconductor Manufacturing • Forecast of integral type faults (caused by machine usage) • Use case: breaking of tungsten filament in ion-implanters • Goal: define an indicator (y) – health factor – of the current component status from process parameters (X)
  • 38. [C] Classification – Predictive Maintenance (PdM) 38 • Health factor indicator is a quantitave index, however we treat this as a Classification problem • Observations divided into: o ‘Non-Faulty’ (data of process iterations with working component) o ‘Faulty’ (data of process iterations with broken component) • Use of Support Vector Machines: the distance from the decision boundary is exploited as ‘distance to fail’ Decision boundary Adapted from [4]
  • 39. [C] Classification – Predictive Maintenance (PdM) 39 • Health factor indicator is a quantitave index, however we treat this as a Classification problem • Observations divided into: o ‘Non-Faulty’ (data of process iterations with working component) o ‘Faulty’ (data of process iterations with broken component) • Use of Support Vector Machines: the distance from the decision boundary is exploited as ‘distance to fail’ Adapted from [4]
  • 40. [C] Classification – Predictive Maintenance (PdM) 40 • Health factor indicator is a quantitave index, however we treat this as a Classification problem • Observations divided into: o ‘Non-Faulty’ (data of process iterations with working component) o ‘Faulty’ (data of process iterations with broken component) • Use of Support Vector Machines: the distance from the decision boundary is exploited as ‘distance to fail’ Adapted from [4]
  • 41. [C] Classification – Predictive Maintenance (PdM) 41 • Health factor indicator is a quantitave index, however we treat this as a Classification problem • Observations divided into: o ‘Non-Faulty’ (data of process iterations with working component) o ‘Faulty’ (data of process iterations with broken component) • Use of Support Vector Machines: the distance from the decision boundary is exploited as ‘distance to fail’ Adapted from [4]
  • 42. [C] Classification – Predictive Maintenance (PdM) 42 • Health factor indicator is a quantitave index, however we treat this as a Classification problem • Observations divided into: o ‘Non-Faulty’ (data of process iterations with working component) o ‘Faulty’ (data of process iterations with broken component) • Use of Support Vector Machines: the distance from the decision boundary is exploited as ‘distance to fail’ Adapted from [4]
  • 43. [C] Classification – Predictive Maintenance (PdM) 43 • Trigger of maintenance action • Maintenance management performance indicators: - Unexpected Breaks NUB (associated cost CUB) - Unexploited Lifetime NUL (associated cost CUL) Health factor Threshold Unexploited Lifetime Unexpected Breaks
  • 44. [C] Classification – Predictive Maintenance (PdM) 44 • Minimization of the overall costs • Support Decision System: from process data and production/maintenances costs, the PdM module suggests when actions should be taken to minimize costs
  • 45. [D] Unsupervised Learning – Fault Detection 45 • Two classes of failures related problem 1) Prediction (breakings in the future) 2) Detection (already happened breaking) • With thousands of variables the detection of a breaking is not always a trivial task • Univariate monitoring can be measleading Tip 4 - Multivariate systems need multivariate approaches
  • 46. [D] Unsupervised Learning – Fault Detection 46 • Employment 1. Issue recognized by the system 2. Drill-down of the ‘guilty’ parameter/s 3. Original data inspection
  • 47. Data Analytics in Manufacturing Gian Antonio Susto Statwolf LTD gianantonio.susto@statwolf.com 47