Comment analyser des données multivariées pour suivre une production

•Download as PPTX, PDF•

0 likes•182 views

Le pilotage des procédés de production est une discipline délicate qui engage non seulement la qualité de la production mais également son efficience. Dans de nombreux procédés, beaucoup de facteurs interagissent pour atteindre l'objectif et il n'est pas facile de comprendre leurs interactions lorsqu'on regarde les paramètres un à un. Cette conférence propose une approche originale pour prendre en compte le caractère multivarié de la question ...

Technology

DATA FRAMEWORKS IN METROLOGY
How to analyse multivariate data to monitor a
production
PEGGY COURTOIS
Data-scientist, Deltamu
France

HOW TO ANALYSE
MULTIVARIATE DATA TO
MONITOR A PRODUCTION
P. Courtois, C. Dubois, J.M. Pou

INDUSTRIAL DATA
N e w 2 0 t h c e n t u r y r e s s o u r c e
3

4
4
Industrial geotextile
• Types:
• Woven fabric, non-woven fabric, knitted fabric
• Functions:
• Filtration, separation, drainage, waterproofing, etc.
• Different use
• Roadworks, railway work, agriculture, drainage, coastal, fluvial …
Source: https://book4yours.blogspot.com/

INDUSTRIAL DATA
C o n t r o l C h a r t s
5
C
o
n
t
r
o
l
C
h
a
r
t
s

CORRELATED DATA
C o v a r i a n c e M a t r i x
6

CORRELATED DATA
2 D E x a m p l e
7
Classical control chart:
→ Analyse individual parameter at a time
→ Define a large monitoring area
→ Take time before identifying an anomaly
 Underperforming

CORRELATED DATA
2 D E x a m p l e
8
Classical control chart:
→ Analyse individual parameter at a time
→ Define a small monitoring area
→ Too many false alarms
 Not appropriate
 To exceed the required quality

CORRELATED DATA
2 D E x a m p l e
9
Multivariate control chart:
→ Analyse all the parameters at once
→ Define an appropriate monitoring area
 Appropriate

MULTIVARIATE SPC
M a h a l a n o b i s D i s t a n c e
10
Source: Woźniak et al, 2019
Based on representative dataset:
→ Distance between a point and a
distribution
→ Points 1 and 2 have the same distance
 Gives a multidimensional distance
 Alarm if the distance is too large
Same
distance

MULTIVARIATE SPC
P r i n c i p a l C o m p o n e n t A n a l y s i s
11
Source: Woźniak et al, 2019
Based on representative dataset :
→ Expression of the data along the axis
with the most variations (Principal
Components)
→ Reduction of the dimensions
 Gives the axis with the most variations
 Identify the axis of the shift

MULTIVARIATE SPC
M a h a l a n o b i s D i s t a n c e
2 D E x a m p l e
12

MULTIVARIATE SPC
P r i n c i p a l C o m p o n e n t A n a l y s i s
2 D E x a m p l e
13

MULTIVARIATE SPC
G e o t e x t i l e M e a s u r e m e n t
M a h a l a n o b i s D i s t a n c e
9 D i m e n s i o n s
14

MULTIVARIATE SPC
G e o t e x t i l e M e a s u r e m e n t
15

CONCLUSION
M u l t i v a r i a t e S P C
16
Advantages of the multivariate control charts:
+ Can be applied to any complex fields
+ Take into account all the characteristics of
the measurement
+ Control charts representative of the reality

FUTURE WORK
I n c e r t a i n t i e s
17
• Include the uncertainty in the multivariate
calculations
• Bayesian Measurement Refinement
→ Based on conditional probabilities
→ JCGM 106:2012 (or ISO GUIDE 98-4)

18
18
References
• Woźniak, M., Gałązka-Friedman, J., Duda, P., Jakubowska, M., Rzepecka, P. and Karwowski, Ł. (2019)
Application of Mössbauer spectroscopy, multidimensional discriminant analysis, and Mahalanobis distance for classification of
equilibrated ordinary chondrites
Meteorit Planet Sci, 54: 1828-1839. https://doi.org/10.1111/maps.13314
• JCGM 106:2012
Evaluation of measurement data – The role of measurement uncertainty in conformity assessment
• Gilbert Saporta (2011)
Probabilités, analyse des données et Statistique

THANK YOU
Questions?
19
Peggy Courtois Christophe Dubois Jean-Michel Pou

Similar to Comment analyser des données multivariées pour suivre une production

Lesson2 esa summer_school_brovelliMaria Antonia Brovelli

Geographic query and analysisMohsin Siddique

INSPIRE 2014 conferenceMuki Haklay

Remote Patient & Elderly Care MonitoringVeselin Pizurica

Sequence-to-Sequence Modeling for Time SeriesArun Kejariwal

Tasseled Cap transformation Technique in ArcGISAtiqa khan

UAS based soil moisture monitoringSalvatore Manfreda

MACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORINGVisionGEOMATIQUE2014

Geospatial Analysis and Internet of Things in Environmental InformaticsAndreas Kamilaris

Using satellite imagery to track economic changeRishabh Srivastava

NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...Wolfgang Ksoll

FOSS4G in Europe; Italy and the Politecnico de MilanoCarolina Arias Muñoz

Eval rec algo_crowdsourcing__icalt_2014_maMojisola Erdt née Anjorin

AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...Ravi Kiran B.

The modern flood forecastingRiccardo Rigon

Anomaly Detection in DataMiningBilalAbbasAwan

Network analysisFrancisco Restivo

Extracting value from data sharing for RES forecasting: Privacy aspects & dat...Leonardo ENERGY

Classroom Occupancy Machine Learning ProjectKristen McIntyre

REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS AM Publications

Similar to Comment analyser des données multivariées pour suivre une production (20)

Lesson2 esa summer_school_brovelli

Geographic query and analysis

INSPIRE 2014 conference

Remote Patient & Elderly Care Monitoring

Sequence-to-Sequence Modeling for Time Series

Tasseled Cap transformation Technique in ArcGIS

UAS based soil moisture monitoring

MACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORING

Geospatial Analysis and Internet of Things in Environmental Informatics

Using satellite imagery to track economic change

NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...

FOSS4G in Europe; Italy and the Politecnico de Milano

Eval rec algo_crowdsourcing__icalt_2014_ma

AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...

The modern flood forecasting

Anomaly Detection in DataMining

Network analysis

Extracting value from data sharing for RES forecasting: Privacy aspects & dat...

Classroom Occupancy Machine Learning Project

REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

Artificial intelligence in the post-deep learning eraDeakin University

Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

CloudStudio User manual (basic edition):comworks

Key Features Of Token Development (1).pptxLBM Solutions

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

Artificial intelligence in the post-deep learning era

Unlocking the Potential of the Cloud for IBM Power Systems

The transition to renewables in India.pdf

Breaking the Kubernetes Kill Chain: Host Path Mount

Streamlining Python Development: A Guide to a Modern Project Setup

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Maximizing Board Effectiveness 2024 Webinar.pptx

CloudStudio User manual (basic edition):

Key Features Of Token Development (1).pptx

Benefits Of Flutter Compared To Other Frameworks

Presentation on how to chat with PDF using ChatGPT code interpreter

Pigging Solutions Piggable Sweeping Elbows

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Advanced Test Driven-Development @ php[tek] 2024

Scanning the Internet for External Cloud Exposures via SSL Certs

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Comment analyser des données multivariées pour suivre une production

1. DATA FRAMEWORKS IN METROLOGY How to analyse multivariate data to monitor a production PEGGY COURTOIS Data-scientist, Deltamu France

2. HOW TO ANALYSE MULTIVARIATE DATA TO MONITOR A PRODUCTION P. Courtois, C. Dubois, J.M. Pou

3. INDUSTRIAL DATA N e w 2 0 t h c e n t u r y r e s s o u r c e 3

4. 4 4 Industrial geotextile • Types: • Woven fabric, non-woven fabric, knitted fabric • Functions: • Filtration, separation, drainage, waterproofing, etc. • Different use • Roadworks, railway work, agriculture, drainage, coastal, fluvial … Source: https://book4yours.blogspot.com/

5. INDUSTRIAL DATA C o n t r o l C h a r t s 5 C o n t r o l C h a r t s

6. CORRELATED DATA C o v a r i a n c e M a t r i x 6

7. CORRELATED DATA 2 D E x a m p l e 7 Classical control chart: → Analyse individual parameter at a time → Define a large monitoring area → Take time before identifying an anomaly  Underperforming

8. CORRELATED DATA 2 D E x a m p l e 8 Classical control chart: → Analyse individual parameter at a time → Define a small monitoring area → Too many false alarms  Not appropriate  To exceed the required quality

9. CORRELATED DATA 2 D E x a m p l e 9 Multivariate control chart: → Analyse all the parameters at once → Define an appropriate monitoring area  Appropriate

10. MULTIVARIATE SPC M a h a l a n o b i s D i s t a n c e 10 Source: Woźniak et al, 2019 Based on representative dataset: → Distance between a point and a distribution → Points 1 and 2 have the same distance  Gives a multidimensional distance  Alarm if the distance is too large Same distance

11. MULTIVARIATE SPC P r i n c i p a l C o m p o n e n t A n a l y s i s 11 Source: Woźniak et al, 2019 Based on representative dataset : → Expression of the data along the axis with the most variations (Principal Components) → Reduction of the dimensions  Gives the axis with the most variations  Identify the axis of the shift

12. MULTIVARIATE SPC M a h a l a n o b i s D i s t a n c e 2 D E x a m p l e 12

13. MULTIVARIATE SPC P r i n c i p a l C o m p o n e n t A n a l y s i s 2 D E x a m p l e 13

14. MULTIVARIATE SPC G e o t e x t i l e M e a s u r e m e n t M a h a l a n o b i s D i s t a n c e 9 D i m e n s i o n s 14

15. MULTIVARIATE SPC G e o t e x t i l e M e a s u r e m e n t 15

16. CONCLUSION M u l t i v a r i a t e S P C 16 Advantages of the multivariate control charts: + Can be applied to any complex fields + Take into account all the characteristics of the measurement + Control charts representative of the reality

17. FUTURE WORK I n c e r t a i n t i e s 17 • Include the uncertainty in the multivariate calculations • Bayesian Measurement Refinement → Based on conditional probabilities → JCGM 106:2012 (or ISO GUIDE 98-4)

18. 18 18 References • Woźniak, M., Gałązka-Friedman, J., Duda, P., Jakubowska, M., Rzepecka, P. and Karwowski, Ł. (2019) Application of Mössbauer spectroscopy, multidimensional discriminant analysis, and Mahalanobis distance for classification of equilibrated ordinary chondrites Meteorit Planet Sci, 54: 1828-1839. https://doi.org/10.1111/maps.13314 • JCGM 106:2012 Evaluation of measurement data – The role of measurement uncertainty in conformity assessment • Gilbert Saporta (2011) Probabilités, analyse des données et Statistique

19. THANK YOU Questions? 19 Peggy Courtois Christophe Dubois Jean-Michel Pou

Editor's Notes

SLIDE DE PRESENTATION CONFERENCIER - INTRO
Hello everyone, thank you very much to listening to my presentation My presentation is about analysing multivariate data to monitor a production.
Before going further ,I would like to give you a bit of context. We all know that the 21st century is highly influenced by data. Multibillionaire companies such as Google, or Amazon have used data to analyse our needs and create new ones. Not only GAFAs have been using data, insurance companies also use large atmospheric and oceanographic data, to evaluate the risk of flood and storm to calculate their clients’ subscriptions. We see that analysing data is crucial to understand any behaviour, or phenomenon. Industrial companies start to acknowledge this new resource, and for an industrial company, analysing data is useful either to characterise an instrument, monitor a production and predict a shift if this happens.
So here at Deltamu we have been working with geotextile manufacturer to analyse their production. Geotextile can be of different types, it can be woven or not, even knitted. It is used for different purposes, either for separation, filtration or drainage. And we use it in various fields such as roadworks, agriculture and so on. To characterise a geotextile, we need a set of measurements which are more or less correlated to one another,
In this picture we see the set of measurement (9 in total), what is done usually is that we monitor these data independantly. Here you have an example of a monitoring of the Tensile strengh, using a control chart based on the average and the standard deviation. Unfortunately, this approach does not consider the measurements as a complex set of correlated variables. In this presentation, we will tackle this issue and present different mathematical tools which can be used. These tools are not new and are heavily used in other fields (psychology, finance, computing, …), but there are not well developed in industries.
For those who are not familiar with correlated data, here you have an example of two correlated measurements: the thickness of the geotextile and the NF punching. This experiment tests the robustness of the material during its use. Its makes sense that the thicker the geotextile is, the more robust the material will be. This information, this correlation between these two data is captured in what we call a cavariance matrix. To make it simple, a covariance matrix is just a table showing how strongly two measurements are linked one another. If two data are independant, the correlation will be 0. If the correlation is close to 1 or -1, the two data will be highly correlated. Pyramidal punch NF G 38 019 : Détermination de la résistance au poinçonnement Objectif : Appréhender les efforts subis par le géotextile lors de sa mise en œuvre, ou en service. Méthodologie : Détermination de la force nécessaire pour assurer la traversée d'une éprouvette de géotextiles par un poinçon pyramidal, perpendiculairement au plan défini par le produit. La méthodologie est la même que pour la norme "NF EN ISO 12236 : Éssai de poinçonnement statique CBR". L'unité de mesure est le kN.
Now lets see what happen when we use basic control chart. What follows is just an example in 2D to understand the current problematic, but the real benefit is more than two variables. Presentation of the plot, showing the correlation. By monitoring these two variables independantly, we define a lower and upper limit for both variables, resulting to the area in blue, which is far more too large compared to the real data. It is unlikely we will have data in the top left corner due to the correlation, and the actual control chart wont be able to seize this anomaly.
On the opposite, we are tempted to reduce this area to be able to seize any anomaly in our production. This is once again not appropriate as this will give us a large number of false alarms and we wont be able to distinguish between anomaly and real data. We also over perform as many measurements will be incorrectly stated as non compliant
A way to tackle this issue is to consider the covariance of the data. By studying the density of the data, we can identify an area where the data are well represented, as we can see with this ellipse on the graph. And any data occuring outside this ellipse will be considered as an anomaly. We have two different tools to characterise this anomaly. First we use the distance of Mahalanobis to alert when a measurement is out of the ellipse. Second we use the PCA to identify which variable is affected by this anomaly.
The Distance of Mahalanobis is like just any other distance except that it takes into account the correlation between the data. For exemple, points 1 and 2 have the same distance because they belong to the same density line, which is not the case of Point 3. If the data were independant, we will have a circle and not an ellipse and the Mahalanobis distance will be the normal Euclidian distance.
As I said earlier, we use the PCA to give us information on which variables is affected by the anomaly. In few words, a PCA is a way to identify the axis with the most variation along the axis. Once the axis are identified we translate the data into this new framework.
Just as an exemple and to present this mathematical approach, we have analysed the two parameters introduced earlier and we have simulated a wear for the pyramidal punch giving us these data. Description of the plot
So now we were alerted by an anomaly, we would like to know which parameters are affected by it. The PCA is able to identify the axis where the anomaly takes place. In 2D, this analysis is not relevant
As I said earlier, this 2D exemple was just an example to have a better representation of the approach. What is more relevant for us is to work with the entire set of variables, in our case the 9 variables I showed you previously. The distance of Mahalanobis is more difficult to visualise in 9 dimensions. The best way is to have 2 types of figures: One showing the evolution of the distance One showing the histogram, the distribution of the distance of Mahalanobis, which is a Chi square with 9 degrees of freedom. At that stage, we are informed that there is an anomaly on the way. Now what we want to know is the direction of this anomaly, what are the variables affected by this anomaly.
Once again the representation is more difficult in 9 dimensions. What we do is that we look at the principal components which are the major axes of the variation. Here we can see 9 principal components and lets say that 3 or 4 major axes represent 75% of the total variation. By looking at the projection of the anomaly on the principal components we can identify the variables

Comment analyser des données multivariées pour suivre une production

Recommended

Recommended

More Related Content

Similar to Comment analyser des données multivariées pour suivre une production

Similar to Comment analyser des données multivariées pour suivre une production (20)

More from Jean-Michel POU

More from Jean-Michel POU (20)

Recently uploaded

Recently uploaded (20)

Comment analyser des données multivariées pour suivre une production

Editor's Notes