“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
Taris alessandra presentazione engl
1. University of Cagliari
Master Science in Chemical and Process Engineering
Statistical control of FTIR measurements in
commercial detergents production
Supervisor:
Ing. Massimiliano GROSSO
Co-supervisor:
Ing. Vincenzo GUIDA
Student:
Alessandra TARIS
Scientific committee:
Prof. Ing. Roberto BARATTI
in collaboration with
2011-2012
2. Focus on surfaces detergents
Complex formulations containing:
• potassium hydroxide
• surfactants (anionic, amphoteric, non ionic)
• Chelating agents
• sodium carbonate
• perfume
• ethanol
• Fatty acid
• polymers
• etc.
Aim: ensure standard quality in detergents
3. Steps in liquid detergents production:
• ingredients mixing
• packaging
• quality control
Problems:
• Interpretation and manipulation of collected process variables
may be difficult
• Online quality control is not always feasible
• Analytical techniques are slow (e.g. concentration
measurements)
4. Experimental campaign (P&G, Bruxelles)
FTIR spectroscopy: fast analytical technique, can be used online
Assorbanza
Process deviations due to composition
variations of detergent
Numero d'onda (cm-1)
Reproduction using a 142 samples set of
detergent
Joint variation of 11 experimental
conditions (compounds concentration)
Samples FTIR spectra
Assorbanza
y11
y21
Y
N P
yN1
y12
y22
yN 2
y1P
y2 P
y NP
N=142, P=1738
Numero d'onda (cm-1)
5. Samples FTIR spectra
142 spectra
1738 absorbances for each spectrum
Assorbanza
Deviations reflect on spectra
Numero d'onda (cm-1)
Problem:
How
can we
identify samples differences
using spectra analysis?
Thesis aims:
1. Development of methods for statistical control on experimental
measurements
(spectra)
using
Multivariate
Statistical
Techniques (to be implemented online in the future)
2. Detect compounds that significantly affect the spectra
6. PCA goals: data compression, informations extraction
Original variables
Principal components (PC)
High dimensions
Extremely correlated
fewer
indipendent
Example: Bidimensional case-study (x1-x2 set)
PC2
• PC1 greatest variance
• PC2 residual variance
x2
• PC1 and PC2 indipendent (orthogonal)
PC1
x1
7. Data coordinates in the new
space: scores (T)
PC2
Score1 (t1): projections on PC1 direction
x2
Score2 (t2): projections on PC2 direction
PC1
PC1
PC2
x1
Scores variance:
Sscore1>>Sscore2
PCA model : only one principal component (PC1)
8. Out-of-control samples identification using Q and T2 statistics
Bidimensional case-study: 2 samples supposed to be out-of-control
Q and T2 geometric interpretation
Q Statistic
Low T2
High Q
High T2
Low Q
x2
O′
Measures sample distance
from PCA model
(that is from orthogonal
projection on PC1 line)
Hotelling T2
Measures distance from O′
within PCA model
x1
If T2 > T2 lim or Q>Q lim
Sample is out-of-control
9. Multivariate data: N° variables >> 2
1) Components decomposition
Y
N J
T P
N J
J J
N=142, J=1738
Y original experimental measurements
T scores matrix (new coordinates)
P loadings matrix (space rotation)
2) PCA model
ˆ
Y Y
How many A principal components?
10. Cumulative variance explained by principal components
100
95
Explained variance = 95%
Varianza spiegata (%)
90
85
80
75
16 components
70
65
60
55
1
2
3
4
5
6
7 8 9 10 11 12 13 14 15 16 17 18 19 20
Numero componenti principali
1738
Original variables (absorbances)
16
Principal Components
Spectra can be well characterized using 16 PC
11. Synthetic chart, easy interpretation
Determination of the region (rectangular-shaped) in which in-control
samples have to fall
T2 e Q limits
(confidence level 95%)
180
160
T
140
2
lim
= 31.13 (MacGregor, 1995)
Qlim = 109.6
120 Qlim
Q
100
(Jackson, 1979)
Determination of T2 e Q statistics for
each spectrum (Jackson, 1991)
80
60
Auto-validation
T2
lim
40
0
10
20
30
40
50
60
False-positive samples
2
T
New definition of normal operating region
Ellipse
Joint region of multivariate gaussian distribution
(limits more selective for outliers)
12. 1. Gaussian test for T2 and Q :
•
Q approchable as gaussian
•
T2 not gaussian
2. Non linear transformation
T2 T2bx
3. Confidence ellipse equation:
( x x ) 2 V 1 ( y y ) 2 cost
13. New control region limits
Q
Confidence limits:
• 95% e 99% (red)
• 100-th percentile (green)
T2
bx
14. Statistical control simulation: identification out-of-control spectra
Load FTIR spectrum
Joint confidence region
Projection on PCA model
(developed on training set)
Calculus T2bx and Q statistics
Q
Statistical control using joint
confidence region calculated on
training set
T2
bx
15. •
Goal: Define relationship Y-X
TA f ( X ) Y TA PA Y f ( X )
•
Linear model
N=142 samples
M=11 experimental conditions (concentrations)
A=16 scores (16 regressive models)
•
Significant variables choise
Stepwise Methods (Draper and Smith, 1998 )
Identification of variables that are most significant
16. • Models examples:
t1 a1 b11 soda b12 non ionic surfactant
t 2 a2 b21 soda b22 surfactant s b23 pH buffer b24 NaCO 3 b25 perfume
• Qualitative compounds influence on spectra:
Sodium
Carbonate
non ionic
surfactant
Influential variables :
Sodium Hydroxide and surfactants
Amphoteric
surfactant
Anionic
surfactant
Non influential variables :
Co-solvent (ethanol)
Sodium
hydroxide
17. Developments of general methods for statistical control:
•
Spectra analysis and compression using PCA
•
Variables reduction from 1738 to 16
•
T2-Q control chart definition
•
New different joint confidence region T2bx-Q
Qualitative relationship between experimental conditions
(X) and scores (TA):
•
Solvent does not influence spectra
•
Spectra depend on soda, surfactants and sodium carbonate
18. This work has been realized in cooperation with the
Procter & Gamble Research Centre in Pomezia (RM)