Quality Control and Improvement in Manufacturing

4th International Summer School
Achievements and Applications of Contemporary
Informatics, Mathematics and Physics
National University of Technology of the Ukraine
Kiev, Ukraine, August 5-16, 2009

Quality Control and Improvement
in Manufacturing

Gülser Köksal , Sinan Kayalıgil
Department of Industrial Engineering, METU, Ankara, Turkey

Gerhard-Wilhelm Weber, Başak Akteke-Öztürk
IAM, METU, Ankara, Turkey

Project Team
Gülser Köksal (IE)
Nur Evin Özdemirel (IE)
Sinan Kayalıgil (IE)
Bülent Karasözen (MATH, IAM)
Gerhard Wilhelm Weber (IAM)
Đnci Batmaz (STAT)
Murat Caner Testik (IE)
Đlker Arif Đpekçi (IE)
Berna Bakır (IS)
Fatma Güntürkün (STAT)
Başak Öztürk (IAM)
Fatma Yerlikaya (IAM)

Other Collaborators:
Esra Karasakal (IE)
Zeev Volkovich (CS - Israel)
Adil Bagirov (AOpt - Australia)
Özge Uncu (IE- Canada)
Pakize Taylan (IAM)
Süreyya Özöğür (IAM)
Elçin Kartal (STAT)
Selcan Cansız (STAT&IE)

OUTLINE
Project Objectives
Quality Improvement (QI)
Data Mining (DM)
DM Applications in QI in Literature
DM Applications in the Project
Casting QI Problem (Decision Trees, Neural Nets,
Clustering)
Driver Seat Design Problem (Decision Trees)
PCB QI Problem (Association)
Other approaches
Nonlinear/Robust Regression
Conclusion

Project Objectives
Determine which DM approaches can
effectively be used in QI
Test performance of DM approaches on
selected quality design and improvement
problems with especially voluminous data
and multiple input and quality characteristics
Develop more effective approaches to solve
such problems

Project Scope

Manufacturing industries keeping records
of various input and quality characteristics
QI problems for which traditional analysis
and solution approaches are ineffective
due to too many variables and complicated
relationships
“Parameter design optimization” and
“quality analysis” type of quality problems

The Approach
Collect appropriate data from different industries for
different quality problems
Apply appropriate DM techniques in solving those
problems
Compare performances of DM techniques
Determine which DM techniques can effectively be
used for which type of QI problems
Develop new / improved algorithms

Quality Control and Improvement Activities

Product development stage Quality control and improvement activity

Product design Concept design

Parameter design (design optimization)

Tolerance design

Manufacturing process design Concept design

Parameter design (design optimization)

Tolerance design

Manufacturing Quality monitoring

Process control

Inspection / Screening

Quality analysis

Customer usage Warranty and repair / replacement

Parameter Design Optimization
Static problem:
INPUT Find settings of manipulated input
for fixed output target
Disturbance and minimum variability

Unmeasured
Measured Dynamic problem:
Find settings of manipulated input
for changing output targets
and minimum variability

INPUT Unmeasured
PRODUCT/PROCESS OUTPUT
Manipulated
Measured

Dynamic Manufacturing Environment
INPUT Goal: to have process output
within target specifications with
Disturbance smallest amount of variation
around the target
(assignable causes, noise)

Unmeasured
statistical process control
to detect assignable causes

Measured
(quality monitoring)

INPUT Unmeasured
PROCESS OUTPUT
Manipulated
Measured

engineering process control

Static Manufacturing Environment

INPUT Goal: to have process output
within target specifications with
Disturbance smallest amount of variation
around the target
Quality analysis: (assignable causes, noise)

Unmeasured
measured / manipulated input

Measured
→ output

INPUT Unmeasured
PROCESS OUTPUT
Manipulated
Measured

Quality Control and Improvement Activities:
Quality Analysis
Quality Analysis consists of

- Finding characteristics critical-to-quality (CTQ)
- Finding input variables that significantly affect quality output

- Predicting quality
- quality output is a real valued variable
- finding empirical models that relate input characteristics of quality to output
ones
- using such models to predict what the resulting quality characteristics will
be for a given set of input parameters

- Classification of quality
- For nominal, binary or ordinal outputs
- For a given set of input parameters, predicting the class of the quality
output

Data Mining

Data mining (knowledge discovery in
databases) :
Extraction of interesting (non-trivial, implicit, previously unknown
and potentially useful) information or patterns in large databases

What is not data mining?
(Deductive) query processing
Expert systems or small ML/statistical programs

Data mining – A KDD Process
Data mining is the core of
KDD process Pattern Evaluation

Data Mining

Task-relevant Data
Data Selection
Data Preprocessing

Data Warehouse
Data Cleaning
Data Integration

Databases

Data Mining Techniques
Supervised Learning
Classification and regression
Decision trees
Neural networks
Support vector machines
Bayesian belief networks
Non-linear robust regression
Rule induction
Association rules
Rough set theory

Data Mining Techniques

Unsupervised Learning
Clustering
K-means, Fuzzy C-means, Hierarchical, Mixture of
Gaussians

Neural Networks (Self Organizing Maps)

Outlier and deviation detection

Trend analysis and change detection

Some Applications

Market research and customer
relationship management
Risk analysis and management
Fraud detection
Text and web analysis
Intelligent inquiry
Process modelling
Supply chain management

Supply Chain Management Applications

Reducing risk of accepting bad credit cards in
payments through e-commerce
Controlling inventory by analyzing past
business, monitoring present transactions, and
predicting future sales
Controlling inventory by predicting customer’s
behavior patterns (e-commerce)
CRM (clustering customers, understanding their
needs and behaviors, etc.)
Source: Kusiak, A. “Data Mining in Design of Products and Production Systems”, Proceedings in INCOM 2006,
Vol.1, 49-53.

SOME DM APPLICATIONS on QI PROBLEMS

Predicting quality for given process parameter levels
Finding optimal process parameter levels for quality
Determining effects of equipment on quality
Determining factors / parameters effects on quality
Tolerancing
Identifing relationships among several quality
characteristics
Determining assignable causes that make a process
out of control (unstable) on time

Some Applications in Literature
Integrated circuit manufacturing
Fountain et al. (2000), Kusiak (2000)

Packaging manufacturing
Abajo et al. (2004)

Semiconductor wafer manufacturing
Gardner (2000), Kusiak (2000), Bae (2005),

Chen (2004), Braha (2002), Hu (2004),

Dabbas (2001), Fan (2001), Mieno (1999)

Skinner (2002)

Sheet metal assembly
Lian et al. (2002)

Some Applications in Literature
Steel production
Cser et al. (2001)

Chemical manufacturing
Shi et al. (2004), Gillblad (2001)

Sun (2003)

Ultra-precision manufacturing
Huang&Wu (2005)

Conveyor belts manufacturing
Hou et al. (2003), Hou (2004)

Plastic manufacturing
Ribeiro (2005)

LITERATURE SURVEY
(DM Applications on Selected QI Problems)
No. of papers
14

2007
12
2006

2005
10
2004

8 2003

2002

6
2001

2000
4
1999

1998 Finding CTQs
2
Predicting quality
1997 Classification of quality
Parameter optimization
0
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 0 5 10 15 20 25

Years

Literature Survey (cont.d)
RBF-NN
BA
Finding CTQs 1
1 CC
1 BN
1
GA
RSM AHC 1
1
1 KW ANN
ANN- BN 11 SVM
DT 1
2
7 1
GA
1

ANN-SOM FST
3 3

RST
ANN 3
RST
6
DT 5
5

ANOVA
5
R
5
Classification of quality

Literature Survey (cont.d)
TM
1
ANN-RBF
1
Predicting quality

ANN-RBF
3
ANN-BN
4
ANN
6
FST GA
4 11

DT
ANN 4
38

R
13 Parameter optimization

QI Problems – Examples from the Project

Casting manufacturing
Driver seat design
Circuit board manufacturing

CASTING QUALITY IMPROVEMENT PROBLEM
– The Company

RKN is a casting company having two
factories located in Ankara
It manufactures intermediate goods for the
automotive, agricultural tractor and motor
industries
RKN applies 6σ methodologies in
improving its processes

– Some Products

Transmission Cases
Engine Block

Oil pan

Gearbox

– Some Research Questions

Is there any relation between defect types
and process parameters?
Do the important factors for different
defect types interact?
Which process parameter levels are better
in reducing the defects?

DRIVER SEAT DESIGN OPTIMIZATION PROBLEM
– The Company

TFD is one of the largest automobile
manufacturers in Turkey located in Bursa.
They would like to improve the design of
the driver seat of a commercial vehicle for
more customer satisfaction.
The driver seat is a critical part of an
automobile that affects the buying
decision.

– The Product


Which customer features do affect overall
satisfaction from the seat?
What are the characteristics of highly
satisfied /dissatisfied customers from the
seat?
Which features of the seat do affect overall
satisfaction from the seat?

CIRCUIT BOARD QUALITY IMPROVEMENT PROBLEM
– The Company

VPC is one of the largest electronic
equipment manufacturers in Turkey.
They produce approximately 35-40
thousand PCBs per day, and 1.5-2 million
PCBs per month.
70-80 thousand PCBs are scrapped every
month.
They would like to minimize PCB failures.

– The Products

Final products:
DVD player/recorder, DivX player, AV receiver, digital
satellite receiver, digital TV receiver, digital media
adapter
Component of interest:
Various PCBs (Printed Circuit
Boards)= Board+Integrated
Circuits+Resistors+Capacitors+
Diots


Which defect types do occur together?
What are the root causes of the defects?
Do suppliers affect the defects?
Do defects occur at certain locations on
the board?

Data Mining Software Used in the Project

SPSS Clementine
Matlab
Statistica QC Miner
MARS

Casting Process

MOLDING LINE

CORE SHOP METU-IE and TU/e-OPAC Workshop
FETTLING SHOP
MELTING

RKN’s Quality Objectives
Decrease percentage of defective items by
choice of process parameters
Priorities:
products suffering from high percentage of
defects
products of larger share in the total tonnage
although with lower percent defectives
Decrease percentage of products returns
because of the defects determined by
customers

Objectives
Decrease the proportion of defective items (to a certain
target value)
Identify the most important process parameters affecting
quality
Finding the ranges of these parameters to operate
(future direction)
Optimizing the proportion of defective items (future
consideration)

Perkins021 Cylinder Head

Perkins 021 cylinder head is
one of the two products
chosen for the analysis from
the second casting plant
Reason:
Having problems with Perkins Cylinder Head

Availability of the data
Volume of the data

Data Collection
Data in RKN come from several processes
and different time periods.
Weekly
Daily
Hourly
Most of the data come from
Core shop
Molding
Melting

Data Collection (Cont...)
Lot: total production in a day (one or more shifts)
Daily records consist of the total volume of
production, total count of defective products and
the distribution of defect types
Response variables recorded are:
total number of defective products
number of defective products for 19 defect types
number of defective products returned by the customer
(newly added)

Data of Core Shop
Cores are produced according to a
weekly production plan
Cores used for a product are ready one
or two days before use
Specific core usage in a shift cannot be
identified accurately
Production may stop for a while and even
the cores from 3 or more days in the past
can be put to use arbitrarily

The Data
5 month’s production data
Number of records : 95 (averages of 95 days)
Input : real (47)
Output : discrete (8)
Can be transformed to binary, nominal or ordinal variables if
needed
Some missing data
AFTER PREPROCESSING
6 real uncorrelated response variables (proportions of
defect types) + 1 total response (proportion of defective
items)
36 real feature (predictor) variables
92 observations

Problem Settings
k Đ
features responses
x1 x2 y1 y2
126,00 135,00 1 0
120,00 140,00 1 0
110,00 120,00 1 0
102,00 131,00 1 0
130,00 125,00 1 0
285,00 115,00 0 0
296,00 140,00 0 0
275,00 129,00 0 0
260,00 128,00 0 0 Univariate
j 280,00
106,00
105,00
306,00
0
0
0
1
Modeling
obs. 113,00 308,00 0 1 vs
122,00 306,00 0 1
128,00 329,00 0 1
Multivariate
145,00 334,00 0 1
287,00 329,00 1 1 Modeling
279,00 324,00 1 1
291,00 335,00 1 1
260,00 340,00 1 1
270,00 321,00 1 1

Univariate Decision Tree Methodology –
CART (Continuous data)
DECISION TREE MODEL (LEAST) SQUARE DEVIATION
1
R (t ) = ∑ (y i − y ( t )) 2
N (t ) i∈ t

IMPURITY MEASURE

Φ(s,t) = R(t) − pLR(tL ) − pRR(tR )

A TYPICAL RULE GENERATED
IF X 22 > 13 .275 AND X 9 > 3 . 095
THEN % Y 6 = 0 .006 ( Support = 48 / 92 )

Research Questions
Can we reduce problem dimension by extracting
important features only?
Is there any relation between defect types and process
parameters?
Do the important factors for different defect types
interact?
Are there significant changes in process parameter when
a defect rate is high or low?
Which process parameter levels are better in reducing
the defects?
Is there any period when high defect rates occur
specifically?
Is there any pattern in the sequence of defect type
occurences?

Feature Reduction

Feature selection
Decision trees
PCA

Univariate Decision Tree Methodology – Nominal data
Number of records: 748
Analysis Accuracy: 93.45%
inputs: x32, x12, x22, x13, x2, x19, x10, x9, x36, x8, x28
Tree depth: 9

Results for output field y
Comparing $C-y with y
'Partition' 1_Training 2_Testing
Correct 699 93.45% 294 92.74%
Wrong 49 6.55% 23 7.26%
Total 748 317

Coincidence Matrix for $C-y (rows show actuals)
'Partition' = 1_Training 0.000000 1.000000 2.000000
0.000000 49 0 3 %94.2
1.000000 0 224 19 %92.1
2.000000 0 27 426 %94
'Partition' = 2_Testing 0.000000 1.000000 2.000000
0.000000 18 0 2
1.000000 0 115 4
2.000000 0 17 161

Conclusion of the Casting Work

DT induced rules were instrumental in
planning new controlled experiments
Process optimization may be sought based
upon these field experiments
DT induced rules may also be used to set
tolerance levels for the uncontrollable
features (variables)

Suggested Factor Levels
Pertinent
Fact contoll Adjusted Suggested Defect
or able? Setting Observed Range Trial Range Types Suggested Mean Setting
x2 H [15, 30] [20, 28] [23, 28] (y2),(y3),(y6),(y8) mümkünse [23, 28]
x3 H [15, 30] [30, 40] [31, 37.5] y1,y3 mümkünse [31, 37.5]
x4 E [13, 15] [12.171, 13.678] [12.295, 13.678] y1 sabit [12.295, 13.678]
x5 E [14, 16] [12.27, 13.66] [12.27, 13.165] y8 sabit [12.27, 13.165]
x6 E [7.5, 9.5] [7.585, 8.25] [7.917, 8.25] y8 sabit [7.917, 8.25]
x8 E [35, 42] [21.75, 42] [21.75, 35] y3, (y2) sabit [21.75, 35]
x9 E [3, 3.5] [2.98, 3.387] yok y2, y3, y6, y8 3 seviye [3.183, 3.216], [3.216, 3.26], [3.26, 3.387]
x11 E [18, 23] [19.8, 22.9] [20.339, 22.9] y3 sabit [20.339, 22.9]

x12 E [250, 400] [290, 360] [350, 360] y2 sabit [350, 360], olmazsa [305, 360]
x14 E [3.5, 5.5] [4.7, 5.2] [4.724, 5.2] y2 sabit [4.724, 5.2]
x16 H [11, 23] [13.2, 30] [15.86, 30] y1, (y2) mümkünse [15.86, 30]
x17 H [11, 23] [15.9, 31.5] [26.55, 31.5] y1 mümkünse [26.55, 31.5]
x19 H [11, 23] [14.1, 24.9] yok y2 kendi seyrine bırakılacak
x20 E 40 [38.992, 42.85] [38.992, 41.32] y3 sabit [38.992, 41.32]
x21 E 50 [48.68, 52.71] [49.181, 52.71] y9 sabit [49.181, 52.71]
28 marta kadar = 12 28 marta kadar: [10.85, 14,35] 4 seviye [10.85, 13.125], [12.275, 14.35], [14.35,
x22 E 31 marttan sonra = 22 31marttan sonra: [20.05, 33.428] yok y1,y2,y3,y6 17.2], [17.2, 33.42]
x25 H aralık yok [2.5, 6.9] [2.5, 6.533] y8 mümkünse [2.5, 6.533]
x26 E [1420, 1430] [1367.59, 1428.23] [1367.59, 1425.98] y8, y9 sabit [1367.59, 1425.98]
x27 H aralık yok [2.259, 4.95] [2.259, 4.2] y2, (y3) mümkünse [2.259, 4.2]
x28 H aralık yok [11.7, 16.9] yok y3, y6 kendi seyrine bırakılacak
y1,y3,y6, 3 levels [3.208, 3.304],
x29 YES [3.2, 3.35] [3.208, 3.41] NOT AVAIL y8 [3.304, 3.325], [3.355, 3.41]
x30 E [1.85, 2] [1.823, 2] yok y1,y2,y3 2 seviye [1.823, 1.88], [1.88, 2]
x32 E [0.2, 0.3] [0.171, 0.283] yok y1,y2 2 seviye [0.171, 0.184], [0.184, 0.283]
June 2007
x33 E maximum 0.3 [0.0767, 0.552] METU-IE and[0.174, 0.552] Workshop
TU/e-OPAC y2 sabit [0.174, 0.552]
x35 E [0.08, .12] [0.0762, 0.1122] [0.088, 0.1122] y1 sabit [0.088, 0.1122]


Questionnairre data
80 observations/subjects
28-88 input variables (age, sex, distance
travelled, anthropometric measures, ease of use,
attractives, etc.)
1-53 output variables (back comfort, tigh comfort,
overall satisfaction, ease of use, attractiveness,
etc.)

Rules for customer satisfaction
Rule for 7 / 7 (very satisfied) (support=4; confidence=1.0)
If
Lumbar ache after driving for a long time = 0 and
Video gray as a seat cover design = 1 and
Accept to pay more for the seat belt sensor = 0 and
Adequate support by the seat cushion = 1 then
7,0 (very satisfied)

Rule for 6 / 7 (satisfied) (support=10; confidence=1.0)
If
Video gray as a seat cover design = 1 and
Accept to pay more for the seat belt sensor = 0 then
6,0 (satisfied)

Rule for 4 / 7 (normal) (support=8; confidence=0.75)
If
Easy reach to the lumbar support adjustment =0 then
4.0 (normal)

Neural Network Modeling - General

A neural network (NN) is an interconnected group of artificial neurons that uses
a mathematical or computational model for information processing based on a
connectionist approach to computation.

Incorporates learning rather than programming and parallel rather than
sequential processing.

Neural networks resemble the human brain in two respects:
The network acquires knowledge from its environment using a learning process
(algorithm)

Synaptic weights, which are inter-neuron connection strengths, are used to store the
learned information.

General Topology

Hidden layers
Output layer
Input layer

Inside the Node
A node
Components:
Receives n-inputs
Weights
Compute net input according to base
Base function (summing unit)
function
Activation function
Applies activation function to the net
input
Bias
Outputs result
b
x1 w1 Activation
function
net Output
x2
.
w2 ∑ f(net)
y
Input
Base
values .
function
. nodei
Xm wm
weights

Properties

Capabilities
Fault tolerance
Robustness
Non-linear mapping
Learning and generalization
Optimization
Issues
Number of source nodes
Number of hidden layers
Number of hidden nodes per hidden layer
Training data (Too much…..overfitting, too little……inaccurate
classification)
Number of classes(sink)
Interconnections
Activation function
Learning technique
Stopping criteria

Application 1:
Classification of quality in Casting
Data:
36 input variable (continuous)
1 output variable (categorical with 3 levels – 1: first defect type exists, 2:
second defect type exists, 0: none of these two defect types exist)
Partition: Training -> 70%, Testing -> 30%
Learning rule: Back-propagation
Network Topology
Input layer (36 neurons)
Hidden layer (6 neurons)
Output layer (1 neuron)
To prevent overfitting, training set was divided again into training and testing set
(partitioning the partition), trained on training set, and error is evaluated on the
test set at each cycle

Results
COINCIDENCE MATRIX FOR PREDICTED CATEGORIES

Overall predicted accuracy Training 0 1 2
0 33 0 3
Training: 92,56%
1 0 158 13
Testing: 87,01%
2 0 27 344
Testing
0 18 0 0
1 0 51 11
2 0 19 132

GAIN CHART

Application 2: Prediction of quality in
Casting
Data:
36 input variable (continuous)
1 output variable (percentage of defectives for a certain defect type)
Partition: Training -> 70%, Testing -> 30%
Learning rule: Back-propagation
Method: Exhaustive prune (finds the best topology)
Final Network Topology
Input layer (36 neurons)
First hidden layer (25 neurons)
Second hidden layer (17 neurons)
Output layer (1 neuron)

Results

Estimated accuracy: 99.95%
Training results are slightly better than
testing results (overfitting)

Statistics

Conclusion

Neural networks can be used for both
classification and prediction
Unlike decision trees, neural networks are
black-box models
To decide on best production regions,
further study may be needed (simulation,
DOE, etc).

CLUSTERING - General
Clustering of data is a method by which large sets of data
is grouped into clusters of smaller sets of similar data.

The example below demonstrates the clustering of balls

we see clustering is grouping data or dividing a
large data set into smaller data sets of some similarity.

Clustering Algorithms
A clustering algorithm attempts to find natural groups of
components (or data) based on some similarity

Clustering algorithms find k clusters so that the objects of
one cluster are similar to each other whereas objects of
different clusters are dissimilar.

Taxonomy of Clustering Approaches

Hierarchical vs. Partitional

A hierarchical algorithm partitions the data set in a nested
manner into clusters which are either disjoint or included
one into another. These algorithms are either
agglomerative or divisive according to the algorithmic
structure and the operation they carried on.

A partitional method assumes that the number of clusters
to be found is already given and then it looks for the
optimal partition based on the objective function.

Nonsmooth Optimization

Most cases of clustering problems are reduced to solving
nonsmooth optimization problems.
Nonsmooth Optimization Problem:
minimize
subject to
: is nonsmooth at many points of interest
does not have a conventional derivative at these points.
A less restrictive class of assumptions for than
smoothness: convexity and Lipschitzness.

Cluster Analysis via Nonsmooth Opt.

Given instances

Problem:

This is a clustering problem with the partitioning method. We will
reformulate this as a nonsmooth optimization problem.

Cluster Analysis via Nonsmooth Opt. Cont’d

k is the number of clusters (given),
m is the number of instances (given),

is the j-th cluster’s center (to be found),
association weight of instance , cluster j (to be
found):

( ) is an matrix,

objective function has many local minima.


if k is not given a priori
Start from a small enough number of clusters k and
gradually increase the number of clusters for the
analysis until a certain stopping criteria is met.
This means: If the solution of the corresponding
optimization problem is not satisfactory, the decision
maker needs to consider a problem with k + 1 clusters,
etc..
This implies: One needs to solve repeatedly arising
optimization problems with different values of k - a task
even more challenging.


Reformulated Problem:

• A complicated objective function: nonsmooth and nonconvex.
The number of variables in the reformulated nonsmooth
optimization problem above is k×n, before it was (m+n)×k.
• This problem can be solved by related nonsmooth methods
(e.g., Semidefinite Programming, discrete gradient method).

Clustering Analysis on RKN Casting Data

We used k-means, PAM (Partitioning Around Medoids) and k-
means improved by Nonsmooth Optimization to identify
homogenous groups in the data.
k-Means: The grouping is done by minimizing the sum of squares
of distances between data and the corresponding cluster centroid.
PAM: A medoid is an object of the cluster, whose average
distance to all the objects in the cluster is minimal.
k-Means improved by Nonsmooth Optimization: k-means
algorithm that solves a nonsmooth optimization subproblem for
calculating the starting point for the k-th cluster center.

Results
k-Means:
k=2, cluster 1: 70 obj., cluster 2: 22 obj.
k=3, cluster 1: 68 obj., cluster 2: 22 obj., cluster 3: 2 obj.
k=4, cluster 1: 68 obj., cluster 2: 16 obj., cluster 3: 6obj., cluster 4: 2 obj.

PAM:
k=4, cluster 1: 20 obj., cluster 2: 34 obj., cluster 3: 25 obj., cluster 4: 13 obj.

k-means improved by Nonsmooth Optimization:
k=4, cluster 1: 45 obj., cluster 2: 24 obj., cluster 3: 2 obj., cluster 4: 21 obj.

Results
PAM Clusters
1 2 3 4 Total

K-Means 1 20 12 25 13 70
Clusters 2 0 22 0 0 22
Total 20 34 25 13 92

k-means improved by
Nonsmooth
Optimization Clusters Total

1 2
k-Means 1 61 9 70
Clusters 2 0 22 22
Total 61 31 92

Results

In the tables above, we showed the relations between
different clustering results. Optimal partitioning with PAM is
obtained for k=4, however for others k=2 gives the best
results. For k=3 and k=4 with k-means, the clusters of 2
and 6 objects are artificial.

These results match with our preprocessing studies
(Cathrene Sugar’s “jump method” and PCA) which
suggested that k is 2 or 4 in our data.

Jump Method and PCA
Transformed distortion

Cluster

Association Analysis
Association rule mining searches for interesting
relationships among the features in a given data
set.
A typical example of association rule mining is
“market basket analysis”.
This process analyzes customer buying habits by
finding associations between the different items
that customers place in their “shopping baskets”

Support and Confidence
• Association rules are statements in the form of
IF antecedent(s) THEN consequent(s)
where antecedent(s) and consequent(s) are disjoint
conjunctions of feature-value pairs.
• Two common measures, support and confidence, are used
to evaluate extracted rules
• For a rule defined as X=>Y
• The support of the rule is the joint probability of X and Y,
Pr(X and Y).
• The confidence of the rule is the conditional probability of Y given
X, Pr(Y|X)

PCB Manufacturing Data in
Transactional Format
In this format, a single board can be seen in more than one
rows, each of which represent different operation performed
on this product
Serial number can be used as the transaction ID which
distinguishes different products
Attributes (variables) of the boards:
Product type
Description of the failure (failure observed during the final
electrical test)
Root cause (cause of the failure identified during the repair)
Location of the root cause
Board type
Supplier
Operation line failure is detected
Date and time

Attributes
11 types of PCB
38 possible failures (e.g., display error, software
error, no audio, etc.)
13 possible root causes (e.g., chip without solder,
resistance is upright, short circuit, etc.)
Location of the root cause on the board
9 board types
6 different suppliers

Application: PCB Manufacturing
Sample records from PCB manufacturing data

Board Type serial supplier Failure reason-of-failure Location
1 2459 GOODBOARD display error no solder U45 6.PIN
1 736 TATCHUN-GIA TZOONG AUX1 error short circuit U8 2.PIN
4 990 GIA TZOONG device-not-work sw L71
3 700 TATCHUN-GIA TZOONG display error short circuit R407
6 712 ÜNAL ELEKTRONĐK rgb-cvbs error flash error R412
2 1411 GOODBOARD sw error upright K23
2 663 GOODBOARD-TATCHUN AUX1 error no solder C130
7 627 UNIWELL ELECTRONIC audio error upside-down B353
4 1169 GOODBOARD sw error sw U6

Possible Applications of Association Analysis

Identifying failure types taken place on the
same board together.
Association of failures with root cause.
Association of failures with suppliers.
Identifying failures occuring in sequence.
Association of failures with the location of
the root cause on the board

Identifying failure types occured on the
same board together

“device-not-functioning” => “flash-
not-loading” (%25, %73)

“flash-not-loading” => “display error”
(%36, %86)

“AUX1 error” AND “feed error” => “ audio
error” (%32, %61)

Association of failures with root causes

“upright” AND “Location” = Chip =>
“audio error” (%46, %82)

“no solder” => “device-not-functioning”
(%18, %100)

Association of failures with suppliers

“GOODBOARD” => “display error” (%23,
%57)

“UNIWELL” AND “GOODBOARD” =>
“feed error” (%18, %53)

Identifying failures dependent on the
sequence of operations
Line 1 = “AUX1 error” => Line 5 = “feed
error” (% 22, % 48)

Association of failures with the location of the
root cause on the board

“device-not-functioning” => Location =
“resistance” (%56, %76)

“flash-not-loading” => Location = “U8
2.PIN” (%43, %66)

Regression Approaches

MULTIPLE LINEAR REGRESSION (MLR)

NONLINEAR REGRESSION (NLR)

GENERAL LINEAR MODELS (GLM)

GENERALIZED LINEAR MODELS (GLZ)

ADDITIVE MODELS

GENERALIZED ADDITIVE MODELS (GAM)

ROBUST REGRESSION

CONCLUSION
Tough QI problems with several input and output
variables can be handled effectively with DM
approaches.
Observational or experimental data, preferentially
voluminous data are needed.
Online data collection systems might need to be
installed
Data quality and pre-processing are crucial
Many tools seem to be difficult to apply in practice for
industry people (advanced training might be necessary)
Results in the form of rules are found useful and
interesting by the industry

FUTURE WORK
Continue collecting different data sets for different
QI problems, and applications on them
Also apply other DM approaches such as linear /
robust regression, fuzzy clustering / regression and
rough set theory.
Compare performances.
Develop new / improved DM algorithms for solving
the QI problems.
Multi-response decision tree modeling
Non-smooth optimization for categorical quality
responses
Improved MARS with Tikhonov regularization

PAPERS AND PRESENTATIONS
FROM THE PROJECT
Bakır, B., Batmaz, Đ., Güntürkün, F.A., Đpekçi, Đ.A., Köksal, G., and Özdemirel,
N.E., Defect Cause Modeling with Decision Tree and Regression Analysis,
Proceedings of XVII. International Conference on Computer and
Information Science and Engineering, Cairo, Egypt, December 08-10,
2006, Volume 17, pp. 266-269, ISBN 975-00803-7-8.

Đpekçi, A.Đ., Bakır, B., Batmaz, Đ., Testik, M.C., and Özdemirel, N.E., Defect
Cause Modeling with Data Mining: Decision Trees and Neural Networks, to
appear in Proceedings of 56th Session of the 1st International Statistical
Institute, Lisbon, Potugal, August 22-29, 2007.

Akteke-Öztürk, B. and Weber, G. W., "A Survey and Results on Semidefinite
and Nonsmooth Optimization for Minimum Sum of Squared Distances
Problem", Technical Report, 2007.

Öztürk-Akteke, B., Weber, G.W., Kayalıgil, S., Kalite Đyileştirmede Veri
Kümeleme: Döküm Endüstrisinde Bir Uygulama, Yöneylem Araştırması ve
Endüstri Mühendisliği 27. Ulusal Kongresi (YA/EM 2007), Đzmir, Türkiye,
Temmuz 02-04, 2007.

FROM THE PROJECT (cont.d)
Session TC-38: Tutorial Session: Data Mining
Applications in Quality Improvement
22nd European Conference on Operational
Research, Prague, July 7-11, 2007

Köksal, G., Testik, M.C., Güntürkün, F.A., Batmaz, Đ.,
Data Mining Applications in Quality Improvement: A
Tutorial and a Literature Review
Đpekçi, A.Đ., Köksal, G., Karasakal, E., Özdemirel, N.E.,
Testik, M.C., Multi Response Decision Tree Approach
Applied To A Discrete Manufacturing Quality
Improvement Problem

FROM THE PROJECT (cont.d)
Köksal, G., Testik, M.C., Güntürkün, F.A., Batmaz, Đ.,
Kalite Đyileştirmede Veri Madenciliği Yaklaşımları ve Bir
Uygulama, 16th National Quality Congress, November
12, 2007, Đstanbul.

Quality Control and Improvement in Manufacturing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Quality Control and Improvement in Manufacturing

Similar to Quality Control and Improvement in Manufacturing (20)

More from SSA KPI

More from SSA KPI (20)

Recently uploaded

Recently uploaded (20)

Quality Control and Improvement in Manufacturing