School of Information and Mechatronics 
Signal and Image Processing Laboratory 
Wook-Jin Choi
• Introduction 
• Lung Segmentation 
• Nodule Candidates Detection 
• Optimal Fuzzy Rule-based Pruning 
• Experimental Results 
• Conclusions 
2
• Lung cancer is the leading cause of cancer 
deaths. 
• Most patients diagnosed with lung cancer 
already have advanced disease 
– 40% are stage IV and 30% are III 
– The current five-year survival rate is only 16% 
• Defective nodules are detected at an early 
stage 
– The survival rate can be increased 
3
• Early detection of lung nodules is 
extremely important for the diagnosis and 
clinical management of lung cancer 
• Lung cancer had been commonly detected 
and diagnosed on chest radiography 
• Since the early 1990s CT has been 
reported to improve detection and 
characterization of pulmonary nodules 
4
• CT was introduced in 1971 
– Sir Godfrey Hounsfield, United Kingdom 
• CT utilize computer-processed X-rays 
– to produce tomographic images or 'slices' of specific 
areas of the body 
• The Hounsfield unit (HU) scale is a linear 
transformation of the original linear attenuation 
coefficient measurement into one in which the 
radio density of distilled water 
5 
  
x water 1000 
 
water 
HU  
6 
Substance HU 
Air −1000 
Lung −500 
Fat −84 
Water 0 
Cerebrospinal Fluid 15 
Blood +30 to +45 
Muscle +40 
Soft Tissue, Contrast Agent +100 to +300 
Bone +700(cancellous bone)to +3000 (dense bone) 
The HU of common substances 
Nodule
• Lung cancer screening is currently implemented 
using low-dose CT examinations 
• Advanced in CT technology 
– Rapid image acquisition with thinner image sections 
– Reduced motion artifacts and improved spatial 
resolution 
• The typical examination generates large-volume 
data sets 
• These large data sets must be evaluated by a 
radiologist 
– A fatiguing process 
7
• The use of pulmonary nodule detection CAD 
system can provide an effective solution 
• CAD system can assist radiologists by increasing 
efficiency and potentially improving nodule 
detection 
8 
General structure of pulmonary nodule detection system
CAD systems Lung segmentation Nodule Candidate Detection False Positive Reduction 
Suzuki et al.(2003)[26] Thresholding Multiple thresholding MTANN 
Rubin et al.(2005)[27] Thresholding Surface normal overlap 
Lantern transform and rule-ba 
sed classifier 
Dehmeshki et al.(2007)[28] Adaptive thresholding Shape-based GATM Rule-based filtering 
Suarez-Cuenca et al.(2009)[29] 
Thresholding and 3-D connec 
ted component labeling 
3-D iris filtering 
Multiple rule-based LDA classi 
fier 
Golosio et al.(2009)[30] Isosurface-triangulation Multiple thresholding Neural network 
Ye et al.(2009)[31] 
3-D adaptive fuzzy segmenta 
tion 
Shape based detection 
Rule-based filtering and weig 
hted SVM classifier 
Sousa et al.(2010)[32] Region growing Structure extraction SVM classifier 
Messay et al.(2010)[33] 
Thresholding and 3-D connec 
ted component labeling 
Multiple thresholding and mo 
rphological opening 
Fisher linear discriminant and 
quadratic classifier 
Riccardi et al.(2011)[34] Iterative thresholding 
3-D fast radial filtering and sc 
ale space analysis 
Zernike MIP classification bas 
ed on SVM 
Cascio et al.(2012)[35] Region growing Mass-spring model 
Double-threshold cut and neu 
ral network 
9
• Thresholding 
– Fixed threshold 
– Optimal threshold 
– 3-D adaptive fuzzy thresholding 
• Lung region extraction 
– 3-D connectivity with seed point 
– 3-D connected component 
labeling 
• Contour correction 
– Morphological dilation 
– Rolling ball algorithm 
– Chain code representation 
10
• A fixed threshold is applicable to segment lung 
area 
– The intensity ranges of images are varied by different 
acquisition protocols 
• To obtain optimal threshold 
– Iterative approach continues until the threshold 
converges 
– The initial threshold : 
– is i th threshold and new threshold as 
11 
T(0)  500HU 
    
T 
( i 1) 
o b 2 
 
(i) T
12 
Input CT images, their intensity histograms, and thresholded images
• White areas 
– non-body voxels 
– including lung cavity 
• Black areas 
– body voxels 
– excluding lung region 
• Lung regions are 
extracted from the non-body 
voxels by using 3- 
D connected 
component labeling 
13 
18-connectivity voxels
14 
Labeled images after applying 3-D connected component labeling
• To extract lung volume 
– Remove rim attached to boundaries of image 
– The first and the second largest volumes are 
selected as the lung region 
• The lung region contains small holes 
– To remove these holes 
– Morphological hole filling operations are applied 
15 
Slung  l first | lsecond
16 
Binary images of the selected lung region 
Lung mask images after hole filling
• The contour of the lung volume is needed to 
correct 
– To include wall side nodule (juxta-pleural nodule) 
17 
Extracted lung region using 3D connected component labeling and contour 
corrected lung region (containing wall side nodule)
18 
Contour correction using chain-code representation
19
20
• Detection of nodule 
candidates is important 
• The performance of nodule 
detection system relies on 
the accuracy of candidate 
detection 
• ROI extraction 
– Optimal multi-thresholding 
• Nodule candidates 
detection and segmentation 
– Rule-based pruning 
21
• The traditional multi-thresholding method 
needs many steps of grey levels 
• An iterative approach is applied to select 
the threshold value 
    
i o b T 
• The optimal threshold value is calculated 
on median slice of lung CT scan 
22 
( 1) 
2 

• The optimal threshold value 
– A base threshold for multi-thresholding 
• Additional six threshold values are obtained 
– Base threshold + 400,+ 300,+ 200,+ 100, - 100, 
and - 200 
23
24 
Optimal Fuzzy Rule 
based on GA 
ROIs Nodule 
Candidates
• Fuzzy rule based classifier removes vessels and noise 
• Vessel removing 
– Volume is extremely bigger than nodule 
– Elongated object 
• Noise removing 
– Radius of ROI is smaller than 3mm 
– Bigger than 30mm 
• Remaining ROIs are nodule candidates 
25 
Index Feature 
1 Area 
2 Diameter 
3 Circularity 
4 Volume 
5-8 Bounding Box Dimensions 
9 Elongation
26 
Rule Description 
R1 Small noise 
R2 Vessel 
R3 Large noise 
R4 Nodule 
Not precise 
Not optimal 
Pruning rules for nodule candidate detection
Input Fuzzy layer Rule layer Output 
Σ Y 
27 
R1 
R2 
R3 
GA based 
Fuzzy Rule 
Inducer 
X1 
X2 
X3 
X4 
X5 
F1 
F2 
F3 
F4 
F5 
Optimal fuzzy rules are induced by using GA-Fuzzy Inference System
• A fuzzy inference system (FIS) is a system 
that uses fuzzy set theory to map inputs 
(features in the case of fuzzy classification) 
to outputs (classes in the case of fuzzy 
classification). 
• Two FIS’s will be discussed here, the 
Mamdani and the Sugeno. 
28
29
30
(a) A fuzzy inference system and (b) a fuzzy inference system as neural network. 
31
• Input 
– Features extracted fromROIs 
• Fuzzy layer 
– Input features are fuzzified 
– Fuzzy membership function is optimized by GA 
• Rule layer 
– Fuzzified features are combined as a optimal fuzzy 
rule 
– Weight matrix for linear combination is optimized by 
GA 
• Output 
– Defuzzifipication of optimal fuzzy rules 
32
• Chromosome 
– Fuzzy membership function selection 
• Sigmoidal membership function 
• Negative sigmoidal membership function 
• Product of two sigmoidal membership functions 
• Gaussian membership function 
– Parameters of the selected fuzzy membership function 
• Fitness function 
– Subtraction between average membership degree of 
true and false data 
33 
d  t  f
• Chromosome 
– Weight matrix for linear combinations of 
fuzzified features 
• Fitness function 
– Balanced accuracy of classification results 
34 
TPR FPR 
(1 ) 
2 
BACC 
  

• To evaluate the performance of the proposed method, Lung Image 
Database Consortium (LIDC) database is applied 
• LIDC database, National Cancer Institute (NCI), United States 
– The LIDC is developing a publicly available database of thoracic 
computed tomography (CT) scans as a medical imaging research 
resource to promote the development of computer-aided 
detection or characterization of pulmonary nodules 
• The database consists of 84 CT scans (up to 2009) 
– 100-400 Digital Imaging and Communication (DICOM) images 
– An XML data file containing the physician annotations of nodules 
– 148 nodules 
– The pixel size in the database ranged from 0.5 to 0.76 mm 
– The reconstruction interval ranged from 1 to 3mm 
35
36 
False positives in ROIs: 72466 
Sensitivity 
False positive 
rate 
Accuracy 
Balanced 
Accuracy 
0.9800 0.6068 0.3965 0.6866 
Performance of conventional rule-based pruning 
False positives: 43970 
False positives per scan : 523.4524
37 
Elongation Circularity
38 
AUROC = 0.9711
39 
Fitness Sensitivity 
False positive 
rate 
Accuracy 
Balanced 
Accuracy 
AUROC 
1 0.8883 0.9825 0.2302 0.7709 0.8761 0.9711 
2 0.8892 0.9775 0.2148 0.7862 0.8813 0.9708 
3 0.8863 0.9900 0.2732 0.7282 0.8584 0.9699 
4 0.8874 0.9875 0.2515 0.7498 0.8680 0.9692 
5 0.8865 0.9900 0.2676 0.7338 0.8612 0.9737 
6 0.8871 0.9875 0.2562 0.7452 0.8657 0.9711 
7 0.8871 0.9900 0.2565 0.7449 0.8668 0.9745 
8 0.8882 0.9800 0.2332 0.7679 0.8734 0.9692 
9 0.8882 0.9875 0.2341 0.7672 0.8767 0.9708 
10 0.8885 0.9875 0.2291 0.7720 0.8792 0.9709 
mean 0.8877 0.9860 0.2446 0.7566 0.8707 0.9711 
std 0.0009 0.0044 0.0190 0.0189 0.0078 0.0017 
Performance of optimal fuzzy rule-based pruning 
False positives: 17728 
False positives per scan: 211.04
• Automated pulmonary nodule detection 
system is studied 
• Pulmonary nodule detection CAD system 
is an effective solution for early detection 
of lung cancer 
• The proposed method are based on 
optimal fuzzy rule 
• The optimal fuzzy rule pruned unwanted 
ROIs with higher sensitivity 
40
41
42

Optimal fuzzy rule based pulmonary nodule detection

  • 1.
    School of Informationand Mechatronics Signal and Image Processing Laboratory Wook-Jin Choi
  • 2.
    • Introduction •Lung Segmentation • Nodule Candidates Detection • Optimal Fuzzy Rule-based Pruning • Experimental Results • Conclusions 2
  • 3.
    • Lung canceris the leading cause of cancer deaths. • Most patients diagnosed with lung cancer already have advanced disease – 40% are stage IV and 30% are III – The current five-year survival rate is only 16% • Defective nodules are detected at an early stage – The survival rate can be increased 3
  • 4.
    • Early detectionof lung nodules is extremely important for the diagnosis and clinical management of lung cancer • Lung cancer had been commonly detected and diagnosed on chest radiography • Since the early 1990s CT has been reported to improve detection and characterization of pulmonary nodules 4
  • 5.
    • CT wasintroduced in 1971 – Sir Godfrey Hounsfield, United Kingdom • CT utilize computer-processed X-rays – to produce tomographic images or 'slices' of specific areas of the body • The Hounsfield unit (HU) scale is a linear transformation of the original linear attenuation coefficient measurement into one in which the radio density of distilled water 5   x water 1000  water HU  
  • 6.
    6 Substance HU Air −1000 Lung −500 Fat −84 Water 0 Cerebrospinal Fluid 15 Blood +30 to +45 Muscle +40 Soft Tissue, Contrast Agent +100 to +300 Bone +700(cancellous bone)to +3000 (dense bone) The HU of common substances Nodule
  • 7.
    • Lung cancerscreening is currently implemented using low-dose CT examinations • Advanced in CT technology – Rapid image acquisition with thinner image sections – Reduced motion artifacts and improved spatial resolution • The typical examination generates large-volume data sets • These large data sets must be evaluated by a radiologist – A fatiguing process 7
  • 8.
    • The useof pulmonary nodule detection CAD system can provide an effective solution • CAD system can assist radiologists by increasing efficiency and potentially improving nodule detection 8 General structure of pulmonary nodule detection system
  • 9.
    CAD systems Lungsegmentation Nodule Candidate Detection False Positive Reduction Suzuki et al.(2003)[26] Thresholding Multiple thresholding MTANN Rubin et al.(2005)[27] Thresholding Surface normal overlap Lantern transform and rule-ba sed classifier Dehmeshki et al.(2007)[28] Adaptive thresholding Shape-based GATM Rule-based filtering Suarez-Cuenca et al.(2009)[29] Thresholding and 3-D connec ted component labeling 3-D iris filtering Multiple rule-based LDA classi fier Golosio et al.(2009)[30] Isosurface-triangulation Multiple thresholding Neural network Ye et al.(2009)[31] 3-D adaptive fuzzy segmenta tion Shape based detection Rule-based filtering and weig hted SVM classifier Sousa et al.(2010)[32] Region growing Structure extraction SVM classifier Messay et al.(2010)[33] Thresholding and 3-D connec ted component labeling Multiple thresholding and mo rphological opening Fisher linear discriminant and quadratic classifier Riccardi et al.(2011)[34] Iterative thresholding 3-D fast radial filtering and sc ale space analysis Zernike MIP classification bas ed on SVM Cascio et al.(2012)[35] Region growing Mass-spring model Double-threshold cut and neu ral network 9
  • 10.
    • Thresholding –Fixed threshold – Optimal threshold – 3-D adaptive fuzzy thresholding • Lung region extraction – 3-D connectivity with seed point – 3-D connected component labeling • Contour correction – Morphological dilation – Rolling ball algorithm – Chain code representation 10
  • 11.
    • A fixedthreshold is applicable to segment lung area – The intensity ranges of images are varied by different acquisition protocols • To obtain optimal threshold – Iterative approach continues until the threshold converges – The initial threshold : – is i th threshold and new threshold as 11 T(0)  500HU     T ( i 1) o b 2  (i) T
  • 12.
    12 Input CTimages, their intensity histograms, and thresholded images
  • 13.
    • White areas – non-body voxels – including lung cavity • Black areas – body voxels – excluding lung region • Lung regions are extracted from the non-body voxels by using 3- D connected component labeling 13 18-connectivity voxels
  • 14.
    14 Labeled imagesafter applying 3-D connected component labeling
  • 15.
    • To extractlung volume – Remove rim attached to boundaries of image – The first and the second largest volumes are selected as the lung region • The lung region contains small holes – To remove these holes – Morphological hole filling operations are applied 15 Slung  l first | lsecond
  • 16.
    16 Binary imagesof the selected lung region Lung mask images after hole filling
  • 17.
    • The contourof the lung volume is needed to correct – To include wall side nodule (juxta-pleural nodule) 17 Extracted lung region using 3D connected component labeling and contour corrected lung region (containing wall side nodule)
  • 18.
    18 Contour correctionusing chain-code representation
  • 19.
  • 20.
  • 21.
    • Detection ofnodule candidates is important • The performance of nodule detection system relies on the accuracy of candidate detection • ROI extraction – Optimal multi-thresholding • Nodule candidates detection and segmentation – Rule-based pruning 21
  • 22.
    • The traditionalmulti-thresholding method needs many steps of grey levels • An iterative approach is applied to select the threshold value     i o b T • The optimal threshold value is calculated on median slice of lung CT scan 22 ( 1) 2 
  • 23.
    • The optimalthreshold value – A base threshold for multi-thresholding • Additional six threshold values are obtained – Base threshold + 400,+ 300,+ 200,+ 100, - 100, and - 200 23
  • 24.
    24 Optimal FuzzyRule based on GA ROIs Nodule Candidates
  • 25.
    • Fuzzy rulebased classifier removes vessels and noise • Vessel removing – Volume is extremely bigger than nodule – Elongated object • Noise removing – Radius of ROI is smaller than 3mm – Bigger than 30mm • Remaining ROIs are nodule candidates 25 Index Feature 1 Area 2 Diameter 3 Circularity 4 Volume 5-8 Bounding Box Dimensions 9 Elongation
  • 26.
    26 Rule Description R1 Small noise R2 Vessel R3 Large noise R4 Nodule Not precise Not optimal Pruning rules for nodule candidate detection
  • 27.
    Input Fuzzy layerRule layer Output Σ Y 27 R1 R2 R3 GA based Fuzzy Rule Inducer X1 X2 X3 X4 X5 F1 F2 F3 F4 F5 Optimal fuzzy rules are induced by using GA-Fuzzy Inference System
  • 28.
    • A fuzzyinference system (FIS) is a system that uses fuzzy set theory to map inputs (features in the case of fuzzy classification) to outputs (classes in the case of fuzzy classification). • Two FIS’s will be discussed here, the Mamdani and the Sugeno. 28
  • 29.
  • 30.
  • 31.
    (a) A fuzzyinference system and (b) a fuzzy inference system as neural network. 31
  • 32.
    • Input –Features extracted fromROIs • Fuzzy layer – Input features are fuzzified – Fuzzy membership function is optimized by GA • Rule layer – Fuzzified features are combined as a optimal fuzzy rule – Weight matrix for linear combination is optimized by GA • Output – Defuzzifipication of optimal fuzzy rules 32
  • 33.
    • Chromosome –Fuzzy membership function selection • Sigmoidal membership function • Negative sigmoidal membership function • Product of two sigmoidal membership functions • Gaussian membership function – Parameters of the selected fuzzy membership function • Fitness function – Subtraction between average membership degree of true and false data 33 d  t  f
  • 34.
    • Chromosome –Weight matrix for linear combinations of fuzzified features • Fitness function – Balanced accuracy of classification results 34 TPR FPR (1 ) 2 BACC   
  • 35.
    • To evaluatethe performance of the proposed method, Lung Image Database Consortium (LIDC) database is applied • LIDC database, National Cancer Institute (NCI), United States – The LIDC is developing a publicly available database of thoracic computed tomography (CT) scans as a medical imaging research resource to promote the development of computer-aided detection or characterization of pulmonary nodules • The database consists of 84 CT scans (up to 2009) – 100-400 Digital Imaging and Communication (DICOM) images – An XML data file containing the physician annotations of nodules – 148 nodules – The pixel size in the database ranged from 0.5 to 0.76 mm – The reconstruction interval ranged from 1 to 3mm 35
  • 36.
    36 False positivesin ROIs: 72466 Sensitivity False positive rate Accuracy Balanced Accuracy 0.9800 0.6068 0.3965 0.6866 Performance of conventional rule-based pruning False positives: 43970 False positives per scan : 523.4524
  • 37.
  • 38.
    38 AUROC =0.9711
  • 39.
    39 Fitness Sensitivity False positive rate Accuracy Balanced Accuracy AUROC 1 0.8883 0.9825 0.2302 0.7709 0.8761 0.9711 2 0.8892 0.9775 0.2148 0.7862 0.8813 0.9708 3 0.8863 0.9900 0.2732 0.7282 0.8584 0.9699 4 0.8874 0.9875 0.2515 0.7498 0.8680 0.9692 5 0.8865 0.9900 0.2676 0.7338 0.8612 0.9737 6 0.8871 0.9875 0.2562 0.7452 0.8657 0.9711 7 0.8871 0.9900 0.2565 0.7449 0.8668 0.9745 8 0.8882 0.9800 0.2332 0.7679 0.8734 0.9692 9 0.8882 0.9875 0.2341 0.7672 0.8767 0.9708 10 0.8885 0.9875 0.2291 0.7720 0.8792 0.9709 mean 0.8877 0.9860 0.2446 0.7566 0.8707 0.9711 std 0.0009 0.0044 0.0190 0.0189 0.0078 0.0017 Performance of optimal fuzzy rule-based pruning False positives: 17728 False positives per scan: 211.04
  • 40.
    • Automated pulmonarynodule detection system is studied • Pulmonary nodule detection CAD system is an effective solution for early detection of lung cancer • The proposed method are based on optimal fuzzy rule • The optimal fuzzy rule pruned unwanted ROIs with higher sensitivity 40
  • 41.
  • 42.

Editor's Notes

  • #2 Good afternoon everyone. My name is Wook-Jin Choi. It is my honor to present to you. Today, I would like to talk about Automatic Detection of Pulmonary Nodules in Lung CT Images
  • #3 Here you see the outline of my presentation
  • #4 Lung cancer is the primary cause of cancer related death in the world. Most patients diagnosed with lung cancer today already have advanced disease (40\% are stage IV, 30\% are stage III), and the current five-year survival rate is only 16\%. However, if defective nodules are detected at an early stage, the survival rate can be increased
  • #8 Multi detector scanners However, each scan contains hundreds of images that must be evaluated by a radiologist, which is a fatiguing process.
  • #9 Flow chart of Pulmonary nodule detection In this thesis, automated pulmonary nodule detection system is studied. In this regards, the nodule detection CAD systems, which are using genetic programming (GP)-based classifier, hierarchical block-image analysis, and shape-based feature descriptor, are proposed. The nodule detection system generally consists of three steps: lung segmentation, nodule candidate detection, and false positive reduction
  • #10 10 selected recent CAD system The performance of these systems will be compared with the proposed systems
  • #11 Lung volume segmentation is an essential preprocessing step The main purpose is to separate the voxels corresponding to the lung cavity The accuracy of lung segmentation largely influences the nodule detection results. the lung region extraction should be performed before any other part of nodule detection. To extract lung region, I propose a segmentation method based on adaptive thresholding and voxel labelling. Because lung region is dark, I convert the image to a binary with less than the selected threshold as foreground.
  • #13 After thresholding, there are many noisy parts likes gas in the intestine. End of thresholding
  • #15 End of labeling
  • #17 End of extraction
  • #18 In the end, I correct the contour of the lung volume because there may some nodules in wall side of the lung.
  • #21 The lung volume is correctly extracted from lung CT images by using the proposed segmentation method
  • #22 To detect nodule candidate, I need to extract ROI. So, optimal threshold and additional six thresholds are obtained I can get 7-stepped ROI
  • #26 The extracted ROIs have useless parts like blood vessels and small noise. So, I have to remove that. Vessel is long object and distributed in whole lung like a tree. After removing, the remaining ROIs are nodule candidates.
  • #31 One of the large problems with the Sugeno FIS is that there is no good intuitive method for determining the coefficients, p, q, and r. Also, the Sugeno has only crisp outputs which may not be what is desired in a given HCI application. Why then would you use a Sugeno FIS rather than a Mamdani FIS? The reason is that there are algorithms which can be used to automatically optimize the Sugeno FIS.