A two stage svm-based mammographic cbir for ca dx

A two-stage SVM-based
mammographic CBIR for CADx
L. TSOCHATZIDIS1
A. KARAHALIOU2
K. ZAGORIS1
S. SKIADOPOULOS2
N. ARIKIDIS2
L. COSTARIDOU2
I. PRATIKAKIS1
University of Patras 2
School of Medicine
Department of Medical Physics
Democritus University of Thrace 1
Department of Electrical and Computer Engineering
Visual Computing Group
TSOCHATZIDIS ET AL. A TWO-STAGE SVM-BASED MAMMOGRAPHIC CBIR FOR CADX, MICCAI-BIA 201519/10/2015 1

CADx in Mammography
Mammography is a dominant imaging
modality for early detection of breast cancer
Often, diagnosis leads to unnecessary
biopsies
Two types of CADx:
• Single-stage: Classification schemes for benign-
malignant discrimination
• Two-stage: Content-Based Image Retrieval that
feeds the diagnosis step which discriminates
between benign and malignant

Proposed CBIR-CAD System
CAD system that incorporates a CBIR step and a decision step
Retrieves similar images based on low-level image features
Margin specific CBIR
Diagnosis is based on the ranked lists produced by CBIR
Provides visual aid and enables consultation of previous cases, leading to increased confidence
into incorporating CAD-cued results

Margin-type classes
Circumscribed
Spiculated
Microlobulated
Ill defined (+Obscured)

CBIR-CAD’s pipeline
BENIGN / MALIGNANT

Semi-automatic Segmentation
The Dijkstra’s shortest path algorithm is exploited to obtain the optimal path between
sequential pairs of landmark points upon mass boundaries
A new cost function is proposed to avoid background correction techniques that may deform
mass contour and introduce additional adjustment parameters
𝑐𝑜𝑠𝑡 𝑃 𝑥, 𝑦 = 𝐼 𝑝 − 𝑔 𝐴𝐵 𝑃 𝑔 𝐴𝐵 𝑃 = 𝐼𝐴 + 𝐼 𝐵 − 𝐼𝐴
𝑑(𝐴, 𝑃)
𝑑(𝐴, 𝐵)
Arikidis, N., Skiadopoulos, S., Karahaliou, A., Kazantzi, A., Vassiou, K., Tsochatzidis, L., Pratikakis, I.,
Costaridou, L.: Shortest paths of mass contour estimates in mammography. In: MICCAI-BIA 2015

CBIR Architecture

Feature Extraction – Global Shape
Solidity factor: The degree that the shape deviates from its convex hull
𝑆𝑜𝑙𝑖𝑑𝑖𝑡𝑦 =
𝐴𝑟𝑒𝑎 𝑜𝑓 𝑚𝑎𝑠𝑠 (𝐴)
𝐴𝑟𝑒𝑎 𝑜𝑓 𝑖𝑡𝑠 𝑐𝑜𝑛𝑣𝑒𝑥 ℎ𝑢𝑙𝑙 (𝐻)
Compactness factor: The degree that a shape deviates from a perfect circle
𝐶𝑜𝑚𝑝𝑎𝑐𝑡𝑛𝑒𝑠𝑠 = 1 −
4𝜋𝐴2
𝑃2

Feature Extraction – Global Shape
Circumscribed Microlobulated Spiculated Ill-defined
Compactness=0.008
Solidity=0.99
Compactness=0.20
Solidity=0.92
Compactness=0.83
Solidity=0.32
Compactness=0.17
Solidity=0.93

Feature Extraction – DFT of NRL
Normalized Radial Length Function
1. The distance of each contour point to the
shape’s center of gravity
2. Normalized by the average radial length
3. Computation of Discrete Fourier Transform
coefficients

Feature Extraction - Texture-based
Rubber Band Straightening Transform (RBST)
•Unfolding the ribbon around the contour as a
flat image
•RBST Column → line segment normal to the
contour
•RBST Row → iso-distant to the contour paths
•Intensity profiles at every contour point along
a line segment normal to the contour

 RBST Image
 Sobel gradient magnitude operator
 Detected edge points

Extracted Features
•Distance between edge points of consecutive
columns
•Distance between edge points and middle row of
RBST image
•Magnitude of gradient on y-axis
•Gradient orientation divergence from vertical
direction
•Acutance (The sum of the difference of gray-level
values between pixels that are iso-distant from
either sides of the contour)
Mean and SD value of the above functions

Feature Name Circumscribed Ill-defined
Avg. dist. edge points 0.036 0.747
Avg. dist center row 0.323 0.865
SD dist. edge points 0.105 0.879
SD dist center row 0.077 0.952

CBIR Architecture

The SVM Layer – Support Vector
Machines
Binary Linear Classifiers
For non-linear problems: Projection of
samples to a higher dimensionality space.
Finds a hyper-plane that optimally separates
the two classes
Decision function:
𝑓 𝑥 = 𝑠𝑖𝑔𝑛(𝑤 ∙ 𝑥 + 𝑏)

The SVM Layer – Structure
An ensemble of binary SVM classifiers is
employed
One SVM for each class – Four SVMs in total
Each SVM outputs the participation level of a
sample in the corresponding class

CBIR Architecture

The Diagnosis Stage
GOAL: Provide the likelihood of malignancy for a query case.
•Based on the K most similar ROIs retrieved
•Similarity between query and an item: 𝑆 𝑞, 𝑥𝑖 =
1
𝑑 𝑞,𝑥 𝑖 +1
, 𝑑 𝑞, 𝑥𝑖 → 𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒
•Two decisions indices investigated:
1. 𝐷1 𝑞 = 𝑖=1
𝑁
𝑆(𝑞,𝑥 𝑖
𝑇𝑃
)
𝑖=1
𝑁 𝑆(𝑞,𝑥 𝑖
𝑇𝑃)+ 𝑖=1
𝑀 𝑆(𝑞,𝑥 𝑗
𝐹𝑃)
2. 𝐷2 q =
1
𝐾 𝑖=1
𝑁
𝑆 𝑞, 𝑥𝑖
𝑇𝑃
−
1
𝐾 𝑗=1
𝑀
𝑆 𝑞, 𝑥𝑗
𝐹𝑃

Experimental Results
Experiments on a dataset of total 400 mammograms from DDSM
Precise contour delineation from expert radiologist
CC and MLO views are treated independently
5-fold cross validation
Grid search for SVM and kernel parameters

Experimental Results – Evaluation
metrics
Precision at N (P@R): The percentage of correct images at the top-R places of the rank list
Mean Average Precision (MAP): Measures the overall performance of a query
𝐴𝑃 =
𝑘=1
𝑛
𝑃@𝑘 ∗ 𝑟𝑒𝑙 𝑘
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑠
𝑟𝑒𝑙 𝑘 =
1, 𝑖𝑓 𝑘_th 𝑖𝑚𝑎𝑔𝑒 𝑖𝑠 𝑐𝑜𝑟𝑟𝑒𝑐𝑡
0, 𝑒𝑙𝑠𝑒
𝑀𝐴𝑃 =
1
𝑁
𝑁
𝐴𝑃

Experimental Results - CBIR
Classes P@R MAP
Circumscribed 0.659 ± 0.061 0.732 ± 0.048
Microlobulated 0.805 ± 0.119 0.859 ± 0.092
Spiculated 0.823 ± 0.073 0.881 ± 0.049
Ill-defined 0.493 ± 0.036 0.586 ± 0.044
Average 𝟎. 𝟔𝟗𝟕 ± 𝟎. 𝟎𝟒𝟗 𝟎. 𝟕𝟔𝟓 ± 𝟎. 𝟎𝟑𝟕

Area under ROC curve (AUC)
The Receiver Operating Characteristic (ROC) curve illustrates the performance of a binary
classifier as its discrimination threshold is varied
The curve is created by plotting the true positive rate (sensitivity) against the false positive rate
(1-specificity) at various threshold settings
The AUC of a classifier is equivalent to the probability that the classifier will rank a randomly
chosen positive instance higher than a randomly chosen negative instance

Experimental Results - Decision
0.74
0.75
0.76
0.77
0.78
0.79
0.8
0.81
0.82
3 4 5 6 7 8 9 10 11 12 13 14 15
Classification performance of D1 and D2 in terms of Az index.
D1 D2
 Maximum 𝐴 𝑧 = 0,815 using D2 for K=13 ranked items

Conclusions
Two-stage CBIR-CAD:
• Margin-specific CBIR stage
• Diagnosis stage
Incorporation of training into the feature extraction (SVM ensemble)
High-performance for spiculated and microlobulated masses
Lack of standard datasets leads to difficulty in comparison between methods
Future efforts:
• Performance improvement for ill-defined masses
• Feature selection for each SVM independently
• Introduction of weights in decision calculation modifying the significance of each retrieved ROIs
• Use of relevance feedback mechanism to improve performance

A two stage svm-based mammographic cbir for ca dx

Recommended

Recommended

More Related Content

Similar to A two stage svm-based mammographic cbir for ca dx

Similar to A two stage svm-based mammographic cbir for ca dx (20)

Recently uploaded

Recently uploaded (20)

A two stage svm-based mammographic cbir for ca dx