SlideShare a Scribd company logo
Michael Biehl
Mathematics and Computing Science
University of Groningen / NL
Tutorial as satellite event of CAIP 2015
Saint Martin’s Institute of Higher Education
Malta, August 31, 2015
Distance based classifiers: Basic concepts,
recent developments, and application examples
www.cs.rug.nl/~ biehl
St. Martin’s Institute, August 2015
1) Distance based classifieres, Learning Vector Quantization
classification problems
distance based classifiers, from KNN to prototypes
the basic scheme: LVQ1
cost function based training: GLVQ
Application: classification of adrenal tumors (I)
Receiver Operator Characteristics
performance evaluation by (cross-) validation
2) GLVQ implementation
stochastic gradient descent, learning rate schedule
batch gradient descent, step size control
Demo: GLVQ with the no-nonsense GMLVQ toolbox
Overview
St. Martin’s Institute, August 2015
3) Alternative distance measures and Relevance Learning
Fixed distance measures:
Minkowski measures, Kernelized distances, Divergences
Application example: detection of Cassava Mosaic Disease
Adaptive distance measures
Matrix Relevance Learning Vector Quantizaion
Application example: Adrenal Tumors cont‘d
Demos: GMLVQ with the no-nonsense GMLVQ toolbox
Application example: Early diagnosis of Rheumatoid Arthritis
Uniqueness, regularization and singularity control
Challenges in bio-medical data analysis
Concluding remarks, references
Overview
1) Distance based classifiers,
Learning Vector Quantization
St. Martin’s Institute, August 2015 5
classification problems
- character/digit/speech recognition
- medical diagnoses
- pixel-wise segmentation in image processing
- object recognition/scene analysis
- fault detection in technical systems
- ...
machine learning approach:
extract information from example data
parameterized in a learning system (neural network, LVQ, SVM...)
working phase: application to novel data
here only: supervised learning , classification:
St. Martin’s Institute, August 2015 6
distance based classification
assignment of data (objects, observations,...)
to one or several classes (crisp/soft) (categories, labels)
based on comparison with reference data (samples, prototypes)
in terms of a distance measure (dis-similarity, metric)
representation of data (a key step!)
- collection of qualitative/quantitative descriptors
- vectors of numerical features
- sequences, graphs, functional data
- relational data, e.g. in terms of pairwise (dis-) similarities
St. Martin’s Institute, August 2015
K-NN classifier
a simple distance-based classifier
- store a set of labeled examples
- classify a query according to the
label of the Nearest Neighbor
(or the majority of K NN)
- local decision boundary acc.
to (e.g.) Euclidean distances
?
- piece-wise linear class borders
parameterized by all examples
feature space
+ conceptually simple, no training required, one parameter (K)
- expensive storage and computation, sensitivity to “outliers”
can result in overly complex decision boundaries
St. Martin’s Institute, August 2015
prototype based classification
a prototype based classifier [Kohonen 1990, 1997]
- represent the data by one or
several prototypes per class
- classify a query according to the
label of the nearest prototype
(or alternative schemes)
- local decision boundaries according
to (e.g.) Euclidean distances
- piece-wise linear class borders
parameterized by prototypes
feature space
?
+ less sensitive to outliers, lower storage needs, little computational
effort in the working phase
- training phase required in order to place prototypes,
model selection problem: number of prototypes per class, etc.
St. Martin’s Institute, August 2015
What about the curse of dimensionality ?
concentration of norms/distances for large N
„distance based methods are bound to fail in high dimensions“ ?
LVQ:
- prototypes are not just random data points
- carefully selected representatives of the data
- distances of a given data point to prototypes are compared
projection to non-trivial
low-dimensional subspace!
[Ghosh et al., 2007, Witoelar et al., 2010]
models of LVQ training, analytical treatment in the limit
successful training needs training examples
see also:
St. Martin’s Institute, August 2015
set of prototypes
carrying class-labels
based on dissimilarity / distance measure
nearest prototype classifier (NPC):
given - determine the winner with
Nearest Prototype Classifier (NPC)
minimal requirements:
- assign to class
standard example:
squared Euclidean
St. Martin’s Institute, August 2015
∙ identification of prototype vectors from labeled example data
∙ distance based classification (e.g. Euclidean)
Learning Vector Quantization
N-dimensional data, feature vectors
• initialize prototype vectors
for different classes
competititve learning: LVQ1 [Kohonen, 1990]
• identify the winner
(closest prototype)
• present a single example
• move the winner
- closer towards the data (same class)
- away from the data (different class)
St. Martin’s Institute, August 2015
∙ identification of prototype vectors from labeled example data
∙ distance based classification (e.g. Euclidean)
Learning Vector Quantization
N-dimensional data, feature vectors
∙ tesselation of feature space
[piece-wise linear]
∙ distance-based classification
[here: Euclidean distances]
∙ generalization ability
correct classification of new data
∙ aim: discrimination of classes
( ≠ vector quantization
or density estimation )


St. Martin’s Institute, August 2015
sequential presentation of labelled examples
… the winner takes it all:
learning rate
many heuristic variants/modifications: [Kohonen, 1990,1997]
- learning rate schedules ηw (t) [Darken & Moody, 1992]
- update more than one prototype per step
iterative training procedure:
randomized initial , e.g. close to the class-conditional means
LVQ1
LVQ1 update step:
St. Martin’s Institute, August 2015
LVQ1 update step:
LVQ1-like update for
generalized distance:
requirement:
update decreases (increases) distance if classes coincide (are different)
LVQ1
St. Martin’s Institute, August 2015
cost function based LVQ
one example: Generalized LVQ [Sato & Yamada, 1995]
sigmoidal (linear for small arguments), e.g.
E approximates number of misclassifications
linear
E favors large margin separation of classes, e.g.
two winning prototypes:
minimize
small , large
E favors class-typical prototypes
St. Martin’s Institute, August 2015
cost function based LVQ
There is nothing objective about objective functions
James L. McClelland
St. Martin’s Institute, August 2015
GLVQ
training = optimization with respect to prototype position,
e.g. single example presentation, stochastic gradient descent,
update of two prototypes per step
based on non-negative, differentiable distance
requirement:
St. Martin’s Institute, August 2015
GLVQ
training = optimization with respect to prototype position,
e.g. single example presentation, stochastic sequence of examples,
update of two prototypes per step
based on non-negative, differentiable distance
St. Martin’s Institute, August 2015
GLVQ
training = optimization with respect to prototype position,
e.g. single example presentation, stochastic sequence of examples,
update of two prototypes per step
based on Euclidean distance
moves prototypes towards / away from
sample with prefactors
St. Martin’s Institute, August 2015
related schemes
Many variants of LVQ
intuitive schemes: LVQ2, LVQ2.1, LVQ3, OLVQ, ...
cost function based: RSLVQ (likelihood ratios) ...
Supervised Neural Gas (NG)
many prototypes, rank based update
Supervised Self-Organizing Maps (SOM)
neighborhood relations, topology preserving mapping
Radial Basis Function Networks (RBF)
hidden units = centroids (prototypes) with Gaussian activation
An example problem:
classification of adrenal tumors
Wiebke Arlt , Angela Taylor
Dave J. Smith, Peter Nightingale
P.M. Stewart, C.H.L. Shackleton
et al.
Petra Schneider
Han Stiekema
Michael Biehl
Johann Bernoulli Institute for
Mathematics and Computer Science
University of Groningen
School of Medicine
Queen Elizabeth Hospital
University of Birmingham/UK
(+ several centers in Europe)
tumor classification
[Arlt et al., J. Clin. Endocrinology & Metabolism, 2011]
St. Martin’s Institute, August 2015
∙ adrenal tumors are common (1-2%)
and mostly found incidentally
∙ adrenocortical carcinomas (ACC) account
for 2-11% of adrenal incidentalomas
( ACA: adrenocortical adenomas )
∙ conventional diagnostic tools lack sensitivity
and are labor and cost intensive (CT, MRI)
www.ensat.org
adrenal
gland
∙ idea: tumor classification based on steroid excretion profile
tumor classification
St. Martin’s Institute, August 2015
- urinary steroid excretion (24 hours)
- 32 potential biomarkers
- biochemistry imposes correlations, grouping of steroids
tumor classification
St. Martin’s Institute, August 2015
ACApatient#
ACCpatient#
# steroid marker
102 patients with benign ACA
45 patients with malignant ACC
color coded excretion values
(logarithmic scale, relative to healthy controls)
data set:
tumor classification
St. Martin’s Institute, August 2015
Generalized LVQ , training and performance evaluation
∙ data divided in 90% training and 10% test set
∙ determine prototypes by (stochastic) gradient descent
typical profiles (1 per class)
∙ apply classifier to test data
evaluate performance (error rates)
∙ employ Euclidean distance measure
in the 32-dim. feature space
∙ repeat and average over many random splits
tumor classification
St. Martin’s Institute, August 2015
ACA
ACC
prototypes:
steroid excretion
in ACA/ACC
tumor classification
St. Martin’s Institute, August 2015
∙ Receiver Operator Characteristics (ROC) [Fawcett, 2000]
obtained by introducing a biased NPC:
false positive rate
(1-specificity)
truepositiverate
(sensitivity)
θ = 0
Area under Curve
all tumors classified as ACA
- no false alarms
- no true positives detected
all tumors classified as ACC
- all true positives detected
- max. number of false alarms
tumor classification
(NPC)
St. Martin’s Institute, August 2015
ROC characteristics (averaged over splits of the data set)
AUC=0.87
GLVQ performance:
tumor classification
2) GLVQ implementation
St. Martin’s Institute, August 2015 30
brief excursion: gradient descent
stochastic gradient descent: convergence requires
decreasing learning rate with ‘time’ (number of steps t ),
e.g. as
condition [Robbins and Monro, 1954]:
?
alternatives:
- more general optimization schemes
(conjugate gradient, line search, second order derivatives…)
- adaptive learning rates
- …
St. Martin’s Institute, August 2015 31
batch gradient descent
batch gradient-based descent w.r.t. GLVQ costs
concatenated prototype vector
update in the direction of
the negative (full) gradient
step size for normalized gradient
St. Martin’s Institute, August 2015
batch gradient descent
too small:
slow convergence
too large:
over-shooting
zig-zagging
oscillatory behavior
divergence
Waypoint averaging
[Papari, Biehl, Bunte, 2011]
(here: modified default step)
default: increase αw by factor, e.g. 1.1
if E(mean over k last ) < E (next )
replace next by mean
reduce αw by a factor, e.g. 2/3
end
St. Martin’s Institute, August 2015 33
- collection of Matlab code (no toolboxes required)
includes example data sets and limited documentation
- mainly for demo-purposes (do not use for critical applications)
efficiency, programming style, etc. were not in the focus
“no nonsense” GMLVQ code collection
provides: single runs, visualization of the data set
leave-one-out, subset validation procedure
variants/options: GLVQ, [GRLVQ], GMLVQ
null-space projection
singularity-control
A no-nonsense beginners’ tool for G(M)LVQ:
http://www.cs.rug.nl/~biehl/No-Nonsense-GMLVQ.zip
St. Martin’s Institute, August 2015 34
example demo
>> load twoclass-difficult.mat (98 examples, 34-dim. feature vectors, binary labels)
>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)
learning curves
and step sizes
prototypes
St. Martin’s Institute, August 2015 35
example demo
>> load twoclass-difficult.mat
>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)
training set ROC visualization (features 33, 34)
St. Martin’s Institute, August 2015 36
example demo
avg. validation set ROCavg. learning curves
>> [gmlvq_mean,roc_val,lcurves_mean,lcurves_std,param_set]=…
run_validation(fvec,lbl,50);
GLVQ without relevances
…
learning curves, averages over 5 validation runs
with 10 % of examples left out for testing
avg. prototypes
St. Martin’s Institute, August 2015 37
http://matlabserver.cs.rug.nl/gmlvqweb/web/
More sophisticated Matlab code: [K. Bunte]
(more options, training by non-linear optimization etc.)
Relevance and Matrix adaptation in Learning Vector
Quantization (GLVQ, GRLVQ, GMLVQ and LiRaM LVQ):
http://www.cs.rug.nl/~biehl/
more links
Pre- and re-prints etc.:
3) Alternative distance measures
and relevance learning
St. Martin’s Institute, August 2015 39
fixed, pre-defined distance measures:
GLVQ (or more general cost function based LVQ):
can be based on general, differentiable distances,
e.g. Minkowski measures
Alternative distance measures
possible work-flow
- select several distance measures according to prior knowledge
or a driven-choice in a preprocessing step
- compare performance of various measures
examples: Kernelized distances
Divergences (statistics)
St. Martin’s Institute, August 2015 40
Kernelized distances
rewrite squared Euclidean
distance in terms of dot-product
distance measure associated with general inner product or
kernel function
e.g. Gaussian Kernel
implicit mapping to high-dimensional space for
better separability of classes, similar: Support Vector Machine
St. Martin’s Institute, August 2015
Divergence Based LVQ:
Detection of Cassava Mosaic Disease
Ernest Mwebaye
John Quinn
Jennifer Aduwo
Petra Schneider
Michael Biehl
Johann Bernoulli Institute
University of Groningen
Department of Computer Sciene
Makerere University, Kampala
Namulonge Crop Research Center, Uganda
41
Thomas Villmann
Sven Haase
Frank-Michael Schleif
University of Applied Sciences, Mittweida
University Bielefeld, Germany
divergence based LVQ
[Neurocomputing, 2011]
St. Martin’s Institute, August 2015 42
healthyMosaic
Example: detection of Mosaic disease in Cassava (maniok) plants
Makerere University and Namulonge Crop Research Center, Uganda
LVQ classifiers based on histogram specific distance measures
divergences (statistics) for non-negative, possibly normalized data
(densities, spectral data, more general functional data)
leaf images
divergence based LVQ
St. Martin’s Institute, August 2015 43
Squared Euclidean distance:
Cauchy-Schwartz divergence
(a) (b) (c)
divergence based LVQ
St. Martin’s Institute, August 2015 44
example family: γ-divergences
non-symmetric (in general) includes: Kullback-Leibler
violates triangle inequality Cauchy-Schwarz
Euclidean
divergence based LVQ
St. Martin’s Institute, August 2015 45
http://www.air.ug/mcrops/
St. Martin’s Institute, August 2015
Adaptive distance measures
46
St. Martin’s Institute, August 2015 47
relevance learning:
- employ a parameterized distance measure
with only the mathematical form fixed in advance
- update its parameters in the training process
together with prototype training
- adaptive, data driven dissimilarity
example: Matrix Relevance LVQ
data-driven optimization of prototypes
and relevance matrix
in the same training process (≠ pre-processing )
Relevance Learning
St. Martin’s Institute, August 2015
Quadratic distance measure
generalized quadratic distance:
variants:
one global, several local, class-wise relevance matrices Λ(j)
→ piecewise quadratic decision boundaries
rectangular discriminative low-dim. representation
e.g. for visualization [Bunte et al., 2012]
diagonal matrices: single feature weights [Bojer et al., 2001]
[Hammer et al., 2002]
scaling of features, general linear transformation of feature space
potential normalization:
St. Martin’s Institute, August 2015
But this is just Mahalonobis distance…
[Mahalonobis, 1936]
S covariance matrix of random vectors
(calculated once from the data, fixed definition, not adaptive)
if you insist…
(‘two point version’)
So it is a generalized Mahalonobis distance ?
No.
a generalized
broccoli
a generalization
of Ohm’s Law
St. Martin’s Institute, August 2015
Relevance Matrix LVQ
optimization of prototypes and distance measure
WTA
Matrix-LVQ1
St. Martin’s Institute, August 2015
Relevance Matrix LVQ
Generalized Matrix LVQ
(GMLVQ)
optimization of prototypes and distance measure
St. Martin’s Institute, August 2015 52
heuristic interpretation
summarizes
- the contribution of the original dimension
- the relevance of original features for the classification
interpretation assumes implicitly:
features have equal order of magnitude
e.g. after z-score-transformation →
(averages over data set)
standard Euclidean distance for
linearly transformed features
St. Martin’s Institute, August 2015
Relevance Matrix LVQ
optimization of
prototype positions
distance measure(s)
in one training process
(≠ pre-processing)
motivation:
improved performance
- weighting of features and pairs of features
simplified classification schemes
- elimination of non-informative, noisy features
- discriminative low-dimensional representation
insight into the data / classification problem
- identification of most discriminative features
- incorporation of prior knowledge (e.g. structure of Ω)
St. Martin’s Institute, August 2015
related schemes
Relevance LVQ variants
local, rectangular, structured, restricted... relevance matrices
for visualization, functional data, texture recognition, etc.
relevance learning in Robust Soft LVQ, Supervised NG, etc.
combination of distances for mixed data ...
Relevance Learning related schemes in supervised learning ...
RBF Networks [Backhaus et al., 2012]
Neighborhood Component Analysis [Goldberger et al., 2005]
Large Margin Nearest Neighbor [Weinberger et al., 2006, 2010]
and many more!
Linear Discriminant Analysis (LDA)
one prototype per class + global matrix,
different objective function!
Classification of adrenal tumors (cont‘d)
Wiebke Arlt , Angela Taylor
Dave J. Smith, Peter Nightingale
P.M. Stewart, C.H.L. Shackleton
et al.
Petra Schneider
Han Stiekema
Michael Biehl
Johann Bernoulli Institute for
Mathematics and Computer Science
University of Groningen
School of Medicine
Queen Elizabeth Hospital
University of Birmingham/UK
(+ several centers in Europe)
[Arlt et al., J. Clin. Endocrinology & Metabolism, 2011]
[Biehl et al., Europ. Symp. Artficial Neural Networks (ESANN), 2012]
St. Martin’s Institute, August 2015
∙ adrenocortical tumors, difficult differential diagnosis:
ACC: adrenocortical carcinomas
ACA: adrenocortical adenomas
∙ idea: steroid metabolomics
tumor classification based on urinary steroid excretion
32 candidate steroid markers:
adrenocortical tumors
St. Martin’s Institute, August 2015
Generalized Matrix LVQ , ACC vs. ACA classification
∙ data divided in 90% training, 10% test set
∙ determine prototypes
typical profiles (1 per class)
∙ apply classifier to test data
evaluate performance (error rates, ROC)
∙ adaptive generalized quadratic distance measure
parameterized by
∙ repeat and average over many random splits
adrenocortical tumors
data set: 24 hrs. urinary steroid excretion
102 patients with benign ACA
45 patients with malignant ACC
St. Martin’s Institute, August 2015
Generalized Matrix LVQ , ACC vs. ACA classification
∙ data divided in 90% training, 10% test set, (z-score transformed)
∙ determine prototypes
typical profiles (1 per class)
∙ apply classifier to test data
evaluate performance (error rates, ROC)
∙ adaptive generalized quadratic distance measure
parameterized by
∙ repeat and average over many random splits
tumor classification (cont’d)
[Arlt et al., 2011]
[Biehl et al., 2012]
St. Martin’s Institute, August 2015
off-diagonal
diagonal elements
fraction of runs (random splits) in which a
steroid is rated among 9 most relevant markers
subset of 9 selected steroids ↔ technical realization (patented, University
of Birmingham/UK)
tumor classification
Relevance matrix
St. Martin’s Institute, August 2015
off-diagonaldiagonal elements
19
ACA
ACC
discriminative
e.g. steroid 19
tumor classification
St. Martin’s Institute, August 2015
off-diagonaldiagonal elements
8
ACA ACC
non-trivial role:
steroid 8 among the most relevant!
tumor classification
St. Martin’s Institute, August 2015
highly discriminative
combination of markers!
weaklydiscriminativemarkers
12
8
tumor classification
St. Martin’s Institute, August 2015
ROC characteristics
clear improvement due to
adaptive distances
(1-specificity)
(sensitivity)
8
GMLVQ
GRLVQ
diagonal rel.
Euclidean
full matrix
AUC
0.87
0.93
0.97
tumor classification
St. Martin’s Institute, August 2015
observation / theory :
low rank of resulting relevance matrix
often: single relevant eigendirection
eigenvalues
in ACA/ACC
classification
intrinsic regularization
nominally ~ NxN adaptive parameters in Matrix LVQ
reduce to ~ N effective degrees of freedom
low-dimensional representation
facilitates, e.g., visualization of labeled data sets
tumor classification
theory: stationarity of Matrix RLVQ
Biehl et al. Stationarity of Matrix Relevance LVQ
Proc. IJCNN 2015
St. Martin’s Institute, August 2015
tumor classification
visualization of the data set
ACA
ACC
St. Martin’s Institute, August 2015 66
modified batch gradient descent
batch gradient-based descent w.r.t. costs
concatenated prototype vector
elements of Ω
updates in the direction of
the normalized gradients
waypoint averaging and step size control
separately for and
St. Martin’s Institute, August 2015 67
example demo
>> load twoclass-difficult.mat (98 34-dim. feature vectors, binary classification)
>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)
prototypes and
relevance matrix
learning curves
and step sizes
St. Martin’s Institute, August 2015 68
example demo
>> load twoclass-difficult.mat
>> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100)
training set ROC visualization of the data set
St. Martin’s Institute, August 2015 69
example demo
avg. validation set ROCavg. prototypes and relevance matrix
>> [gmlvq_mean,roc_val,lcurves_mean,lcurves_std,param_set]=…
run_validation(fvec,lbl,50);
GMLVQ
…
learning curves, averages over 5 validation runs
with 10 % of examples left out for testing
St. Martin’s Institute, August 2015 70
a multi-class problem
visualization of 18-dim. data setavg. prototypes and rel. matrix
>> load uci-segmenation-sampled
>> [gmlvq_system, curves_single,param_set]=run_single(fvec,lbl,50)
St. Martin’s Institute, August 2015 71
Singularity control
Note:
singularity of relevance matrix can lead to numerical instabilities
and over-simplification effects
singularity control: penalty term
derivative
-> modified matrix update
(implemented in the no-nonsense gmlvq code collection)
St. Martin’s Institute, August 2015 72
Uniqueness
(I) uniqueness of Ω, given Λ
matrix square root is not unique
irrelevant rotations, reflections, symmetries….
canonical representation in terms of eigen-decomposition of Λ:
- pos. semi-definite
symmetric solution
(Matlab: “sqrtm”)
St. Martin’s Institute, August 2015 73
simple example:
contributions cancel exactly if
-> disregarded in the classification
of the training data
but naïve interpretation of diagonal
suggests high relevance, could cause
non-trivial effect for novel data
consider two identical, entirely
irrelevant features, e.g.
Uniqueness
(II) uniqueness of relevance matrix for given data set ?
St. Martin’s Institute, August 2015 74
(II) uniqueness
given transformation:
are in the null-space of
is possible if the rows of
→ identical mapping of examples, different for
possible to extend by prototypes
is singular if
features are correlated, dependent
Uniqueness
St. Martin’s Institute, August 2015 75
regularization
training process yields
determine with eigenvectors and eigenvalues
regularization:
(K<J ) retain the eigenspace corresponding to largest eigenvalues
removes also span of small non-zero eigenvalues
(K=J ) removes all null-space contributions, unique solution
with minimal Euclidean norm of row vectors
equivalent: (Moore-Penrose-Inverse X+ )
(implemented in the no-nonsense gmlvq code collection)
St. Martin’s Institute, August 2015 76
regularization
regularized mapping
after/during training
pre-processing of data
(PCA)
mapped feature space
fixed K
prototypes yet unknown
example: diagnosis of
rheumatoid arthritis
retains original features
flexible K
may include prototypes
example: Wine data set
Strickert, Hammer, Villmann, Biehl, IEEE SCCI 2013
Regularization and improved interpretation of linear data mappings
and adaptive distance measures
St. Martin’s Institute, August 2015 77
illustrative example
infra-red spectral data: 124 wine spamples
256 wavelengths 30 training data
94 test spectra
alcoholcontent
high
low
medium
GMLVQ classification
[UCI ML repository]
St. Martin’s Institute, August 2015 78
GMLVQ
best performance
7 dimensions remaining
over-fitting
effect
null-space correction
P=30 dimensions
St. Martin’s Institute, August 2015 79
original
regularized
regularization
- enhances generalization
- smoothens relevance profile/matrix
- removes ‘false relevances’
- improves interpretability of Λ
raw relevance matrix
posterior regularization
St. Martin’s Institute, August 2015
Early diagnosis of Rheumatoid Arthritis
Expression of chemokines CXCL4 and CXCL7 by synovial
macrophages defines an early stage of rheumatoid arthritis
Ann. of the Rheumatic Diseases, 2015 (available online)
L. Yeo, N. Adlard, M. Biehl, M. Juarez, M. Snow
C.D. Buckley, A. Filer, K. Raza, D. Scheel-Toellner
St. Martin’s Institute, August 2015
uninflamed control established RA early inflammation
resolving early RA
cytokine based diagnosis of RA
at earliest possible stage ?
ultimate goals:
understand pathogenesis and
mechanism of progression
rheumatoid arthritis (RA)
St. Martin’s Institute, August 2015
mRNA extraction real-time PCRtissue sectionsynovium
synovial tissue cytokine expression
IL1A IL17F FASL CXCL4 CCL15 TGFB1 KITLG
IL1B IL18 CD70 CXCL5 CCL16 TGFB2 MST1
IL1RN IL19 CD30L CXCL6 CCL17 TGFB3 SPP1
IL2 IL20 4-1BB-L CXCL7 CCL18 EGF SFRP1
IL3 IL21 TRAIL CXCL9 CCL19 FGF2 ANXA1
IL4 IL22 RANKL CXCL10 CCL20 TGFA TNFRSF13B
IL5 IL23A TWEAK CXCL11 CCL21 IGF2 IL6R
IL6 IL24 APRIL CXCL12 CCL22 VEGFA NAMPT
IL7 IL25 BAFF CXCL13 CCL23 VEGFB C1QTNF3
IL8 IL26 LIGHT CXCL14 CCL24 MIF VCAM1
IL9 IL27 TL1A CXCL16 CCL25 LIF LGALS1
IL10 IL28A GITRL CCL1 CCL26 OSM LGALS9
IL11 IL29 FASLG CCL2 CCL27 ADIPOQ LGALS3
IL12A IL32 IFNA1 CCL3 CCL28 LEP LGALS12
IL12B IL33 IFNA2 CCL4 XCL1 GHRL
IL13 LTA IFNB1 CCL5 XCL2 RETN
IL14 TNF IFNG CCL7 CX3CL1 CTLA4
IL15 LTB CXCL1 CCL8 CSF1 EPO
IL16 OX40L CXCL2 CCL11 CSF2 TPO
IL17A CD40L CXCL3 CCL13 CSF3 FLT3LG
panel of 117 cytokines
• cell signaling proteins
• regulate immune response
• produced by, e.g.
T-cells, macrophages,
lymphocytes, fibroblasts, etc.
St. Martin’s Institute, August 2015
GMLVQ analysis
pre-processing:
• log-transformed expression values (117 dim. data, 47 samples in total)
• 21 leading principal components explain ca. 90% of the total variation
Two two-class problems: (A) established RA vs. uninflamed controls
(B) early RA vs. resolving inflammation
• 1 prototype per class, global relevance matrix, distance measure:
• leave-one-out validation
evaluation in terms of Receiver Operating Characteristics
St. Martin’s Institute, August 2015
false positive rate
truepositiveratetruepositiverate
diagonal Λii vs. cytokine index i
established RA vs.
uninflamed control
early RA vs.
resolving inflammation
Matrix Relevance LVQ
diagonal relevancesleave-one-out
intialization
of LVQ system
St. Martin’s Institute, August 2015
CXCL4 chemokine (C-X-C motif) ligand 4
CXCL7 chemokine (C-X-C motif) ligand 7
direct study on protein level, staining / imaging of sinovial tissue:
macrophages : predominant source of CXCL4/7 expression
protein level studies
• high levels of CXCL4 and CXLC7
in the first 12 weeks of synovitis
in early RA
• expression on macrophages
outside of blood vessels
discriminates
early RA / resolving cases
(2 PhD thesis projects)
St. Martin’s Institute, August 2015
false positive rate
truepositiveratetruepositiverate
diagonal Λii vs. cytokine index i
established RA vs.
uninflamed control
early RA vs.
resolving inflammation
relevant cytokines
macrophage
stimulating 1
diagonal relevancesleave-one-out
St. Martin’s Institute, August 2015
four class problem
one prototype per class
and one global matrix
trained in one go
low-rank relevance
matrix (rank ≈ 2)
visualization of data
set in terms of
eigenvectors of Λ
Niels Kluiter
research internship
at JBI Groningen
St. Martin’s Institute, August 2015
four class problem
- extract binary classifiers (healthy vs. est. RA, resolving vs. early RA)
by restricting the system to the corresponding prototypes
for varying number K of PCs used as feature vectors
- determine corresponding ROC performances
robust in a range
of 14 < K < 20
healthy vs. est. RA
K=16: AUC = 0.92
early vs. resolving RA
K=16: AUC = 0.79
to do: nested L1O-val.
St. Martin’s Institute, August 2015
four class problem
read off problem-
specific relevances
from eigenvectors
of Λ
control
vs. est. RA
resolving
vs.earlyRA
Some challenges in biomedical data,
further application examples
St. Martin’s Institute, August 2015
challenges in bio-medical data
A. Filer, A. Clark, M. Juarez, J. Falconer et al.
- micro-array gene expression data
high-dimensional (~50000 probes)
PCA + GMLVQ
(work in progress)
early Arthritis vs. resolving inflammations
- preliminary result:
better than random classification
close inspection of high relevance genes:
system discriminates male/female patients
prediction reflects higher prevalence of RA in female patients
leave-one-out
“accuracy is not enough”
St. Martin’s Institute, August 2015 92
interpretability
- important: understand the basis of decisions
- white-box approaches for classification/regression etc.
- insights into the data and problem at hand
- e.g. selection of most discriminative bio-markers
challenges
relevance of steroid markers
wwwensat.org
adrenocortical tumors
adenomas (ACA)
carcinomas (ACC)
W. Arlt, M. Biehl et al.
Urine steroid metabolomics as a biomarker tool for detecting
malignancy in adrenal tumors J. of Clin. Endocrinology &
Metabolism 96: 3775-3784 (2011).
St. Martin’s Institute, August 2015 93
large amounts of data , e.g. image data bases
life lines (longitudinal patient data)
prescription data bases [E. Hak, K. Taxis]
challenges
A
B
C
D
query images
retrieval:
√ - same class
× - different classs
UMCG data base of skin lesion images
K. Bunte, M. Biehl, M.F. Jonkman, N. Petkov Learning Effective Color
Features for Content Based Image Retrieval in Dermatology. Pattern
Recognition 44 (2011) 1892-1902.
St. Martin’s Institute, August 2015 94
high-dimensional data, e.g. medical images (CT, MRI, PET …)
gene expression, DNA sequences, …
challenges
projection on first eigenvector of Λ
projectiononfirsteigenvectorofΛ
M. Biehl, K. Bunte, P. Schneider Analysis
of Flow Cytometry Data by Matrix
Relevance Learning Vector Quantization
PLOS One 8: e59401 (2013)
- low-dim. representation
- feature selection
- visualization
high-throughput flow cytometry
~ 10k cells x30 markers/sample
 derive 186 features
 GMLVQ, low-dim. projection
St. Martin’s Institute, August 2015 95
incomplete data
challenges
- missing values, noise, uncertain labels…
imputation, semi-supervised learning
- complementary data sets…
learning from privileged information, transfer learning
mixed data
- combination of different sources / technical platforms
suitable adaptive & integrative (dis-) similarity measures
E. Mwebaze, G. Bearda, M. Biehl, D. Zühlke Combining dissimilarity
measures for prototype-based classification Proc. of the 23rd European
Symposium on Artificial Neural Networks ESANN 2015, d-side publishing,
31-36 (2015)
St. Martin’s Institute, August 2015
distances combined
...
N-dim. vector M-bin histogram temporal sequence
Euclidean divergence (mis-)alignment
combined distance measure, e.g.
+source-specific prototypes
relevance learning!
E. Mwebaze, G. Bearda, M. Biehl, D. Zühlke Combining dissimilarity
measures for prototype-based classification Proc. of the 23rd European
Symposium on Artificial Neural Networks ESANN 2015, d-side publishing,
31-36 (2015)
St. Martin’s Institute, August 2015
challenges
imbalanced data sets
- prevalence of diseases (screening vs. differential diagnosis)
- role of false positive / false negatives
T. Villmann, M. Kaden, W. Herrmann, M. Biehl
Learning Vector Quantization for ROC-optimization
possible
working points
St. Martin’s Institute, August 2015
causal relations vs. correlation
challenges
- predictive power vs. causal dependence ?
www.causality.inf.ethz.ch/
data/LUCAS.html
E. Mwebaze, J. Quinn, M. Biehl
Causal Relevance Learning for Robust Classification under Interventions
Proc. 19th Europ. Symp. on Artificial Neural Networks ESANN 2011
St. Martin’s Institute, August 2015
challenges
data not given as vectors in a Euclidean space,
e.g. symbolic sequences of different length
known: pairwise dis-similarities, e.g. edit-distance
‘relational data’ given as matrix
loooooooongword
shrtwrd
pseudo-Euclidean embedding
prototypes expressed as
Non-vectorial data:
St. Martin’s Institute, August 2015
non-vectorial data
distances
Training: updates w.r.t. prototype coefficients, e.g. LVQ1-like or GLVQ
Working phase: WTA classification of novel data:
distance from known example data
distance from protoypes
[Hammer, Schleif, Zhu, 2011] [Hammer & Hasenfuss, 2010]
prototypes
St. Martin’s Institute, August 2015
CAIP contributions
Gert-Jan de Vries, Steffen Pauws and Michael Biehl.
Facial Expression Recognition using Learning Vector Quantization
Thomas Villmann, Marika Kaden, David Nebel and Michael Biehl.
Learning Vector Quantization with Adaptive Cost-based
Outlier-Rejection
St. Martin’s Institute, August 2015
a review article
For a recent review and further references see:
M. Biehl, B. Hammer, T. Villmann Distance measures for prototype
based classification
In: BrainComp, Proc. of the International Workshop on Brain-
Inspired Computing. Cetraro/Italy, July 2013
L. Grandinetti, T. Lippert, N. Petkov (editors)
Springer Lecture Notes in Computer Science Vol 8603
pp. 100-116 (2014)
check
www.cs.rug.nl/~biehl
for more references and application examples

More Related Content

What's hot

Classification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining TechniquesClassification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining Techniques
inventionjournals
 
Multi-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data MiningMulti-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data Mining
IOSR Journals
 
Ieee doctoral progarm final
Ieee doctoral progarm finalIeee doctoral progarm final
Ieee doctoral progarm final
Joydeb Roy Chowdhury
 
Basic course on computer-based methods
Basic course on computer-based methodsBasic course on computer-based methods
Basic course on computer-based methods
improvemed
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
IRJET Journal
 
Intro 2 Machine Learning
Intro 2 Machine LearningIntro 2 Machine Learning
Intro 2 Machine Learning
Brockhaus Consulting GmbH
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
DataminingTools Inc
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
DataminingTools Inc
 
Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...
Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...
Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...Gunther Eysenbach
 
Trending Topics in Machine Learning
Trending Topics in Machine LearningTrending Topics in Machine Learning
Trending Topics in Machine Learning
Techsparks
 
Incremental learning from unbalanced data with concept class, concept drift a...
Incremental learning from unbalanced data with concept class, concept drift a...Incremental learning from unbalanced data with concept class, concept drift a...
Incremental learning from unbalanced data with concept class, concept drift a...
IJDKP
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
ahmad abdelhafeez
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
Valerii Klymchuk
 
Edbt2014 talk
Edbt2014 talkEdbt2014 talk
Edbt2014 talk
Khalid Belhajjame
 
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryA Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
IJERA Editor
 
DATA MINING.doc
DATA MINING.docDATA MINING.doc
DATA MINING.docbutest
 
Model Selection Techniques
Model Selection TechniquesModel Selection Techniques
Model Selection Techniques
Swati .
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptx
nishanth kurush
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
thamizh arasi
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
thamizh arasi
 

What's hot (20)

Classification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining TechniquesClassification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining Techniques
 
Multi-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data MiningMulti-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data Mining
 
Ieee doctoral progarm final
Ieee doctoral progarm finalIeee doctoral progarm final
Ieee doctoral progarm final
 
Basic course on computer-based methods
Basic course on computer-based methodsBasic course on computer-based methods
Basic course on computer-based methods
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
 
Intro 2 Machine Learning
Intro 2 Machine LearningIntro 2 Machine Learning
Intro 2 Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...
Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...
Usage of Semantic Web Technologies (Web 3.0) Aiming to Facilitate the Utilisa...
 
Trending Topics in Machine Learning
Trending Topics in Machine LearningTrending Topics in Machine Learning
Trending Topics in Machine Learning
 
Incremental learning from unbalanced data with concept class, concept drift a...
Incremental learning from unbalanced data with concept class, concept drift a...Incremental learning from unbalanced data with concept class, concept drift a...
Incremental learning from unbalanced data with concept class, concept drift a...
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
Edbt2014 talk
Edbt2014 talkEdbt2014 talk
Edbt2014 talk
 
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryA Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
 
DATA MINING.doc
DATA MINING.docDATA MINING.doc
DATA MINING.doc
 
Model Selection Techniques
Model Selection TechniquesModel Selection Techniques
Model Selection Techniques
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptx
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 

Similar to 2015: Distance based classifiers: Basic concepts, recent developments and application examples

Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
University of Groningen
 
2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...
University of Groningen
 
June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...
University of Groningen
 
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Barbara Russo
 
January 2020: Prototype-based systems in machine learning
January 2020: Prototype-based systems in machine learning  January 2020: Prototype-based systems in machine learning
January 2020: Prototype-based systems in machine learning
University of Groningen
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
South West Data Meetup
 
Review of "Survey Research Methods & Design in Psychology"
Review of "Survey Research Methods & Design in Psychology"Review of "Survey Research Methods & Design in Psychology"
Review of "Survey Research Methods & Design in Psychology"
James Neill
 
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
IJECEIAES
 
Mining the LET Performance in Generating Prediction Models for OTDSS
Mining the LET Performance in Generating Prediction Models for OTDSSMining the LET Performance in Generating Prediction Models for OTDSS
Mining the LET Performance in Generating Prediction Models for OTDSS
Ivy Tarun
 
Episode 12 : Research Methodology ( Part 2 )
Episode 12 :  Research Methodology ( Part 2 )Episode 12 :  Research Methodology ( Part 2 )
Episode 12 : Research Methodology ( Part 2 )
SAJJAD KHUDHUR ABBAS
 
20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt
SamPrem3
 
20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt
PalaniKumarR2
 
Di35605610
Di35605610Di35605610
Di35605610
IJERA Editor
 
Cluster spss week7
Cluster spss week7Cluster spss week7
Cluster spss week7
Birat Sharma
 
Sampling and Data_Update.ppt
Sampling and Data_Update.pptSampling and Data_Update.ppt
Sampling and Data_Update.ppt
MdShohelRana69
 
Machine Learning Powered A/B Testing
Machine Learning Powered A/B TestingMachine Learning Powered A/B Testing
Machine Learning Powered A/B Testing
Pavel Serdyukov
 
Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...
Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...
Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...
Miguel Rodriguez Artacho
 
Episode 18 : Research Methodology ( Part 8 )
Episode 18 :  Research Methodology ( Part 8 )Episode 18 :  Research Methodology ( Part 8 )
Episode 18 : Research Methodology ( Part 8 )
SAJJAD KHUDHUR ABBAS
 
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
Matthias Braunhofer
 

Similar to 2015: Distance based classifiers: Basic concepts, recent developments and application examples (20)

Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
 
2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...
 
June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...
 
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
 
January 2020: Prototype-based systems in machine learning
January 2020: Prototype-based systems in machine learning  January 2020: Prototype-based systems in machine learning
January 2020: Prototype-based systems in machine learning
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
Review of "Survey Research Methods & Design in Psychology"
Review of "Survey Research Methods & Design in Psychology"Review of "Survey Research Methods & Design in Psychology"
Review of "Survey Research Methods & Design in Psychology"
 
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...
 
Mining the LET Performance in Generating Prediction Models for OTDSS
Mining the LET Performance in Generating Prediction Models for OTDSSMining the LET Performance in Generating Prediction Models for OTDSS
Mining the LET Performance in Generating Prediction Models for OTDSS
 
Episode 12 : Research Methodology ( Part 2 )
Episode 12 :  Research Methodology ( Part 2 )Episode 12 :  Research Methodology ( Part 2 )
Episode 12 : Research Methodology ( Part 2 )
 
20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt
 
20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt20IT501_DWDM_PPT_Unit_IV.ppt
20IT501_DWDM_PPT_Unit_IV.ppt
 
Di35605610
Di35605610Di35605610
Di35605610
 
Cluster spss week7
Cluster spss week7Cluster spss week7
Cluster spss week7
 
Sampling and Data_Update.ppt
Sampling and Data_Update.pptSampling and Data_Update.ppt
Sampling and Data_Update.ppt
 
Machine Learning Powered A/B Testing
Machine Learning Powered A/B TestingMachine Learning Powered A/B Testing
Machine Learning Powered A/B Testing
 
Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...
Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...
Learninig Analytics Special Track: A cluster-based analisys to diagnose stude...
 
Episode 18 : Research Methodology ( Part 8 )
Episode 18 :  Research Methodology ( Part 8 )Episode 18 :  Research Methodology ( Part 8 )
Episode 18 : Research Methodology ( Part 8 )
 
0 introduction
0  introduction0  introduction
0 introduction
 
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
 

More from University of Groningen

Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
University of Groningen
 
ESE-Eyes-2023.pdf
ESE-Eyes-2023.pdfESE-Eyes-2023.pdf
ESE-Eyes-2023.pdf
University of Groningen
 
APPIS-FDGPET.pdf
APPIS-FDGPET.pdfAPPIS-FDGPET.pdf
APPIS-FDGPET.pdf
University of Groningen
 
stat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdfstat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdf
University of Groningen
 
prototypes-AMALEA.pdf
prototypes-AMALEA.pdfprototypes-AMALEA.pdf
prototypes-AMALEA.pdf
University of Groningen
 
stat-phys-AMALEA.pdf
stat-phys-AMALEA.pdfstat-phys-AMALEA.pdf
stat-phys-AMALEA.pdf
University of Groningen
 
Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...
University of Groningen
 
The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...
University of Groningen
 
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
University of Groningen
 
2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ... 2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ...
University of Groningen
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
University of Groningen
 
The statistical physics of learning - revisited
The statistical physics of learning - revisitedThe statistical physics of learning - revisited
The statistical physics of learning - revisited
University of Groningen
 
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
University of Groningen
 
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
University of Groningen
 
2017: Prototype-based models in unsupervised and supervised machine learning
2017: Prototype-based models in unsupervised and supervised machine learning2017: Prototype-based models in unsupervised and supervised machine learning
2017: Prototype-based models in unsupervised and supervised machine learning
University of Groningen
 

More from University of Groningen (15)

Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
 
ESE-Eyes-2023.pdf
ESE-Eyes-2023.pdfESE-Eyes-2023.pdf
ESE-Eyes-2023.pdf
 
APPIS-FDGPET.pdf
APPIS-FDGPET.pdfAPPIS-FDGPET.pdf
APPIS-FDGPET.pdf
 
stat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdfstat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdf
 
prototypes-AMALEA.pdf
prototypes-AMALEA.pdfprototypes-AMALEA.pdf
prototypes-AMALEA.pdf
 
stat-phys-AMALEA.pdf
stat-phys-AMALEA.pdfstat-phys-AMALEA.pdf
stat-phys-AMALEA.pdf
 
Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...
 
The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...
 
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
 
2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ... 2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ...
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
 
The statistical physics of learning - revisited
The statistical physics of learning - revisitedThe statistical physics of learning - revisited
The statistical physics of learning - revisited
 
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
 
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
 
2017: Prototype-based models in unsupervised and supervised machine learning
2017: Prototype-based models in unsupervised and supervised machine learning2017: Prototype-based models in unsupervised and supervised machine learning
2017: Prototype-based models in unsupervised and supervised machine learning
 

Recently uploaded

(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 

Recently uploaded (20)

(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 

2015: Distance based classifiers: Basic concepts, recent developments and application examples

  • 1. Michael Biehl Mathematics and Computing Science University of Groningen / NL Tutorial as satellite event of CAIP 2015 Saint Martin’s Institute of Higher Education Malta, August 31, 2015 Distance based classifiers: Basic concepts, recent developments, and application examples www.cs.rug.nl/~ biehl
  • 2. St. Martin’s Institute, August 2015 1) Distance based classifieres, Learning Vector Quantization classification problems distance based classifiers, from KNN to prototypes the basic scheme: LVQ1 cost function based training: GLVQ Application: classification of adrenal tumors (I) Receiver Operator Characteristics performance evaluation by (cross-) validation 2) GLVQ implementation stochastic gradient descent, learning rate schedule batch gradient descent, step size control Demo: GLVQ with the no-nonsense GMLVQ toolbox Overview
  • 3. St. Martin’s Institute, August 2015 3) Alternative distance measures and Relevance Learning Fixed distance measures: Minkowski measures, Kernelized distances, Divergences Application example: detection of Cassava Mosaic Disease Adaptive distance measures Matrix Relevance Learning Vector Quantizaion Application example: Adrenal Tumors cont‘d Demos: GMLVQ with the no-nonsense GMLVQ toolbox Application example: Early diagnosis of Rheumatoid Arthritis Uniqueness, regularization and singularity control Challenges in bio-medical data analysis Concluding remarks, references Overview
  • 4. 1) Distance based classifiers, Learning Vector Quantization
  • 5. St. Martin’s Institute, August 2015 5 classification problems - character/digit/speech recognition - medical diagnoses - pixel-wise segmentation in image processing - object recognition/scene analysis - fault detection in technical systems - ... machine learning approach: extract information from example data parameterized in a learning system (neural network, LVQ, SVM...) working phase: application to novel data here only: supervised learning , classification:
  • 6. St. Martin’s Institute, August 2015 6 distance based classification assignment of data (objects, observations,...) to one or several classes (crisp/soft) (categories, labels) based on comparison with reference data (samples, prototypes) in terms of a distance measure (dis-similarity, metric) representation of data (a key step!) - collection of qualitative/quantitative descriptors - vectors of numerical features - sequences, graphs, functional data - relational data, e.g. in terms of pairwise (dis-) similarities
  • 7. St. Martin’s Institute, August 2015 K-NN classifier a simple distance-based classifier - store a set of labeled examples - classify a query according to the label of the Nearest Neighbor (or the majority of K NN) - local decision boundary acc. to (e.g.) Euclidean distances ? - piece-wise linear class borders parameterized by all examples feature space + conceptually simple, no training required, one parameter (K) - expensive storage and computation, sensitivity to “outliers” can result in overly complex decision boundaries
  • 8. St. Martin’s Institute, August 2015 prototype based classification a prototype based classifier [Kohonen 1990, 1997] - represent the data by one or several prototypes per class - classify a query according to the label of the nearest prototype (or alternative schemes) - local decision boundaries according to (e.g.) Euclidean distances - piece-wise linear class borders parameterized by prototypes feature space ? + less sensitive to outliers, lower storage needs, little computational effort in the working phase - training phase required in order to place prototypes, model selection problem: number of prototypes per class, etc.
  • 9. St. Martin’s Institute, August 2015 What about the curse of dimensionality ? concentration of norms/distances for large N „distance based methods are bound to fail in high dimensions“ ? LVQ: - prototypes are not just random data points - carefully selected representatives of the data - distances of a given data point to prototypes are compared projection to non-trivial low-dimensional subspace! [Ghosh et al., 2007, Witoelar et al., 2010] models of LVQ training, analytical treatment in the limit successful training needs training examples see also:
  • 10. St. Martin’s Institute, August 2015 set of prototypes carrying class-labels based on dissimilarity / distance measure nearest prototype classifier (NPC): given - determine the winner with Nearest Prototype Classifier (NPC) minimal requirements: - assign to class standard example: squared Euclidean
  • 11. St. Martin’s Institute, August 2015 ∙ identification of prototype vectors from labeled example data ∙ distance based classification (e.g. Euclidean) Learning Vector Quantization N-dimensional data, feature vectors • initialize prototype vectors for different classes competititve learning: LVQ1 [Kohonen, 1990] • identify the winner (closest prototype) • present a single example • move the winner - closer towards the data (same class) - away from the data (different class)
  • 12. St. Martin’s Institute, August 2015 ∙ identification of prototype vectors from labeled example data ∙ distance based classification (e.g. Euclidean) Learning Vector Quantization N-dimensional data, feature vectors ∙ tesselation of feature space [piece-wise linear] ∙ distance-based classification [here: Euclidean distances] ∙ generalization ability correct classification of new data ∙ aim: discrimination of classes ( ≠ vector quantization or density estimation )  
  • 13. St. Martin’s Institute, August 2015 sequential presentation of labelled examples … the winner takes it all: learning rate many heuristic variants/modifications: [Kohonen, 1990,1997] - learning rate schedules ηw (t) [Darken & Moody, 1992] - update more than one prototype per step iterative training procedure: randomized initial , e.g. close to the class-conditional means LVQ1 LVQ1 update step:
  • 14. St. Martin’s Institute, August 2015 LVQ1 update step: LVQ1-like update for generalized distance: requirement: update decreases (increases) distance if classes coincide (are different) LVQ1
  • 15. St. Martin’s Institute, August 2015 cost function based LVQ one example: Generalized LVQ [Sato & Yamada, 1995] sigmoidal (linear for small arguments), e.g. E approximates number of misclassifications linear E favors large margin separation of classes, e.g. two winning prototypes: minimize small , large E favors class-typical prototypes
  • 16. St. Martin’s Institute, August 2015 cost function based LVQ There is nothing objective about objective functions James L. McClelland
  • 17. St. Martin’s Institute, August 2015 GLVQ training = optimization with respect to prototype position, e.g. single example presentation, stochastic gradient descent, update of two prototypes per step based on non-negative, differentiable distance requirement:
  • 18. St. Martin’s Institute, August 2015 GLVQ training = optimization with respect to prototype position, e.g. single example presentation, stochastic sequence of examples, update of two prototypes per step based on non-negative, differentiable distance
  • 19. St. Martin’s Institute, August 2015 GLVQ training = optimization with respect to prototype position, e.g. single example presentation, stochastic sequence of examples, update of two prototypes per step based on Euclidean distance moves prototypes towards / away from sample with prefactors
  • 20. St. Martin’s Institute, August 2015 related schemes Many variants of LVQ intuitive schemes: LVQ2, LVQ2.1, LVQ3, OLVQ, ... cost function based: RSLVQ (likelihood ratios) ... Supervised Neural Gas (NG) many prototypes, rank based update Supervised Self-Organizing Maps (SOM) neighborhood relations, topology preserving mapping Radial Basis Function Networks (RBF) hidden units = centroids (prototypes) with Gaussian activation
  • 21. An example problem: classification of adrenal tumors Wiebke Arlt , Angela Taylor Dave J. Smith, Peter Nightingale P.M. Stewart, C.H.L. Shackleton et al. Petra Schneider Han Stiekema Michael Biehl Johann Bernoulli Institute for Mathematics and Computer Science University of Groningen School of Medicine Queen Elizabeth Hospital University of Birmingham/UK (+ several centers in Europe) tumor classification [Arlt et al., J. Clin. Endocrinology & Metabolism, 2011]
  • 22. St. Martin’s Institute, August 2015 ∙ adrenal tumors are common (1-2%) and mostly found incidentally ∙ adrenocortical carcinomas (ACC) account for 2-11% of adrenal incidentalomas ( ACA: adrenocortical adenomas ) ∙ conventional diagnostic tools lack sensitivity and are labor and cost intensive (CT, MRI) www.ensat.org adrenal gland ∙ idea: tumor classification based on steroid excretion profile tumor classification
  • 23. St. Martin’s Institute, August 2015 - urinary steroid excretion (24 hours) - 32 potential biomarkers - biochemistry imposes correlations, grouping of steroids tumor classification
  • 24. St. Martin’s Institute, August 2015 ACApatient# ACCpatient# # steroid marker 102 patients with benign ACA 45 patients with malignant ACC color coded excretion values (logarithmic scale, relative to healthy controls) data set: tumor classification
  • 25. St. Martin’s Institute, August 2015 Generalized LVQ , training and performance evaluation ∙ data divided in 90% training and 10% test set ∙ determine prototypes by (stochastic) gradient descent typical profiles (1 per class) ∙ apply classifier to test data evaluate performance (error rates) ∙ employ Euclidean distance measure in the 32-dim. feature space ∙ repeat and average over many random splits tumor classification
  • 26. St. Martin’s Institute, August 2015 ACA ACC prototypes: steroid excretion in ACA/ACC tumor classification
  • 27. St. Martin’s Institute, August 2015 ∙ Receiver Operator Characteristics (ROC) [Fawcett, 2000] obtained by introducing a biased NPC: false positive rate (1-specificity) truepositiverate (sensitivity) θ = 0 Area under Curve all tumors classified as ACA - no false alarms - no true positives detected all tumors classified as ACC - all true positives detected - max. number of false alarms tumor classification (NPC)
  • 28. St. Martin’s Institute, August 2015 ROC characteristics (averaged over splits of the data set) AUC=0.87 GLVQ performance: tumor classification
  • 30. St. Martin’s Institute, August 2015 30 brief excursion: gradient descent stochastic gradient descent: convergence requires decreasing learning rate with ‘time’ (number of steps t ), e.g. as condition [Robbins and Monro, 1954]: ? alternatives: - more general optimization schemes (conjugate gradient, line search, second order derivatives…) - adaptive learning rates - …
  • 31. St. Martin’s Institute, August 2015 31 batch gradient descent batch gradient-based descent w.r.t. GLVQ costs concatenated prototype vector update in the direction of the negative (full) gradient step size for normalized gradient
  • 32. St. Martin’s Institute, August 2015 batch gradient descent too small: slow convergence too large: over-shooting zig-zagging oscillatory behavior divergence Waypoint averaging [Papari, Biehl, Bunte, 2011] (here: modified default step) default: increase αw by factor, e.g. 1.1 if E(mean over k last ) < E (next ) replace next by mean reduce αw by a factor, e.g. 2/3 end
  • 33. St. Martin’s Institute, August 2015 33 - collection of Matlab code (no toolboxes required) includes example data sets and limited documentation - mainly for demo-purposes (do not use for critical applications) efficiency, programming style, etc. were not in the focus “no nonsense” GMLVQ code collection provides: single runs, visualization of the data set leave-one-out, subset validation procedure variants/options: GLVQ, [GRLVQ], GMLVQ null-space projection singularity-control A no-nonsense beginners’ tool for G(M)LVQ: http://www.cs.rug.nl/~biehl/No-Nonsense-GMLVQ.zip
  • 34. St. Martin’s Institute, August 2015 34 example demo >> load twoclass-difficult.mat (98 examples, 34-dim. feature vectors, binary labels) >> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100) learning curves and step sizes prototypes
  • 35. St. Martin’s Institute, August 2015 35 example demo >> load twoclass-difficult.mat >> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100) training set ROC visualization (features 33, 34)
  • 36. St. Martin’s Institute, August 2015 36 example demo avg. validation set ROCavg. learning curves >> [gmlvq_mean,roc_val,lcurves_mean,lcurves_std,param_set]=… run_validation(fvec,lbl,50); GLVQ without relevances … learning curves, averages over 5 validation runs with 10 % of examples left out for testing avg. prototypes
  • 37. St. Martin’s Institute, August 2015 37 http://matlabserver.cs.rug.nl/gmlvqweb/web/ More sophisticated Matlab code: [K. Bunte] (more options, training by non-linear optimization etc.) Relevance and Matrix adaptation in Learning Vector Quantization (GLVQ, GRLVQ, GMLVQ and LiRaM LVQ): http://www.cs.rug.nl/~biehl/ more links Pre- and re-prints etc.:
  • 38. 3) Alternative distance measures and relevance learning
  • 39. St. Martin’s Institute, August 2015 39 fixed, pre-defined distance measures: GLVQ (or more general cost function based LVQ): can be based on general, differentiable distances, e.g. Minkowski measures Alternative distance measures possible work-flow - select several distance measures according to prior knowledge or a driven-choice in a preprocessing step - compare performance of various measures examples: Kernelized distances Divergences (statistics)
  • 40. St. Martin’s Institute, August 2015 40 Kernelized distances rewrite squared Euclidean distance in terms of dot-product distance measure associated with general inner product or kernel function e.g. Gaussian Kernel implicit mapping to high-dimensional space for better separability of classes, similar: Support Vector Machine
  • 41. St. Martin’s Institute, August 2015 Divergence Based LVQ: Detection of Cassava Mosaic Disease Ernest Mwebaye John Quinn Jennifer Aduwo Petra Schneider Michael Biehl Johann Bernoulli Institute University of Groningen Department of Computer Sciene Makerere University, Kampala Namulonge Crop Research Center, Uganda 41 Thomas Villmann Sven Haase Frank-Michael Schleif University of Applied Sciences, Mittweida University Bielefeld, Germany divergence based LVQ [Neurocomputing, 2011]
  • 42. St. Martin’s Institute, August 2015 42 healthyMosaic Example: detection of Mosaic disease in Cassava (maniok) plants Makerere University and Namulonge Crop Research Center, Uganda LVQ classifiers based on histogram specific distance measures divergences (statistics) for non-negative, possibly normalized data (densities, spectral data, more general functional data) leaf images divergence based LVQ
  • 43. St. Martin’s Institute, August 2015 43 Squared Euclidean distance: Cauchy-Schwartz divergence (a) (b) (c) divergence based LVQ
  • 44. St. Martin’s Institute, August 2015 44 example family: γ-divergences non-symmetric (in general) includes: Kullback-Leibler violates triangle inequality Cauchy-Schwarz Euclidean divergence based LVQ
  • 45. St. Martin’s Institute, August 2015 45 http://www.air.ug/mcrops/
  • 46. St. Martin’s Institute, August 2015 Adaptive distance measures 46
  • 47. St. Martin’s Institute, August 2015 47 relevance learning: - employ a parameterized distance measure with only the mathematical form fixed in advance - update its parameters in the training process together with prototype training - adaptive, data driven dissimilarity example: Matrix Relevance LVQ data-driven optimization of prototypes and relevance matrix in the same training process (≠ pre-processing ) Relevance Learning
  • 48. St. Martin’s Institute, August 2015 Quadratic distance measure generalized quadratic distance: variants: one global, several local, class-wise relevance matrices Λ(j) → piecewise quadratic decision boundaries rectangular discriminative low-dim. representation e.g. for visualization [Bunte et al., 2012] diagonal matrices: single feature weights [Bojer et al., 2001] [Hammer et al., 2002] scaling of features, general linear transformation of feature space potential normalization:
  • 49. St. Martin’s Institute, August 2015 But this is just Mahalonobis distance… [Mahalonobis, 1936] S covariance matrix of random vectors (calculated once from the data, fixed definition, not adaptive) if you insist… (‘two point version’) So it is a generalized Mahalonobis distance ? No. a generalized broccoli a generalization of Ohm’s Law
  • 50. St. Martin’s Institute, August 2015 Relevance Matrix LVQ optimization of prototypes and distance measure WTA Matrix-LVQ1
  • 51. St. Martin’s Institute, August 2015 Relevance Matrix LVQ Generalized Matrix LVQ (GMLVQ) optimization of prototypes and distance measure
  • 52. St. Martin’s Institute, August 2015 52 heuristic interpretation summarizes - the contribution of the original dimension - the relevance of original features for the classification interpretation assumes implicitly: features have equal order of magnitude e.g. after z-score-transformation → (averages over data set) standard Euclidean distance for linearly transformed features
  • 53. St. Martin’s Institute, August 2015 Relevance Matrix LVQ optimization of prototype positions distance measure(s) in one training process (≠ pre-processing) motivation: improved performance - weighting of features and pairs of features simplified classification schemes - elimination of non-informative, noisy features - discriminative low-dimensional representation insight into the data / classification problem - identification of most discriminative features - incorporation of prior knowledge (e.g. structure of Ω)
  • 54. St. Martin’s Institute, August 2015 related schemes Relevance LVQ variants local, rectangular, structured, restricted... relevance matrices for visualization, functional data, texture recognition, etc. relevance learning in Robust Soft LVQ, Supervised NG, etc. combination of distances for mixed data ... Relevance Learning related schemes in supervised learning ... RBF Networks [Backhaus et al., 2012] Neighborhood Component Analysis [Goldberger et al., 2005] Large Margin Nearest Neighbor [Weinberger et al., 2006, 2010] and many more! Linear Discriminant Analysis (LDA) one prototype per class + global matrix, different objective function!
  • 55. Classification of adrenal tumors (cont‘d) Wiebke Arlt , Angela Taylor Dave J. Smith, Peter Nightingale P.M. Stewart, C.H.L. Shackleton et al. Petra Schneider Han Stiekema Michael Biehl Johann Bernoulli Institute for Mathematics and Computer Science University of Groningen School of Medicine Queen Elizabeth Hospital University of Birmingham/UK (+ several centers in Europe) [Arlt et al., J. Clin. Endocrinology & Metabolism, 2011] [Biehl et al., Europ. Symp. Artficial Neural Networks (ESANN), 2012]
  • 56. St. Martin’s Institute, August 2015 ∙ adrenocortical tumors, difficult differential diagnosis: ACC: adrenocortical carcinomas ACA: adrenocortical adenomas ∙ idea: steroid metabolomics tumor classification based on urinary steroid excretion 32 candidate steroid markers: adrenocortical tumors
  • 57. St. Martin’s Institute, August 2015 Generalized Matrix LVQ , ACC vs. ACA classification ∙ data divided in 90% training, 10% test set ∙ determine prototypes typical profiles (1 per class) ∙ apply classifier to test data evaluate performance (error rates, ROC) ∙ adaptive generalized quadratic distance measure parameterized by ∙ repeat and average over many random splits adrenocortical tumors data set: 24 hrs. urinary steroid excretion 102 patients with benign ACA 45 patients with malignant ACC
  • 58. St. Martin’s Institute, August 2015 Generalized Matrix LVQ , ACC vs. ACA classification ∙ data divided in 90% training, 10% test set, (z-score transformed) ∙ determine prototypes typical profiles (1 per class) ∙ apply classifier to test data evaluate performance (error rates, ROC) ∙ adaptive generalized quadratic distance measure parameterized by ∙ repeat and average over many random splits tumor classification (cont’d) [Arlt et al., 2011] [Biehl et al., 2012]
  • 59. St. Martin’s Institute, August 2015 off-diagonal diagonal elements fraction of runs (random splits) in which a steroid is rated among 9 most relevant markers subset of 9 selected steroids ↔ technical realization (patented, University of Birmingham/UK) tumor classification Relevance matrix
  • 60. St. Martin’s Institute, August 2015 off-diagonaldiagonal elements 19 ACA ACC discriminative e.g. steroid 19 tumor classification
  • 61. St. Martin’s Institute, August 2015 off-diagonaldiagonal elements 8 ACA ACC non-trivial role: steroid 8 among the most relevant! tumor classification
  • 62. St. Martin’s Institute, August 2015 highly discriminative combination of markers! weaklydiscriminativemarkers 12 8 tumor classification
  • 63. St. Martin’s Institute, August 2015 ROC characteristics clear improvement due to adaptive distances (1-specificity) (sensitivity) 8 GMLVQ GRLVQ diagonal rel. Euclidean full matrix AUC 0.87 0.93 0.97 tumor classification
  • 64. St. Martin’s Institute, August 2015 observation / theory : low rank of resulting relevance matrix often: single relevant eigendirection eigenvalues in ACA/ACC classification intrinsic regularization nominally ~ NxN adaptive parameters in Matrix LVQ reduce to ~ N effective degrees of freedom low-dimensional representation facilitates, e.g., visualization of labeled data sets tumor classification theory: stationarity of Matrix RLVQ Biehl et al. Stationarity of Matrix Relevance LVQ Proc. IJCNN 2015
  • 65. St. Martin’s Institute, August 2015 tumor classification visualization of the data set ACA ACC
  • 66. St. Martin’s Institute, August 2015 66 modified batch gradient descent batch gradient-based descent w.r.t. costs concatenated prototype vector elements of Ω updates in the direction of the normalized gradients waypoint averaging and step size control separately for and
  • 67. St. Martin’s Institute, August 2015 67 example demo >> load twoclass-difficult.mat (98 34-dim. feature vectors, binary classification) >> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100) prototypes and relevance matrix learning curves and step sizes
  • 68. St. Martin’s Institute, August 2015 68 example demo >> load twoclass-difficult.mat >> [gmlvq_system,curves_single,param_set]=run_single(fvec,lbl,100) training set ROC visualization of the data set
  • 69. St. Martin’s Institute, August 2015 69 example demo avg. validation set ROCavg. prototypes and relevance matrix >> [gmlvq_mean,roc_val,lcurves_mean,lcurves_std,param_set]=… run_validation(fvec,lbl,50); GMLVQ … learning curves, averages over 5 validation runs with 10 % of examples left out for testing
  • 70. St. Martin’s Institute, August 2015 70 a multi-class problem visualization of 18-dim. data setavg. prototypes and rel. matrix >> load uci-segmenation-sampled >> [gmlvq_system, curves_single,param_set]=run_single(fvec,lbl,50)
  • 71. St. Martin’s Institute, August 2015 71 Singularity control Note: singularity of relevance matrix can lead to numerical instabilities and over-simplification effects singularity control: penalty term derivative -> modified matrix update (implemented in the no-nonsense gmlvq code collection)
  • 72. St. Martin’s Institute, August 2015 72 Uniqueness (I) uniqueness of Ω, given Λ matrix square root is not unique irrelevant rotations, reflections, symmetries…. canonical representation in terms of eigen-decomposition of Λ: - pos. semi-definite symmetric solution (Matlab: “sqrtm”)
  • 73. St. Martin’s Institute, August 2015 73 simple example: contributions cancel exactly if -> disregarded in the classification of the training data but naïve interpretation of diagonal suggests high relevance, could cause non-trivial effect for novel data consider two identical, entirely irrelevant features, e.g. Uniqueness (II) uniqueness of relevance matrix for given data set ?
  • 74. St. Martin’s Institute, August 2015 74 (II) uniqueness given transformation: are in the null-space of is possible if the rows of → identical mapping of examples, different for possible to extend by prototypes is singular if features are correlated, dependent Uniqueness
  • 75. St. Martin’s Institute, August 2015 75 regularization training process yields determine with eigenvectors and eigenvalues regularization: (K<J ) retain the eigenspace corresponding to largest eigenvalues removes also span of small non-zero eigenvalues (K=J ) removes all null-space contributions, unique solution with minimal Euclidean norm of row vectors equivalent: (Moore-Penrose-Inverse X+ ) (implemented in the no-nonsense gmlvq code collection)
  • 76. St. Martin’s Institute, August 2015 76 regularization regularized mapping after/during training pre-processing of data (PCA) mapped feature space fixed K prototypes yet unknown example: diagnosis of rheumatoid arthritis retains original features flexible K may include prototypes example: Wine data set Strickert, Hammer, Villmann, Biehl, IEEE SCCI 2013 Regularization and improved interpretation of linear data mappings and adaptive distance measures
  • 77. St. Martin’s Institute, August 2015 77 illustrative example infra-red spectral data: 124 wine spamples 256 wavelengths 30 training data 94 test spectra alcoholcontent high low medium GMLVQ classification [UCI ML repository]
  • 78. St. Martin’s Institute, August 2015 78 GMLVQ best performance 7 dimensions remaining over-fitting effect null-space correction P=30 dimensions
  • 79. St. Martin’s Institute, August 2015 79 original regularized regularization - enhances generalization - smoothens relevance profile/matrix - removes ‘false relevances’ - improves interpretability of Λ raw relevance matrix posterior regularization
  • 80. St. Martin’s Institute, August 2015 Early diagnosis of Rheumatoid Arthritis Expression of chemokines CXCL4 and CXCL7 by synovial macrophages defines an early stage of rheumatoid arthritis Ann. of the Rheumatic Diseases, 2015 (available online) L. Yeo, N. Adlard, M. Biehl, M. Juarez, M. Snow C.D. Buckley, A. Filer, K. Raza, D. Scheel-Toellner
  • 81. St. Martin’s Institute, August 2015 uninflamed control established RA early inflammation resolving early RA cytokine based diagnosis of RA at earliest possible stage ? ultimate goals: understand pathogenesis and mechanism of progression rheumatoid arthritis (RA)
  • 82. St. Martin’s Institute, August 2015 mRNA extraction real-time PCRtissue sectionsynovium synovial tissue cytokine expression IL1A IL17F FASL CXCL4 CCL15 TGFB1 KITLG IL1B IL18 CD70 CXCL5 CCL16 TGFB2 MST1 IL1RN IL19 CD30L CXCL6 CCL17 TGFB3 SPP1 IL2 IL20 4-1BB-L CXCL7 CCL18 EGF SFRP1 IL3 IL21 TRAIL CXCL9 CCL19 FGF2 ANXA1 IL4 IL22 RANKL CXCL10 CCL20 TGFA TNFRSF13B IL5 IL23A TWEAK CXCL11 CCL21 IGF2 IL6R IL6 IL24 APRIL CXCL12 CCL22 VEGFA NAMPT IL7 IL25 BAFF CXCL13 CCL23 VEGFB C1QTNF3 IL8 IL26 LIGHT CXCL14 CCL24 MIF VCAM1 IL9 IL27 TL1A CXCL16 CCL25 LIF LGALS1 IL10 IL28A GITRL CCL1 CCL26 OSM LGALS9 IL11 IL29 FASLG CCL2 CCL27 ADIPOQ LGALS3 IL12A IL32 IFNA1 CCL3 CCL28 LEP LGALS12 IL12B IL33 IFNA2 CCL4 XCL1 GHRL IL13 LTA IFNB1 CCL5 XCL2 RETN IL14 TNF IFNG CCL7 CX3CL1 CTLA4 IL15 LTB CXCL1 CCL8 CSF1 EPO IL16 OX40L CXCL2 CCL11 CSF2 TPO IL17A CD40L CXCL3 CCL13 CSF3 FLT3LG panel of 117 cytokines • cell signaling proteins • regulate immune response • produced by, e.g. T-cells, macrophages, lymphocytes, fibroblasts, etc.
  • 83. St. Martin’s Institute, August 2015 GMLVQ analysis pre-processing: • log-transformed expression values (117 dim. data, 47 samples in total) • 21 leading principal components explain ca. 90% of the total variation Two two-class problems: (A) established RA vs. uninflamed controls (B) early RA vs. resolving inflammation • 1 prototype per class, global relevance matrix, distance measure: • leave-one-out validation evaluation in terms of Receiver Operating Characteristics
  • 84. St. Martin’s Institute, August 2015 false positive rate truepositiveratetruepositiverate diagonal Λii vs. cytokine index i established RA vs. uninflamed control early RA vs. resolving inflammation Matrix Relevance LVQ diagonal relevancesleave-one-out intialization of LVQ system
  • 85. St. Martin’s Institute, August 2015 CXCL4 chemokine (C-X-C motif) ligand 4 CXCL7 chemokine (C-X-C motif) ligand 7 direct study on protein level, staining / imaging of sinovial tissue: macrophages : predominant source of CXCL4/7 expression protein level studies • high levels of CXCL4 and CXLC7 in the first 12 weeks of synovitis in early RA • expression on macrophages outside of blood vessels discriminates early RA / resolving cases (2 PhD thesis projects)
  • 86. St. Martin’s Institute, August 2015 false positive rate truepositiveratetruepositiverate diagonal Λii vs. cytokine index i established RA vs. uninflamed control early RA vs. resolving inflammation relevant cytokines macrophage stimulating 1 diagonal relevancesleave-one-out
  • 87. St. Martin’s Institute, August 2015 four class problem one prototype per class and one global matrix trained in one go low-rank relevance matrix (rank ≈ 2) visualization of data set in terms of eigenvectors of Λ Niels Kluiter research internship at JBI Groningen
  • 88. St. Martin’s Institute, August 2015 four class problem - extract binary classifiers (healthy vs. est. RA, resolving vs. early RA) by restricting the system to the corresponding prototypes for varying number K of PCs used as feature vectors - determine corresponding ROC performances robust in a range of 14 < K < 20 healthy vs. est. RA K=16: AUC = 0.92 early vs. resolving RA K=16: AUC = 0.79 to do: nested L1O-val.
  • 89. St. Martin’s Institute, August 2015 four class problem read off problem- specific relevances from eigenvectors of Λ control vs. est. RA resolving vs.earlyRA
  • 90. Some challenges in biomedical data, further application examples
  • 91. St. Martin’s Institute, August 2015 challenges in bio-medical data A. Filer, A. Clark, M. Juarez, J. Falconer et al. - micro-array gene expression data high-dimensional (~50000 probes) PCA + GMLVQ (work in progress) early Arthritis vs. resolving inflammations - preliminary result: better than random classification close inspection of high relevance genes: system discriminates male/female patients prediction reflects higher prevalence of RA in female patients leave-one-out “accuracy is not enough”
  • 92. St. Martin’s Institute, August 2015 92 interpretability - important: understand the basis of decisions - white-box approaches for classification/regression etc. - insights into the data and problem at hand - e.g. selection of most discriminative bio-markers challenges relevance of steroid markers wwwensat.org adrenocortical tumors adenomas (ACA) carcinomas (ACC) W. Arlt, M. Biehl et al. Urine steroid metabolomics as a biomarker tool for detecting malignancy in adrenal tumors J. of Clin. Endocrinology & Metabolism 96: 3775-3784 (2011).
  • 93. St. Martin’s Institute, August 2015 93 large amounts of data , e.g. image data bases life lines (longitudinal patient data) prescription data bases [E. Hak, K. Taxis] challenges A B C D query images retrieval: √ - same class × - different classs UMCG data base of skin lesion images K. Bunte, M. Biehl, M.F. Jonkman, N. Petkov Learning Effective Color Features for Content Based Image Retrieval in Dermatology. Pattern Recognition 44 (2011) 1892-1902.
  • 94. St. Martin’s Institute, August 2015 94 high-dimensional data, e.g. medical images (CT, MRI, PET …) gene expression, DNA sequences, … challenges projection on first eigenvector of Λ projectiononfirsteigenvectorofΛ M. Biehl, K. Bunte, P. Schneider Analysis of Flow Cytometry Data by Matrix Relevance Learning Vector Quantization PLOS One 8: e59401 (2013) - low-dim. representation - feature selection - visualization high-throughput flow cytometry ~ 10k cells x30 markers/sample  derive 186 features  GMLVQ, low-dim. projection
  • 95. St. Martin’s Institute, August 2015 95 incomplete data challenges - missing values, noise, uncertain labels… imputation, semi-supervised learning - complementary data sets… learning from privileged information, transfer learning mixed data - combination of different sources / technical platforms suitable adaptive & integrative (dis-) similarity measures E. Mwebaze, G. Bearda, M. Biehl, D. Zühlke Combining dissimilarity measures for prototype-based classification Proc. of the 23rd European Symposium on Artificial Neural Networks ESANN 2015, d-side publishing, 31-36 (2015)
  • 96. St. Martin’s Institute, August 2015 distances combined ... N-dim. vector M-bin histogram temporal sequence Euclidean divergence (mis-)alignment combined distance measure, e.g. +source-specific prototypes relevance learning! E. Mwebaze, G. Bearda, M. Biehl, D. Zühlke Combining dissimilarity measures for prototype-based classification Proc. of the 23rd European Symposium on Artificial Neural Networks ESANN 2015, d-side publishing, 31-36 (2015)
  • 97. St. Martin’s Institute, August 2015 challenges imbalanced data sets - prevalence of diseases (screening vs. differential diagnosis) - role of false positive / false negatives T. Villmann, M. Kaden, W. Herrmann, M. Biehl Learning Vector Quantization for ROC-optimization possible working points
  • 98. St. Martin’s Institute, August 2015 causal relations vs. correlation challenges - predictive power vs. causal dependence ? www.causality.inf.ethz.ch/ data/LUCAS.html E. Mwebaze, J. Quinn, M. Biehl Causal Relevance Learning for Robust Classification under Interventions Proc. 19th Europ. Symp. on Artificial Neural Networks ESANN 2011
  • 99. St. Martin’s Institute, August 2015 challenges data not given as vectors in a Euclidean space, e.g. symbolic sequences of different length known: pairwise dis-similarities, e.g. edit-distance ‘relational data’ given as matrix loooooooongword shrtwrd pseudo-Euclidean embedding prototypes expressed as Non-vectorial data:
  • 100. St. Martin’s Institute, August 2015 non-vectorial data distances Training: updates w.r.t. prototype coefficients, e.g. LVQ1-like or GLVQ Working phase: WTA classification of novel data: distance from known example data distance from protoypes [Hammer, Schleif, Zhu, 2011] [Hammer & Hasenfuss, 2010] prototypes
  • 101. St. Martin’s Institute, August 2015 CAIP contributions Gert-Jan de Vries, Steffen Pauws and Michael Biehl. Facial Expression Recognition using Learning Vector Quantization Thomas Villmann, Marika Kaden, David Nebel and Michael Biehl. Learning Vector Quantization with Adaptive Cost-based Outlier-Rejection
  • 102. St. Martin’s Institute, August 2015 a review article For a recent review and further references see: M. Biehl, B. Hammer, T. Villmann Distance measures for prototype based classification In: BrainComp, Proc. of the International Workshop on Brain- Inspired Computing. Cetraro/Italy, July 2013 L. Grandinetti, T. Lippert, N. Petkov (editors) Springer Lecture Notes in Computer Science Vol 8603 pp. 100-116 (2014) check www.cs.rug.nl/~biehl for more references and application examples