SlideShare a Scribd company logo
Michael Biehl, Aleke Nolte
Johann Bernoulli Institute for
Mathematics and Computer Science
University of Groningen, NL
SUNDIAL H2020 Network
www.cs.rug.nl/~biehl
www.astro.rug.nl/~sundial/
pre- reprints, available code
Prototype-based models in unsupervised
and supervised machine learning
Lingyu Wang
Kapteyn Astronomical Inst.
and SRON Groningen
Astrophysics Science Group
Groningen, NL
Astroinformatics, Cape Town, November 2017 2
Overview
Introduction / Motivation
prototypes, exemplars
neural activation / learning
Supervised Learning
Learning Vector Quantization (LVQ)
Adaptive distances and relevance learning
Unsupervised Learning
Vector Quantization (VQ), competitive learning
Kohonen’s Self-Organizing Map (SOM)
Illustration:
SOM-clustering of galaxy data, post-labelling
Supervised classification, LVQ+relevance learning
Astroinformatics, Cape Town, November 2017
Introduction
prototypes, exemplars:
representation of information in terms of
typical representatives (e.g. of a class of objects),
much debated concept in cognitive psychology
machine learning: prototype- (and distance-) based systems
- easy to implement, highly flexible, online training
- white box: parameterization in the space of observed data
- yield interpretable classifiers/regression systems
- help to detect bias in training data, other artifacts
- provide insights into data set / problem at hand
Accuracy is not enough! [Paulo Lisboa]
Astroinformatics, Cape Town, November 2017
4
Introduction
neural interpretation: activation and learning in a shallow network
external stimulus to a network of neurons
response according to weights (= expected inputs)
activation: BMU - best matching unit (and neighbors)
learning -> even stronger response to the same stimulus in future
weights represent different expected stimuli (prototypes)
Astroinformatics, Cape Town, November 2017
based on dis-similarity/distance measure
assignment to prototypes: e.g. Nearest Prototype Scheme
given vector xμ , determine winner
(BMU)
→ assign xμ to prototype w*
most popular example: (squared) Euclidean distance
Vector Quantization (VQ)
VQ system: set of prototypes
data: set of feature vectors
Vector Quantization: identify typical representatives of data
which capture essential features
Astroinformatics, Cape Town, November 2017 6
random sequential (repeated) presentation of data
… the Winner Takes it All (WTA):
initially: randomized wk, e.g. in randomly selected data points
Competitive Learning
η (<1): learning rate, step size of update
competitive VQ: competition without neighborhood cooperativeness
stochastic gradient descent minimization of the
Quantization Error
(here: sq. Euclidean)
Astroinformatics, Cape Town, November 2017
Self-Organizing Map (SOM)
T. Kohonen. Self-Organizing Maps (Springer 1995, 1997, 2001)
neighborhood cooperativeness on a pre-defined low-dim. lattice
d-dim. lattice A of
neurons (prototypes)
- update BMU and lattice neighborhood:
where
range ρ w.r.t. distances in lattice A
upon presentation of xμ :
- determine the Best Matching Unit
at position s in the lattice
Astroinformatics, Cape Town, November 2017 8
prototype lattice deforms, reflecting the density of observations
© Wikipedia
SOM: provides topology/neighborhood preserving
low-dimensional representation
e.g. for inspection and visualization of structured datasets
Frequently:
unsupervised analysis, post-hoc comparison with classes of data
Self-Organizing Map
Astroinformatics, Cape Town, November 2017 9
Hubble’s galaxy classification scheme
http://astro.physics.uiowa.edu/ITU/labs/general-astronomy/counting-galaxies/part-1-counting-galaxies.html
Illustration: Galaxy Characteristics
Astroinformatics, Cape Town, November 2017
10
11
12
.
.
41
Illustration: Galaxy Characteristics
Numerical features describing a catalogue of galaxies
work in progress - details not (yet) disclosed
GAMA: Galaxy and
Mass Assembly Survey
www.gama-survey.org
reduced
set of 10
selected
features
full set
of 41
features
(semi-major)
(semi-minor)
logistic normalization:
Astroinformatics, Cape Town, November 2017
class 1
class 3
class 4
7
class 5
class 6
class 2 8,9
1 - elliptical E0-E6
3 – “early type spirals”
4 – “early type barred spirals”
5 – “intermediate type spirals”
6 – “intermediate type, barred”
7 – “late type spirals & irregulars”
Illustration: Galaxy Classification
2 - Little Blue Spheroids (LBS)
“
8,9 – artefacts, stars
Kelvin et al., MNRAS 439: 1245-1269, 2014.
Astroinformatics, Cape Town, November 2017 12
SOM: (rectangular grid, ‘medium size’)
unsupervised clustering
based on 10 manually selected features
data set of ~ 5000 samples
post-labelling of prototypes
(majority of represented samples)
according to human classification
note: map with p.b.c. (toroidal)
Self-Organizing Map
SOM toolbox:
http://www.cis.hut.fi/somtoolbox/
init:lininit, training:long, hape:toroid, mapsize:regular, lattice:rect,
2
2
2
2
2
2
2
7
7
7
7
7
5
5
5
5
5
5
5
3
1
1
1
1
1
1
1
2
2
2
2
2
2
7
7
2
7
5
5
5
5
5
5
1
3
3
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
7
7
5
5
5
5
5
3
3
3
1
3
1
1
1
1
1
2
2
2
2
2
2
2
2
2
7
1
7
1
5
1
1
3
3
3
3
3
3
1
1
1
1
2
2
2
2
2
7
2
7
7
7
7
7
3
5
5
3
3
3
3
3
3
3
3
3
1
1
7
7
2
2
7
2
7
7
7
7
7
7
5
7
5
5
5
3
3
3
3
3
3
3
3
7
5
2
2
2
7
7
7
7
7
7
5
5
5
5
5
5
5
3
3
3
3
3
3
3
3
2
1
2
2
7
7
7
7
7
7
7
5
5
5
5
5
5
5
3
3
3
3
3
3
3
1
5
2
7
2
2
7
7
7
7
7
5
5
5
5
5
5
5
5
5
3
1
1
1
3
1
5
5
7
2
7
2
2
7
7
7
7
5
5
5
5
5
5
5
5
3
3
1
1
1
3
3
3
1
2
2
2
2
2
7
7
7
7
5
5
5
5
5
5
5
5
6
6
3
1
1
1
1
1
1
2
2
2
2
2
2
7
7
7
7
7
5
7
5
5
5
5
5
6
3
1
1
1
1
1
2
1
2
2
2
2
2
7
7
7
7
7
5
5
5
5
5
5
3
3
1
1
1
1
1
1
datadim: 10, normalizaton: logistic , size: regular , features: ExcelVarslct
Astroinformatics, Cape Town, November 2017 13
SOM (rectangular grid, ‘medium size’)
unsupervised clustering
pie-charts:
percentage at which classes
are assigned to a particular unit
Self-Organizing Map
observations / suggestions:
- LBS appear well separated
- overlap of 1 / 3 and 5 / 7
with smooth transtions
- 6 and 5 mix/overlap
- “small classes” 4,8,9 hardly
represented
to do: inspect prototypes, U-matrix, ...
meta-clustering
Astroinformatics, Cape Town, November 2017
∙ identification of prototype vectors from labeled example data
∙ distance based classification (e.g. Euclidean)
Supervised Competitive Learning
N-dimensional data, feature vectors
• initialize prototype vectors
for different classes
Learning Vector Quantization here: heuristic LVQ1 [Kohonen, 1990]
• identify the winner
(closest prototype)
• present a single example
• move the winner
- closer towards the data (same class)
- away from the data (different class)
Alternatives: cost function based training
e.g. Generalized LVQ [ GLVQ: Sato and Yamada, 1995]
Astroinformatics, Cape Town, November 2017
∙ identification of prototype vectors from labeled example data
∙ distance based classification (e.g. Euclidean)
Learning Vector Quantization
N-dimensional data, feature vectors
∙ distance-based classification
[here: Euclidean distances]
∙ generalization ability
correct classification of new data
∙ aim: discrimination of classes
( ≠ vector quantization
or density estimation )


Nearest Prototype Classifier
Astroinformatics, Cape Town, November 2017 16
Distance Measures
fixed distance measures:
- select distance measures (prior knowledge, pre-processing)
- compare performance of various measures
relevance learning: adaptive distance measures
- fix only parametric form of distance measure
- data driven adaptation:
determine prototypes and distance parameters
in the same training process (e.g. cost function based GLVQ)
Example: Generalized Matrix Relevance LVQ
(Adaptive)
[Schneider, Biehl, Hammer, 2009]
Astroinformatics, Cape Town, November 2017
Generalized Relevance Matrix LVQ (GMLVQ)
adaptive quadratic distance in LVQ:
normalization:
summarizes
- the contribution of the original dimension j
- relevance of original features for the classification
standard (squared) Euclidean distance for
linearly transformed features
: relevance of pairs (i,j) of features
Astroinformatics, Cape Town, November 2017 18
- restriction to classes with significant number of samples
- sub-sampling in order to achieve balanced training sets (5×743)
- use of all 41 features
- avgerages over random splits in 90% training, 10% test set
GMLVQ analysis
one prototype
per class
1 2
3 5
7
confusion matrix of the NPC
61.3 10.4 20.1 7.5 0.7
3.1 90.5 0 1.9 4.5
16.5 1.7 68.0 13.6 0.2
1.6 7.8 10.0 73.6 7.0
1.3 13.0 0.3 13.8 71.
6
predicted
trueclass
Astroinformatics, Cape Town, November 2017 19
diagonal of the relevance matrix:
continuous weights
- alternative set of features ?
projection of the data set on leading
eigenvectors of Λ: discriminative
low-dim. representation:
e.g. strong overlap of classes 1 / 3
(elliptical / early type spirals)
- agrees only partially
with hand-crafted set ()
correlations between features?
GMLVQ analysis
Astroinformatics, Cape Town, November 2017 20
Summary
Prototype-based systems in machine learning:
represent data in terms of exemplars, white box
parameterization of clustering / classification / regression
Unsupervised Learning
data reduction, vector quantization, clustering
low-dimensional representation, topology preserving SOM
Supervised Learning
example: LVQ for classification with adaptive distance
Generalized Matrix Relevance LVQ (GMLVQ) *
white box, transparent, intuitive, powerful
accuracy is not enough: insight into problem / data set
e.g. with respect to feature selection / weighting
* GMLVQ (matlab) toolboxes: www.cs.rug.nl/~biehl
Astroinformatics, Cape Town, November 2017 21
Unsupervised Learning
Neural Gas (NG)
Generative Topographic Map (GTM)
Relevance learning in dimension reduction
Regression
Ordinal Regresssion in GMVLQ
Radial Basis Function networks (RBF)
Probabilistic classification
likelihood-based classifiers (Robust Soft LVQ)
Distances / Similarities
unconventional, problem-specific similarity measures
e.g. functional data (time series, spectra, histograms...)
non-vectorial data, relational data
relevances: weak/strong, bounds
...
there is a lot more...
Astroinformatics, Cape Town, November 2017 22
review: WIRES Cognitive Science (2016)

More Related Content

What's hot

About functional SIR
About functional SIRAbout functional SIR
About functional SIR
tuxette
 
Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Shenghui Wang
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
tuxette
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
tuxette
 
The statistical physics of learning - revisited
The statistical physics of learning - revisitedThe statistical physics of learning - revisited
The statistical physics of learning - revisited
University of Groningen
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
tuxette
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
tuxette
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
tuxette
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
tuxette
 
4 avrachenkov
4 avrachenkov4 avrachenkov
4 avrachenkov
Yandex
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
tuxette
 
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
Paris Women in Machine Learning and Data Science
 
cvpr2009: class specific hough forest for object detection
cvpr2009: class specific hough forest for object detectioncvpr2009: class specific hough forest for object detection
cvpr2009: class specific hough forest for object detectionzukun
 
Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)
Ha Phuong
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
NTNU
 
Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application Examples
Förderverein Technische Fakultät
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
The Statistical and Applied Mathematical Sciences Institute
 
Introduction to search and optimisation for the design theorist
Introduction to search and optimisation for the design theoristIntroduction to search and optimisation for the design theorist
Introduction to search and optimisation for the design theorist
Akin Osman Kazakci
 
Unsupervised Learning and Image Classification in High Performance Computing ...
Unsupervised Learning and Image Classification in High Performance Computing ...Unsupervised Learning and Image Classification in High Performance Computing ...
Unsupervised Learning and Image Classification in High Performance Computing ...
HPCC Systems
 
Learning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifoldLearning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifold
Kai-Wen Zhao
 

What's hot (20)

About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 
Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
The statistical physics of learning - revisited
The statistical physics of learning - revisitedThe statistical physics of learning - revisited
The statistical physics of learning - revisited
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 
Graph Neural Network in practice
Graph Neural Network in practiceGraph Neural Network in practice
Graph Neural Network in practice
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 
4 avrachenkov
4 avrachenkov4 avrachenkov
4 avrachenkov
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biologyKernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
How to Accelerate Molecular Simulations with Data? by Žofia Trsťanová, Machin...
 
cvpr2009: class specific hough forest for object detection
cvpr2009: class specific hough forest for object detectioncvpr2009: class specific hough forest for object detection
cvpr2009: class specific hough forest for object detection
 
Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
 
Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application Examples
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
 
Introduction to search and optimisation for the design theorist
Introduction to search and optimisation for the design theoristIntroduction to search and optimisation for the design theorist
Introduction to search and optimisation for the design theorist
 
Unsupervised Learning and Image Classification in High Performance Computing ...
Unsupervised Learning and Image Classification in High Performance Computing ...Unsupervised Learning and Image Classification in High Performance Computing ...
Unsupervised Learning and Image Classification in High Performance Computing ...
 
Learning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifoldLearning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifold
 

Similar to 2017: Prototype-based models in unsupervised and supervised machine learning

Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
University of Groningen
 
L4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics CourseL4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics Course
Mohaiminur Rahman
 
2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classification2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classification
University of Groningen
 
prototypes-AMALEA.pdf
prototypes-AMALEA.pdfprototypes-AMALEA.pdf
prototypes-AMALEA.pdf
University of Groningen
 
3680-NoCA.pptx
3680-NoCA.pptx3680-NoCA.pptx
3680-NoCA.pptxgrssieee
 
Classification accuracy of sar images for various land
Classification accuracy of sar images for various landClassification accuracy of sar images for various land
Classification accuracy of sar images for various land
eSAT Publishing House
 
Object class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunalObject class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunal
Kunal Kishor Nirala
 
Imecs2012 pp440 445
Imecs2012 pp440 445Imecs2012 pp440 445
Imecs2012 pp440 445Rasha Orban
 
Cs501 cluster analysis
Cs501 cluster analysisCs501 cluster analysis
Cs501 cluster analysis
Kamal Singh Lodhi
 
3D Scene Analysis via Sequenced Predictions over Points and Regions
3D Scene Analysis via Sequenced Predictions over Points and Regions3D Scene Analysis via Sequenced Predictions over Points and Regions
3D Scene Analysis via Sequenced Predictions over Points and Regions
Flavia Grosan
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
Health-e-Child CaseReasoner
Health-e-Child CaseReasonerHealth-e-Child CaseReasoner
Health-e-Child CaseReasoner
GaborRendes
 
Clustering techniques final
Clustering techniques finalClustering techniques final
Clustering techniques final
Benard Maina
 
Irrera gold2010
Irrera gold2010Irrera gold2010
Irrera gold2010grssieee
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional Verification
Sai Kiran Kadam
 
Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)esambale
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10
mqasimsheikh5
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Scientific Review SR
 

Similar to 2017: Prototype-based models in unsupervised and supervised machine learning (20)

Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
 
L4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics CourseL4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics Course
 
2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classification2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classification
 
prototypes-AMALEA.pdf
prototypes-AMALEA.pdfprototypes-AMALEA.pdf
prototypes-AMALEA.pdf
 
3680-NoCA.pptx
3680-NoCA.pptx3680-NoCA.pptx
3680-NoCA.pptx
 
Classification accuracy of sar images for various land
Classification accuracy of sar images for various landClassification accuracy of sar images for various land
Classification accuracy of sar images for various land
 
Object class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunalObject class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunal
 
Imecs2012 pp440 445
Imecs2012 pp440 445Imecs2012 pp440 445
Imecs2012 pp440 445
 
Cs501 cluster analysis
Cs501 cluster analysisCs501 cluster analysis
Cs501 cluster analysis
 
SOM-VIS
SOM-VISSOM-VIS
SOM-VIS
 
3D Scene Analysis via Sequenced Predictions over Points and Regions
3D Scene Analysis via Sequenced Predictions over Points and Regions3D Scene Analysis via Sequenced Predictions over Points and Regions
3D Scene Analysis via Sequenced Predictions over Points and Regions
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
Health-e-Child CaseReasoner
Health-e-Child CaseReasonerHealth-e-Child CaseReasoner
Health-e-Child CaseReasoner
 
Clustering techniques final
Clustering techniques finalClustering techniques final
Clustering techniques final
 
ppt
pptppt
ppt
 
Irrera gold2010
Irrera gold2010Irrera gold2010
Irrera gold2010
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional Verification
 
Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
 

More from University of Groningen

Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
University of Groningen
 
ESE-Eyes-2023.pdf
ESE-Eyes-2023.pdfESE-Eyes-2023.pdf
ESE-Eyes-2023.pdf
University of Groningen
 
APPIS-FDGPET.pdf
APPIS-FDGPET.pdfAPPIS-FDGPET.pdf
APPIS-FDGPET.pdf
University of Groningen
 
stat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdfstat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdf
University of Groningen
 
stat-phys-AMALEA.pdf
stat-phys-AMALEA.pdfstat-phys-AMALEA.pdf
stat-phys-AMALEA.pdf
University of Groningen
 
Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...
University of Groningen
 
The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...
University of Groningen
 
Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)
University of Groningen
 
2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...
University of Groningen
 
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
University of Groningen
 
2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ... 2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ...
University of Groningen
 
Prototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciencesPrototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciences
University of Groningen
 
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
University of Groningen
 
2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...
University of Groningen
 
2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain Data2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain Data
University of Groningen
 
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
University of Groningen
 
June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...
University of Groningen
 

More from University of Groningen (17)

Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
 
ESE-Eyes-2023.pdf
ESE-Eyes-2023.pdfESE-Eyes-2023.pdf
ESE-Eyes-2023.pdf
 
APPIS-FDGPET.pdf
APPIS-FDGPET.pdfAPPIS-FDGPET.pdf
APPIS-FDGPET.pdf
 
stat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdfstat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdf
 
stat-phys-AMALEA.pdf
stat-phys-AMALEA.pdfstat-phys-AMALEA.pdf
stat-phys-AMALEA.pdf
 
Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...
 
The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...
 
Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)
 
2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...
 
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
 
2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ... 2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ...
 
Prototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciencesPrototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciences
 
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
 
2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...
 
2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain Data2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain Data
 
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma
 
June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...
 

Recently uploaded

THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 

Recently uploaded (20)

THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 

2017: Prototype-based models in unsupervised and supervised machine learning

  • 1. Michael Biehl, Aleke Nolte Johann Bernoulli Institute for Mathematics and Computer Science University of Groningen, NL SUNDIAL H2020 Network www.cs.rug.nl/~biehl www.astro.rug.nl/~sundial/ pre- reprints, available code Prototype-based models in unsupervised and supervised machine learning Lingyu Wang Kapteyn Astronomical Inst. and SRON Groningen Astrophysics Science Group Groningen, NL
  • 2. Astroinformatics, Cape Town, November 2017 2 Overview Introduction / Motivation prototypes, exemplars neural activation / learning Supervised Learning Learning Vector Quantization (LVQ) Adaptive distances and relevance learning Unsupervised Learning Vector Quantization (VQ), competitive learning Kohonen’s Self-Organizing Map (SOM) Illustration: SOM-clustering of galaxy data, post-labelling Supervised classification, LVQ+relevance learning
  • 3. Astroinformatics, Cape Town, November 2017 Introduction prototypes, exemplars: representation of information in terms of typical representatives (e.g. of a class of objects), much debated concept in cognitive psychology machine learning: prototype- (and distance-) based systems - easy to implement, highly flexible, online training - white box: parameterization in the space of observed data - yield interpretable classifiers/regression systems - help to detect bias in training data, other artifacts - provide insights into data set / problem at hand Accuracy is not enough! [Paulo Lisboa]
  • 4. Astroinformatics, Cape Town, November 2017 4 Introduction neural interpretation: activation and learning in a shallow network external stimulus to a network of neurons response according to weights (= expected inputs) activation: BMU - best matching unit (and neighbors) learning -> even stronger response to the same stimulus in future weights represent different expected stimuli (prototypes)
  • 5. Astroinformatics, Cape Town, November 2017 based on dis-similarity/distance measure assignment to prototypes: e.g. Nearest Prototype Scheme given vector xμ , determine winner (BMU) → assign xμ to prototype w* most popular example: (squared) Euclidean distance Vector Quantization (VQ) VQ system: set of prototypes data: set of feature vectors Vector Quantization: identify typical representatives of data which capture essential features
  • 6. Astroinformatics, Cape Town, November 2017 6 random sequential (repeated) presentation of data … the Winner Takes it All (WTA): initially: randomized wk, e.g. in randomly selected data points Competitive Learning η (<1): learning rate, step size of update competitive VQ: competition without neighborhood cooperativeness stochastic gradient descent minimization of the Quantization Error (here: sq. Euclidean)
  • 7. Astroinformatics, Cape Town, November 2017 Self-Organizing Map (SOM) T. Kohonen. Self-Organizing Maps (Springer 1995, 1997, 2001) neighborhood cooperativeness on a pre-defined low-dim. lattice d-dim. lattice A of neurons (prototypes) - update BMU and lattice neighborhood: where range ρ w.r.t. distances in lattice A upon presentation of xμ : - determine the Best Matching Unit at position s in the lattice
  • 8. Astroinformatics, Cape Town, November 2017 8 prototype lattice deforms, reflecting the density of observations © Wikipedia SOM: provides topology/neighborhood preserving low-dimensional representation e.g. for inspection and visualization of structured datasets Frequently: unsupervised analysis, post-hoc comparison with classes of data Self-Organizing Map
  • 9. Astroinformatics, Cape Town, November 2017 9 Hubble’s galaxy classification scheme http://astro.physics.uiowa.edu/ITU/labs/general-astronomy/counting-galaxies/part-1-counting-galaxies.html Illustration: Galaxy Characteristics
  • 10. Astroinformatics, Cape Town, November 2017 10 11 12 . . 41 Illustration: Galaxy Characteristics Numerical features describing a catalogue of galaxies work in progress - details not (yet) disclosed GAMA: Galaxy and Mass Assembly Survey www.gama-survey.org reduced set of 10 selected features full set of 41 features (semi-major) (semi-minor) logistic normalization:
  • 11. Astroinformatics, Cape Town, November 2017 class 1 class 3 class 4 7 class 5 class 6 class 2 8,9 1 - elliptical E0-E6 3 – “early type spirals” 4 – “early type barred spirals” 5 – “intermediate type spirals” 6 – “intermediate type, barred” 7 – “late type spirals & irregulars” Illustration: Galaxy Classification 2 - Little Blue Spheroids (LBS) “ 8,9 – artefacts, stars Kelvin et al., MNRAS 439: 1245-1269, 2014.
  • 12. Astroinformatics, Cape Town, November 2017 12 SOM: (rectangular grid, ‘medium size’) unsupervised clustering based on 10 manually selected features data set of ~ 5000 samples post-labelling of prototypes (majority of represented samples) according to human classification note: map with p.b.c. (toroidal) Self-Organizing Map SOM toolbox: http://www.cis.hut.fi/somtoolbox/ init:lininit, training:long, hape:toroid, mapsize:regular, lattice:rect, 2 2 2 2 2 2 2 7 7 7 7 7 5 5 5 5 5 5 5 3 1 1 1 1 1 1 1 2 2 2 2 2 2 7 7 2 7 5 5 5 5 5 5 1 3 3 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 7 7 5 5 5 5 5 3 3 3 1 3 1 1 1 1 1 2 2 2 2 2 2 2 2 2 7 1 7 1 5 1 1 3 3 3 3 3 3 1 1 1 1 2 2 2 2 2 7 2 7 7 7 7 7 3 5 5 3 3 3 3 3 3 3 3 3 1 1 7 7 2 2 7 2 7 7 7 7 7 7 5 7 5 5 5 3 3 3 3 3 3 3 3 7 5 2 2 2 7 7 7 7 7 7 5 5 5 5 5 5 5 3 3 3 3 3 3 3 3 2 1 2 2 7 7 7 7 7 7 7 5 5 5 5 5 5 5 3 3 3 3 3 3 3 1 5 2 7 2 2 7 7 7 7 7 5 5 5 5 5 5 5 5 5 3 1 1 1 3 1 5 5 7 2 7 2 2 7 7 7 7 5 5 5 5 5 5 5 5 3 3 1 1 1 3 3 3 1 2 2 2 2 2 7 7 7 7 5 5 5 5 5 5 5 5 6 6 3 1 1 1 1 1 1 2 2 2 2 2 2 7 7 7 7 7 5 7 5 5 5 5 5 6 3 1 1 1 1 1 2 1 2 2 2 2 2 7 7 7 7 7 5 5 5 5 5 5 3 3 1 1 1 1 1 1 datadim: 10, normalizaton: logistic , size: regular , features: ExcelVarslct
  • 13. Astroinformatics, Cape Town, November 2017 13 SOM (rectangular grid, ‘medium size’) unsupervised clustering pie-charts: percentage at which classes are assigned to a particular unit Self-Organizing Map observations / suggestions: - LBS appear well separated - overlap of 1 / 3 and 5 / 7 with smooth transtions - 6 and 5 mix/overlap - “small classes” 4,8,9 hardly represented to do: inspect prototypes, U-matrix, ... meta-clustering
  • 14. Astroinformatics, Cape Town, November 2017 ∙ identification of prototype vectors from labeled example data ∙ distance based classification (e.g. Euclidean) Supervised Competitive Learning N-dimensional data, feature vectors • initialize prototype vectors for different classes Learning Vector Quantization here: heuristic LVQ1 [Kohonen, 1990] • identify the winner (closest prototype) • present a single example • move the winner - closer towards the data (same class) - away from the data (different class) Alternatives: cost function based training e.g. Generalized LVQ [ GLVQ: Sato and Yamada, 1995]
  • 15. Astroinformatics, Cape Town, November 2017 ∙ identification of prototype vectors from labeled example data ∙ distance based classification (e.g. Euclidean) Learning Vector Quantization N-dimensional data, feature vectors ∙ distance-based classification [here: Euclidean distances] ∙ generalization ability correct classification of new data ∙ aim: discrimination of classes ( ≠ vector quantization or density estimation )   Nearest Prototype Classifier
  • 16. Astroinformatics, Cape Town, November 2017 16 Distance Measures fixed distance measures: - select distance measures (prior knowledge, pre-processing) - compare performance of various measures relevance learning: adaptive distance measures - fix only parametric form of distance measure - data driven adaptation: determine prototypes and distance parameters in the same training process (e.g. cost function based GLVQ) Example: Generalized Matrix Relevance LVQ (Adaptive) [Schneider, Biehl, Hammer, 2009]
  • 17. Astroinformatics, Cape Town, November 2017 Generalized Relevance Matrix LVQ (GMLVQ) adaptive quadratic distance in LVQ: normalization: summarizes - the contribution of the original dimension j - relevance of original features for the classification standard (squared) Euclidean distance for linearly transformed features : relevance of pairs (i,j) of features
  • 18. Astroinformatics, Cape Town, November 2017 18 - restriction to classes with significant number of samples - sub-sampling in order to achieve balanced training sets (5×743) - use of all 41 features - avgerages over random splits in 90% training, 10% test set GMLVQ analysis one prototype per class 1 2 3 5 7 confusion matrix of the NPC 61.3 10.4 20.1 7.5 0.7 3.1 90.5 0 1.9 4.5 16.5 1.7 68.0 13.6 0.2 1.6 7.8 10.0 73.6 7.0 1.3 13.0 0.3 13.8 71. 6 predicted trueclass
  • 19. Astroinformatics, Cape Town, November 2017 19 diagonal of the relevance matrix: continuous weights - alternative set of features ? projection of the data set on leading eigenvectors of Λ: discriminative low-dim. representation: e.g. strong overlap of classes 1 / 3 (elliptical / early type spirals) - agrees only partially with hand-crafted set () correlations between features? GMLVQ analysis
  • 20. Astroinformatics, Cape Town, November 2017 20 Summary Prototype-based systems in machine learning: represent data in terms of exemplars, white box parameterization of clustering / classification / regression Unsupervised Learning data reduction, vector quantization, clustering low-dimensional representation, topology preserving SOM Supervised Learning example: LVQ for classification with adaptive distance Generalized Matrix Relevance LVQ (GMLVQ) * white box, transparent, intuitive, powerful accuracy is not enough: insight into problem / data set e.g. with respect to feature selection / weighting * GMLVQ (matlab) toolboxes: www.cs.rug.nl/~biehl
  • 21. Astroinformatics, Cape Town, November 2017 21 Unsupervised Learning Neural Gas (NG) Generative Topographic Map (GTM) Relevance learning in dimension reduction Regression Ordinal Regresssion in GMVLQ Radial Basis Function networks (RBF) Probabilistic classification likelihood-based classifiers (Robust Soft LVQ) Distances / Similarities unconventional, problem-specific similarity measures e.g. functional data (time series, spectra, histograms...) non-vectorial data, relational data relevances: weak/strong, bounds ... there is a lot more...
  • 22. Astroinformatics, Cape Town, November 2017 22 review: WIRES Cognitive Science (2016)

Editor's Notes

  1. 1 – elliptical 5 – intermediate type spirals 2 - Little Blue Spheroids 6 - intermediate type barred 3 - early type spirals 7- irregular 4 – early type barred spirals
  2. 1 – elliptical 5 – intermediate type spirals 2 - Little Blue Spheroids 6 - intermediate type barred 3 - early type spirals 7- irregular 4 – early type barred spirals
  3. 1 – elliptical 5 – intermediate type spirals 2 - Little Blue Spheroids 6 - intermediate type barred 3 - early type spirals 7- irregular 4 – early type barred spirals
  4. 1 – elliptical 5 – intermediate type spirals 2 - Little Blue Spheroids 6 - intermediate type barred 3 - early type spirals 7- irregular 4 – early type barred spirals