SlideShare a Scribd company logo
Stefano Cavuoti
INAF – Capodimonte Astronomical Observatory – Napoli
By definition, machine learning models are based on learning and self-adaptive
techniques.
A priori, real world data are intrinsically carriers of embedded information, hidden by
noise.
In almost all cases the signal-to-noise (S/N) ratio is low and the amount of data
prevents the human exploration. We don’t know which feature of a pattern is much
carrier of good information and how and where the correlation of features gives the
best knowledge of the solution.
If we want to classify the cultivars from which a bottle
of wine is made we could analyze a lot of parameters.
• Alcohol
• Malic acid
• Ash (cenere)
• Alcalinity of ash
• Magnesium
• Total phenols
• Flavanoids
• Nonflavanoid phenols
• Proanthocyanins
• Color intensity
• Hue
• OD280/OD315 of diluted wines
• Proline
• …
• Date of production
The date of production is not
related to the kind of cultivars and
for this task this parameter is
useless or could be even harmful!
The following example is extracted from:
http://www.visiondummy.com/2014/04/curse-dimensionality-affect-
classification/
Given a set of images, each one depicting either a cat or a dog. We would like
to create a classifier able to distinguish dogs from cats automatically.
To do so, we first need to think about a descriptor for each object class that
can be expressed by numbers, such that a mathematical algorithm can use
those numbers to recognize objects. We could for instance argue that cats and
dogs generally differ in color. A possible descriptor discriminating these two
classes could then consist of three numbers: the average red, green and blue
colors of the images.
A simple linear classifier for instance, could linearly combine these features
to decide on the class labels
If 0.5*red + 0.3*green + 0.2*blue > 0.6 :
return cat;
else:
return dog;
However, these three color-describing numbers, called features, will obviously not
suffice to obtain a perfect classification. Therefore, we could decide to add some
features that describe the texture of the image, for instance by calculating the average
edge or gradient intensity in both the X and Y direction. We now have 5 features that,
in combination, could possibly be used by a classification algorithm to distinguish cats
from dogs.
To obtain an even more accurate classification, we could add more features, based on
color or texture histograms, statistical moments, etc.
Maybe we can obtain a perfect classification by carefully defining a few hundred of
these features?
The answer to this question might sound a bit counter-intuitive: no, we can’t!.
In fact, after a certain point, increasing the dimensionality of the problem, by adding
new features, would actually degrade the performance of our classifier.
This is illustrated by figure, and is often referred
to as ‘The Curse of Dimensionality’.
If we would keep adding features, the dimensionality of the
feature space grows, and becomes sparser and sparser.
Due to this sparsity, it becomes much more easy to find a
separable hyperplane because the likelihood that a training
sample lies on the wrong side of the best hyperplane becomes
infinitely small when the number of features becomes
infinitely large.
However, if we project the highly dimensional classification
result back to a lower dimensional space, we find this:
Imagine a unit square that represents the 2D feature space.
The average of the feature space is the center of this unit
square, and all points within unit distance from this center, are
inside a unit circle that inscribes the unit square.
The training samples that do not fall within this unit circle are
closer to the corners of the search space than to its center.
These samples are difficult to classify because their feature
values greatly differ (e.g. samples in opposite corners of the
unit square). Therefore, classification is easier if most samples
fall inside the inscribed unit circle
The volume of the inscribing hypersphere of dimension d and
with radius 0.5 can be calculated as:
Figure shows how the volume of this hypersphere changes
when the dimensionality increases:
The volume of the hypersphere tends to zero as the dimensionality tends to
infinity, whereas the volume of the surrounding hypercube remains
constant. This surprising and rather counter-intuitive observation partially
explains the problems associated with the curse of dimensionality in
classification: In high dimensional spaces, most of the training data resides
in the corners of the hypercube defining the feature space. As mentioned
before, instances in the corners of the feature space are much more difficult
to classify than instances around the centroid of the hypersphere. This is
illustrated by figure 11, which shows a 2D unit square, a 3D unit cube, and a
creative visualization of an 8D hypercube which has 2^8 = 256 corners
are not separated in this parameter are separated in this parameter
Pruning of data consists of an heuristic evaluation of quality performance of a machine
learning technique, based on the Wrapper model of feature extraction, mixed with
statistical indicators. It is basically used to optimize the parameter space in
classification and regression problems.
All permutations
of data features Regression
• Confusion
matrix
• Completeness
• Purity
• Contamination
• Bias
• Standard
Deviation
• MAD
• RMS
• Outlier
percentages
Classification
var A
varB
PCA finds linear principal components, based on the Euclidean distance. It
is well suited for clustering problems
In order to overcome such kind of limits several methods have been
developed based on different principles:
• Combination of PCA (segments)
• Principal curves,
• Principal surfaces
• Principal manifolds
Feature Selection:
• May Improve performance of classification
algorithm
• Classification algorithm may not scale up to
the size of the full feature set either in
sample or time
• Allows us to better understand the domain
• Cheaper to collect a reduced set of
predictors
• Safer to collect a reduced set of predictors
N-N-N-NO TIME, NO TIME, NO TIME!
HELLO, GOOD BYE,
I AM LATE, I AM LATE….
JUST TIME FOR A FEW QUESTIONS!
Filter model
Wrapper model
The goal here is to have the network "discover" structure in the data by finding how the
data is clustered. The results can be used for data encoding and compression. One such
method for doing this is called vector quantization
In vector quantization, we assume there is a codebook which is defined by a set of M
prototype vectors. (M is chosen by the user and the initial prototype vectors are
chosen arbitrarily).
An input belongs to cluster i if i is the index of the closest prototype (closest in the
sense of the normal euclidean distance). This has the effect of dividing up the input
space into a Voronoi tesselation
1. Choose the number of clusters M
2. Initialize the prototypes w*1,... w*m (one simple method for doing this is to randomly
choose M vectors from the input data)
3. Repeat until stopping criterion is satisfied:
a) Randomly pick an input x
b) Determine the "winning" node k by finding the prototype vector that satisfies
|w*k-x|<=|w*i-x| (for all i ) note: if the prototypes are normalized, this is
equivalent to maximizing w*i x
c) Update only the winning prototype weights according to
w*k(new) = w*k(old) + m( x - w*k(old) )
LVQ (Learning Vector Quantization) is a
supervised version of vector quantization.
Classes are predefined and we have a set of
labeled data. The goal is to determine a set
of prototypes the best represent each class.
Each output neuron represents a prototype
independent from each other.
Weight update is not extended to neighbours.
Does not provide a map.
LVQ network: is a hybrid supervised network where the clustered space (through SOM)
is used to take a classification decision, in order to compress (categorize) the input
information

More Related Content

What's hot

Autoencoders
AutoencodersAutoencoders
Autoencoders
CloudxLab
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
Mark Chang
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
Sri Ambati
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionalityNikhil Sharma
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
HARISH R
 
Dimensionality reduction
Dimensionality reductionDimensionality reduction
Dimensionality reduction
Shatakirti Er
 
Activation function
Activation functionActivation function
Activation function
RakshithGowdakodihal
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
Sushant Shrivastava
 
Ml8 boosting and-stacking
Ml8 boosting and-stackingMl8 boosting and-stacking
Ml8 boosting and-stacking
ankit_ppt
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
Edge AI and Vision Alliance
 
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
HJ van Veen
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
Antonio Rueda-Toicen
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
LEE HOSEONG
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
RohanBorgalli
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
Hasan H Topcu
 
07 regularization
07 regularization07 regularization
07 regularization
Ronald Teo
 
Data Augmentation
Data AugmentationData Augmentation
Data Augmentation
Md Tajul Islam
 

What's hot (20)

Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
 
Dimensionality reduction
Dimensionality reductionDimensionality reduction
Dimensionality reduction
 
Activation function
Activation functionActivation function
Activation function
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Ml8 boosting and-stacking
Ml8 boosting and-stackingMl8 boosting and-stacking
Ml8 boosting and-stacking
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
 
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 
07 regularization
07 regularization07 regularization
07 regularization
 
Data Augmentation
Data AugmentationData Augmentation
Data Augmentation
 

Viewers also liked

08 distributed optimization
08 distributed optimization08 distributed optimization
08 distributed optimization
Marco Quartulli
 
08 visualisation seminar ver0.2
08 visualisation seminar   ver0.208 visualisation seminar   ver0.2
08 visualisation seminar ver0.2
Marco Quartulli
 
07 data structures_and_representations
07 data structures_and_representations07 data structures_and_representations
07 data structures_and_representations
Marco Quartulli
 
05 sensor signal_models_feature_extraction
05 sensor signal_models_feature_extraction05 sensor signal_models_feature_extraction
05 sensor signal_models_feature_extraction
Marco Quartulli
 
06 ashish mahabal bse1
06 ashish mahabal bse106 ashish mahabal bse1
06 ashish mahabal bse1
Marco Quartulli
 
06 ashish mahabal bse2
06 ashish mahabal bse206 ashish mahabal bse2
06 ashish mahabal bse2
Marco Quartulli
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
Marco Quartulli
 
07 big skyearth_dlr_7_april_2016
07 big skyearth_dlr_7_april_201607 big skyearth_dlr_7_april_2016
07 big skyearth_dlr_7_april_2016
Marco Quartulli
 
05 astrostat feigelson
05 astrostat feigelson05 astrostat feigelson
05 astrostat feigelson
Marco Quartulli
 
06 ashish mahabal bse3
06 ashish mahabal bse306 ashish mahabal bse3
06 ashish mahabal bse3
Marco Quartulli
 
04 bigdata and_cloud_computing
04 bigdata and_cloud_computing04 bigdata and_cloud_computing
04 bigdata and_cloud_computing
Marco Quartulli
 

Viewers also liked (11)

08 distributed optimization
08 distributed optimization08 distributed optimization
08 distributed optimization
 
08 visualisation seminar ver0.2
08 visualisation seminar   ver0.208 visualisation seminar   ver0.2
08 visualisation seminar ver0.2
 
07 data structures_and_representations
07 data structures_and_representations07 data structures_and_representations
07 data structures_and_representations
 
05 sensor signal_models_feature_extraction
05 sensor signal_models_feature_extraction05 sensor signal_models_feature_extraction
05 sensor signal_models_feature_extraction
 
06 ashish mahabal bse1
06 ashish mahabal bse106 ashish mahabal bse1
06 ashish mahabal bse1
 
06 ashish mahabal bse2
06 ashish mahabal bse206 ashish mahabal bse2
06 ashish mahabal bse2
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
07 big skyearth_dlr_7_april_2016
07 big skyearth_dlr_7_april_201607 big skyearth_dlr_7_april_2016
07 big skyearth_dlr_7_april_2016
 
05 astrostat feigelson
05 astrostat feigelson05 astrostat feigelson
05 astrostat feigelson
 
06 ashish mahabal bse3
06 ashish mahabal bse306 ashish mahabal bse3
06 ashish mahabal bse3
 
04 bigdata and_cloud_computing
04 bigdata and_cloud_computing04 bigdata and_cloud_computing
04 bigdata and_cloud_computing
 

Similar to 07 dimensionality reduction

Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh
 
07 learning
07 learning07 learning
07 learning
ankit_ppt
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
Soumya Mukherjee
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
Collin Bennett
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
Suresh Pokharel
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
SanjanaSaxena17
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
Sivam Chinna
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
Machine Learning Valencia
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
ShwetapadmaBabu1
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
pushkarjoshi42
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
kibrualemu812
 
Svm ms
Svm msSvm ms
Svm ms
student
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Gingles Caroline
 
Explore ml day 2
Explore ml day 2Explore ml day 2
Explore ml day 2
preetikumara
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in Agriculture
Aman Vasisht
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
ssuser2023c6
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Knoldus Inc.
 
Prob and statistics models for outlier detection
Prob and statistics models for outlier detectionProb and statistics models for outlier detection
Prob and statistics models for outlier detection
Trilochan Panigrahi
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
TheULTIMATEALLROUNDE
 

Similar to 07 dimensionality reduction (20)

Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
07 learning
07 learning07 learning
07 learning
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
Svm ms
Svm msSvm ms
Svm ms
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
 
Explore ml day 2
Explore ml day 2Explore ml day 2
Explore ml day 2
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in Agriculture
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Prob and statistics models for outlier detection
Prob and statistics models for outlier detectionProb and statistics models for outlier detection
Prob and statistics models for outlier detection
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 

Recently uploaded

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
binhminhvu04
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
yusufzako14
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 

Recently uploaded (20)

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 

07 dimensionality reduction

  • 1. Stefano Cavuoti INAF – Capodimonte Astronomical Observatory – Napoli
  • 2. By definition, machine learning models are based on learning and self-adaptive techniques. A priori, real world data are intrinsically carriers of embedded information, hidden by noise. In almost all cases the signal-to-noise (S/N) ratio is low and the amount of data prevents the human exploration. We don’t know which feature of a pattern is much carrier of good information and how and where the correlation of features gives the best knowledge of the solution.
  • 3. If we want to classify the cultivars from which a bottle of wine is made we could analyze a lot of parameters. • Alcohol • Malic acid • Ash (cenere) • Alcalinity of ash • Magnesium • Total phenols • Flavanoids • Nonflavanoid phenols • Proanthocyanins • Color intensity • Hue • OD280/OD315 of diluted wines • Proline • … • Date of production The date of production is not related to the kind of cultivars and for this task this parameter is useless or could be even harmful!
  • 4. The following example is extracted from: http://www.visiondummy.com/2014/04/curse-dimensionality-affect- classification/ Given a set of images, each one depicting either a cat or a dog. We would like to create a classifier able to distinguish dogs from cats automatically. To do so, we first need to think about a descriptor for each object class that can be expressed by numbers, such that a mathematical algorithm can use those numbers to recognize objects. We could for instance argue that cats and dogs generally differ in color. A possible descriptor discriminating these two classes could then consist of three numbers: the average red, green and blue colors of the images. A simple linear classifier for instance, could linearly combine these features to decide on the class labels If 0.5*red + 0.3*green + 0.2*blue > 0.6 : return cat; else: return dog;
  • 5. However, these three color-describing numbers, called features, will obviously not suffice to obtain a perfect classification. Therefore, we could decide to add some features that describe the texture of the image, for instance by calculating the average edge or gradient intensity in both the X and Y direction. We now have 5 features that, in combination, could possibly be used by a classification algorithm to distinguish cats from dogs. To obtain an even more accurate classification, we could add more features, based on color or texture histograms, statistical moments, etc. Maybe we can obtain a perfect classification by carefully defining a few hundred of these features? The answer to this question might sound a bit counter-intuitive: no, we can’t!. In fact, after a certain point, increasing the dimensionality of the problem, by adding new features, would actually degrade the performance of our classifier. This is illustrated by figure, and is often referred to as ‘The Curse of Dimensionality’.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10. If we would keep adding features, the dimensionality of the feature space grows, and becomes sparser and sparser. Due to this sparsity, it becomes much more easy to find a separable hyperplane because the likelihood that a training sample lies on the wrong side of the best hyperplane becomes infinitely small when the number of features becomes infinitely large. However, if we project the highly dimensional classification result back to a lower dimensional space, we find this:
  • 11. Imagine a unit square that represents the 2D feature space. The average of the feature space is the center of this unit square, and all points within unit distance from this center, are inside a unit circle that inscribes the unit square. The training samples that do not fall within this unit circle are closer to the corners of the search space than to its center. These samples are difficult to classify because their feature values greatly differ (e.g. samples in opposite corners of the unit square). Therefore, classification is easier if most samples fall inside the inscribed unit circle
  • 12. The volume of the inscribing hypersphere of dimension d and with radius 0.5 can be calculated as: Figure shows how the volume of this hypersphere changes when the dimensionality increases:
  • 13. The volume of the hypersphere tends to zero as the dimensionality tends to infinity, whereas the volume of the surrounding hypercube remains constant. This surprising and rather counter-intuitive observation partially explains the problems associated with the curse of dimensionality in classification: In high dimensional spaces, most of the training data resides in the corners of the hypercube defining the feature space. As mentioned before, instances in the corners of the feature space are much more difficult to classify than instances around the centroid of the hypersphere. This is illustrated by figure 11, which shows a 2D unit square, a 3D unit cube, and a creative visualization of an 8D hypercube which has 2^8 = 256 corners
  • 14. are not separated in this parameter are separated in this parameter
  • 15.
  • 16.
  • 17. Pruning of data consists of an heuristic evaluation of quality performance of a machine learning technique, based on the Wrapper model of feature extraction, mixed with statistical indicators. It is basically used to optimize the parameter space in classification and regression problems. All permutations of data features Regression • Confusion matrix • Completeness • Purity • Contamination • Bias • Standard Deviation • MAD • RMS • Outlier percentages Classification
  • 18.
  • 19.
  • 21. PCA finds linear principal components, based on the Euclidean distance. It is well suited for clustering problems
  • 22. In order to overcome such kind of limits several methods have been developed based on different principles: • Combination of PCA (segments) • Principal curves, • Principal surfaces • Principal manifolds
  • 23. Feature Selection: • May Improve performance of classification algorithm • Classification algorithm may not scale up to the size of the full feature set either in sample or time • Allows us to better understand the domain • Cheaper to collect a reduced set of predictors • Safer to collect a reduced set of predictors
  • 24. N-N-N-NO TIME, NO TIME, NO TIME! HELLO, GOOD BYE, I AM LATE, I AM LATE…. JUST TIME FOR A FEW QUESTIONS!
  • 26.
  • 27.
  • 28. The goal here is to have the network "discover" structure in the data by finding how the data is clustered. The results can be used for data encoding and compression. One such method for doing this is called vector quantization In vector quantization, we assume there is a codebook which is defined by a set of M prototype vectors. (M is chosen by the user and the initial prototype vectors are chosen arbitrarily). An input belongs to cluster i if i is the index of the closest prototype (closest in the sense of the normal euclidean distance). This has the effect of dividing up the input space into a Voronoi tesselation
  • 29. 1. Choose the number of clusters M 2. Initialize the prototypes w*1,... w*m (one simple method for doing this is to randomly choose M vectors from the input data) 3. Repeat until stopping criterion is satisfied: a) Randomly pick an input x b) Determine the "winning" node k by finding the prototype vector that satisfies |w*k-x|<=|w*i-x| (for all i ) note: if the prototypes are normalized, this is equivalent to maximizing w*i x c) Update only the winning prototype weights according to w*k(new) = w*k(old) + m( x - w*k(old) ) LVQ (Learning Vector Quantization) is a supervised version of vector quantization. Classes are predefined and we have a set of labeled data. The goal is to determine a set of prototypes the best represent each class. Each output neuron represents a prototype independent from each other. Weight update is not extended to neighbours. Does not provide a map.
  • 30.
  • 31. LVQ network: is a hybrid supervised network where the clustered space (through SOM) is used to take a classification decision, in order to compress (categorize) the input information