SlideShare a Scribd company logo
Dimensionality
Reduction
By Saad Elbeleidy
1
Agenda
● Curse Of Dimensionality
● Why Reduce Dimensions
● Types of Dimensionality Reduction
○ Feature Selection
○ Feature Extraction
● Application Specific Methods
2
Curse Of Dimensionality
3
The collection of issues that arise when dealing
with high dimensional data.
More data is good.
More detailed data (dimensions) might not be.
Curse Of Dimensionality
Why Reduce Dimensions
Types of Dimensionality
Reduction
● Feature Selection
● Feature Extraction
Application Specific
Methods
Data Representation
4
Index Label Shape Size Color Feature 4
0
1
2
3
4 More
Data
More
Detail
Just one more feature
Adding just one more feature could increase the amount of data
you need by an additional power.
In the age of Big Data, we still do not have enough data to
account for the curse of dimensionality.
5
Data Examples
● HD Images
○ Should we represent all pixels? Are they all useful?
● Video
○ What changes is what’s useful
● Text
○ Keywords vs all words
● Time series
○ Does every second matter?
6
Why Reduce Dimensions
7
1. Computation
Less dimensions allow us to compute models
more efficiently.
2. Visualization
Difficult to visualize more than 3 dimensions.
3. Remove Useless Information
Curse Of Dimensionality
Why Reduce Dimensions
Types of Dimensionality
Reduction
● Feature Selection
● Feature Extraction
Application Specific
Methods
Dimensionality Reduction
8
Dimensionality reduction aims to map the data from the original
dimension space to a lower dimension space while minimizing
(relevant) information loss.
Types of Dimensionality Reduction
9
● Feature Selection
Select features from the available features
● Feature Extraction
Generate synthetic features that represent the
available features.
Curse Of Dimensionality
Why Reduce Dimensions
Types of Dimensionality
Reduction
● Feature Selection
● Feature Extraction
Application Specific
Methods
Feature Selection
10
Select some features from the originally available
set.
1. Filter
2. Wrapper
3. Embedded
Curse Of Dimensionality
Why Reduce Dimensions
Types of Dimensionality
Reduction
● Feature Selection
● Feature Extraction
Application Specific
Methods
Process
1. Start with all features
2. Select a subset
3. Learn on the subset (train model)
4. Measure performance
11
Filter
12
All Features Subset Learning Performance
Filter to best subset
Wrapper
13
All Features Subset Learning Performance
Embedded
14
All Features Subset Learning Performance
Comparison
1. Filter
a. Pros: Low computation time and robust to overfitting
b. Cons: Select redundant data, results not as great, greedy
c. Used in pre-processing
2. Wrapper
a. Pros: Take learnings into consideration so better fit the data
b. Cons: Potentially high computation time, prone to overfitting, greedy
3. Embedded
a. Pros: Combine advantages of both Filter and Wrapper
b. Cons: Computation time
15
Example Methods
1. Filter
a. Mutual Information
2. Wrapper
a. Recursive Feature Elimination
3. Embedded
a. LASSO
b. LARS
16
Mutual Information
Calculate the value of each feature based on the mutual
information with the target class.
17
Recursive Feature Elimination
Select features by recursively considering smaller and smaller sets
of features.
1. Train estimator and obtain importance of each feature
2. Prune least important k features
3. Repeat until only left with n features
Parameters: n features to keep, k features to drop per iteration
18
Norms Review
l1
norm: Manhattan distance or absolute value
l2
norm: Euclidian distance
lp
norm: (p >= 1)
l0
norm: number of non-zero entries
19
Least Absolute Shrinkage & Selection Operator
A linear model that estimates sparse coefficients
Uses l1
norm regularization.
20
Elastic Net
A linear regression model trained with l1
and l2
norms
regularization.
Elastic-net is useful when there are multiple features which are
correlated with one another. Lasso is likely to pick one of these at
random, while elastic-net is likely to pick both.
21
OLS
LASSO
Ridge
Elastic Nets
Regression Comparison
22
Least Angle RegreSsion
A regression algorithm for high-dimensional data.
1. Start with all coefficients equal to zero
2. Find the predictor most correlated with the response, say xj1
.
3. Take the largest step possible in the direction of this predictor
4. When some other predictor, say xj2
, has as much correlation
with the current residual, proceed in a direction equiangular
between the two predictors.
23
Feature Extraction
24
Generate synthetic (read: made up) features that
represent the available ones.
Methods to cover:
1. PCA
2. t-SNE
3. Spectral Embedding
4. LDA
Curse Of Dimensionality
Why Reduce Dimensions
Types of Dimensionality
Reduction
● Feature Selection
● Feature Extraction
Application Specific
Methods
Principal Component Analysis
Decomposes a dataset into a set of successive orthogonal
components that explain a maximum amount of variance.
25
Demo: http://setosa.io/ev/principal-component-analysis
Singular Value Decomposition
Factorization of A into the product of three matrices.
The columns of U and V are orthonormal and the matrix D is
diagonal with positive real entries.
26
Singular Value Decomposition
27
Singular Value Decomposition
28
Singular value decomposition is essentially trying to reduce a rank
d matrix to a rank r matrix.
We can take a list of d unique vectors, and approximate them as a
linear combination of r unique vectors.
t-Stochastic Neighbor Embedding
1. Computes probabilities that are proportional to the similarity
of objects.
2. Uses the probabilities to learn a d dimensional map that
reflects the similarities as well as possible.
3. Minimize the Kullback–Leibler divergence.
29
Demo: https://distill.pub/2016/misread-tsne/
t-Stochastic Neighbor Embedding
1. Conditional Probabilities
2. Joint Probabilities
3. Minimization
30
Spectral Embedding
Example of nonlinear dimensionality reduction.
1. Weighted Graph Construction
Transform the raw input data into graph representation using
affinity (adjacency) matrix representation.
2. Graph Laplacian Construction
3. Partial Eigenvalue Decomposition
Eigenvalue decomposition is done on graph Laplacian
31
Linear Discriminant Analysis
Use classifier coefficients to calculate a projection of the data in k
dimensions.
Ensure that the value of k is less than C - 1. (C = number of classes)
32
Application Specific Methods
Simple yet effective methods of dimensionality
reduction in:
● Computer Vision
● Natural Language Processing
● Time Series
33
Curse Of Dimensionality
Why Reduce Dimensions
Types of Dimensionality
Reduction
● Feature Selection
● Feature Extraction
Application Specific
Methods
Computer Vision
Applications:
● Facial Recognition
● Self Driving Cars
● Optical Character Recognition (OCR)
Data: matrix of pixels’ (RGB or Grayscale) values
34
Pooling
Pooling algorithm: max, average, other statistical metrics
Filter: patch to apply the pooling algorithm to
Stride: the step distance to the next patch
35
Natural Language Processing
Applications:
● Question Answering
● Summarization
● Spam Filtering
Data: vector representations of words in a string
36
Removing Words
Stopwords are common terms that do not add significance
relative to the machine learning application.
Ex. the, a, an, he, she, because, not
Stopword removal is very common in NLP applications but
selecting stopwords may be difficult.
37
Combining Words
Reducing words to their
stem/root word.
Examples:
● presumably -> presum
● multiply -> multipli
● crying -> cri
38
Similar to stemming but includes
parts of speech.
Examples:
● hike_verb
● hike_noun
Stemming Lemmatization
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
Time Series
Applications:
● Financial Forecasting
● Medical Diagnosis
● Server Log Analysis
Data: table where most features are a time unit.
39
Time Scale Reduction
If modeling annual performance, no need to use per second data.
Reduction algorithms:
● Start
● Average
● Min/Max
40
Other Methods
Granger Causality
Forecastable Component Analysis (FCA)
41
Questions?
42
References & Resources
43
Feature Selection
http://scikit-learn.org/stable/modules/feature_selection.html
http://scikit-learn.org/stable/modules/linear_model.html
http://www-stat.stanford.edu/~hastie/Papers/LARS/LeastAngle_2002.pdf
Feature Extraction
https://www.cs.cmu.edu/~venkatg/teaching/CStheory-infoage/book-chapter-4.pdf
http://scikit-learn.org/stable/modules/decomposition.html
https://www.quora.com/What-is-an-intuitive-explanation-of-singular-value-decomposition-SVD
http://scikit-learn.org/stable/modules/manifold.html#t-sne
http://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf
http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/luxburg06_TR_v
2_4139%5b1%5d.pdf
Computer Vision
http://cs231n.github.io/convolutional-networks/
Natural Language Processing
http://textminingonline.com/dive-into-nltk-part-iv-stemming-and-lemmatization
Time Series
http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6244856
References & Resources
44
Relevant Wikipedia Pages
Norms
Curse of Dimensionality
Dimensionality Reduction
Feature Selection
Feature Extraction
Feature Engineering
45
Mutual Information
LASSO
Ridge Regression
Elastic Nets
LARS
SVD
PCA
t-SNE
LDA
Relevant Wikipedia Pages
Computer Vision
Natural Language Processing
Time Series
46
Pooling Layers
Stopwords
Stemming
Lemmatization
Granger Causality

More Related Content

What's hot

Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
Ricardo Wendell Rodrigues da Silveira
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
Knoldus Inc.
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
Prof. Neeta Awasthy
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
AbhishekKumar4995
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
Mohammad Junaid Khan
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
Impetus Technologies
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
KNN
KNN KNN
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
Neha Kulkarni
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
Mohit Rajput
 
Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]
AAKANKSHA JAIN
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
Marco Quartulli
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
Arshad Farhad
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 

What's hot (20)

Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
KNN
KNN KNN
KNN
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 

Similar to Dimensionality Reduction

Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine Learning
Mehwish690898
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
Mazen Aly
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
Anton Kulesh
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and Techniques
Sơn Còm Nhom
 
Questions On The Equation For Regression
Questions On The Equation For RegressionQuestions On The Equation For Regression
Questions On The Equation For Regression
Tiffany Sandoval
 
Module-4_Part-II.pptx
Module-4_Part-II.pptxModule-4_Part-II.pptx
Module-4_Part-II.pptx
VaishaliBagewadikar
 
30thSep2014
30thSep201430thSep2014
30thSep2014Mia liu
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
AnushaSharma81
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
RAHUL BHOJWANI
 
Machine_Learning_Trushita
Machine_Learning_TrushitaMachine_Learning_Trushita
Machine_Learning_Trushita
Trushita Redij
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
Soumya Mukherjee
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
Collin Bennett
 
VSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 SessionsVSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 Sessions
BigML, Inc
 
C3 w5
C3 w5C3 w5
random forest.pptx
random forest.pptxrandom forest.pptx
random forest.pptx
PriyadharshiniG41
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
Alon Bochman, CFA
 
Ds for finance day 3
Ds for finance day 3Ds for finance day 3
Ds for finance day 3
QuantUniversity
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selection
Davis David
 
A comparative review of various approaches for feature extraction in Face rec...
A comparative review of various approaches for feature extraction in Face rec...A comparative review of various approaches for feature extraction in Face rec...
A comparative review of various approaches for feature extraction in Face rec...
Vishnupriya T H
 

Similar to Dimensionality Reduction (20)

Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine Learning
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and Techniques
 
Questions On The Equation For Regression
Questions On The Equation For RegressionQuestions On The Equation For Regression
Questions On The Equation For Regression
 
Module-4_Part-II.pptx
Module-4_Part-II.pptxModule-4_Part-II.pptx
Module-4_Part-II.pptx
 
30thSep2014
30thSep201430thSep2014
30thSep2014
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Machine_Learning_Trushita
Machine_Learning_TrushitaMachine_Learning_Trushita
Machine_Learning_Trushita
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
 
VSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 SessionsVSSML17 Review. Summary Day 1 Sessions
VSSML17 Review. Summary Day 1 Sessions
 
C3 w5
C3 w5C3 w5
C3 w5
 
random forest.pptx
random forest.pptxrandom forest.pptx
random forest.pptx
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Ds for finance day 3
Ds for finance day 3Ds for finance day 3
Ds for finance day 3
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selection
 
A comparative review of various approaches for feature extraction in Face rec...
A comparative review of various approaches for feature extraction in Face rec...A comparative review of various approaches for feature extraction in Face rec...
A comparative review of various approaches for feature extraction in Face rec...
 

Recently uploaded

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 

Recently uploaded (20)

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 

Dimensionality Reduction

  • 2. Agenda ● Curse Of Dimensionality ● Why Reduce Dimensions ● Types of Dimensionality Reduction ○ Feature Selection ○ Feature Extraction ● Application Specific Methods 2
  • 3. Curse Of Dimensionality 3 The collection of issues that arise when dealing with high dimensional data. More data is good. More detailed data (dimensions) might not be. Curse Of Dimensionality Why Reduce Dimensions Types of Dimensionality Reduction ● Feature Selection ● Feature Extraction Application Specific Methods
  • 4. Data Representation 4 Index Label Shape Size Color Feature 4 0 1 2 3 4 More Data More Detail
  • 5. Just one more feature Adding just one more feature could increase the amount of data you need by an additional power. In the age of Big Data, we still do not have enough data to account for the curse of dimensionality. 5
  • 6. Data Examples ● HD Images ○ Should we represent all pixels? Are they all useful? ● Video ○ What changes is what’s useful ● Text ○ Keywords vs all words ● Time series ○ Does every second matter? 6
  • 7. Why Reduce Dimensions 7 1. Computation Less dimensions allow us to compute models more efficiently. 2. Visualization Difficult to visualize more than 3 dimensions. 3. Remove Useless Information Curse Of Dimensionality Why Reduce Dimensions Types of Dimensionality Reduction ● Feature Selection ● Feature Extraction Application Specific Methods
  • 8. Dimensionality Reduction 8 Dimensionality reduction aims to map the data from the original dimension space to a lower dimension space while minimizing (relevant) information loss.
  • 9. Types of Dimensionality Reduction 9 ● Feature Selection Select features from the available features ● Feature Extraction Generate synthetic features that represent the available features. Curse Of Dimensionality Why Reduce Dimensions Types of Dimensionality Reduction ● Feature Selection ● Feature Extraction Application Specific Methods
  • 10. Feature Selection 10 Select some features from the originally available set. 1. Filter 2. Wrapper 3. Embedded Curse Of Dimensionality Why Reduce Dimensions Types of Dimensionality Reduction ● Feature Selection ● Feature Extraction Application Specific Methods
  • 11. Process 1. Start with all features 2. Select a subset 3. Learn on the subset (train model) 4. Measure performance 11
  • 12. Filter 12 All Features Subset Learning Performance Filter to best subset
  • 13. Wrapper 13 All Features Subset Learning Performance
  • 14. Embedded 14 All Features Subset Learning Performance
  • 15. Comparison 1. Filter a. Pros: Low computation time and robust to overfitting b. Cons: Select redundant data, results not as great, greedy c. Used in pre-processing 2. Wrapper a. Pros: Take learnings into consideration so better fit the data b. Cons: Potentially high computation time, prone to overfitting, greedy 3. Embedded a. Pros: Combine advantages of both Filter and Wrapper b. Cons: Computation time 15
  • 16. Example Methods 1. Filter a. Mutual Information 2. Wrapper a. Recursive Feature Elimination 3. Embedded a. LASSO b. LARS 16
  • 17. Mutual Information Calculate the value of each feature based on the mutual information with the target class. 17
  • 18. Recursive Feature Elimination Select features by recursively considering smaller and smaller sets of features. 1. Train estimator and obtain importance of each feature 2. Prune least important k features 3. Repeat until only left with n features Parameters: n features to keep, k features to drop per iteration 18
  • 19. Norms Review l1 norm: Manhattan distance or absolute value l2 norm: Euclidian distance lp norm: (p >= 1) l0 norm: number of non-zero entries 19
  • 20. Least Absolute Shrinkage & Selection Operator A linear model that estimates sparse coefficients Uses l1 norm regularization. 20
  • 21. Elastic Net A linear regression model trained with l1 and l2 norms regularization. Elastic-net is useful when there are multiple features which are correlated with one another. Lasso is likely to pick one of these at random, while elastic-net is likely to pick both. 21
  • 23. Least Angle RegreSsion A regression algorithm for high-dimensional data. 1. Start with all coefficients equal to zero 2. Find the predictor most correlated with the response, say xj1 . 3. Take the largest step possible in the direction of this predictor 4. When some other predictor, say xj2 , has as much correlation with the current residual, proceed in a direction equiangular between the two predictors. 23
  • 24. Feature Extraction 24 Generate synthetic (read: made up) features that represent the available ones. Methods to cover: 1. PCA 2. t-SNE 3. Spectral Embedding 4. LDA Curse Of Dimensionality Why Reduce Dimensions Types of Dimensionality Reduction ● Feature Selection ● Feature Extraction Application Specific Methods
  • 25. Principal Component Analysis Decomposes a dataset into a set of successive orthogonal components that explain a maximum amount of variance. 25 Demo: http://setosa.io/ev/principal-component-analysis
  • 26. Singular Value Decomposition Factorization of A into the product of three matrices. The columns of U and V are orthonormal and the matrix D is diagonal with positive real entries. 26
  • 28. Singular Value Decomposition 28 Singular value decomposition is essentially trying to reduce a rank d matrix to a rank r matrix. We can take a list of d unique vectors, and approximate them as a linear combination of r unique vectors.
  • 29. t-Stochastic Neighbor Embedding 1. Computes probabilities that are proportional to the similarity of objects. 2. Uses the probabilities to learn a d dimensional map that reflects the similarities as well as possible. 3. Minimize the Kullback–Leibler divergence. 29 Demo: https://distill.pub/2016/misread-tsne/
  • 30. t-Stochastic Neighbor Embedding 1. Conditional Probabilities 2. Joint Probabilities 3. Minimization 30
  • 31. Spectral Embedding Example of nonlinear dimensionality reduction. 1. Weighted Graph Construction Transform the raw input data into graph representation using affinity (adjacency) matrix representation. 2. Graph Laplacian Construction 3. Partial Eigenvalue Decomposition Eigenvalue decomposition is done on graph Laplacian 31
  • 32. Linear Discriminant Analysis Use classifier coefficients to calculate a projection of the data in k dimensions. Ensure that the value of k is less than C - 1. (C = number of classes) 32
  • 33. Application Specific Methods Simple yet effective methods of dimensionality reduction in: ● Computer Vision ● Natural Language Processing ● Time Series 33 Curse Of Dimensionality Why Reduce Dimensions Types of Dimensionality Reduction ● Feature Selection ● Feature Extraction Application Specific Methods
  • 34. Computer Vision Applications: ● Facial Recognition ● Self Driving Cars ● Optical Character Recognition (OCR) Data: matrix of pixels’ (RGB or Grayscale) values 34
  • 35. Pooling Pooling algorithm: max, average, other statistical metrics Filter: patch to apply the pooling algorithm to Stride: the step distance to the next patch 35
  • 36. Natural Language Processing Applications: ● Question Answering ● Summarization ● Spam Filtering Data: vector representations of words in a string 36
  • 37. Removing Words Stopwords are common terms that do not add significance relative to the machine learning application. Ex. the, a, an, he, she, because, not Stopword removal is very common in NLP applications but selecting stopwords may be difficult. 37
  • 38. Combining Words Reducing words to their stem/root word. Examples: ● presumably -> presum ● multiply -> multipli ● crying -> cri 38 Similar to stemming but includes parts of speech. Examples: ● hike_verb ● hike_noun Stemming Lemmatization Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
  • 39. Time Series Applications: ● Financial Forecasting ● Medical Diagnosis ● Server Log Analysis Data: table where most features are a time unit. 39
  • 40. Time Scale Reduction If modeling annual performance, no need to use per second data. Reduction algorithms: ● Start ● Average ● Min/Max 40
  • 41. Other Methods Granger Causality Forecastable Component Analysis (FCA) 41
  • 43. References & Resources 43 Feature Selection http://scikit-learn.org/stable/modules/feature_selection.html http://scikit-learn.org/stable/modules/linear_model.html http://www-stat.stanford.edu/~hastie/Papers/LARS/LeastAngle_2002.pdf Feature Extraction https://www.cs.cmu.edu/~venkatg/teaching/CStheory-infoage/book-chapter-4.pdf http://scikit-learn.org/stable/modules/decomposition.html https://www.quora.com/What-is-an-intuitive-explanation-of-singular-value-decomposition-SVD http://scikit-learn.org/stable/modules/manifold.html#t-sne http://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/luxburg06_TR_v 2_4139%5b1%5d.pdf
  • 44. Computer Vision http://cs231n.github.io/convolutional-networks/ Natural Language Processing http://textminingonline.com/dive-into-nltk-part-iv-stemming-and-lemmatization Time Series http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6244856 References & Resources 44
  • 45. Relevant Wikipedia Pages Norms Curse of Dimensionality Dimensionality Reduction Feature Selection Feature Extraction Feature Engineering 45 Mutual Information LASSO Ridge Regression Elastic Nets LARS SVD PCA t-SNE LDA
  • 46. Relevant Wikipedia Pages Computer Vision Natural Language Processing Time Series 46 Pooling Layers Stopwords Stemming Lemmatization Granger Causality