SlideShare a Scribd company logo
1 of 9
Download to read offline
CS676A Project
Report
Learning with Relative Attributes
Submitted in partial fulfillment of
the requirements for the course CS676A
April 15, 2016
Submitted by
13683 Shubham Jain
13788 Vikas Jain
Under the guidance of
Prof Vinay P. Namboodiri
Department of Computer Science and
Engineering
Indian Institute of Technology Kanpur
Kanpur, U.P. , India – 208 016
July 2015
Contents
1 Introduction 2
2 Methodology 3
2.1 Feature representation . . . . . . . . . . . . . . . . 3
2.2 Ranking Methods . . . . . . . . . . . . . . . . . . . 3
2.2.1 RankSVM . . . . . . . . . . . . . . . . . . . . 3
2.2.2 RankNet [2] . . . . . . . . . . . . . . . . . . . 4
2.3 Zero Shot Learning . . . . . . . . . . . . . . . . . . 5
3 Dataset 6
4 Code Used 6
5 Results 6
5.1 Correctly ordered pairs: . . . . . . . . . . . . . . . . 6
5.2 Class Prediction: . . . . . . . . . . . . . . . . . . . . 7
6 Conclusions 7
7 Scope and Future Work 7
1
1 Introduction
Human nameable attributes are more intuitive to understand,
e.g. open natural scene, white young face. Due to their simplic-
ity, human nameable attributes are used for recognition tasks.
But sometimes, it becomes difficult to assign categorical labels
to the images i.e binary labels – yes or no if attribute is present or
not respectively. Relative attributes[1] are more understandable
and intuitive, e.g. a person in a image is taller than some other
person in other image and shorter than an another person in an-
other image. In this project, we aim to learn relative attributes
associated with face images namely Masculine, white, young,
chubby, visible-forehead, bushy-eyebrows, narrow-eyes, pointy-
nose, big-lips and round-face. We used PubFig dataset[5] for
the task. We extracted features using Convolutional Neural Net-
work(VGG16 model)[4]. We implemented and trained Ranknet
model to predict absolute ranking of attributes give the image
features. We also explored zero shot learning for unseen classes
by building probabilistic model of each seen and unseen class
using attribute ranking for each image of seen class and relative
description of unseen classes with seen classes respectively.
Figure 1: Relative Attribute [1]
2
2 Methodology
The main idea in a simplified version is as follows: Find a good
feature representation of the image which somehow incorpo-
rates the strength of features not just their presence. Now using
these features learn a model which when given a feature repre-
sentation of a new image tries to predict a ranking for the image
for that particular attribute. We will now describe the feature
representation that we used and the ranking models.
2.1 Feature representation
Parikh et. al. [1] in their paper used GIST features of the images
for the learning task. Instead of using handcrafted features, We
used features extracted from Convolutional Neural Network. The
VGG16 [4] model for extracting features from the images. We
used learned model of VGG16 trained on ImageNet dataset[6].
The last layer(softmax) from the network is removed and the out-
put of layer before it(FC-4096) is used as feature vector of each
image. The dimension of each feature vector is 4096.
2.2 Ranking Methods
For ranking we have compared two methods. One is based on
an approximation algorithm to solve the problem via Support
Vector Machines (SVM). The latter is based on neural networks.
2.2.1 RankSVM
The idea was to learn given a query image and user preferences
which are incorporated to retrieve more relevant images in the
search results. It uses a set of comparisons between certain
images with respect to an attribute. The exact equation is NP-
Hard and an approximate solution is obtained by using negative
3
slack variables. The mathematical description is as follows:
Rank SVM formalization[1]
2.2.2 RankNet [2]
It is a Neural network with two hidden layers and back-propagation
can be extended to train on these pairs. The first hidden layer
had 2048 neurons and the second hidden layer had 512 neurons.
The model trains on pairs of examples to learn a ranking func-
tion that maps to the real numbers. It uses a natural probabilis-
tic cost function on pairs of examples. The network as described
in the paper is as follows :
For the ith training sample, denote the outputs of net by
oi, the targets by ti, let the transfer function of each node in the
jth layer of nodes be gj, and let the cost function be q
i=1 f(oi, ti)
If αk are the parameters of the model, then a gradient descent
step amounts to ∂αk = -ηk
∂f
∂αk
, where the ηk are positive learning
rates. The net embodies the function
oi = g3
(
j
w32
ij g2
(
k
w21
jkxk + b2
j ) + b3
i ) ≡ g3
i (1)
where for the weights w and offsets b, the upper indices index
the node layer, and the lower indices index the nodes within each
corresponding layer. The cost function becomes a function of the
4
difference of the outputs of two consecutive training samples: f
(o2 − o1). The update equations (back propagation) are as follows
(f’ ≡ f’ (o2 − o1)):
∂f
∂b3
= f (g
3
2 − g
3
1) ≡ 3
2 − 3
1 (2)
∂f
∂w32
m
= 3
2g2
2m − 3
1g2
1m (3)
∂f
∂b2
m
= 3
2w32
m g
2
2m − 3
1w32
m g
2
1m (4)
∂f
∂w21
m
= 2
2mg1
1n − 2
1mg2
1m (5)
2.3 Zero Shot Learning
Once the ranking of each attribute of the classes are learnt. The
feature vector of each image of each class is represented in the
attribute space. Using the features vector of each image of a
particular class, the parameters of normal distribution is found
to obtain a probabilistic model of the class.
The probabilistic model for the unseen class is found using the
relative description of unseen class with the seen classes. The
method to find the Normal Distribution parameters is same as
given in [1]
Once the probabilistic model of seen and unseen classes are
obtained, for a query image, fist the rank of attributes for the
images is found and we obtain a vector ¯x in the attribute space.
The image is labelled with the class with maximum probability
among all seen and unseen classes.
c∗
= max
j∈1,...,N
P(¯xi|µi, Σi) (6)
5
3 Dataset
We used PubFig[5] Dataset for this project. The dataset consist
of face images of celebrity. The images from 8 classes were taken
for the learning tasks. Around 100 images from each class were
used, out of which 85 were used for training and rest for testing.
For zero shot learning, we took images from 3 different classes.
Around 30 images of each unseen class were used for prediction
the class.
4 Code Used
• Feature Extraction - From [4] https://gist.github.com/
baraldilorenzo/07d7802847aaad0a35d3
• RankNet - Adapted from https://github.com/shiba24/learning2rank
• RankNet - From [1]
5 Results
5.1 Correctly ordered pairs:
The result obtained for the learned relative ranking per attribute
in the 8 classes are shown in Table 1.
Attribute Acc. Attribute Acc.
Masculine 80.35 Bushy-eyebrows 80.81
Young 79.14 Pointy-nose 79.25
Chubby 69.68 Big-lips 88.25
Forehead 66.48 Round-face 83.66
Table 1: Accuracy of learned attributes
6
5.2 Class Prediction:
The results of predicted class by formulating Normal Distribu-
tion of each class are shown in Table 2.
Class RankNet Acc. RankSVM Acc.
1 0.80 0.67
2 0.91 0.36
3 1.00 0.75
4 0.89 0.56
5 0.75 0.67
6 0.83 0.58
7 0.85 0.77
8 0.92 0.85
9(Unseen) 0.03 –
10(Unseen) 0.00 –
11(Unseen) 0.10 –
Table 2:
6 Conclusions
It can be inferred that feature vectors extracted from Convolu-
tional Neural Network and RankNet model used for learning rel-
ative attributes are performing better than the techniques used
in [1] of GIST features and RankSVM model.
However, for zero shot learning task, while the prediction of seen
classes are better for RankNet as compared to RankSVM, it per-
forms significantly poor in case of unseen classes.
7 Scope and Future Work
For zero shot we can instead do unsupervised step like cluster-
ing instead of using the comparison of attributes but that is one
thing that the paper wanted to incorporate so that it kind of
7
comes out in a natural way. Also, we need to find a better way
to fins out the parameters of the unseen classes as the one that
the paper used we found not to work so well.
References
[1] Parikh, Devi, and Kristen Grauman. ”Relative attributes.”
Computer Vision (ICCV), 2011 IEEE International Confer-
ence on. IEEE, 2011.
[2] Burges, Chris, et al. ”Learning to rank using gradient de-
scent.” Proceedings of the 22nd international conference on
Machine learning. ACM, 2005.
[3] Joachims, Thorsten. ”Optimizing search engines using
clickthrough data.” Proceedings of the eighth ACM SIGKDD
international conference on Knowledge discovery and data
mining. ACM, 2002.
[4] Simonyan, Karen, and Andrew Zisserman. ”Very deep
convolutional networks for large-scale image recognition.”
arXiv preprint arXiv:1409.1556 (2014).
[5] ”Attribute and Simile Classifiers for Face Verification,”
Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and
Shree K. Nayar, International Conference on Computer Vi-
sion (ICCV), 2009.
[6] Deng, Jia, et al. ”Imagenet: A large-scale hierarchical im-
age database.” Computer Vision and Pattern Recognition,
2009. CVPR 2009. IEEE Conference on. IEEE, 2009.
8

More Related Content

What's hot

DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...cscpconf
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
4213ijaia05
4213ijaia054213ijaia05
4213ijaia05ijaia
 
Neural network in matlab
Neural network in matlab Neural network in matlab
Neural network in matlab Fahim Khan
 
Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483IJRAT
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNijaia
 
Knowledge distillation deeplab
Knowledge distillation deeplabKnowledge distillation deeplab
Knowledge distillation deeplabFrozen Paradise
 
Black-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelBlack-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelIJECEIAES
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningMLAI2
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]SubhradeepMaji
 
Offline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural NetworkOffline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural Networkijaia
 
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixSteganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixCSCJournals
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsankit_ppt
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketakiKetaki Patwari
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network Tomoki Hayashi
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 

What's hot (20)

DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
 
Neural networks
Neural networksNeural networks
Neural networks
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
4213ijaia05
4213ijaia054213ijaia05
4213ijaia05
 
Neural network in matlab
Neural network in matlab Neural network in matlab
Neural network in matlab
 
Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
 
Knowledge distillation deeplab
Knowledge distillation deeplabKnowledge distillation deeplab
Knowledge distillation deeplab
 
Black-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelBlack-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX model
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
 
nn network
nn networknn network
nn network
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]
 
Offline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural NetworkOffline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural Network
 
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixSteganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topics
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 

Similar to Learning with Relative Attributes

Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep LearningIRJET Journal
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an ObjectAnkur Tyagi
 
ShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReportShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReportShawn Quinn
 
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET Journal
 
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...CSCJournals
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System OperationsIRJET Journal
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsManje Gowda
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundIJCSIS Research Publications
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple featuresHirantha Pradeep
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonAkash Satamkar
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxPyariMohanJena
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptxManeetBali
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisNaeem Shehzad
 

Similar to Learning with Relative Attributes (20)

Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep Learning
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
 
ShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReportShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReport
 
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
 
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
 
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITIONFEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
research_paper
research_paperresearch_paper
research_paper
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System Operations
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithms
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective Background
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and Python
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptx
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptx
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 

Recently uploaded

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 

Recently uploaded (20)

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 

Learning with Relative Attributes

  • 1. CS676A Project Report Learning with Relative Attributes Submitted in partial fulfillment of the requirements for the course CS676A April 15, 2016 Submitted by 13683 Shubham Jain 13788 Vikas Jain Under the guidance of Prof Vinay P. Namboodiri Department of Computer Science and Engineering Indian Institute of Technology Kanpur Kanpur, U.P. , India – 208 016 July 2015
  • 2. Contents 1 Introduction 2 2 Methodology 3 2.1 Feature representation . . . . . . . . . . . . . . . . 3 2.2 Ranking Methods . . . . . . . . . . . . . . . . . . . 3 2.2.1 RankSVM . . . . . . . . . . . . . . . . . . . . 3 2.2.2 RankNet [2] . . . . . . . . . . . . . . . . . . . 4 2.3 Zero Shot Learning . . . . . . . . . . . . . . . . . . 5 3 Dataset 6 4 Code Used 6 5 Results 6 5.1 Correctly ordered pairs: . . . . . . . . . . . . . . . . 6 5.2 Class Prediction: . . . . . . . . . . . . . . . . . . . . 7 6 Conclusions 7 7 Scope and Future Work 7 1
  • 3. 1 Introduction Human nameable attributes are more intuitive to understand, e.g. open natural scene, white young face. Due to their simplic- ity, human nameable attributes are used for recognition tasks. But sometimes, it becomes difficult to assign categorical labels to the images i.e binary labels – yes or no if attribute is present or not respectively. Relative attributes[1] are more understandable and intuitive, e.g. a person in a image is taller than some other person in other image and shorter than an another person in an- other image. In this project, we aim to learn relative attributes associated with face images namely Masculine, white, young, chubby, visible-forehead, bushy-eyebrows, narrow-eyes, pointy- nose, big-lips and round-face. We used PubFig dataset[5] for the task. We extracted features using Convolutional Neural Net- work(VGG16 model)[4]. We implemented and trained Ranknet model to predict absolute ranking of attributes give the image features. We also explored zero shot learning for unseen classes by building probabilistic model of each seen and unseen class using attribute ranking for each image of seen class and relative description of unseen classes with seen classes respectively. Figure 1: Relative Attribute [1] 2
  • 4. 2 Methodology The main idea in a simplified version is as follows: Find a good feature representation of the image which somehow incorpo- rates the strength of features not just their presence. Now using these features learn a model which when given a feature repre- sentation of a new image tries to predict a ranking for the image for that particular attribute. We will now describe the feature representation that we used and the ranking models. 2.1 Feature representation Parikh et. al. [1] in their paper used GIST features of the images for the learning task. Instead of using handcrafted features, We used features extracted from Convolutional Neural Network. The VGG16 [4] model for extracting features from the images. We used learned model of VGG16 trained on ImageNet dataset[6]. The last layer(softmax) from the network is removed and the out- put of layer before it(FC-4096) is used as feature vector of each image. The dimension of each feature vector is 4096. 2.2 Ranking Methods For ranking we have compared two methods. One is based on an approximation algorithm to solve the problem via Support Vector Machines (SVM). The latter is based on neural networks. 2.2.1 RankSVM The idea was to learn given a query image and user preferences which are incorporated to retrieve more relevant images in the search results. It uses a set of comparisons between certain images with respect to an attribute. The exact equation is NP- Hard and an approximate solution is obtained by using negative 3
  • 5. slack variables. The mathematical description is as follows: Rank SVM formalization[1] 2.2.2 RankNet [2] It is a Neural network with two hidden layers and back-propagation can be extended to train on these pairs. The first hidden layer had 2048 neurons and the second hidden layer had 512 neurons. The model trains on pairs of examples to learn a ranking func- tion that maps to the real numbers. It uses a natural probabilis- tic cost function on pairs of examples. The network as described in the paper is as follows : For the ith training sample, denote the outputs of net by oi, the targets by ti, let the transfer function of each node in the jth layer of nodes be gj, and let the cost function be q i=1 f(oi, ti) If αk are the parameters of the model, then a gradient descent step amounts to ∂αk = -ηk ∂f ∂αk , where the ηk are positive learning rates. The net embodies the function oi = g3 ( j w32 ij g2 ( k w21 jkxk + b2 j ) + b3 i ) ≡ g3 i (1) where for the weights w and offsets b, the upper indices index the node layer, and the lower indices index the nodes within each corresponding layer. The cost function becomes a function of the 4
  • 6. difference of the outputs of two consecutive training samples: f (o2 − o1). The update equations (back propagation) are as follows (f’ ≡ f’ (o2 − o1)): ∂f ∂b3 = f (g 3 2 − g 3 1) ≡ 3 2 − 3 1 (2) ∂f ∂w32 m = 3 2g2 2m − 3 1g2 1m (3) ∂f ∂b2 m = 3 2w32 m g 2 2m − 3 1w32 m g 2 1m (4) ∂f ∂w21 m = 2 2mg1 1n − 2 1mg2 1m (5) 2.3 Zero Shot Learning Once the ranking of each attribute of the classes are learnt. The feature vector of each image of each class is represented in the attribute space. Using the features vector of each image of a particular class, the parameters of normal distribution is found to obtain a probabilistic model of the class. The probabilistic model for the unseen class is found using the relative description of unseen class with the seen classes. The method to find the Normal Distribution parameters is same as given in [1] Once the probabilistic model of seen and unseen classes are obtained, for a query image, fist the rank of attributes for the images is found and we obtain a vector ¯x in the attribute space. The image is labelled with the class with maximum probability among all seen and unseen classes. c∗ = max j∈1,...,N P(¯xi|µi, Σi) (6) 5
  • 7. 3 Dataset We used PubFig[5] Dataset for this project. The dataset consist of face images of celebrity. The images from 8 classes were taken for the learning tasks. Around 100 images from each class were used, out of which 85 were used for training and rest for testing. For zero shot learning, we took images from 3 different classes. Around 30 images of each unseen class were used for prediction the class. 4 Code Used • Feature Extraction - From [4] https://gist.github.com/ baraldilorenzo/07d7802847aaad0a35d3 • RankNet - Adapted from https://github.com/shiba24/learning2rank • RankNet - From [1] 5 Results 5.1 Correctly ordered pairs: The result obtained for the learned relative ranking per attribute in the 8 classes are shown in Table 1. Attribute Acc. Attribute Acc. Masculine 80.35 Bushy-eyebrows 80.81 Young 79.14 Pointy-nose 79.25 Chubby 69.68 Big-lips 88.25 Forehead 66.48 Round-face 83.66 Table 1: Accuracy of learned attributes 6
  • 8. 5.2 Class Prediction: The results of predicted class by formulating Normal Distribu- tion of each class are shown in Table 2. Class RankNet Acc. RankSVM Acc. 1 0.80 0.67 2 0.91 0.36 3 1.00 0.75 4 0.89 0.56 5 0.75 0.67 6 0.83 0.58 7 0.85 0.77 8 0.92 0.85 9(Unseen) 0.03 – 10(Unseen) 0.00 – 11(Unseen) 0.10 – Table 2: 6 Conclusions It can be inferred that feature vectors extracted from Convolu- tional Neural Network and RankNet model used for learning rel- ative attributes are performing better than the techniques used in [1] of GIST features and RankSVM model. However, for zero shot learning task, while the prediction of seen classes are better for RankNet as compared to RankSVM, it per- forms significantly poor in case of unseen classes. 7 Scope and Future Work For zero shot we can instead do unsupervised step like cluster- ing instead of using the comparison of attributes but that is one thing that the paper wanted to incorporate so that it kind of 7
  • 9. comes out in a natural way. Also, we need to find a better way to fins out the parameters of the unseen classes as the one that the paper used we found not to work so well. References [1] Parikh, Devi, and Kristen Grauman. ”Relative attributes.” Computer Vision (ICCV), 2011 IEEE International Confer- ence on. IEEE, 2011. [2] Burges, Chris, et al. ”Learning to rank using gradient de- scent.” Proceedings of the 22nd international conference on Machine learning. ACM, 2005. [3] Joachims, Thorsten. ”Optimizing search engines using clickthrough data.” Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2002. [4] Simonyan, Karen, and Andrew Zisserman. ”Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014). [5] ”Attribute and Simile Classifiers for Face Verification,” Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar, International Conference on Computer Vi- sion (ICCV), 2009. [6] Deng, Jia, et al. ”Imagenet: A large-scale hierarchical im- age database.” Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009. 8