SlideShare a Scribd company logo
1 of 10
Download to read offline
Modifications	to	a	Minimizing	Expected	Risk-based	Multi-class	Image	
Classification	Algorithm	on	Value-of-Information	(VoI)	
Zhuo	Li	——	zhuol1@andrew.cmu.edu	
Abstract—— Real-world image classification always meets with the problem that there are so
many images that could be easily obtained through different kinds of technologies, while little of
them are correctly labeled manually, for a huge amount of human-labeling requires too much
work. However, by applying proper active learning algorithms, computers can complete the
labeling process with a small number of human-labeled images as start and interactively
querying the oracle or human to get the true labels for some informative images useful in the
labeling. In my project, I made some modifications to the existing active learning algorithm [1]
on
VoI (value-of-information) to perform the task of multi-class image classification.
Keywords: Machine Learning; Active Learning; Uncertainty Sampling
Introduction	to	the	algorithm	I	used	
My algorithm is an adaption to the existing active learning algorithm raised by Joshi, Ajay J.,
Fatih Porikli, and Nikolaos P. [1]
whose	Query Selection Strategy is Minimizing Expected Risk,
from which I modified an Uncertainty Sampling algorithm to implement this multi-class bio-
images’ classification project.
My adapted algorithm cares about the misclassification risk for every image in the active pool
(unlabeled pool) in the query selection phase and uses a support vector machine (SVM) as the
base learner.
Specifically, I randomly choose 300 samples, which is around 1/10 of the number of query
limits, as the “seed” for both the active and random learners, and use a batch mode for Query
Selection at the size of 50.
Modifications:	
In this project, I chose to make modifications to the existing algorithm [1]
on the framework of
VoI which takes care of two things in the query selection strategy, the misclassification risk and
the cost of user annotation. I chose only to consider the metric of misclassification risk
rather than the cost of user annotation because in this project, the cost is the same for the
algorithm to query for any image in the training set, and there only exists the limit for the
number of queries, but the cost for every different query. Hence, misclassification risk is the
metric for selecting images to query in active learning.
The second modification I made is introduced in the first part of “Why this algorithm is
suitable” in the format of “Note”.
Why	this	algorithm	is	suitable	
The algorithm I used is suitable for this multi-class bio-images’ classification for the
following two reasons:
1. The	risk	misclassification	strategy	used	in	query	selection:	
In	the	query	selection	phase,	the	original	algorithm	will	compute	the	overall	risks	[1]	
for	
the	whole	system	after	learning	either	one	image	from	the	unlabeled	pool,	and	compare	
each	risk	to	the	overall	risk	of	the	whole	system	before	learning	either	one	of	the	image	
from	the	unlabeled	pool.	Then	the	algorithm	chooses	to	query	the	image	that	causes	
the	largest	risk	reduction,	in	other	words,	reduces	the	overall	risk	at	the	most.	
There	is	a	risk	matrix	M	in	the	computation	of	the	overall	risk,	which	denotes	the	weight	
of	the	risk	of	misclassifying	every	label.	The	weight	can	be	given	for	misclassifying	one	
label	for	another	label,	according	to	the	risk	it	causes	in	real	world.	For	example,	if	this	
algorithm	is	to	be	used	to	recognize	the	gene	that	causes	different	diseases,	the	weight	
of	misclassifying	a	gene	that	could	cause	tumor	to	causing	color	blindness	can	be	very	
high,	since	it	would	be	expensive	if	the	classification	is	wrong,	but	the	weight	could	be	
low	vice	versa.	
	
NOTE:	
However,	because	of	the	great	time	complexity	caused	by	computing	the	posterior	risk	
for	every	image	under	every	newly	learned	model	(a	Minimizing	Expected	Risk	
algorithm	),	which	requires	to	train	thousands	of	new	models	in	just	one	iteration,	I	
made	the	second	modification	to	the	query	selection	phase	by	computing	the	risk	of	
misclassifying	every	image	in	the	unlabeled	pool	instead,	moving	the	50	ones	with	the	
largest	risks	(batch	mode)	to	the	labeled	pool	and	training	the	active	learning	model	
again	with	the	new	labeled	pool,	which	has	changed	this	algorithm	from	time-
consuming	Minimizing	Expected	Risk-based	to	time-complexity-friendly.	
	
The	risk	for	misclassifying	one	image	𝑥	which	belongs	to	the	unlabeled	pool	is	as	follows,		
ℛℒ
{%}
=	 𝑀*+
,
+-.
,
*-.
∙ 𝑝%
*
ℒ 𝑝%
+
ℒ 	
where	ℒ	is	the	labeled	pool	in	each	iteration	of	query	selection,	𝑘	is	the	size	of	multi-
labels,	𝑀	is	the	risk	matrix	mentioned	above,	and	 𝑝%
*
ℒ 	is	the	posterior	probability	of	
classifying	image	𝑥	to	label	𝒾	under	the	condition	of	labeled	pool,	which	does	not	need	
to	train	thousands	of	new	models	to	make	a	decision.	
	
2. The	Support	Vector	Machine	used	as	base	learner	for	Multi-class	Classification	
Since	the	training	data	has	altogether	8	labels,	it	would	not	be	eligible	to	use	just	one	
binary	classifier.	SVM	implements	multi-class	in	mainly	two	ways,	one-versus-rest	and	
one-versus-one.	I	called	the	“svc”	from	Python’s	API	“sklearn.svm”	to	implement	the	
function	of	multi-class	classification	through	the	one-versus-one	[2]
	method.	
With	svc,	it	is	possible	to	train	the	model	in	multi-class	classification	and	return	the	
probabilities	for	every	label.
Performance	of	this	algorithm		
	
Besides the test error as a function of the amount of labeled data declared by the requirements, I
used another metric, success rate, to evaluate the performance of the active learner against a
random learner, which is actually a projection of test error, documenting the success rate of a
model predicting the test set, but more explicitly depicts the successful rates of predictions. Two
kinds of images will be provided here for evaluation.
In addition, I ran 10 times for EASY and MODERATE dataset to get the average success rate
and test error for evaluation, to avoid the randomness of the “seed set” so as to make a more
comprehensive evaluation.(Seed set is picked randomly from all training data before the active
learning process begins)
	
Note that there is a parameter C in the figures, which is the penalty parameter for SVM and will
be explained later in the Findings part.
EASY	DATASET:	
	
	
Figure	1:	One-time	Test	Errors	and	Success	Rate		
versus	Amount	of	Labeled	Points		
for	EASY	dataset
Figure	2:	Average	Test	Errors	and	Success	Rate		
versus	Amount	of	Labeled	Points	
for	EASY	dataset	
	
MODERATE	DATASET:	
	
Figure	3:	One-time	Test	Errors	and	Success	Rate		
versus	Amount	of	Labeled	Points		
for	MODERATE	dataset
Figure	4:	Average	Test	Errors	and	Success	Rate		
versus	Amount	of	Labeled	Points	
for	EASY	dataset	
DIFFICULT	DATASET:	
	
	
Figure	5:	One-time	Test	Errors	and	Success	Rate		
versus	Amount	of	Labeled	Points		
for	DIFFICULT	dataset
Findings	&	Explanations	of	Figures	
For	Parameter	C	
	
I set C as 1.0 for the EASY and DIFFICULT dataset and 0.9 for the MODERATE dataset,
reasons are as follows:
C is the penalty parameter for the base learner SVM controlling the influence of the
misclassification on the objective function, [3]
in other words, parameter C determines the
model’s “faith” to training data, that is to say, if C is too large, SVM will “trust” the training data
too much, which might cause overfitting. On the contrary, if C is too small, SVM will not “trust”
the training data that much, which might cause underfitting. How to choose a better C matters.
In SVM, C is default as 1.0, which is a trade-off between bias and variance. For the low-noise
EASY and DIFFICULT dataset, I choose C as default, which is 1.0. As for the MODERATE set,
since there is some noise in the training data, of which I’ve got to minimize the influence, I
choose parameter C as 0.9, which will make the model not “trust” the training data that much so
as to avoid overfitting.
Four graphs of “success rate” with different Cs are provided in Fig.6 below, from which we can
see the difference of the prediction accuracy between active learner and random learner,
accounting for the suitability of choosing C as 0.9, rather than other values, for MODERATE set.
Figure	6:	Average	Test	Errors	for	MODERATE	dataset		
with	different	Cs	
For	Easy	Dataset	
For the EASY dataset, we can observe that the active learner outperforms the random learner,
both in the prediction accuracy and the speed to reach its best performance.
The active learner has around 77 errors out of 1000 predictions, while the random learner has
around 92 errors out of 1000. These are all the final performances of both learners because from
Fig.2, the average performance of 10 times of running, we can see the lines for two learners tend
to be smooth in the end.
From the average performance, we could see that in the beginning, the active learner might
perform a little weaker than the random learner, which is because the active learner chooses the
most informative images to query, in other words, the ones that are most risky and most likely to
be around the boundaries, and the random learner randomly picks images to query, which could
temporarily cause the active learner to underperform. However, with the amount of labeled
points increasing, we could see crystal clear that the active learner outperforms the random
one.
What’s more, the success rate of active learner reaches its peak at 92.4% way before when
random learner reaches its peak at 91.1%, which shows that the learning speed of the active
learner is faster than the random one.
For	Moderate	Dataset	
	
For the MODERATE dataset, I set penalty parameter C as 0.9 to avoid the influences of noise.
We can observe in Fig.4 that the active learner outperforms the random learner, both in the
prediction accuracy and the learning speed.
The active learner has around 150 errors out of 1000 predictions, while the random learner has
around 166 errors out of 1000.
From the average performance, we could see that in the beginning, the active learner might
perform a little weaker than the random learner, as in the EASY set. However, we could see that
overall, the active learner outperforms the random one. The performance in MODERATE set
is not as good as in EASY set is because there exists a certain amount of noise in the training set
for MODERATE set.
In addition, the success rate of the active learner reaches its peak at 85%, and the random learner
reaches its peak accuracy is 83.6%. The closer the two learners reach their peaks, the smoother
the line of the active learner, which shows that the learning speed of the active learner is faster
than the random one.
	
	
For	Difficult	Dataset	
	
For the DIFFICULT dataset, I did feature selection to both before and in each iteration of the
active learning.
A Tree-based Feature Selection method from Python’s API “sklearn.feature_selection” [4]
is
applied for feature selection here, and each feature selection is done before training the active
learner and random learner to exclude the negative influences of unrelated features.
From Fig.7 below, we can see related features in the training data for DIFFUCLT set is around
from 23 to 26, which means nearly half the features are unrelated, which are successfully
excluded by the process of my feature selection processes.
Figure	7:	Numbers	of	selected	features		
for	the	active	and	random	learners	at	the	end	
In addition, by Fig.5, we can see that the active learner has around 130 errors out of 1000
predictions, while the random learner has around 153 errors out of 1000, and nearly in the whole
time, the active learner outperforms the random learner with its peak accuracy at 87%, while the
random learner’s peak accuracy is 84.7%.
References
1. Joshi, Ajay J., Fatih Porikli, and Nikolaos P. Papanikolopoulos. "Scalable active learning for
multiclass image classification." IEEE transactions on pattern analysis and machine intelligence
34.11 (2012): 2259-2273.
2. http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn-svm-svc
3. http://stats.stackexchange.com/questions/31066/what-is-the-influence-of-c-in-svms-with-
linear-kernel
4. http://scikit-learn.org/stable/modules/feature_selection.html

More Related Content

What's hot

Facial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachFacial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachAshwinRachha
 
Building_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKABuilding_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKASunil Kakade
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Takrim Ul Islam Laskar
 
A new architecture of internet of things and big data ecosystem for
A new architecture of internet of things and big data ecosystem forA new architecture of internet of things and big data ecosystem for
A new architecture of internet of things and big data ecosystem forVenkat Projects
 
Facial expression recognition
Facial expression recognitionFacial expression recognition
Facial expression recognitionElyesMiri
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Nss power point_machine_learning
Nss power point_machine_learningNss power point_machine_learning
Nss power point_machine_learningGauravsd2014
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDRabi Das
 
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media DataIRJET Journal
 
A Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFISA Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFISIJSRD
 
Feature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performanceFeature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performanceVenkat Projects
 
smartwatch-user-identification
smartwatch-user-identificationsmartwatch-user-identification
smartwatch-user-identificationSebastian W. Cheah
 
A deep learning facial expression recognition based scoring system for restau...
A deep learning facial expression recognition based scoring system for restau...A deep learning facial expression recognition based scoring system for restau...
A deep learning facial expression recognition based scoring system for restau...CloudTechnologies
 
Crocodile Physics
Crocodile PhysicsCrocodile Physics
Crocodile Physicsu082930
 
Recommender system
Recommender systemRecommender system
Recommender systemSaiguru P.v
 
IRJET- Survey on Face Recognition using Biometrics
IRJET-  	  Survey on Face Recognition using BiometricsIRJET-  	  Survey on Face Recognition using Biometrics
IRJET- Survey on Face Recognition using BiometricsIRJET Journal
 
Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers Rupinder Saini
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Aakash Chotrani
 

What's hot (20)

Facial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachFacial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approach
 
Building_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKABuilding_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKA
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
 
Soumya
SoumyaSoumya
Soumya
 
A new architecture of internet of things and big data ecosystem for
A new architecture of internet of things and big data ecosystem forA new architecture of internet of things and big data ecosystem for
A new architecture of internet of things and big data ecosystem for
 
Facial expression recognition
Facial expression recognitionFacial expression recognition
Facial expression recognition
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Nss power point_machine_learning
Nss power point_machine_learningNss power point_machine_learning
Nss power point_machine_learning
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
 
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media Data
 
A Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFISA Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFIS
 
Feature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performanceFeature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performance
 
smartwatch-user-identification
smartwatch-user-identificationsmartwatch-user-identification
smartwatch-user-identification
 
Crocodile Physics
Crocodile PhysicsCrocodile Physics
Crocodile Physics
 
A deep learning facial expression recognition based scoring system for restau...
A deep learning facial expression recognition based scoring system for restau...A deep learning facial expression recognition based scoring system for restau...
A deep learning facial expression recognition based scoring system for restau...
 
Crocodile Physics
Crocodile PhysicsCrocodile Physics
Crocodile Physics
 
Recommender system
Recommender systemRecommender system
Recommender system
 
IRJET- Survey on Face Recognition using Biometrics
IRJET-  	  Survey on Face Recognition using BiometricsIRJET-  	  Survey on Face Recognition using Biometrics
IRJET- Survey on Face Recognition using Biometrics
 
Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers Facial expression recongnition Techniques, Database and Classifiers
Facial expression recongnition Techniques, Database and Classifiers
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning
 

Viewers also liked

zhou_resume_june2016
zhou_resume_june2016zhou_resume_june2016
zhou_resume_june2016Haonan Zhou
 
Vadim Korolik Resume
Vadim Korolik ResumeVadim Korolik Resume
Vadim Korolik ResumeVadim Korolik
 
Israel Vicars Resume
Israel Vicars ResumeIsrael Vicars Resume
Israel Vicars ResumeIsrael Vicars
 
Jalisa Israel resume 2016
Jalisa Israel resume 2016Jalisa Israel resume 2016
Jalisa Israel resume 2016Jalisa Israel
 
Omar Israel Wright_Resume
Omar Israel Wright_ResumeOmar Israel Wright_Resume
Omar Israel Wright_ResumeOmar Wright
 
Resume Yi (Joey) Zhou
Resume Yi (Joey) ZhouResume Yi (Joey) Zhou
Resume Yi (Joey) ZhouYi Zhou
 
Israel Izzy Riemer resume
Israel Izzy Riemer resumeIsrael Izzy Riemer resume
Israel Izzy Riemer resumeIsrael Riemer
 
Israel Olaore Resume 1
Israel Olaore Resume 1Israel Olaore Resume 1
Israel Olaore Resume 1Israel Olaore
 
Anyu Zhou-Resume
Anyu Zhou-ResumeAnyu Zhou-Resume
Anyu Zhou-ResumeAnyu Zhou
 
Xiaochen Cai-Resume July 2016
Xiaochen Cai-Resume July 2016Xiaochen Cai-Resume July 2016
Xiaochen Cai-Resume July 2016笑辰 蔡
 
Rachel Rutti Resume 8 23-2010
Rachel Rutti Resume 8 23-2010Rachel Rutti Resume 8 23-2010
Rachel Rutti Resume 8 23-2010rachelrutti
 
Sicheng Zhou Resume
Sicheng Zhou ResumeSicheng Zhou Resume
Sicheng Zhou ResumeSicheng ZHOU
 
Israel Irazarry new resume 10-19-2016
Israel Irazarry new resume 10-19-2016Israel Irazarry new resume 10-19-2016
Israel Irazarry new resume 10-19-2016ISRAEL IRIZARRY
 
Lewis_A_Ecker_RESUME_2016_COLORphoto (2)
Lewis_A_Ecker_RESUME_2016_COLORphoto (2)Lewis_A_Ecker_RESUME_2016_COLORphoto (2)
Lewis_A_Ecker_RESUME_2016_COLORphoto (2)Lewis Ecker
 

Viewers also liked (20)

zhou_resume_june2016
zhou_resume_june2016zhou_resume_june2016
zhou_resume_june2016
 
Vadim Korolik Resume
Vadim Korolik ResumeVadim Korolik Resume
Vadim Korolik Resume
 
Israel Vicars Resume
Israel Vicars ResumeIsrael Vicars Resume
Israel Vicars Resume
 
Jalisa Israel resume 2016
Jalisa Israel resume 2016Jalisa Israel resume 2016
Jalisa Israel resume 2016
 
Omar Israel Wright_Resume
Omar Israel Wright_ResumeOmar Israel Wright_Resume
Omar Israel Wright_Resume
 
Resume Yi (Joey) Zhou
Resume Yi (Joey) ZhouResume Yi (Joey) Zhou
Resume Yi (Joey) Zhou
 
KAI RESUME
KAI RESUMEKAI RESUME
KAI RESUME
 
Israel Izzy Riemer resume
Israel Izzy Riemer resumeIsrael Izzy Riemer resume
Israel Izzy Riemer resume
 
Israel Olaore Resume 1
Israel Olaore Resume 1Israel Olaore Resume 1
Israel Olaore Resume 1
 
Roy_Resume_2016
Roy_Resume_2016Roy_Resume_2016
Roy_Resume_2016
 
Anyu Zhou-Resume
Anyu Zhou-ResumeAnyu Zhou-Resume
Anyu Zhou-Resume
 
Xiaochen Cai-Resume July 2016
Xiaochen Cai-Resume July 2016Xiaochen Cai-Resume July 2016
Xiaochen Cai-Resume July 2016
 
Zoey Hopkins Cv
Zoey Hopkins CvZoey Hopkins Cv
Zoey Hopkins Cv
 
Rachel Rutti Resume 8 23-2010
Rachel Rutti Resume 8 23-2010Rachel Rutti Resume 8 23-2010
Rachel Rutti Resume 8 23-2010
 
Sicheng Zhou Resume
Sicheng Zhou ResumeSicheng Zhou Resume
Sicheng Zhou Resume
 
YUANXIN ZHOU resume copy 3
YUANXIN ZHOU resume copy 3YUANXIN ZHOU resume copy 3
YUANXIN ZHOU resume copy 3
 
Israel Irazarry new resume 10-19-2016
Israel Irazarry new resume 10-19-2016Israel Irazarry new resume 10-19-2016
Israel Irazarry new resume 10-19-2016
 
Lewis_A_Ecker_RESUME_2016_COLORphoto (2)
Lewis_A_Ecker_RESUME_2016_COLORphoto (2)Lewis_A_Ecker_RESUME_2016_COLORphoto (2)
Lewis_A_Ecker_RESUME_2016_COLORphoto (2)
 
Monika, resume
Monika, resumeMonika, resume
Monika, resume
 
Israel Maynard resume
Israel Maynard resumeIsrael Maynard resume
Israel Maynard resume
 

Similar to Multi-class Bio-images Classification

Using machine learning in anti money laundering part 2
Using machine learning in anti money laundering   part 2Using machine learning in anti money laundering   part 2
Using machine learning in anti money laundering part 2Naveen Grover
 
A Comparative Study on Identical Face Classification using Machine Learning
A Comparative Study on Identical Face Classification using Machine LearningA Comparative Study on Identical Face Classification using Machine Learning
A Comparative Study on Identical Face Classification using Machine LearningIRJET Journal
 
Machine Learning Interview Questions
Machine Learning Interview QuestionsMachine Learning Interview Questions
Machine Learning Interview QuestionsRock Interview
 
direct marketing in banking using data mining
direct marketing in banking using data miningdirect marketing in banking using data mining
direct marketing in banking using data miningHossein Malekinezhad
 
Machine learning interview questions and answers
Machine learning interview questions and answersMachine learning interview questions and answers
Machine learning interview questions and answerskavinilavuG
 
Post Graduate Admission Prediction System
Post Graduate Admission Prediction SystemPost Graduate Admission Prediction System
Post Graduate Admission Prediction SystemIRJET Journal
 
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...Shakas Technologies
 
Implementation of Spam Classifier using Naïve Bayes Algorithm
Implementation of Spam Classifier using Naïve Bayes AlgorithmImplementation of Spam Classifier using Naïve Bayes Algorithm
Implementation of Spam Classifier using Naïve Bayes AlgorithmIRJET Journal
 
How to Implement the Digital Medicine in the Future
How to Implement the Digital Medicine in the FutureHow to Implement the Digital Medicine in the Future
How to Implement the Digital Medicine in the FutureYoon Sup Choi
 
IRJET- Automated Student’s Attendance Management using Convolutional Neural N...
IRJET- Automated Student’s Attendance Management using Convolutional Neural N...IRJET- Automated Student’s Attendance Management using Convolutional Neural N...
IRJET- Automated Student’s Attendance Management using Convolutional Neural N...IRJET Journal
 
SVM-KNN Hybrid Method for MR Image
SVM-KNN Hybrid Method for MR ImageSVM-KNN Hybrid Method for MR Image
SVM-KNN Hybrid Method for MR ImageIRJET Journal
 
machine learning
machine learningmachine learning
machine learningMounisha A
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachYusuf Uzun
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applicationsBenjaminlapid1
 
Using the Machine to predict Testability
Using the Machine to predict TestabilityUsing the Machine to predict Testability
Using the Machine to predict TestabilityMiguel Lopez
 

Similar to Multi-class Bio-images Classification (20)

RSI_ReportPDF
RSI_ReportPDFRSI_ReportPDF
RSI_ReportPDF
 
Ijetr042148
Ijetr042148Ijetr042148
Ijetr042148
 
Using machine learning in anti money laundering part 2
Using machine learning in anti money laundering   part 2Using machine learning in anti money laundering   part 2
Using machine learning in anti money laundering part 2
 
A Comparative Study on Identical Face Classification using Machine Learning
A Comparative Study on Identical Face Classification using Machine LearningA Comparative Study on Identical Face Classification using Machine Learning
A Comparative Study on Identical Face Classification using Machine Learning
 
Machine Learning Interview Questions
Machine Learning Interview QuestionsMachine Learning Interview Questions
Machine Learning Interview Questions
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
direct marketing in banking using data mining
direct marketing in banking using data miningdirect marketing in banking using data mining
direct marketing in banking using data mining
 
Machine learning interview questions and answers
Machine learning interview questions and answersMachine learning interview questions and answers
Machine learning interview questions and answers
 
Post Graduate Admission Prediction System
Post Graduate Admission Prediction SystemPost Graduate Admission Prediction System
Post Graduate Admission Prediction System
 
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
 
Implementation of Spam Classifier using Naïve Bayes Algorithm
Implementation of Spam Classifier using Naïve Bayes AlgorithmImplementation of Spam Classifier using Naïve Bayes Algorithm
Implementation of Spam Classifier using Naïve Bayes Algorithm
 
How to Implement the Digital Medicine in the Future
How to Implement the Digital Medicine in the FutureHow to Implement the Digital Medicine in the Future
How to Implement the Digital Medicine in the Future
 
IRJET- Automated Student’s Attendance Management using Convolutional Neural N...
IRJET- Automated Student’s Attendance Management using Convolutional Neural N...IRJET- Automated Student’s Attendance Management using Convolutional Neural N...
IRJET- Automated Student’s Attendance Management using Convolutional Neural N...
 
SVM-KNN Hybrid Method for MR Image
SVM-KNN Hybrid Method for MR ImageSVM-KNN Hybrid Method for MR Image
SVM-KNN Hybrid Method for MR Image
 
machine learning
machine learningmachine learning
machine learning
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN Approach
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
Using the Machine to predict Testability
Using the Machine to predict TestabilityUsing the Machine to predict Testability
Using the Machine to predict Testability
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 

Multi-class Bio-images Classification

  • 1. Modifications to a Minimizing Expected Risk-based Multi-class Image Classification Algorithm on Value-of-Information (VoI) Zhuo Li —— zhuol1@andrew.cmu.edu Abstract—— Real-world image classification always meets with the problem that there are so many images that could be easily obtained through different kinds of technologies, while little of them are correctly labeled manually, for a huge amount of human-labeling requires too much work. However, by applying proper active learning algorithms, computers can complete the labeling process with a small number of human-labeled images as start and interactively querying the oracle or human to get the true labels for some informative images useful in the labeling. In my project, I made some modifications to the existing active learning algorithm [1] on VoI (value-of-information) to perform the task of multi-class image classification. Keywords: Machine Learning; Active Learning; Uncertainty Sampling Introduction to the algorithm I used My algorithm is an adaption to the existing active learning algorithm raised by Joshi, Ajay J., Fatih Porikli, and Nikolaos P. [1] whose Query Selection Strategy is Minimizing Expected Risk, from which I modified an Uncertainty Sampling algorithm to implement this multi-class bio- images’ classification project. My adapted algorithm cares about the misclassification risk for every image in the active pool (unlabeled pool) in the query selection phase and uses a support vector machine (SVM) as the base learner. Specifically, I randomly choose 300 samples, which is around 1/10 of the number of query limits, as the “seed” for both the active and random learners, and use a batch mode for Query Selection at the size of 50. Modifications: In this project, I chose to make modifications to the existing algorithm [1] on the framework of VoI which takes care of two things in the query selection strategy, the misclassification risk and the cost of user annotation. I chose only to consider the metric of misclassification risk rather than the cost of user annotation because in this project, the cost is the same for the algorithm to query for any image in the training set, and there only exists the limit for the number of queries, but the cost for every different query. Hence, misclassification risk is the metric for selecting images to query in active learning. The second modification I made is introduced in the first part of “Why this algorithm is suitable” in the format of “Note”.
  • 2. Why this algorithm is suitable The algorithm I used is suitable for this multi-class bio-images’ classification for the following two reasons: 1. The risk misclassification strategy used in query selection: In the query selection phase, the original algorithm will compute the overall risks [1] for the whole system after learning either one image from the unlabeled pool, and compare each risk to the overall risk of the whole system before learning either one of the image from the unlabeled pool. Then the algorithm chooses to query the image that causes the largest risk reduction, in other words, reduces the overall risk at the most. There is a risk matrix M in the computation of the overall risk, which denotes the weight of the risk of misclassifying every label. The weight can be given for misclassifying one label for another label, according to the risk it causes in real world. For example, if this algorithm is to be used to recognize the gene that causes different diseases, the weight of misclassifying a gene that could cause tumor to causing color blindness can be very high, since it would be expensive if the classification is wrong, but the weight could be low vice versa. NOTE: However, because of the great time complexity caused by computing the posterior risk for every image under every newly learned model (a Minimizing Expected Risk algorithm ), which requires to train thousands of new models in just one iteration, I made the second modification to the query selection phase by computing the risk of misclassifying every image in the unlabeled pool instead, moving the 50 ones with the largest risks (batch mode) to the labeled pool and training the active learning model again with the new labeled pool, which has changed this algorithm from time- consuming Minimizing Expected Risk-based to time-complexity-friendly. The risk for misclassifying one image 𝑥 which belongs to the unlabeled pool is as follows, ℛℒ {%} = 𝑀*+ , +-. , *-. ∙ 𝑝% * ℒ 𝑝% + ℒ where ℒ is the labeled pool in each iteration of query selection, 𝑘 is the size of multi- labels, 𝑀 is the risk matrix mentioned above, and 𝑝% * ℒ is the posterior probability of classifying image 𝑥 to label 𝒾 under the condition of labeled pool, which does not need to train thousands of new models to make a decision. 2. The Support Vector Machine used as base learner for Multi-class Classification Since the training data has altogether 8 labels, it would not be eligible to use just one binary classifier. SVM implements multi-class in mainly two ways, one-versus-rest and one-versus-one. I called the “svc” from Python’s API “sklearn.svm” to implement the function of multi-class classification through the one-versus-one [2] method. With svc, it is possible to train the model in multi-class classification and return the probabilities for every label.
  • 3. Performance of this algorithm Besides the test error as a function of the amount of labeled data declared by the requirements, I used another metric, success rate, to evaluate the performance of the active learner against a random learner, which is actually a projection of test error, documenting the success rate of a model predicting the test set, but more explicitly depicts the successful rates of predictions. Two kinds of images will be provided here for evaluation. In addition, I ran 10 times for EASY and MODERATE dataset to get the average success rate and test error for evaluation, to avoid the randomness of the “seed set” so as to make a more comprehensive evaluation.(Seed set is picked randomly from all training data before the active learning process begins) Note that there is a parameter C in the figures, which is the penalty parameter for SVM and will be explained later in the Findings part. EASY DATASET: Figure 1: One-time Test Errors and Success Rate versus Amount of Labeled Points for EASY dataset
  • 6. Findings & Explanations of Figures For Parameter C I set C as 1.0 for the EASY and DIFFICULT dataset and 0.9 for the MODERATE dataset, reasons are as follows: C is the penalty parameter for the base learner SVM controlling the influence of the misclassification on the objective function, [3] in other words, parameter C determines the model’s “faith” to training data, that is to say, if C is too large, SVM will “trust” the training data too much, which might cause overfitting. On the contrary, if C is too small, SVM will not “trust” the training data that much, which might cause underfitting. How to choose a better C matters. In SVM, C is default as 1.0, which is a trade-off between bias and variance. For the low-noise EASY and DIFFICULT dataset, I choose C as default, which is 1.0. As for the MODERATE set, since there is some noise in the training data, of which I’ve got to minimize the influence, I choose parameter C as 0.9, which will make the model not “trust” the training data that much so as to avoid overfitting. Four graphs of “success rate” with different Cs are provided in Fig.6 below, from which we can see the difference of the prediction accuracy between active learner and random learner, accounting for the suitability of choosing C as 0.9, rather than other values, for MODERATE set.
  • 7. Figure 6: Average Test Errors for MODERATE dataset with different Cs For Easy Dataset For the EASY dataset, we can observe that the active learner outperforms the random learner, both in the prediction accuracy and the speed to reach its best performance. The active learner has around 77 errors out of 1000 predictions, while the random learner has around 92 errors out of 1000. These are all the final performances of both learners because from Fig.2, the average performance of 10 times of running, we can see the lines for two learners tend to be smooth in the end. From the average performance, we could see that in the beginning, the active learner might perform a little weaker than the random learner, which is because the active learner chooses the most informative images to query, in other words, the ones that are most risky and most likely to be around the boundaries, and the random learner randomly picks images to query, which could temporarily cause the active learner to underperform. However, with the amount of labeled points increasing, we could see crystal clear that the active learner outperforms the random one.
  • 8. What’s more, the success rate of active learner reaches its peak at 92.4% way before when random learner reaches its peak at 91.1%, which shows that the learning speed of the active learner is faster than the random one. For Moderate Dataset For the MODERATE dataset, I set penalty parameter C as 0.9 to avoid the influences of noise. We can observe in Fig.4 that the active learner outperforms the random learner, both in the prediction accuracy and the learning speed. The active learner has around 150 errors out of 1000 predictions, while the random learner has around 166 errors out of 1000. From the average performance, we could see that in the beginning, the active learner might perform a little weaker than the random learner, as in the EASY set. However, we could see that overall, the active learner outperforms the random one. The performance in MODERATE set is not as good as in EASY set is because there exists a certain amount of noise in the training set for MODERATE set. In addition, the success rate of the active learner reaches its peak at 85%, and the random learner reaches its peak accuracy is 83.6%. The closer the two learners reach their peaks, the smoother the line of the active learner, which shows that the learning speed of the active learner is faster than the random one. For Difficult Dataset For the DIFFICULT dataset, I did feature selection to both before and in each iteration of the active learning. A Tree-based Feature Selection method from Python’s API “sklearn.feature_selection” [4] is applied for feature selection here, and each feature selection is done before training the active learner and random learner to exclude the negative influences of unrelated features. From Fig.7 below, we can see related features in the training data for DIFFUCLT set is around from 23 to 26, which means nearly half the features are unrelated, which are successfully excluded by the process of my feature selection processes.
  • 9. Figure 7: Numbers of selected features for the active and random learners at the end In addition, by Fig.5, we can see that the active learner has around 130 errors out of 1000 predictions, while the random learner has around 153 errors out of 1000, and nearly in the whole time, the active learner outperforms the random learner with its peak accuracy at 87%, while the random learner’s peak accuracy is 84.7%.
  • 10. References 1. Joshi, Ajay J., Fatih Porikli, and Nikolaos P. Papanikolopoulos. "Scalable active learning for multiclass image classification." IEEE transactions on pattern analysis and machine intelligence 34.11 (2012): 2259-2273. 2. http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn-svm-svc 3. http://stats.stackexchange.com/questions/31066/what-is-the-influence-of-c-in-svms-with- linear-kernel 4. http://scikit-learn.org/stable/modules/feature_selection.html