1. İstanbul Kültür University – Department of Computer Engineering
Final Project Proposal and Appointment Form
Advisor : Uğur Ayan
Academic Year : 2009-2010 Semester: Fall Spring Summer
Student No: Name & Surname: E-mail Address: Signature:
1. Student:
2. Student:
PROJECT TITLE : Semi-Supervised Machine Learning Toolbox
DESCRIPTION : In computer science, semi-supervised learning is a class of machine learning
techniques that make use of both labeled and unlabeled data for training - typically a small amount of
labeled data with a large amount of unlabeled data. Semi-supervised learning falls between
unsupervised learning (without any labeled training data) and supervised learning (with completely
labeled training data). Many machine-learning researchers have found that unlabeled data, when
used in conjunction with a small amount of labeled data, can produce considerable improvement in
learning accuracy. The acquisition of labeled data for a learning problem often requires a skilled
human agent to manually classify training examples. The cost associated with the labeling process
thus may render a fully labeled training set infeasible, whereas acquisition of unlabeled data is
relatively inexpensive. In such situations, semi-supervised learning can be of great practical value.
One example of a semi-supervised learning technique is co-training, in which two or possibly more
learners are each trained on a set of examples, but with each learner using a different, and ideally
independent, set of features for each example.
An alternative approach is to model the joint probability distribution of the features and the labels. For
the unlabelled data the labels can then be treated as 'missing data'. It is common to use the EM
algorithm to maximize the likelihood of the model.
Before coming to apply the final project, you have to read below given papers. Min. Requirements :
applicants are capable of writing C, C# or Matlab code on Gaussin Random Field, Consistency
Method, Low Density Seperation and Active Learning
RECOMMENDED RESOURCES :
Abney, S., Semisupervised Learning for Computational Linguistics. Chapman & Hall/CRC, 2008.
Blum, A., Mitchell, T. Combining labeled and unlabeled data with co-training. COLT: Proceedings of
the Workshop on Computational Learning Theory, Morgan Kaufmann, 1998, p. 92-100.
Chapelle, O., B. Schölkopf and A. Zien: Semi-Supervised Learning. MIT Press, Cambridge, MA
(2006). Further information.
Huang T-M., Kecman V., Kopriva I. [1], "Kernel Based Algorithms for Mining Huge Data Sets,
Supervised, Semisupervised and Unsupervised Learning", Springer-Verlag, Berlin, Heidelberg, 260
pp. 96 illus., Hardcover, ISBN 3-540-31681-7, 2006.
O'Neill, T. J. (1978) Normal discrimination with unclassified observations. Journal of the American
Statistical Association, 73, 821–826.
Zhu, X. Semi-supervised learning literature survey.
Zhu, X., Goldberg, A. Introduction to Semi-Supervised Learning. Morgan & Claypool Publishers, 2009.
Advisor Head of Department
.......................................... ....................................................
ikuBM-formNo : 28.2, Hazırlayan : Ar. Gör. F. Canan Pembe, Onaylayan: Prof. Dr. Ümit Karakaş