Given ( x j , y j ) : a set of N training examples ( j=1,…,N )
The error rate of a base classifier C i : where I(p) = 1 if p is true, and 0 otherwise. The importance of a classifier C i :
AdaBoosting Algorithm The weight update mechanism (Equation): where is the normalization factor: : the weight for example ( x i , y i ) during the round
AdaBoosting Algorithm Let k be the number of boosting rounds, D is the set of all examples Update the weight of each examples according to Equation End for , Initialize the weights for all N examples For i = 1 to k do Create training set D i by sampling from D according to W . Train a base classifier C i on D i Apply C i to all examples in the original set D
Two “independent” views: split the features into two sets.
Train a classifier on each view.
Each classifier labels data that can be used to train the other classifier , and vice versa
Co-Training Approach Feature Set X=(X1, X2) Classification Model One Classification Model Two new labeled data set 1 subset X1 subset X2 training training new labeled data set 2 classifying classifying Unlabeled data Unlabeled data example set L example set L
Step 1 Step 2 positive negative Reliable Negative (RN) Q =U - RN U P positive Using P, RN and Q to build the final classifier iteratively or Using only P and RN to build a classifier Existing 2-step strategy