1. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
Active Learning From Imbalanced Data: A Solution of Online Weighted Extreme
Learning Machine
In this paper author has introduce new algorithm to solve the problem of imbalance
dataset (imbalance means some time more than 80% of data belongs to one class
and only 20% of data belongs to second class, due to more no of data records
allocated to single class can cause invalid classification result, while classifying it
may predict only that class). Using this paper technique we can compute class label
automatically.
Below example for identifying class label from dataset, supposedataset having
student mark details
Sub1, Sub2, Sub3
90, 80, 70 if(total > 90 then class = A or total > 70 && < 90 then class = B
Like this all dataset will have class labels.
In some dataset there may be no class label for classification and may have
imbalance data and using this paper algorithm we can calculate label for all records
in acceptable range instead of 80% and 20% problem describeabove.
In this paper author has used many datasets but I am using seed and yeast dataset.
Below are the records from seed dataset
Area, Perimeter, Compactness, kernel_length, kernel_width, asymmetry_coeff groove
15.26, 14.84, 0.871, 5.763, 3.312, 2.221, 5.22, 1
14.88, 14.57, 0.8811, 5.554, 3.333, 1.018, 4.956, 1
21.18, 17.21, 0.8989, 6.573, 4.033, 5.78, 6.231, 2
20.88, 17.05 , 0.9031, 6.45, 4.032, 5.016, 6.321, 2
Last value is the class label and in above four records two records belongs to class
1 and two records belong to class 2. In above dataset class labels are present but we
can compute class label efficiently using this paper algorithm called ‘Active Online
Weighted Extreme Machine Learning’
Many techniques are available to overcome from imbalance data but they have
high computation and yield less performance and some existing techniques make
use of human resource to allocate labels (class name) to dataset.
To solve above problems algorithm will perform below steps
2. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
In this paper algorithm first choosefew random records (centroids) with class label
assign to them by humans.
Then clusters will be form by matching similarity between all records with chosen
centroids and then records which has max similarity with centroid will be assign to
that centroid
In the next step minority (means current record match with positive label centroid
with max similarity) and majority (means current record not match with centroid
and becomenegative). Division between majority and minority will give weight
for that record.
Rank will assign to each record baseon centroids similarity and then label will
assign to record from high match centroid.
Algorithm will continue this process till all records assign with labels and
algorithm will call stop function if no match found between centroids.
Screen shots
3. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
Click on ‘Upload Dataset’ button to upload either seed or yeast dataset
4. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
In above dataset I am uploading ‘seed’ dataset and then click on ‘Compute SVM-
ACS Label’ button to get count of each class
6. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
In above screen we can see in all three classes 1, 2, and 3 has equal no of records
70 and this problem can be solve by executing ‘AOW-ELM’ algorithm. Now click
on third button ‘Upload Dataset’ and upload data
8. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
In above screen we can see there are no class labels 1, 2 and 3 and we can calculate
efficiently by clicking on ‘ComputeAOW-ELM Label’ button
9. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
In above screen we can see all three classes have varied no of records and classifier
can easily predict label for given record. Now click on ‘View Matrix’ button to see
majority and minority details
Now upload yeast dataset and see results for both algorithms
11. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
In above screen we can see two classes NUC and CYT has above 400 records and
we distribute in acceptable range using proposealgorithm. Now upload dataset by
clicking on third button
13. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
Now click on ‘Compute AOW-ELM Label’ button to assign label
14. Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
In above screen we can see only one class has records more than 400. From above
output we can say all classes contains less than 50% of records. Now click on
‘Label Computation Time Chart’ button to get computation time for both datasets
to get labels