5. Dataset Example
• How to create a arff file
• How to open a file
• How to see number of attribute instances and attributes
• How to see the features and their information
Weather_data.arff
4/30/18 5
6. Creating Training, Validation and Test Sets
Training set: 60%
Cross validation: 20%
Test set: 20%
weather.arff
4/30/18 6
7. Generating Non-stratified Folds
When I am using k-fold cross validation, can I get each of the folds
from WEKA?
Stratified folds means every fold has every class of your dataset with
maintaining class ratio.
supermarket.arff
4/30/18 7
8. Generating Stratified Folds
When I am using k-fold cross validation, can I get each of the folds
from WEKA?
Stratified folds means every fold has every class of your dataset with
maintaining class ratio.
supermarket.arff
4/30/18 8
10. Numeric Transform
When your algorithm works well with integer but not real numbers.
Diabetes.arff
4/30/18 10
11. Outliers and Extreme Values
Is this possible to find out outliers and extreme values that are hidden in
dataset?
InterQuartileRange
Find outlier and extreme values.
Remove them.
nsl kdd dataset arff
4/30/18 11