3. DataNames
# 590# 2
Class
Time- stamp
Unlabeled features
NaNs
FeaturesClass
# 590# 1
-1 or 1
19/07/2008
11:55:00
Year
Month
Day of Month
Day of Week
Hour
Minute
V1 to V590
Timestamp
# 6
NAs 4.54%
Today’s focus
Input Input Data frame
10. Upto 1%
#324
2 - 3%
#46
17%
#20
46%
#4
Continuous
Variables
Categorical
Variables
5% trimmed mean
#308
Mode
#16
kNN imputation*
Clean
#52
#48
#4
For each record, identify missing features. For each missing feature find the k nearest neighbors which have that feature.
Impute the missing value using the imputation function on the k-length vector of values found from the neighbors. (Source:
R package imputation v2.0.3 by Jeffrey Wong)