Data

750 views
696 views

Published on

Data

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
750
On SlideShare
0
From Embeds
0
Number of Embeds
23
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data

  1. 1. DATA FOR DATA MINING<br />
  2. 2. Types of variables<br />Can be divided into two main types: <br />Categorical attributes - Nominal, binary and ordinal variables<br />Continuous attributes– integer, interval-scaled and ratio-scaled variables<br />Ignore attribute (optional) - Variables which are of no significance<br />
  3. 3. Data Cleaning<br />Erroneous values can be divided into:<br />Noisy value: Valid for the dataset, but incorrectly recorded<br />Invalid values: Can be easily detected and removed/corrected<br />Noise detection:<br />Peaks in the dataset<br />Some values outside the normal range: Such values could even be genuine (called as Outliers)<br />
  4. 4. Missing Values<br />Reasons of occurrence: <br />Equipment malfunction<br />Additional fields were added later<br />Non-availability of information <br />Strategies to deal with missing values<br />Discard instances<br />Replace by most frequent/average value<br />
  5. 5. Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at www.dataminingtools.net<br />

×