Upcoming SlideShare
×

# Data Mining Techniques

820 views

Published on

2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
820
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
35
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Data Mining Techniques

1. 1. Data Mining Techniques
2. 2. STATISTICAL PERPECTIVE ON DATA MINING <ul><li>Point Estimation:: </li></ul><ul><ul><li>Process of estimating a population parameter by an estimate of the parameter. </li></ul></ul><ul><ul><li>This can be done to estimate mean, variance, standard deviation and ect </li></ul></ul><ul><ul><li>Estimator technique may also be used to estimate the value for missing data </li></ul></ul><ul><ul><li>One measured of the effectiveness of an estimate is the Mean Squared error (MSE) and Root Mean Square (RMS) </li></ul></ul>
3. 3. STATISTICAL PERPECTIVE ON DATA MINING <ul><li>Model Based On Summarization:: </li></ul><ul><ul><li>Several basic well-known statistical concepts such as mean, variance, standard deviation, median and mode </li></ul></ul><ul><ul><li>Fitting a population to a specific frequency distribution provides an even better model of the data. </li></ul></ul><ul><ul><li>Well known techniques to display data graphically:: histogram,box plot,range,quartiles ect </li></ul></ul>
4. 4. STATISTICAL PERPECTIVE ON DATA MINING <ul><li>Bayes Theorem </li></ul><ul><ul><li>Is a techniques to estimate the likelihood of a property given the set of data as evidence or input </li></ul></ul><ul><li>Hypothesis Testing </li></ul><ul><ul><li>Attempt to find a model that explains the observed data by first creating a hypothesis and then testing the hypothesis against the data </li></ul></ul>
5. 5. CLASSIFICATION (I) Azizi Bin Ab Aziz Artificial Intelligence Unit Dept. of Computer Science Faculty of Information Technology Universiti Utara Malaysia E-mail: aziziaziz@uum.edu.my URL: www.aisig.uum.edu.my/aziziabaziz
6. 6. CLASSIFICATION <ul><li>Classification  “to categorize data into respective and well known classes/groups”. </li></ul><ul><li>Prediction  types of classification technique BUT dealing with continuous data (classification  discrete). </li></ul><ul><li>Class  predetermined group (equivalence classes). </li></ul><ul><li>Dataset : training data (to build classification model) and testing data (to evaluate the generated model/ unseen). </li></ul><ul><li>Classification process: </li></ul><ul><ul><li>1) model construction from training data. </li></ul></ul><ul><ul><li>2) model evaluation using testing data. </li></ul></ul>
7. 7. CLASSIFICATION PROCESS CLASSIFICATION ALGORITHM IF Income = “HIGH” AND Age < 30 THEN Credit Rating = “GOOD” .. TRAINING DATA Classification Model BAD 26 medium Anne GOOD 35 medium Dave GOOD 27 high Jim GOOD 35 medium Mary BAD 28 low Mike Credit Rating Age Income Name TESTING DATA (Jeff, medium, 35) Result: Credit Rating = “GOOD”
8. 8. CLASSIFICATION ALGORITHM <ul><li>What we will learn ?: Basic classification algorithm: </li></ul><ul><ul><li>Statistical (linear regression, logistic regression, time-series analysis) </li></ul></ul><ul><ul><li>Decision Tree </li></ul></ul><ul><ul><li>Neural Networks </li></ul></ul>
9. 9. Statistical : Linear Regression <ul><li>Regression analysis: is used primarily for prediction among bivariate relationship. </li></ul><ul><li>Main components: </li></ul><ul><ul><li>Dependent variables  response variable. </li></ul></ul><ul><ul><li>Independent variables  explanatory variables. </li></ul></ul><ul><ul><li>Correlation analysis  to measure the strength of measured variable. </li></ul></ul><ul><ul><li>Visualization  using scatter diagram </li></ul></ul><ul><ul><li>Basis  Least – Square method </li></ul></ul>
10. 10. Linear Regression (Example) Y = -2.2 + 2.3X 299 163 90 35 20   121 66 36 11 6 5 100 50 25 10 5 4 49 28 16 7 4 3 25 15 9 5 3 2 4 4 4 2 2 1 y*y xy x*x y x Set
11. 11. Linear Regression (Example) -2.2 0.96 Y= -2.2 + 2.3X X Y Prediction: If X= 9  Y=? Y = -2.2 + 2.3(9) = 18.5
12. 12. Statistical : Regression <ul><li>Example:: Olap Example </li></ul><ul><ul><li>Life Insurance Promotion is the attribute whose value is determined by a linear combination of attributes credit cards insurance and sex . </li></ul></ul><ul><ul><li>Life insurance promotion=0.5909(creditcardinsurance)-0.5455(sex) + 0.7727 </li></ul></ul><ul><ul><li>Suppose we wish to determine if a female who does not have credit card insurance is likely candidate for the life insurance promotion?? </li></ul></ul>
13. 13. Statistical : Time Series Analysis <ul><li>Time Series: collection of data obtained by observing a response variable at periodic point of time. </li></ul><ul><li>Variable: Y t refers value of the variable at the time t . </li></ul><ul><li>Time Series Components: </li></ul><ul><ul><li>Secular Trend : tendency of the series to increase or decrease over long period of time (long-term trend). </li></ul></ul><ul><ul><li>Cyclical Fluctuation : wavelike @ oscillating pattern that attributable to business /economic condition (business cycle) </li></ul></ul><ul><ul><li>Seasonal Variation : describes the fluctuations the recur during specific time (eg: festive season) </li></ul></ul>
14. 14. Secular Trend Number of employees at Home Depot Inc. from 1993 – 2002 shows an increasing trend
15. 15. Cyclical Fluctuation Number of batteries sold by National Battery Retailers from 1984 – 2003 shows cyclical fluctuation
16. 16. Secular Trend Number of manufactured home shipments in US from 1990 t0 2002 shows Increasing trend from 1990-1996, and then remain almost constant until 1999, and then decline until 2002.
17. 17. Seasonal Variation Sales of Baseball and Softball equipments by Hercher Sporting Goods from 2001 – 2003 quarterly shows a seasonal variation each year.
18. 18. Time Series Data & Graph Secular Trend Data Source: U.S Prime Interest Rate (Bureau of Economic Analysis) 15.8 1985 6.77 1984 5.89 1983 10.67 1982 6.9 1981 15.27 1980 12.67 1979 9.06 1978 6.82 1977 6.84 1976 7.86 1975 Rate Year
19. 19. Time Series Analysis: Smoothing Method <ul><li>Smoothing Method: method to remove rapid fluctuation that sometimes occurred in a time series. </li></ul><ul><li>Techniques: widely used  moving average method and exponential method . </li></ul><ul><li>Moving average : identify the long term trend of a time series (averaging neutralize the effect of rapid fluctuation).  N- point moving average. </li></ul><ul><li>Exponential average : similar to moving average, but emphasize to reduce random effects. (using exponential smoothing constant). </li></ul>
20. 20. Smoothing : Moving Average Method <ul><li>N -Point Moving Average: the average of the time series values over N adjacent time periods. </li></ul><ul><li>Plot the “smooth” time series curve that clearly depicts the long-term trend. </li></ul><ul><li>Algorithm: </li></ul><ul><ul><li>Select N , the number of time series value (must equally spaced) </li></ul></ul><ul><ul><li>Calculate N point moving total (summation </li></ul></ul><ul><ul><li>Compute the N point average, M t by dividing moving total with N </li></ul></ul>
21. 21. Moving Average Method Moving average for N = 2 & 3 11.67 11.5 35 23 15 5 10 10 30 20 8 4 8 11 24 22 12 3   6   12 10 2         2 1 3-point average 2-point average 3-point 2-point Value Time 10+2= 12 10+12= 22 12+8= 20 8+15= 23 2+10+12= 24 10+12+8= 30 12+8+15= 35 12/2= 6 22/2= 11 20/2= 10 23/2= 11.5 24/3= 8 30/3= 10 35/3= 11.67
22. 22. The Example of Moving Average Plot
23. 23. Smoothing : Exponential Smoothing Method <ul><li>Smoothing technique using weighted value and all observed data will be used. </li></ul><ul><li>Most recently observed value receive higher weight. </li></ul><ul><li>Equation: </li></ul><ul><li>E  smoothed series, W  smoothing constant (weight) </li></ul>
24. 24. The Example of Exponential Smoothing Plot * W n = 0.75, W n-1 = 0.25, using two-year weighted moving average
25. 25. The Comparison of Smoothing Methods
26. 26. Comparing Classification / Prediction Methods <ul><li>Accuracy : ability of a classifier to correctly predict the class label of new or previously unseen data </li></ul><ul><li>Speed: refer to the computational cost in generating and using the given classifier </li></ul><ul><li>Robustness : ability of the classifier/ predictor to make correct prediction s given a noisy and missing data </li></ul><ul><li>Scalability : ability to construct the classifier efficiently given a large amount of data </li></ul><ul><li>Interpretability : level of understanding and insight that is provided by the classifier or predictor </li></ul>