Like this presentation? Why not share!

- Knowledge discovery claudiad amato by SSSW 2015 357 views
- Introduction to Data Mining by Umesh HRao 468 views
- Value of IT Investment.ppt by khanyasmin 4429 views
- Knowledge Discovery and Data Mining by Amritanshu Mehra 3417 views
- time series analysis by SACHIN AWASTHI 28989 views
- Ch 1 Intro to Data Mining by Sushil Kulkarni 26809 views

1,266

-1

-1

Published on

Introduction to Data mining

No Downloads

Total Views

1,266

On Slideshare

0

From Embeds

0

Number of Embeds

0

Shares

0

Downloads

0

Comments

0

Likes

4

No embeds

No notes for slide

- 1. Introduction on Data Mining<br />
- 2. What is Data Mining<br />Non-trivial extraction of implicit, previously unknown and potentially useful information from data<br />Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns<br />Data mining is the process of automatically discovering useful information in large data repositories<br /> --<br />
- 3. Simple Examples for Data Mining<br /><ul><li>Predicting whether a newly arrived customer will spend more than 100$ at a department store.
- 4. Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,)</li></li></ul><li>Why Data Mining<br />Credit ratings/targeted marketing:<br />Given a database of 100,000 names, which persons are the least likely to default on their credit cards? <br />Identify likely responders to sales promotions<br />Fraud detection<br />Which types of transactions are likely to be fraudulent, given the demographics and transactional history of a particular customer?<br />
- 5. Origins of Data Mining<br />Draws ideas from machine learning/AI, pattern recognition, statistics, and database systems<br />Traditional Techniquesmay be unsuitable due to <br />Enormity of data<br />High dimensionality of data<br />Heterogeneous, distributed nature of data<br />
- 6. Data Mining Tasks<br />Prediction Methods<br />Use some variables to predict unknown or future values of other variables<br />Description Methods<br />Find human-interpretable patterns that describe the data.<br />
- 7. Data Mining Tasks<br />Classification [Predictive]<br />Clustering [Descriptive]<br />Association Rule Discovery [Descriptive]<br />Sequential Pattern Discovery [Descriptive]<br />Regression [Predictive]<br />Deviation Detection [Predictive]<br />
- 8. Classification: Definition<br />It is used for discrete target variables<br />Ex: predicting whether a Web user will make a purchase at an online store is an classification tasks because the target variabe is binary-valued.<br />
- 9. Clustering: Definition<br />- Clustering analysis seeks to find groups of closely related observations that belong to the same cluster are more similar to each other than observations that observations that belong s to other clusters.<br /> Ex:<br /> -to find areas of ocean that have aq significant impact on the earth’s climate.<br />
- 10. Association Rule Discovery: Definition<br /> Given a set of records each of which contain some number of items from a given collection;<br />Produce dependency rules which will predict occurrence of an item based on occurrences of other items.<br />
- 11. Contd…<br />Rules Discovered:<br />{Milk} --> {Coke}<br /> {Diaper, Milk} --> {Beer}<br />
- 12. Sequential Pattern Discovery: Definition<br /> Given is a set of objects, with each object associated with its own timeline of events, find rules that predict strong sequential dependencies among different events.<br />(A B) (C) ---> (D E)<br />
- 13. Contd…<br /> Rules are formed by first disovering patterns. Event occurrences in the patterns are governed by timing constraints.<br />(A B) (C) (D E)<br /><= xg<br /> >ng<br /><= ws<br /><= ms<br />
- 14. Sequential Pattern Discovery: Example<br /> In telecommunications alarm logs, <br />(Inverter_ProblemExcessive_Line_Current) <br /> (Rectifier_Alarm) --> (Fire_Alarm)<br />
- 15. Regression<br /> Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency.<br />Greatly studied in statistics, neural network fields.<br />
- 16. Regression-examples<br /> Predicting sales amounts of new product based on advertising expenditure.<br />Predicting wind velocities as a function of temperature, humidity, air pressure, etc.<br />Time series prediction of stock market indices.<br />
- 17. Deviation/Anomaly Detection<br />Detect significant deviations from normal behavior<br />Applications:<br />Credit Card Fraud Detection<br />Network Intrusion Detection<br />
- 18. Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at www.dataminingtools.net<br />

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment