Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MS Sql Server: Introduction To Datamining Suing Sql Server


Published on

Introduction To Datamining Suing Sql Server

Published in: Technology
    Are you sure you want to  Yes  No
    Your message goes here

MS Sql Server: Introduction To Datamining Suing Sql Server

  2. 2. What is a Data Mining?<br />Data mining is the process of analyzing a data set to find patterns<br />Data mining can also defined as deriving of knowledge from raw-data<br />
  3. 3. Aliases<br />Data mining is also known by the following terms:<br />
  4. 4. Importance of Data mining<br />The Amount of data in the contemporary world is humungous. By studying this data and understanding the trend and patterns, one can understand the system better. Due to data mining, conclusions which are profitable for an organization or decisions which may help a librarian manage books better: may be arrived at. <br />Pervasiveness of data:<br />CRM<br />(Customer Relationship Management)<br />ERP<br />(Enterprise Resource Planning)<br />Database servers<br />Data Pool<br />Web Server Logs<br />
  5. 5. Data Mining<br />The traditional SQL queries that we learnt till now follow the method of ‘querying’ and based upon the response, ‘explore’ the system more. <br />Query and Exploration Method<br />Data Mining Method<br />The Data mining methodology hence takes the opposite direction as that of query methods<br />Here, the important attribute on which the analysis is based is the ‘name’. Hence, it is called as the class<br />
  6. 6. Applications<br />The Application of data mining covers a wide domain. Any place where data is involved can be operated upon using data mining. Some of the real world applications of data mining are as follows:<br />
  7. 7. Algorithms for Data mining<br />The Data mining systems utilize a wide variety of algorithms. The Four common algorithm types are:<br />
  8. 8. Tasks involved in Data Mining<br />The Process of data mining is divided into various steps as follows:<br /><ul><li> Classification
  9. 9. Clustering
  10. 10. Association
  11. 11. Regression
  12. 12. Forecasting</li></ul>Let us have a look at them<br />
  13. 13. Classification<br />Classification is the process of grouping items into meaningful groups. The Groups are later treated as a single element and the relation between the groups are analyzed. Simply put, it is the task of assigning a group to each case.<br />Example:<br />Data Set<br />
  14. 14. Clustering<br />Clustering is the process of grouping data items based on some attributes<br />Example:<br />Data Set<br />Clustered based on nearness<br />
  15. 15. Data mining algorithms<br />Data Mining is a complex methodology which needs advanced algorithms operating on useful data.<br />The Data mining algorithms are mainly divided into 2 types:<br />Supervising algorithm<br />Unsupervising algorithm<br />In a supervising algorithm, the system needs a target(may be a set of attributes) to learn against<br />Whereas the Unsupervising algorithm, iterates till the boundaries of the problem are reached<br />
  16. 16. Regression and Forecasting<br />REGRESSION:<br />In some problems, the analysis, instead of looking for patterns that describe prime attributes (classes), we look for patterns in numerical values<br />There are 2 types of regression: 1.Linear regression 2. Logostic Regression<br />Regression is used to solve many business problems like predicting sea-wave patterns, temperature, air pressure, and humidity.<br />FORECASTING:<br />As the name suggests, it is the fore telling of data from that which currently exists.<br />Eg: Election results forecast <br />
  17. 17. Steps to take<br />The Process of data mining consists of various steps which are listed below:<br />Data Collection: Collect data<br />Data Cleaning: Eliminate unwanted, irrelevant and wrong data<br />Data Transformation: Change data into a word that can be used for data mining. The Types of data transformations are:<br />Numerical Transformation<br />Grouping<br />Aggregation: Form groups of minute data items and handle them as aggregates. It makes the process much easier.<br />Missing Value handling: Predict missing values or eliminate all such values<br />Removing Outliers: Remove invalid data<br />Model Building: Build the data mining model.<br />Model Assessment Test with a large amount of data. If a model needs change, make it immediately.<br />
  18. 18. What to do next?<br />The Microsoft Office 2007 supports a wide variety of data mining tools. <br />Visit the site and download the MS Access 2007 Add-on for data mining. Install the add-on.<br />Working with the Access 07 Data mining tools will be handled in the next set of presentations.<br />Summary<br /><ul><li> Data mining
  19. 19. Applications
  20. 20. Classification
  21. 21. Clustering
  22. 22. Algorithms
  23. 23. Regressions
  24. 24. Steps involved</li></li></ul><li>Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at<br />