MS SQL SERVER: Data mining using office 2007


Published on

MS SQL SERVER: Data mining using office 2007

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

MS SQL SERVER: Data mining using office 2007

  1. 1. Data Mining Using Office 2007<br />
  2. 2. Overview<br />The Data Mining Client<br />Importing data<br />Exploring data<br /> Preparing data<br />The data modeling chunk<br />Usage of Models<br />Data Mining Cell Functions<br />
  3. 3. Data Mining Client Introduction<br />The Data Mining Add-Ins for Office 2007 comprise three different add-ins.<br />The Data Mining Add-Ins package is available as a free download from Microsoft. <br />The Data Mining Client is designed to walk you through the data mining process.<br />
  4. 4. Data mining process<br />
  5. 5. Data Mining Client Ribbon<br />
  6. 6. Importing Data<br />Data can be directly imported from Access, SQL Server, text files, and XML files. <br />It can also scrape web pages to turn them into raw data.<br />The Data Preparation chunk of the Data Mining Client contains the Sample Data tool, which offers the option to sample external data.<br /> This allows you to use a percentage or a fixed number of rows sampled randomly from a database table or query accessed through Analysis Services.<br />
  7. 7. Import data options in Excel<br />
  8. 8. Prepare data<br />While preparing the data, you start from your hypothesis about the problem you are trying to solve.<br />This step involves understanding , shaping , and selecting your data in a way that you believe will be pertinent to the problem at hand.<br />
  9. 9. Explore data<br />The Explore Data tool is designed to show histograms for discrete and continuous columns, and it has a bonus feature that allows you to materialize continuous histograms into table columns.<br />For example, instead of considering Age as a continuous number across the range of ages in your data, you could break the ages into discrete sections that are easier to understand.<br />
  10. 10. The Explore Data tool<br />The Explore Data tool displaying a histogram for the Agecolumn divided into six buckets.<br />
  11. 11. The data modeling chunk<br />The data modeling chunk provides environment to build models on your prepared data sets.<br />
  12. 12. Data Modeling Tasks<br />Data Modeling Tasks and Algorithms used for the task.<br />
  13. 13. Modeling task wizard flow<br />
  14. 14. Modeling task wizard flow<br />The Introduction page shows helpful text describing the purpose and the use of the task wizard.<br />The Select Data page is identical to the select data pages of the data exploration and preparation tools. All of the tasks operate on both data inside Excel and data in external databases.<br />Select Columns and Options is where the columns used for modeling and the options for each task are specified.<br />
  15. 15. Modeling task wizard flow<br />The Split Data page is shown for the Classify and Estimate tasks. <br />Specifying an amount of data to set aside for testing your model simplifies the entire data mining process.<br />The Finish page in each task wizard allows you to name the objects that are created and set additional options.<br />
  16. 16. Usage of Models<br />The Data Mining Client for Excel 2007 add-in provides tools to view, document, and query models, as well as cell functions that allow you to create interactive predictive workbooks.<br /> The Data Mining Templates for Visio add-in provides renderers that allow you to create annotated diagrams from models that you can save to web formats.<br />
  17. 17. Data Mining Cell Functions<br />Interactive predictive spreadsheets can be created using the three data mining cell functions provided with the Data Mining Client. <br /><ul><li>DMPredict function returns any predicted result from a model.</li></ul>The function takes a connection, a model, the prediction function, and up to 32 name/value pairs for the input. <br />
  18. 18. Data Mining Cell Functions<br /><ul><li>DMPREDICTTABLEROW function is analogous to DMPredict, except that it operates on a table row instead of an arbitrary collection of cells.</li></ul> As such, the function takes a range and a list of ordered mappings. <br /><ul><li>DMCONTENTQUERY allows you to fetch an arbitrary piece of content from a mining model.</li></ul> Usually, this function is used in conjunction with a cell containing a DMPredict or DMPredictTableRow function call that returns PredictNodeID, allowing you to return the reason for a particular prediction. <br />The function takes the model name, the piece of content to be returned, and the filter clause used to specify the content. <br />
  19. 19. Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at<br />