On October 23rd, 2014, we updated our
By continuing to use LinkedIn’s SlideShare service, you agree to the revised terms, so please take a few minutes to review them.
MS SQL SERVER: Using the data mining toolsPresentation Transcript
Using the Data Mining Tools
overview Introduction to BI Dev studio Creating Data Mining Objects Steps to Create and Edit the Models Using the Models Using SQL Server Management Studio
BI Development Studio Business Intelligence Development Studio is the primary environment that you will use to develop business solutions that include Analysis Services, Integration Services, and Reporting Services projects. This environment is integrated into the Microsoft Visual Studio (VS) shell to provide a complete development experience for business intelligence operations. Each project type supplies templates for creating the objects required for business intelligence solutions, and provides a variety of designers, tools, and wizards to work with the objects.
This is where you manage your solution and projects. All objects are created and managed here. To add objects to your project, you right-click the project name and select Add New Item.
These tabs allow you to quickly switch between designer windows. A tab will be displayed for each object or file that is currently open.
BI User Interface
This is where you edit and analyze your objects. Creating a new object or double-clicking an object in Solution Explorer will open that object’s specific designer, allowing you to modify and interact with the object.
Many objects have different aspects that you can edit or interact with. These aspects are indicated by tabs within the Designer window.
BI User Interface
This is a context-sensitive window that displays properties for the currently selected item, which is a general concept in VS and applies to any type of operation performed within the studio.
The area on the main menu bar between the Debug menu and the Tools menu is where you will find context-sensitive menus specific to Analysis Services objects.
The Output window displays messages when youbuild and deploy projects. If there are errors in your project, this is where you will find their descriptions.
Data Mining Objects open your database or project To perform data mining:
you must indicate and describe your source data
Then create mining structures and models.
How to set up data sources? Two objects in Analysis Services act as interfaces to your data:
The data source :
which is essentially a connection string indicating data location
The data source view (DSV):
DSV is an abstraction layer that enables you to modify the way you look at data sources, or even define a schema and switch the actual source at a later time.
What is a Data source? A data source is a rather simple object. It consists of nothing more than a connection string, plus some additional information indicating how to connect. Two important things to note about data source are:
Data location :
when you set up your data sources, the data source must be accessible not only to the client where you used the tools to build the model, but also to the server where the model will be processed. This can be done by moving the data to a SQL Server database using SQL Server Integration Services (SSIS) before building your models using BI Dev Studio.
The user credentials that are used to access data from Analysis Services play an essential role. Microsoft recommends always using integrated security if it’s supported by the source database.
Using the data source view The DSV is an abstract client-side view of your data. This is where your modeling begins. The DSV is where you select, organize, explore, and in a sense, manipulate the data in the source. While creating a DSV for data mining purposes, the most important table to identify is your case table. This is the table that contains the cases you want to analyze. You must also bring in any related tables that provide additional information about your cases.
Using the data source view The DSV Designer initially displays a diagram of the tables in your data source and the relationships between them. you can use the DSV Designer later to explore the data and alter it to the shape you need for your models.
Named Calculations in DSV These are additional virtual columns on the tables in your DSV, which enable you to mine derived information in your data without having to change your source data. A named calculation consists of name, a SQL expression containing the calculation, and an optional description.
Named Calculations in DSV The calculation can be any valid of the following SQL expression :
Arithmetic operations (+, −, *, /, and %)
Mathematical functions (ABS, LOG, SIGN, and SQRT)
Compositing expressions : The hypothesis you want to test depends on a
variable that is a combination of two of the variables you already have. EX: It may not be interesting that a person is married or has children, but the combination of the two may provide valuable information. A composite expression for this situation could look like this: [Marital Status] + ‘ ‘ + [Has Children]
CASE expressions :CASE expressions are an extremely flexible wayto create meaningful variables for data mining.
The CASE expression allows you to assign results based on the evaluation of one or more conditions.
Exploring Data Part of any data mining project is learning about and understanding the nature of your data. By leveraging controls from Office Web Components (OWC), the DSV Designer provides the functionality to explore your data in four different views. By right-clicking a DSV table and selecting Explore Data, you can view your data as a table, PivotTable, simple charts, and a PivotChart.
Exploring data with pivot table
Steps to Create and Edit the Models After the data has been organized, modified, selected, and understand the data you want to analyze. you can start to create data mining objects. Which involves the following steps:
Running the Data Mining Wizard.
Refining the results in Data Mining Designer.
Server Analysis Services Server Analysis Services has two major objects that deal with data mining:
A mining structure defines the domain of a mining problem. A mining structure contains a list of structure columns that have data and content types, bindings to the data source, and some optional flags that control how the data is modeled.
A mining model is the application of a mining algorithm to the data in a mining structure. The definition of a mining model contains an algorithm with its associated parameters, plus a list of columns from the mining structure.
The Data Mining Wizard The Data Mining Wizard creates the mining structure that describes the columns and training data you will use for mining, and optionally a mining model, which takes those columns, applies an algorithm, and defines the usage of each column for that algorithm. The steps of the wizard are: 1. Select your algorithm or choose only a structure. 2. Select the source tables and specify how they are used. 3. Select the columns from those tables and specify how they are used. 4. Finally, specify holdout data and name the structure and model.
The Data Mining Wizard Specifying the trained data using Data Mining Wizard.
The Data Mining Designer Data Mining Designer is where most of the work with your models will take place. It contains the following five panes for editing, browsing, querying, and comparing models:
The Mining Structure pane
The Mining Models pane
The Mining Model Viewer pane
The Mining Accuracy pane
The Mining Model Prediction pane
You must use the Mining structure editor to perform modeling operations that are not possible in the Mining Model Wizard.
The Mining Structure Editor
Data Mining Reports SQL Server Reporting Services are used to access data mining query results and to distribute those results. SQL Server Management Studio combines a broad group of graphical tools with a number of rich script editors to provide access to SQL Server to developers and administrators of all skill levels. Reporting Services has options to run reports periodically and cache the results to expedite report retrieval, and you can even specify queries to control report distribution.
Reporting using data Mining Query Designer
SQL Server Management Studio Microsoft SQL Server Management Studio, is an integrated environment for accessing, configuring, managing, administering, and developing all components of SQL Server. SQL Server Management Studio combines a broad group of graphical tools with a number of rich script editors to provide access to SQL Server to developers and administrators of all skill levels.
SQL Server Management Studio SQL Server Management Studio combines the features of Enterprise Manager, Query Analyzer, and Analysis Manager, included in previous releases of SQL Server, into a single environment. In addition, SQL Server Management Studio works with all components of SQL Server such as Reporting Services, Integration Services, SQL Server 2005 Compact Edition, and Notification Services. Developers get a familiar experience, and database administrators get a single comprehensive utility that combines easy-to-use graphical tools with rich scripting capabilities.
Management Studio UI
Management Studio Features Management Studio provides function to implement the following features:
Create and maintain databases
Build queries using the prediction builder
Build queries using the query editor
Process models and structures
Assign object permissions
Backup and restore databases
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net