• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
MS SQL SERVER: SSIS and data mining
 

MS SQL SERVER: SSIS and data mining

on

  • 3,868 views

MS SQL SERVER: SSIS and data mining

MS SQL SERVER: SSIS and data mining

Statistics

Views

Total Views
3,868
Views on SlideShare
3,866
Embed Views
2

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 2

http://dataminingtools.net 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    MS SQL SERVER: SSIS and data mining MS SQL SERVER: SSIS and data mining Presentation Transcript

    • SQL Server Integration ServicesAnd Data Mining
    • Overview
      Standard Tasks in SSIS
      SSIS Packages
      Data Flow
      Working with SSIS in Data Mining
      Data Mining Transformations
      Text Mining Transformations
      Summary
    • Overview of SSIS
      SQL Server Integration Services (SSIS) is a component of the Microsoft SQL Server database software which can be used to perform a broad range of data migration tasks.
      SSIS is a platform for data integration and workflow applications. It features a fast and flexible data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data.
    • SQL Server Integration Services
    • SSIS Designer
    • SSIS Packages
      A package is the basic deployment and execution unit of an SSIS project.
      An SSIS package is the container for SSIS flows.
      You can create an SSIS package by right-clicking the SSIS Package folder in the Integration Services project folder and selecting the New SSIS Package menu item.
      An SSIS project may contain multiple packages. A package contains only one control flow, which may contain one or more data flows.
      In addition to control flow and data flow, a package contains SSIS connections and package variables.
    • Task Flow and Containers
      Tasks are listed in the SSIS Toolbox.
      You can add a task to the package by dragging it from the Toolbox and dropping it into the package designer.
      A package usually contains multiple tasks in a task flow.
      Multiple tasks are organized in sequential order with precedence constraints.
      Containers are SSIS objects that provide structure to a package.
      Each package has a container, which stores the flows of a package.
    • Data Flow Example
    • How to Set the Properties of a Task or Container?
      To set the properties of a task or container by using theProperties window :
      • In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
      • In Solution Explorer, double-click the package to open it.
      • Click the Control Flow tab.
      • On the design surface of the Control Flow tab, right-click the task or container, and then click Properties.
      • In the Properties window, update the property value.
      • Optionally, create property expressions to dynamically update the properties of the task or container.
      • To save the updated package, click Save Selected Items on the File menu.
    • How to Set the Properties of a Task or Container?
      To set the properties of a task or container by using a task or container editor:
      • In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
      • In Solution Explorer, double-click the package to open it.
      • Click the Control Flow tab.
      • On the design surface of the Control Flow tab, right-click the task or container, and then click Edit to open the corresponding task or container editor.
      • If the task or container editor has multiple nodes, click the node that contains the property that you want to set.
      • Optionally, click Expressions and, on the Expressions page, create property expressions to dynamically update the properties of the task or container.
      • Update the property value.
      • To save the updated package, click Save Selected Items on the File menu.
    • Working with SSIS in Data Mining
      This powerful tool is used to load data from various sources, combine these data sources, normalize column values, remove dirty records, replace missing values, split data into training and testing data sets, and so on.
      SSIS is more than just an ETL tool for data mining as it actually provides a few built-in data mining components in the control flow and data flow environment.
    • Data Mining Transformations
      The data flow components can be categorized in three large groups, depending on their position in the data flow:
    • Text Mining Transformations
      you must first bring the text to some form that can be consumed by the algorithms, to perform text mining with SQL Server Data Mining.
      The solution included in the product is to represent each piece of text as a collection of words and phrases.
    • Text Mining Transformations
      After each document is represented as a collection of key phrases, you can perform data mining using one of the following model types:
      • Classification models that use the key words and phrases nested table as input to predict the class of a document
      • Clustering models that find similar documents based on common occurrences
      • Association models that detect cross-correlations between key words and phrases
    • Text Mining Transformations
      The process of text mining usually consists of at least the following three phases:
      1. Extraction transformation: Build a dictionary of key words and
      phrases over a collection of representative documents.
      2. Lookup transformation: Based on the dictionary, extract the list of
      significant key words and phrases for each document to be analyzed.
      3. Train mining models on top of the transformed data.
    • Visit more self help tutorials
      Pick a tutorial of your choice and browse through it at your own pace.
      The tutorials section is free, self-guiding and will not involve any additional support.
      Visit us at www.dataminingtools.net