1. SQL Server Integration ServicesAnd Data Mining<br />
2. Overview<br />Standard Tasks in SSIS<br />SSIS Packages<br />Data Flow<br />Working with SSIS in Data Mining<br />Data Mining Transformations<br />Text Mining Transformations<br />Summary<br />
3. Overview of SSIS<br />SQL Server Integration Services (SSIS) is a component of the Microsoft SQL Server database software which can be used to perform a broad range of data migration tasks.<br />SSIS is a platform for data integration and workflow applications. It features a fast and flexible data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data. <br />
4. SQL Server Integration Services<br />
5. SSIS Designer<br />
6. SSIS Packages<br />A package is the basic deployment and execution unit of an SSIS project.<br />An SSIS package is the container for SSIS flows.<br /> You can create an SSIS package by right-clicking the SSIS Package folder in the Integration Services project folder and selecting the New SSIS Package menu item.<br />An SSIS project may contain multiple packages. A package contains only one control flow, which may contain one or more data flows.<br />In addition to control flow and data flow, a package contains SSIS connections and package variables.<br />
7. Task Flow and Containers<br />Tasks are listed in the SSIS Toolbox. <br />You can add a task to the package by dragging it from the Toolbox and dropping it into the package designer.<br />A package usually contains multiple tasks in a task flow. <br />Multiple tasks are organized in sequential order with precedence constraints.<br />Containers are SSIS objects that provide structure to a package.<br /> Each package has a container, which stores the flows of a package.<br />
8. Data Flow Example<br />
9. How to Set the Properties of a Task or Container?<br />To set the properties of a task or container by using theProperties window :<br /><ul><li>In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
10. In Solution Explorer, double-click the package to open it.
11. Click the Control Flow tab.
12. On the design surface of the Control Flow tab, right-click the task or container, and then click Properties.
13. In the Properties window, update the property value.
14. Optionally, create property expressions to dynamically update the properties of the task or container.
15. To save the updated package, click Save Selected Items on the File menu.</li></li></ul><li>How to Set the Properties of a Task or Container?<br />To set the properties of a task or container by using a task or container editor:<br /><ul><li>In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
16. In Solution Explorer, double-click the package to open it.
17. Click the Control Flow tab.
18. On the design surface of the Control Flow tab, right-click the task or container, and then click Edit to open the corresponding task or container editor.
19. If the task or container editor has multiple nodes, click the node that contains the property that you want to set.
20. Optionally, click Expressions and, on the Expressions page, create property expressions to dynamically update the properties of the task or container.
21. Update the property value.
22. To save the updated package, click Save Selected Items on the File menu.</li></li></ul><li>Working with SSIS in Data Mining<br />This powerful tool is used to load data from various sources, combine these data sources, normalize column values, remove dirty records, replace missing values, split data into training and testing data sets, and so on.<br />SSIS is more than just an ETL tool for data mining as it actually provides a few built-in data mining components in the control flow and data flow environment.<br />
23. Data Mining Transformations<br />The data flow components can be categorized in three large groups, depending on their position in the data flow:<br />
24. Text Mining Transformations<br />you must first bring the text to some form that can be consumed by the algorithms, to perform text mining with SQL Server Data Mining. <br />The solution included in the product is to represent each piece of text as a collection of words and phrases.<br />
25. Text Mining Transformations<br />After each document is represented as a collection of key phrases, you can perform data mining using one of the following model types:<br /><ul><li>Classification models that use the key words and phrases nested table as input to predict the class of a document
26. Clustering models that find similar documents based on common occurrences
27. Association models that detect cross-correlations between key words and phrases</li></li></ul><li>Text Mining Transformations<br />The process of text mining usually consists of at least the following three phases:<br />1. Extraction transformation: Build a dictionary of key words and<br /> phrases over a collection of representative documents. <br />2. Lookup transformation: Based on the dictionary, extract the list of<br /> significant key words and phrases for each document to be analyzed. <br />3. Train mining models on top of the transformed data.<br />
28. Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at www.dataminingtools.net<br />