Introduction to Microsoft SQL Server 2008 R2 Integration Services


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Key Points: Integration Services (SSIS) provides a scalable enterprise data integration platform with exceptional Extract, Transform, Load (ETL) and integration capabilities, enabling organizations to more easily manage data from a wide array of data sourcesMaster Data Services (MDS) enables organizations to start with simple solutions for analytic or operational requirements, and then adapt the solutions to additional requirements incrementallyThe latest version of SQL Server from Microsoft SQL Server 2008 offers hundreds of new DBMS features that boost the productivity of database administrators and developers, improve support for larger databases, and enhance securityReporting Services (SSRS) provides a full range of ready-to-use tools and services to help you create, deploy, and manage reports for your organization, as well as programming features that enable you to extend and customize your reporting functionalityAnalysis Services (SSAS) delivers online analytical processing (OLAP) and data mining functionality for business intelligence applicationsConclusion: With SQL Server 2008 R2 customers get all the technologies needed to build a reliable and secure BI platform. SQL Server 2008 R2 has the strongest combination of price/performance, manageability, security, and DBA productivity.
  • Update column values or create new columnsTransform each row in the pipeline input
  • The transformations create new rowsets that can include aggregate and sorted values, sample rowsets, or pivoted and unpivotedrowsets.
  • The transformations distribute rows to different outputs, create copies of the transformation inputs, join multiple inputs into one output, and perform lookup operations.
  • Introduction to Microsoft SQL Server 2008 R2 Integration Services

    1. 1. Microsoft SQL Server 2008 R2 Introduction to Integration Services
    2. 2. SQL Server 2008 R2 BI Technologies
    3. 3. Contents• Understanding the Data Integration• Understanding the SQL Server 2008 R2 Integration Services• Understanding the SSIS Packages• Understanding the SSIS Control Flow• Understanding the SSIS Data Flow
    4. 4. SQL Server 2008 R2 BI Structure Reporting and Visualization Tools (Dashboard, KPI, Presentation Layer Scorecard,…) Turn data into information (analysis) Analytical Layer Multidimensional OLAP DatabaseData Storage and Retrieval Layer Data Warehouse in RDBMS 1. Extract the data from the multiple sources Data Transformation Layer 2. Modify the data to consistent 3. Load the data into Data Storage system Data Source Layer Text, MS Excel, MS Access, MS SQL, Oracle,…| External Sources
    5. 5. Microsoft Business Intelligence Platform Analytic Scorecards, Analytics, Planning Applications (PerformancePoint Service) Portal (SharePoint) Data Delivery Report Builder End-user Analysis SSRS (Excel) Integrate Analyze Report (SQL Integration Services) (SQL Analysis Services) (SQL Reporting Services) Infrastructure Platform Data Warehouse, Data Marts, Operational Data (SQL Server 2008 R2) Office SQL
    6. 6. Data Integration in Real World Extract data Transform the Load data intofrom sources data data stores
    7. 7. Data Integration Challenges• Multiple sources with different formats.• Structured, semi-structured, and unstructured data.• Huge data volumes. Enterprises spend 60%–80% of their resources developing and testing their ETL processes
    8. 8. Introduction to Integration ServicesSQL SERVER 2008 R2 INTEGRATION SERVICES
    9. 9. Introducing Integration Services 2008 R2• Primarily designed to implement ETL processes• Provides a robust, flexible, fast, scalable and extensible architecture• Challenges traditional ETL design approaches
    10. 10. Introducing Integration Services 2008 R2• Its capabilities are useful in many other scenarios: – Assessing data quality – Cleansing and standardizing data – Merging data from heterogeneous data stores – Implementing ad hoc data transfers – Automating administrative tasks
    11. 11. SSIS Architecture• SQL Server Integration Services (SSIS) service• SSIS object model• Two distinct runtime engines: – Control flow – Data flow
    12. 12. SSIS Architecture• SSIS Designer – Graphical tool to create and maintain Integration Services packages.• Integration Services Runtime – Saves the layout of packages, runs packages, and provides support for logging, breakpoints, configuration, connections, and transactions.• Tasks and other executable: – The Integration Services run-time executables are the package, containers, tasks, and event handlers
    13. 13. SSIS Architecture• Data Flow engine (pipeline) – In-memory buffers• Data Flow components – Sources, – Transformations – Destinations
    14. 14. SSIS Architecture• Object Model – Allow for creating custom components for use in packages• Integration Services Service – Lets you monitor running Integration Services packages and to manage the storage of packages.
    15. 15. Introduction to Integration ServicesPACKAGE ESSENTIALS
    16. 16. What’s IS Package• A package is the object that implements Integration Services functionality to extract, transform, and load data.• Creation tools: – SSIS Designer in BI Development Studio. – SQL Server Import and Export Wizard – Integration Services Connections Project Wizard• Saved in XML format to the file system or SQL Server
    17. 17. Package Elements• Connection managers• Control flow components• Data flow components• Variables• Event handlers• Configurations
    18. 18. Connection Managers• Logical representation of a connection• Stored in the package and cannot be shared between packages• Used by package elements• Do not need to connect to SQL Server
    19. 19. Introduction to Integration ServicesCONTROL FLOW
    20. 20. Control Flow• Control flow is the process-oriented workflow engine• A package consists of a single control flow• Control flow elements: – Containers – Tasks – Precedence constraints – Variables – Event handlers
    21. 21. Containers• Provide structure and services for – Grouping tasks – Implementing repeating flows• Execute in sequence defined by precedence constraints in the control flow• Manage variable and transactional boundaries
    22. 22. Tasks• Perform discrete operations at runtime• Execute in sequence defined by precedence constraints in the control flow• Use properties configured at design time or assigned dynamically at runtime by using expressions
    23. 23. Task CategoriesTask DescriptionsData Flow The Data Flow task defines and runs data flows that extract data, apply transformations, and load dataData Preparation Data preparation tasks copy files and directories, download files and data, save data returned by Web methods, or work with XML documentsWorkflow Workflow tasks communicate with other processes to run packages or programs, send and receive messages between packages, send e-mail messages, read Windows Management Instrumentation (WMI) data, or watch for WMI events.SQL Server SQL Server tasks access, copy, insert, delete, or modify SQL Server objects and dataAnalysis Services Analysis Services tasks create, modify, delete, or process Analysis Services objectsScripting Scripting tasks extend package functionality through custom scriptsMaintenance Maintenance tasks perform administrative functions, such as backing up and shrinking SQL Server databases, rebuilding and reorganizing indexes, and running SQL Server Agent jobs
    24. 24. Precedence Constraints• Precedence constraints link executables, containers, and tasks in packages into a control flow, and specify conditions that determine whether executables run• Configure conditions that determine whether the executable runs: – Success, Failure, or Completion constraints – Expressions – Logical AND/OR for multiple constraints
    25. 25. Variables• Variables customize package behavior by changing expression values or object properties• System variables store values collected during package execution• All variables use case-sensitive names• Variables can be scoped at package, container, or task level
    26. 26. Event Handlers• At run time executables raise events• Event handlers can be defined to respond to these events• Creating an event handler is similar to building a package; an event handler has tasks and containers, which are sequenced into the control flow
    27. 27. Event Handlers• Common events used to trigger event handlers: – OnPreExecute – OnPostExecute – OnError• Examples: – Retrieve system information to assess resource availability before the package runs – Send an e-mail message when an error occurs
    28. 28. Introduction to Integration ServicesDATA FLOW
    29. 29. Data Flow• Data Flow is optional elements – Extract data – Modify data – Load data into data sources.• The main data flow elements are – Sources – Transformations – Destinations.
    30. 30. Data Flow Sources• Sources extract data from: – Relational tables and views – Files – Analysis Services databases
    31. 31. Data Flow Transformations• Aggregate, merge, distribute, or modify data• Include error outputs in some cases• Transformation categories – Row – Rowset – Split and Join – Script – Other
    32. 32. Row TransformationsTransformation DescriptionCharacter Map The transformation that applies string functions to character data. The transformation that adds copies of input columns to theCopy Column transformation output. The transformation that converts the data type of a column to aData Conversion different data type. The transformation that populates columns with the results ofDerived Column expressions.Export Column The transformation that inserts data from a data flow into a file. The transformation that reads data from a file and adds it to a dataImport Column flow. The transformation that uses script to extract, transform, or loadScript Component data. The transformation that runs SQL commands for each row in a dataOLE DB Command flow.
    33. 33. Rowset TransformationsTransformation Description The transformation that performs aggregations such asAggregate AVERAGE, SUM, and COUNT.Sort The transformation that sorts data.Percentage The transformation that creates a sample data set using aSampling percentage to specify the sample size. The transformation that creates a sample data set byRow Sampling specifying the number of rows in the sample. The transformation that creates a less normalized versionPivot of a normalized table. The transformation that creates a more normalizedUnpivot version of a nonnormalized table.
    34. 34. Split and Join TransformationsTransformation DescriptionConditional Split The transformation that routes data rows to different outputs. The transformation that distributes data sets to multipleMulticast outputs.Union All The transformation that merges multiple data sets.Merge The transformation that merges two sorted data sets. The transformation that joins two data sets using a FULL, LEFT,Merge Join or INNER join. The transformation that looks up values in a reference tableLookup using an exact match. The transformation that writes data from a connected data source in the data flow to a Cache connection manager thatCache saves the data to a cache file. The Lookup transformation performs lookups on the data in the cache file.
    35. 35. The Script Transformation• Extends the capabilities of the data flow• Similar to the Script Task, develop VB.NET or C# .NET scripts to introduce custom logic into the data flow• Can be configured for these roles: – Source – Destination – Transformation• Delivers optimized performance because it is precompiled
    36. 36. Other Transformations• Add audit information• Populate lookup caches• Export and import data• Count rows• Manage slowly changing dimensions
    37. 37. Introduction to Integration ServicesDEMO
    38. 38. THANK YOU