Ssis Best Practices Israel Bi U Ser Group Itay Braun

2,093 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,093
On SlideShare
0
From Embeds
0
Number of Embeds
50
Actions
Shares
0
Downloads
73
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Be sure to welcome people to the presentation. Start by stating our direction: We will look at the challenges facing IT regarding Mission Critical applications. We will then show how SQL Server 2008 addresses those challenges.
  • Ssis Best Practices Israel Bi U Ser Group Itay Braun

    1. 1. { Integration Services Best Practices} Itay Braun BI and SQL Server Consultant Email: itay@twingo.co.il Blog: http://blogs.microsoft.co.il/blogs/itaybraun/
    2. 2. New website for SQL Server in Hebrew: www.sqlserver.co.il Twingo is looking for experienced BI / SQL Server developers. At least two years experience. Please contact itay@twingo.co.il for more details If you are looking for employees or looking for a job, please contact Yossi Elkayam yelkayam@microsoft.com
    3. 3. If it moves – Log it! Establishing performance baseline Package Configuration Lookup Optimization Data Profiling Other tips and tricks
    4. 4. SSIS Log Providers Event handlers Analyzing the data Don’t forget the jobs
    5. 5. Used to capture run-time information about a package Helps to audit and troubleshoot a package every time it is run Integration Services includes the following log providers: The Text File log provider (CSV) The SQL Server Profiler log provider The SQL Server log provider (sysssislog table) The Windows Event log provider The XML File log provider
    6. 6. All tasks share the same basic events Each task also has unique events
    7. 7. Build manually the table and events Allows better control on the collected data For Ex. Row count Important step was finished
    8. 8. Simple SSIS package within the package Mostly used to response to OnError events Log and sending email
    9. 9. SQL 2008 – sysssislog table http://technet.microsoft.com/en- us/library/ms186984.aspx SQL 2005 – sysdtslog90 http://msdn.microsoft.com/en- us/library/ms186984(SQL.90).aspx Analyze: Total execution time SSAS partition processing time Errors and Warnings Time elapsed between PackageStart and PackageEnd
    10. 10. Don’t forget to monitor the execution of the ETL jobs. Use Reporting Services to write simple reports about the ETL execution process.
    11. 11. If it moves – Log it! Establishing performance baseline Package Configuration Lookup Optimization Data Profiling Other tips and tricks
    12. 12. Understanding resource utilization CPU Bound Memory Bound I/O Bound Network Bound
    13. 13. Processor time Process / % Processor Time (Total) sqlservr.exe and dtexec.exe Do the tasks run in parallel
    14. 14. Process / Private Bytes (DTEXEC.exe) – The amount of memory currently in use by Integration Services. Process / Working Set (DTEXEC.exe) – The total amount of allocated memory by Integration Services. SQL Server: Memory Manager / Total Server Memory: The total amount of memory allocated by SQL Server. Memory / Page Reads / sec – Represents to total memory pressure on the system. If this consistently goes above 500, the system is under memory pressure.
    15. 15. SSIS Pipeline/ Buffers in use - the number of pipeline buffers in use throughout the pipeline. Buffer Spooled / Buffer Spooled - The number of buffers spooled to disk. Buffer spooled has initial value of 0. When it goes above 0, it indicates that the engine has started memory swapping. Rows Read - The number of rows read from all data sources in total. Rows Written - The number of rows written to all data destinations in total.
    16. 16. To ensure that Integration Services is minimally writing to disk, SSIS should only hit the disk when it reads from the source and writes to the target. For SAN / NAS use the vendors applications
    17. 17. SSIS moves data as fast as the network is able to handle it. Network Interface / Current Bandwidth: This counter provides an estimate of current bandwidth. Network Interface / Bytes Total / sec: The rate at which bytes are sent and received over each network adapter. Network Interface / Transfers/sec: Tells how many network transfers per second are occurring. If it is approaching 40,000 IOPs, then get another NIC card and use teaming between the NIC cards.
    18. 18. If it moves – Log it! Establishing performance baseline Package Configuration Lookup Optimization Data Profiling Other tips and tricks
    19. 19. the package needs to know where it is moving data from and where it is moving data to Typically Integration Services packages are built on a different environment to where they are intended to be executed in production.
    20. 20. Object which can be configures: Tasks Containers Variables Connection Managers Data Flow Components
    21. 21. XML Configuration File Most popular configuration type Easy deployment Disadvantage - Path to the .dtsconfig file must be hard coded within the package Environment Variable Configuration Takes the value for a property from whatever is stored in a named environment vriable Stores the property path inside the package and the value outside the package
    22. 22. Parent Package Configuration Fetch a value from a variable in a calling package Stores the property path inside the package and the value outside the package. Registry Configuration The value to be applied to a package property is stored in a registry entry stores the property path inside the package and the value outside the package
    23. 23. SQL Server Configuration stored in a SQL Server table. The table can have any name you like, and can be in any database on any server that you like.
    24. 24. Consider command-line options as an alternative to configurations The /SET option used to apply a value to some property in the package that is being run The /CONFIGFILE option used to tell the package to use an XML configuration file, even if one has not been defined in the package Configure Only the ConnectionString Property for Connection Managers Instead of Servername, initialCatalog, UserName, Password Don’t save the password in XML files
    25. 25. If it moves – Log it! Establishing performance baseline Package Configuration Lookup Optimization Data Profiling Other tips and tricks
    26. 26. Use the NOLOCK or TABLOCK hints to remove locking overhead To optimize memory usage, SELECT only the columns you actually need If possible, perform datetime conversions at the source or target databases, as it is more expensive to perform within Integration Services. In SQL Server 2008 Integration Services, there is a new feature of the shared lookup cache.
    27. 27. Commit size 0 is fastest on heap bulk targets because only one transaction is committed If commit size = 0 is not possible, use the highest possible value of commit size to reduce the overhead of multiple-batch writing Commit size = 0 is a bad idea if inserting into a Btree all incoming rows must be sorted at once into the target Btree
    28. 28. Batchsize = 0 is ideal for inserting into a heap. For an indexed destination, I recommend testing between 100,000 and 1,000,000 as batch size. Use a commit size of <5000 to avoid lock escalation when inserting Use partitions and partition SWITCH command More info here: Getting Optimal Performance with Integration Services Lookups.
    29. 29. If it moves – Log it! Establishing performance baseline Package Configuration Lookup Optimization Data Profiling Other tips and tricks
    30. 30. New Feature in SSIS 2008 Used to profile the data Null values Values distribution Column length
    31. 31. If it moves – Log it! Establishing performance baseline Package Configuration Lookup Optimization Data Profiling Other tips and tricks
    32. 32. Make data types as narrow as possible so you will allocate less memory for your transformation Watch precision issues when using the money, float, and decimal types. money is faster than decimal, and money has fewer precision considerations than float
    33. 33. Do not sort within Integration Services unless it is absolutely necessary. In order to perform a sort, Integration Services allocates the memory space of the entire data set that needs to be transformed There are times where using Transact-SQL will be faster than processing the data in SSIS. As a general rule, any and all set-based operations will perform faster in Transact-SQL.
    34. 34. To perform delta detection, you can use a change detection mechanism such as the new SQL Server 2008 Change Data Capture (CDC) functionality
    35. 35. Custom logging using event handlers: http://blogs.conchango.com/jamiethomson/ archive/2005/06/11/SSIS_3A00_-Custom- Logging-Using-Event-Handlers.aspx Best Practices for Integration Services Configurations - http://msdn.microsoft.com/en- us/library/cc671628.aspx Other best practices - http://bi- polar23.blogspot.com/2007/11/ssis-best- practices-part-1.html
    36. 36. © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

    ×