Ssis optimization –better designs

5,927 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,927
On SlideShare
0
From Embeds
0
Number of Embeds
232
Actions
Shares
0
Downloads
97
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Ssis optimization –better designs

  1. 1. SSIS Optimization -Better Designs VarunRagul Mavoodu
  2. 2. <ul><li>Common Problems and Solutions </li></ul><ul><li>Better designs </li></ul><ul><li>Tips and Examples </li></ul>Agenda
  3. 3. <ul><li>Flat Files - Use Fast parse for faster Loading </li></ul><ul><ul><ul><li>Average performance improvement would be around 8% per column. – Example </li></ul></ul></ul><ul><li>OLE DB Source </li></ul><ul><ul><li>Optimize the Query – Apply more filter , remove unnecessary column , joins etc. </li></ul></ul><ul><ul><li>Packet Size – By default server choose 4k change it to 32K. </li></ul></ul>Source Optimization
  4. 4. Transformation’s in SSIS Sync  Same buffers are used for each operation. Async  New Buffers are created for each operation . Category Row Partial Blocking Full Blocking Sync/Async Sync Async Async Input = Output Yes Partially No Wait For all Input No No Yes New Buffer or thread No Yes Yes
  5. 5. Transformation's Split Row Transformation Partially Blocking Full Blocking Oledb Command Union All Aggregate SCD Term Look up Sort Row Count Data mining query Fuzzy lookup Import/Export Column Merge and Merge Join Fuzzy grouping Multi cast Row sampling Look up Term Extraction Derived Column PivotUnpivot Copy Column Data conversion Conditional Split
  6. 6. <ul><li>Look up </li></ul><ul><ul><li>Full and partial cache occupies memory during pre execute phase. </li></ul></ul><ul><ul><li>Memory is never de allocated until package is complete. </li></ul></ul><ul><li>More the no full cache lookup , more the Ram it takes </li></ul><ul><li>Solution : Override using LEFT JOIN wherever necessary </li></ul>Transformations – Look Up
  7. 7. <ul><li>Try to Join data at database. </li></ul><ul><li>Merge use sorted data from source and change the properties. </li></ul><ul><li>Use new features of 2008 like CDC which is very promising and can effectively replace Type 2 using SSIS. </li></ul><ul><li>Try to Use Merge func of TSql . </li></ul>Transformations - Tips
  8. 8. <ul><li>OLEDB Insert Options </li></ul><ul><li>The commit size should always be equal to Rows in Dataflow per buffer . </li></ul><ul><li>Uncheck “Check Constraints” , if sure of Data Quality </li></ul><ul><li>Apply table lock for faster access ( exception for type 2 ) </li></ul><ul><li>Rows per batch should be decided based on No of rows per buffer </li></ul>Destination - Optimization
  9. 9. <ul><li>SQL Server Destination – Works well for small datasets </li></ul><ul><ul><li>Average improvement of 20% found on loading. </li></ul></ul><ul><ul><li>Documented limitations on Error handling. </li></ul></ul><ul><li>Data Compression in 2008 – 30 % increase in Data loading when data is compressed but select was pretty much faster . </li></ul><ul><li>During Data Loading process change the recovery to simple. </li></ul>Destination …..
  10. 10. <ul><li>Enable Trace flag 610 when doing bulk operation like index rebuild , bulk loading . </li></ul><ul><li>If the target table has a clustered index an order insert will improve perf. </li></ul>Destination ….
  11. 11. <ul><li>Index Strategy for Data loading </li></ul><ul><li>Based on the delta </li></ul><ul><ul><li>Single Clustered Index – Leave as such </li></ul></ul><ul><ul><li>Single Non clustered and data > 100 % - Drop and reload </li></ul></ul><ul><ul><li>Multiple Non Clustered and data ? 10 % - Drop and reload </li></ul></ul><ul><li>Always remember Sql Server Auto update Stats only on a 20% increase in data. </li></ul>Destiantion …
  12. 12. <ul><li>Dataflow Buffer Memory </li></ul><ul><ul><li>Tweaking Data flow buffer can give better performance </li></ul></ul><ul><ul><li>Based on Trial and error method in production like load scenarios conclude the optimum size . </li></ul></ul><ul><ul><li>Remove unnecessary columns. </li></ul></ul><ul><ul><li>Blob Storage /Buffer Temp : point to Fast Drive , by default it will take the temp path in environment variable. </li></ul></ul>Design Issues- Buffer Memory
  13. 13. <ul><li>Update and insert issues </li></ul><ul><ul><li>Locking and possible Lock Escalation. </li></ul></ul><ul><ul><li>Delay in Loading. </li></ul></ul><ul><li>Solution :Create another temp table replace OLEDB command with OLEDB BULK INSERT </li></ul><ul><li>Add a new execute SQL task for batch update </li></ul>Design Issues – Oledb Command(SCD)
  14. 14. <ul><li>Always use queries in Lookup do not default . </li></ul><ul><li>Always use nolock wherever possible , it will improve large table scans. </li></ul><ul><li>Try to use shared Look up when tables are reused . </li></ul><ul><li>Use cast and convert at Sql rather than at SSIS. </li></ul><ul><li>Sort at Source. </li></ul><ul><li>Merge instead of SCD. </li></ul>Design Issues
  15. 15. Measuring performance <ul><li>Performance Counters </li></ul><ul><li>Buffers Spooled – Should be low as 0 - The no of buffers that where written on the BLOB storage . It indicates the ram has been exhausted and where written on file system </li></ul><ul><li>Disc I/O - Disk Per /Sec should be less than 10 for optimum performance </li></ul><ul><li>Try to Dissect your SSIS to analyze performance </li></ul><ul><li>Example : Using Row Count as target to test Source speed </li></ul>
  16. 16. <ul><li>SSIS is not an Service . It is an EXE </li></ul><ul><li>SSIS Service installed on service is just for monitoring purpose . </li></ul>Myths..
  17. 17. <ul><li>SQL Server 2008 R2 </li></ul><ul><ul><ul><li>Parallel DWH and SSRS Improvements </li></ul></ul></ul><ul><li>SQL Server 2012 </li></ul><ul><ul><ul><li>Ready Cloud SQL Server ,Better UI . </li></ul></ul></ul><ul><ul><ul><li>Data tap , Deployment wizard etc. </li></ul></ul></ul><ul><ul><ul><li>Undo/Redo and couple other transforms </li></ul></ul></ul>Editions Of Sql Server
  18. 18. <ul><li>MVP  Brian Knight , Jamie , Phil Brammer , Rafael Salas .. </li></ul><ul><li>Blue shirt Guys  Matt Mason , Denny Lee , Bob Bojanic , </li></ul><ul><li> David Noor ,Matt Carroll , Thomas Kejser </li></ul><ul><li>SQL CAT team Blog </li></ul>Blogs and Materials

×