Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Blazing SSIS!       High Performance Design Techniques             Bhavik Merchantbhavik.merchant@gmail.com bhavik.merchan...
Tweet, tweet..• Twitter: @BhavikMerchant• HashTag: #SQLPASS
Agenda in a Nutshell•   Introductions•   Rationale•   SSIS internal architecture, tuning approach•   Source, Transforms, D...
Brief Speaker Intro• Background  – Certified End-to-End Microsoft BI practitioner  – Team Lead and Consultant at CSG  – Mi...
Why this topic?• Many ways to skin a cat  – SSIS vs SQL e.g. “transforms”  – Sometimes multiple approaches even within SSI...
SSIS Tuning – Setting the Stage• SSIS Architecture  – In-memory engine. But spooling can occur.  – Pipeline, Buffers. Want...
Examining the Pipeline - Source• Tune the source  – I thought we were inserting… Ok then, why?• How?  –   Reduce number of...
Examining the Pipeline - Transforms
Examining the Pipeline - Transforms• Also needs design consideration/tuning• Similar principles to Source apply    – Reduc...
Examining the Pipeline - Destination• Choice of Destination component   – Fastest is “SQL Server”, with limitation• Fast L...
Updates – where SQL shines• Only mechanism for updates within a data flow is the  OLEDB Command transform• Its synchronous...
Tweaking the Data Flow Component• Adjust data flow default buffer size   – DefaultBufferMaxRows   – DefaultBufferSize (byt...
Taking it further…• Raw Files for transformation isolation• Table Partitioning   – Load work partition, then switch   – Pa...
General Tips• Use views or stored procs in sources where  possible• Database Compression = less I/O• Available Server Memo...
Benchmarking/Testing• Initial build, benchmark• Do at least 3 tests then average - Excel• Change one thing at a time! Then...
Summary• Start from source and work through to  destination• Isolate from downstream or upstream  components• Benchmark be...
Thanks for listening….QUESTION AND ANSWER
Related Links• http://sqlblog.com/blogs/jamie_thomson  Jamie Thomson “SSIS Junkie”• http://toddmcdermid.blogspot.com  Todd...
Upcoming SlideShare
Loading in …5
×

2012-03-29 (PASS BI Virtual Chapter) Blazing SSIS! High Performance Design Techniques

763 views

Published on

SSIS is a flexible tool that can be used for various data loading scenarios. However, when considered alongside the features offered by the SQL Server relational database engine, a downside emerges - there can be multiple approaches to particular solutions. This can
negatively affect performance.

In a step by step fashion with live demos, you will be introduced to an approach for performance monitoring and tuning in SSIS. Along the way you will learn best practice design patterns and configuration
tweaks that will make your packages fly!

Published in: Technology
  • Be the first to comment

  • Be the first to like this

2012-03-29 (PASS BI Virtual Chapter) Blazing SSIS! High Performance Design Techniques

  1. 1. Blazing SSIS! High Performance Design Techniques Bhavik Merchantbhavik.merchant@gmail.com bhavik.merchant@csg.com.au
  2. 2. Tweet, tweet..• Twitter: @BhavikMerchant• HashTag: #SQLPASS
  3. 3. Agenda in a Nutshell• Introductions• Rationale• SSIS internal architecture, tuning approach• Source, Transforms, Destination• Updates, Data Flow tweaks• Advanced approaches• Tips• Testing approach• Q&A
  4. 4. Brief Speaker Intro• Background – Certified End-to-End Microsoft BI practitioner – Team Lead and Consultant at CSG – Microsoft vTSP for BI – Trainer (SSAS, SSIS, SSRS, PowerPivot, Sharepoint BI)• Experience – Variety of BI projects in from SQL 2000 - 2008R2, MOSS 2007 - SP2010 over the past 6+ years
  5. 5. Why this topic?• Many ways to skin a cat – SSIS vs SQL e.g. “transforms” – Sometimes multiple approaches even within SSIS• Some settings are overlooked• Some settings are obscure• Will see examples as we go along
  6. 6. SSIS Tuning – Setting the Stage• SSIS Architecture – In-memory engine. But spooling can occur. – Pipeline, Buffers. Want to minimise size and number• SSIS Pipeline – We’ll follow this from source to destination – Good way to tune your packages – Isolate from downstream using RowCount/Multicast
  7. 7. Examining the Pipeline - Source• Tune the source – I thought we were inserting… Ok then, why?• How? – Reduce number of rows – obvious – Reduce number of columns – buffers – Reduce column width – buffers again – SQL command instead of Table/View – FastParse on flat files (only numerics and date/time)• DEMO!
  8. 8. Examining the Pipeline - Transforms
  9. 9. Examining the Pipeline - Transforms• Also needs design consideration/tuning• Similar principles to Source apply – Reduce number of rows (Cond split) – Reduce number of columns – Perform Transforms in source (Sort, Aggregate, Trim)• Synchronous – Streaming, Row-based• .. vs Asynchronous – Partially Blocking, Blocking (beware – memory!)• Buffers, pointer passing, creation• DEMO will explain the Dam!
  10. 10. Examining the Pipeline - Destination• Choice of Destination component – Fastest is “SQL Server”, with limitation• Fast Load with OLEDB Destinaton – batch based• Tablock ON, Check constraints OFF• Maximum insert commit size – 0 for heaps – 10k to 1m for B-Trees (table with clustered index) – Default value means commit whole batch. May lead to locks, large transaction log• Indexes – Drop then re-create after• DEMO!
  11. 11. Updates – where SQL shines• Only mechanism for updates within a data flow is the OLEDB Command transform• Its synchronous and row-by-row – So this is good right? – No  An UPDATE statement issued for each row!• Instead, lean on SQL Server for a set based approach via temp tables and the Execute SQL Task• DEMO!
  12. 12. Tweaking the Data Flow Component• Adjust data flow default buffer size – DefaultBufferMaxRows – DefaultBufferSize (bytes) – EngineThreads, incrementally• Beware of oversizing buffers – spooling – BufferTempStoragePath – BLOBTempStoragePath• DEMO!
  13. 13. Taking it further…• Raw Files for transformation isolation• Table Partitioning – Load work partition, then switch – Parallel parameter based loads – can also align with physical storage• Balanced Data Distributor –source is not bottleneck. Good for heaps.• Adjust MaxConcurrentExecutables at package level. Default (-1) = Logical Cores + 2• Network packet size – default 4096 (4k). Can increase in the connection string. Recommend 32767 (32k) for less network overhead
  14. 14. General Tips• Use views or stored procs in sources where possible• Database Compression = less I/O• Available Server Memory – NOT part of the SQL Server DBMS Memory Pool – Rule of thumb: • Host OS upto 1.5Gb • SQL DBMS, SSRS, SSAS all have separate allocations • SSIS uses whats left - Important for large cached Lookups
  15. 15. Benchmarking/Testing• Initial build, benchmark• Do at least 3 tests then average - Excel• Change one thing at a time! Then retest, record results• Testing – BIDS is not a true representation. Neither is your workstation.• Can use Perfmon to monitor SSIS counters. Combine with SSIS log events (e.g. PipelineComponentTime) and DMVs – especially sys.dm_os_wait_stats
  16. 16. Summary• Start from source and work through to destination• Isolate from downstream or upstream components• Benchmark before you start tuning• Resist the urge to change many things at once• Create a development/tuning checklist
  17. 17. Thanks for listening….QUESTION AND ANSWER
  18. 18. Related Links• http://sqlblog.com/blogs/jamie_thomson Jamie Thomson “SSIS Junkie”• http://toddmcdermid.blogspot.com Todd McDermid (Dimension Merge SCD)• http://blogs.msdn.com/b/mattm/ Matt Masson - SSIS Team Blog• http://sqlcat.com , http://sql-server- performance.com Look for SSIS best practices• http://bidshelper.codeplex.com• http://dimensionmergescd.codeplex.com

×