Your SlideShare is downloading. ×
  • Like
  • Save
2012-03-29 (PASS BI Virtual Chapter) Blazing SSIS! High Performance Design Techniques
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

2012-03-29 (PASS BI Virtual Chapter) Blazing SSIS! High Performance Design Techniques

  • 450 views
Published

SSIS is a flexible tool that can be used for various data loading scenarios. However, when considered alongside the features offered by the SQL Server relational database engine, a downside emerges - …

SSIS is a flexible tool that can be used for various data loading scenarios. However, when considered alongside the features offered by the SQL Server relational database engine, a downside emerges - there can be multiple approaches to particular solutions. This can
negatively affect performance.

In a step by step fashion with live demos, you will be introduced to an approach for performance monitoring and tuning in SSIS. Along the way you will learn best practice design patterns and configuration
tweaks that will make your packages fly!

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
450
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Blazing SSIS! High Performance Design Techniques Bhavik Merchantbhavik.merchant@gmail.com bhavik.merchant@csg.com.au
  • 2. Tweet, tweet..• Twitter: @BhavikMerchant• HashTag: #SQLPASS
  • 3. Agenda in a Nutshell• Introductions• Rationale• SSIS internal architecture, tuning approach• Source, Transforms, Destination• Updates, Data Flow tweaks• Advanced approaches• Tips• Testing approach• Q&A
  • 4. Brief Speaker Intro• Background – Certified End-to-End Microsoft BI practitioner – Team Lead and Consultant at CSG – Microsoft vTSP for BI – Trainer (SSAS, SSIS, SSRS, PowerPivot, Sharepoint BI)• Experience – Variety of BI projects in from SQL 2000 - 2008R2, MOSS 2007 - SP2010 over the past 6+ years
  • 5. Why this topic?• Many ways to skin a cat – SSIS vs SQL e.g. “transforms” – Sometimes multiple approaches even within SSIS• Some settings are overlooked• Some settings are obscure• Will see examples as we go along
  • 6. SSIS Tuning – Setting the Stage• SSIS Architecture – In-memory engine. But spooling can occur. – Pipeline, Buffers. Want to minimise size and number• SSIS Pipeline – We’ll follow this from source to destination – Good way to tune your packages – Isolate from downstream using RowCount/Multicast
  • 7. Examining the Pipeline - Source• Tune the source – I thought we were inserting… Ok then, why?• How? – Reduce number of rows – obvious – Reduce number of columns – buffers – Reduce column width – buffers again – SQL command instead of Table/View – FastParse on flat files (only numerics and date/time)• DEMO!
  • 8. Examining the Pipeline - Transforms
  • 9. Examining the Pipeline - Transforms• Also needs design consideration/tuning• Similar principles to Source apply – Reduce number of rows (Cond split) – Reduce number of columns – Perform Transforms in source (Sort, Aggregate, Trim)• Synchronous – Streaming, Row-based• .. vs Asynchronous – Partially Blocking, Blocking (beware – memory!)• Buffers, pointer passing, creation• DEMO will explain the Dam!
  • 10. Examining the Pipeline - Destination• Choice of Destination component – Fastest is “SQL Server”, with limitation• Fast Load with OLEDB Destinaton – batch based• Tablock ON, Check constraints OFF• Maximum insert commit size – 0 for heaps – 10k to 1m for B-Trees (table with clustered index) – Default value means commit whole batch. May lead to locks, large transaction log• Indexes – Drop then re-create after• DEMO!
  • 11. Updates – where SQL shines• Only mechanism for updates within a data flow is the OLEDB Command transform• Its synchronous and row-by-row – So this is good right? – No  An UPDATE statement issued for each row!• Instead, lean on SQL Server for a set based approach via temp tables and the Execute SQL Task• DEMO!
  • 12. Tweaking the Data Flow Component• Adjust data flow default buffer size – DefaultBufferMaxRows – DefaultBufferSize (bytes) – EngineThreads, incrementally• Beware of oversizing buffers – spooling – BufferTempStoragePath – BLOBTempStoragePath• DEMO!
  • 13. Taking it further…• Raw Files for transformation isolation• Table Partitioning – Load work partition, then switch – Parallel parameter based loads – can also align with physical storage• Balanced Data Distributor –source is not bottleneck. Good for heaps.• Adjust MaxConcurrentExecutables at package level. Default (-1) = Logical Cores + 2• Network packet size – default 4096 (4k). Can increase in the connection string. Recommend 32767 (32k) for less network overhead
  • 14. General Tips• Use views or stored procs in sources where possible• Database Compression = less I/O• Available Server Memory – NOT part of the SQL Server DBMS Memory Pool – Rule of thumb: • Host OS upto 1.5Gb • SQL DBMS, SSRS, SSAS all have separate allocations • SSIS uses whats left - Important for large cached Lookups
  • 15. Benchmarking/Testing• Initial build, benchmark• Do at least 3 tests then average - Excel• Change one thing at a time! Then retest, record results• Testing – BIDS is not a true representation. Neither is your workstation.• Can use Perfmon to monitor SSIS counters. Combine with SSIS log events (e.g. PipelineComponentTime) and DMVs – especially sys.dm_os_wait_stats
  • 16. Summary• Start from source and work through to destination• Isolate from downstream or upstream components• Benchmark before you start tuning• Resist the urge to change many things at once• Create a development/tuning checklist
  • 17. Thanks for listening….QUESTION AND ANSWER
  • 18. Related Links• http://sqlblog.com/blogs/jamie_thomson Jamie Thomson “SSIS Junkie”• http://toddmcdermid.blogspot.com Todd McDermid (Dimension Merge SCD)• http://blogs.msdn.com/b/mattm/ Matt Masson - SSIS Team Blog• http://sqlcat.com , http://sql-server- performance.com Look for SSIS best practices• http://bidshelper.codeplex.com• http://dimensionmergescd.codeplex.com