SSIS is a flexible tool that can be used for various data loading scenarios. However, when considered alongside the features offered by the SQL Server relational database engine, a downside emerges - there can be multiple approaches to particular solutions. This can
negatively affect performance.
In a step by step fashion with live demos, you will be introduced to an approach for performance monitoring and tuning in SSIS. Along the way you will learn best practice design patterns and configuration
tweaks that will make your packages fly!
4. Brief Speaker Intro
• Background
– Certified End-to-End Microsoft BI practitioner
– Team Lead and Consultant at CSG
– Microsoft vTSP for BI
– Trainer (SSAS, SSIS, SSRS, PowerPivot, Sharepoint
BI)
• Experience
– Variety of BI projects in from SQL 2000 - 2008R2,
MOSS 2007 - SP2010 over the past 6+ years
5. Why this topic?
• Many ways to skin a cat
– SSIS vs SQL e.g. “transforms”
– Sometimes multiple approaches even within SSIS
• Some settings are overlooked
• Some settings are obscure
• Will see examples as we go along
6. SSIS Tuning – Setting the Stage
• SSIS Architecture
– In-memory engine. But spooling can occur.
– Pipeline, Buffers. Want to minimise size and
number
• SSIS Pipeline
– We’ll follow this from source to destination
– Good way to tune your packages
– Isolate from downstream using
RowCount/Multicast
7. Examining the Pipeline - Source
• Tune the source
– I thought we were inserting… Ok then, why?
• How?
– Reduce number of rows – obvious
– Reduce number of columns – buffers
– Reduce column width – buffers again
– SQL command instead of Table/View
– FastParse on flat files (only numerics and date/time)
• DEMO!
9. Examining the Pipeline - Transforms
• Also needs design consideration/tuning
• Similar principles to Source apply
– Reduce number of rows (Cond split)
– Reduce number of columns
– Perform Transforms in source (Sort, Aggregate, Trim)
• Synchronous
– Streaming, Row-based
• .. vs Asynchronous
– Partially Blocking, Blocking (beware – memory!)
• Buffers, pointer passing, creation
• DEMO will explain the Dam!
10. Examining the Pipeline - Destination
• Choice of Destination component
– Fastest is “SQL Server”, with limitation
• Fast Load with OLEDB Destinaton – batch based
• Tablock ON, Check constraints OFF
• Maximum insert commit size
– 0 for heaps
– 10k to 1m for B-Trees (table with clustered index)
– Default value means commit whole batch. May lead to locks, large
transaction log
• Indexes – Drop then re-create after
• DEMO!
11. Updates – where SQL shines
• Only mechanism for updates within a data flow is the
OLEDB Command transform
• Its synchronous and row-by-row
– So this is good right?
– No An UPDATE statement issued for each row!
• Instead, lean on SQL Server for a set based approach
via temp tables and the Execute SQL Task
• DEMO!
12. Tweaking the Data Flow Component
• Adjust data flow default buffer size
– DefaultBufferMaxRows
– DefaultBufferSize (bytes)
– EngineThreads, incrementally
• Beware of oversizing buffers – spooling
– BufferTempStoragePath
– BLOBTempStoragePath
• DEMO!
13. Taking it further…
• Raw Files for transformation isolation
• Table Partitioning
– Load work partition, then switch
– Parallel parameter based loads
– can also align with physical storage
• Balanced Data Distributor –source is not bottleneck. Good for heaps.
• Adjust MaxConcurrentExecutables at package level. Default (-1) =
Logical Cores + 2
• Network packet size – default 4096 (4k). Can increase in the
connection string. Recommend 32767 (32k) for less network
overhead
14. General Tips
• Use views or stored procs in sources where
possible
• Database Compression = less I/O
• Available Server Memory
– NOT part of the SQL Server DBMS Memory Pool
– Rule of thumb:
• Host OS upto 1.5Gb
• SQL DBMS, SSRS, SSAS all have separate allocations
• SSIS uses whats left - Important for large cached Lookups
15. Benchmarking/Testing
• Initial build, benchmark
• Do at least 3 tests then average - Excel
• Change one thing at a time! Then retest, record results
• Testing – BIDS is not a true representation. Neither is
your workstation.
• Can use Perfmon to monitor SSIS counters. Combine
with SSIS log events (e.g. PipelineComponentTime) and
DMVs – especially sys.dm_os_wait_stats
16. Summary
• Start from source and work through to
destination
• Isolate from downstream or upstream
components
• Benchmark before you start tuning
• Resist the urge to change many things at
once
• Create a development/tuning checklist