1. Capture Change and Apply it With
Change Data Capture & SSIS
Steve Wake
Sr. Business Intelligence Consultant
President, Denver SQL Server User Group
MAKING BUSINESS INTELLIGENT
www.pragmaticworks.com
2. What is Change Data Capture (CDC)?
• CDC first introduced in SQL Server 2008
• First available only in the Database Engine
• Must be enabled for the database(s) and
table(s) you want to track changes on
• SQL Server 2012 Integration Services (SSIS)
added components for CDC
3. What is Change Data Capture (CDC)?
Microsoft MSDN description:
Change data capture is designed to capture insert, update, and delete
activity applied to SQL Server tables, and to make the details of the
changes available in an easily consumed relational format. The change
tables used by change data capture contain columns that mirror the
column structure of a tracked source table, along with the metadata
needed to understand the changes that have occurred.
Change data capture is available only on the Enterprise, Developer, and
Evaluation editions of SQL Server.
Source: MSDN - http://msdn.microsoft.com/en-us/library/bb522489(v=sql.105).aspx
4. Why use CDC?
• Loading a Data Warehouse
– Changes from Source Only
– Near Real-Time
• Maintaining a Audit DB
• Maintaining a Change Log
6. CDC Limitations
Source: MSDN - http://msdn.microsoft.com/en-us/library/cc645593(v=sql.105).aspx
Type of Column Changes Captured in Change Tables Limitations
Sparse Columns Yes Does not support capturing changes when
using a columnset.
Computed Columns No Changes to computed columns are not
tracked. The column will appear in the
change table with the appropriate type,
but will have a value of NULL.
XML Yes Changes to individual XML elements are
not tracked.
Timestamp Yes The data type in the change table is
converted to binary.
BLOB data types Yes The previous image of the BLOB column is
stored only if the column itself is changed.
7. Performance Considerations
• Use different filegroups for CDC tables
• @supports_net_changes (non-clustered index added
to cdc table)
• Use @captured_column_list with
sys.sp_cdc_enable_table to limit columns tracked
• FMI – SQL CAT whitepaper:
http://msdn.microsoft.com/en-
us/library/dd266396(v=sql.100).aspx
8. SSIS 2012
• CDC Control Task
– manage LSN’s, handles errors and recovery
• CDC Source
– read CDC tables/metadata
• CDC Splitter
– splits based on _$operation
• Still requires Database/Table to be setup for CDC
9. Demo
• Setup CDC on database and table
• Managing CDC in SSMS
• Using SSIS 2012 Packages
10. Summary
• CDC added to SQL Server 2008
• Must be enabled manually, not on by default
• SSIS 2012 added support to more easily
handle CDC in SSIS packages
• CDC is a built-in way to track changes to SQL
databases/tables
11. Services
Speed development through training, and
rapid development services from
Pragmatic Works.
Products
BI products to convert to a Microsoft BI
platform and simplify development on the
platform.
Foundation
Helping those who do not have the means
to get into information technology and to
achieve their dreams.
See you soon…
Sales : sales@pragmaticworks.com
My Email: swake@pragmaticworks.com
My Blog: http://blog.wakebi.com
My Twitter: @stevewake
Office Number: 904-638-3805