SlideShare a Scribd company logo
Change Systems
Critical Component Series
Doug McClurg
Founder
dmcclurg@sosra.com
Data
systems
engineered
to last.
The goal of this series is to give you the tools you need to push analytics forward at your company.
• The nature and importance of change systems in an overall data platform
• Compare and contrast traditional and modern data warehouse architectures
• Discuss a key technology that is core to change systems in the enterprise
• Compare the SQL Server features that enable robust change data capture
Change Systems
Agenda
Database Engine
MDF LDF
Overview
The Source of Change
• A database engine manages files.
• Data structures
• Transaction logs
• Change systems accurately track
modifications inside data structures.
• The source of record for change is the
transaction log. Using this log directly is
a characteristic of passive change
systems.
• Active change systems watch the data
structure and record observable
change.
Overview
Modeling Change
AccountID CustomerID AccountBalance ModifyDate
4568456 2342 1234758.23
2017-03-11
04:11:05
4624572 9875 5768.01
2017-03-11
04:13:15
4745733 8735 478893.33
2017-03-11
04:13:01
AccountID CustomerID Type Amount EventDate
4568456 2342 Deposit 1198575.32
2017-03-08
09:09:04
4624572 9875 Deposit 4438.70
2017-03-08
09:10:01
4745733 8735 Deposit 460436.02
2017-03-07
10:13:20
4568456 2342 Deposit 528.11
2017-03-08
06:13:45
4624572 9875 Deposit 1345.23
2017-03-09
10:22:25
4745733 8735 Deposit 635.20
2017-03-08
11:13:01
4568456 2342 Withdrawal 23.21
2017-03-09
12:12:02
4624572 9875 Fee 21.34
2017-03-09
06:13:45
4745733 8735 Withdrawal 42.66
2017-03-10
13:13:12
4568456 2342 Transfer 35678.01
2017-03-11
04:11:05
4624572 9875 Deposit 5.42
2017-03-11
04:13:15
4745733 8735 Deposit 17864.77
2017-03-11
04:13:01
Table
Log
=*
*Record the CRUD operations to the table and you get a changelog.
The duality is that a table supports data at
rest and logs capture change. If you have a
log you can not only create the original table
but a myriad of other derived tables. Logs
therefore seem to be a more fundamental
data structure.
Overview
Modeling Change
Valid Time
John Doe who lived in Flat Rock,
NC made his first visit to us on
April 1st, 1985 and changed his
permanent address during a sale
on November 12th 2005.
Name Address ValidFrom ValidTo
John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731
1985-04-01
10:00:00
2005-11-12
09:05:00
John Doe 9433 Collingdale Way, Raleigh, NC 27617
2005-11-12
09:06:01
9999-12-31
23:59:99
Transaction Time
Our data warehouse went live on
November 1st 2005. The ETL runs
daily at 4 AM.
Name Address CreateDate ExpireDate
John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731
2005-11-01
09:25:11
2005-11-13
04:54:11
John Doe 9433 Collingdale Way, Raleigh, NC 27617
2005-11-13
04:54:12
9999-12-31
23:59:99
Overview
Modeling Change
ID Name Address ModifyDate
12345 John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731
1985-04-01
10:00:00
12345 John Doe 9433 Collingdale Way, Raleigh, NC 27617
2005-11-12
09:06:01
Key ID Name Address ValidFrom ValidTo CreateDate ExpireDate
1 12345 John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731
1985-04-01
10:00:00
2005-11-12
09:06:00
2005-11-01
09:25:11
2005-11-13
04:54:11
2 12345 John Doe 9433 Collingdale Way, Raleigh, NC 27617
2005-11-12
09:06:01
9999-12-31
23:59:99
2005-11-13
04:54:12
9999-12-31
23:59:99
ETL
Source
Target
SCD 2
Dimension
This column
creates risk
Latency of 1 Day at best
Application
Database
SQL
DB2
SQL
…
Enterprise Data
Warehouse
Mart Mart
Batch ETL
Jobs
Storage and Query
Traditional Architecture
The Pull Method
Application
Database
SQL
DB2
SQL
…
Enterprise Data
Warehouse
Mart Mart
Batch ETL
Jobs
Storage and Query
Traditional Architecture
Focus on the Source
Focus Area One
Friction & Frustration
Data Quality
• Timeliness
• Latency of change
• Latency of build
• Consistency
• Redundant ETL
• Accuracy
• Filters
• Logic
• Source
Lead Time
• Custom ETL
• Manual ETL
• Business case and ceremony
• Domain knowledge
Dependencies
• Business logic
• Redundancy
• Downstream effects
• Team
Collect
and
Route
Events Query | Model | Automate
Modern Architecture
The Push Method (Lambda)
Speed Layer
Batch Layer
Serving
Layer
Real-time
Views
Batch
Views
Events
Query | Model | Automate
Stream
Modern Architecture
The Push Method (Kappa)
Unified Log Storage
Archive
Collect
Modern Architecture
The Fungibility of Data
LOG
• Ingest (don’t extract) disparate silos of data
• Store data in its atomic form (no transform)
• Collect changes as if they were events (immutable)
• Run downstream ETL more often (process less data each cycle)
Modern Architecture
Lessons Learned
Mart
ETL
Application
Database
Enriched
Source
Mirror Layer
Mart
Storage and Query
Micro-Batch
ETL Jobs
Modern Architecture
Phase 1
Homogenize, Protect,
and Standardize
= database transaction log
Mirror Layer
Analytical Model
Temporary Staging
Source
Why Have a Mirror Layer?
1. Improve the data structure of a
source system (add primary keys,
indexes)
2. Hide complexity related to the
type of source system (SQL, API,
Mainframe)
3. Improve the quality and
performance of change tracking
4. Enable data governance programs
by homogenizing sources
5. Enable prototyping of new
automation solutions without
developer support
Risks/Assumptions
This layer must be real-time and
simple, close to the metal. The more
it looks like another ETL layer, the
more the risks will outweigh the
benefits.
Transform
Near Real-time
Intensive
Transform
Mirror layer
Overview
But all I read is hate for replication on the internets!
Mirror layer
Replication in Production
Sale
Transaction
Customer
Profile
Source Database Server
T-LOGT-LOG
Pub
Sub
ArticleArticle
Push
Dist
cmd
• Set up everything in a lower environment
and replay production activity to get an
idea of load.
• The source database is placed into an
Always On availability group so that the
database and replication can failover.
• Distributor and subscriber are moved to
their own failover cluster.
• Subscribers connect to an availability
group listener so they can find the right
server after a failover.
• Database and log backups are still taken
regularly to support disaster recovery,
but additional preparations are made to
enable a smooth restore of replication.
Mirror Layer Demo
Features of SQL Server
AccountID CustomerID AccountBalance ModifyDate
4568456 2342 1234758.23
2017-03-11
04:11:05
4624572 9875 5768.01
2017-03-11
04:13:15
AccountID Operation Columns
4568456 INSERT
4624572 UPDATE AccountBalance
4745733 DELETE
Base Table
Change Table (Internal)
AccountID CustomerID AccountBalance ModifyDate CreateDate ExpireDate
4624572 9875 5001.01
2017-03-10
06:19:01
2017-03-10
06:20:35
2017-03-11
04:14:22
4745733 8735 478893.33
2017-03-11
04:13:01
2017-03-11
04:14:59
2017-03-12
09:01:12
History Table
Change Tracking
• Net changes only
• No data
• Internal tables
• Internal functions
• Retention period only
Temporal Tables
• Net changes not automatic
• Data
• Normal tables
• T-SQL language integration
• Full support for archiving
https://github.com/dpmcclurg/ChangeSystemsDemo
Download the Code!

More Related Content

What's hot

SQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise ManageabilitySQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise Manageability
Dan English
 
Partitioning 11g-whitepaper-159443
Partitioning 11g-whitepaper-159443Partitioning 11g-whitepaper-159443
Partitioning 11g-whitepaper-159443
Sandeep Chandra --seeking a new position
 
Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2
Eduardo Castro
 
Basic oracle-database-administration
Basic oracle-database-administrationBasic oracle-database-administration
Basic oracle-database-administration
sreehari orienit
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
Ajeet Singh
 
Oracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionOracle 11g data warehouse introdution
Oracle 11g data warehouse introdution
Aditya Trivedi
 
58750024 datastage-student-guide
58750024 datastage-student-guide58750024 datastage-student-guide
58750024 datastage-student-guide
Madhusudhanareddy Katta
 
Oracle 11g nf_1.0
Oracle 11g nf_1.0Oracle 11g nf_1.0
Oracle 11g nf_1.0
Nabi Abdul
 
Oracle dba trainining in hyderabad
Oracle dba trainining in hyderabadOracle dba trainining in hyderabad
Oracle dba trainining in hyderabad
sreehari orienit
 
Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimization
Dhani Ahmad
 
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangaloreOracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
TIB Academy
 
Microsoft SQL Server 2016 - Everything Built In
Microsoft SQL Server 2016 - Everything Built InMicrosoft SQL Server 2016 - Everything Built In
Microsoft SQL Server 2016 - Everything Built In
David J Rosenthal
 
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.euDatabase migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
aldaschwede80
 
Oracle RDBMS architecture
Oracle RDBMS architectureOracle RDBMS architecture
Oracle RDBMS architecture
Martin Berger
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
Samar Prasad
 

What's hot (15)

SQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise ManageabilitySQL Server Integration Services – Enterprise Manageability
SQL Server Integration Services – Enterprise Manageability
 
Partitioning 11g-whitepaper-159443
Partitioning 11g-whitepaper-159443Partitioning 11g-whitepaper-159443
Partitioning 11g-whitepaper-159443
 
Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2
 
Basic oracle-database-administration
Basic oracle-database-administrationBasic oracle-database-administration
Basic oracle-database-administration
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Oracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionOracle 11g data warehouse introdution
Oracle 11g data warehouse introdution
 
58750024 datastage-student-guide
58750024 datastage-student-guide58750024 datastage-student-guide
58750024 datastage-student-guide
 
Oracle 11g nf_1.0
Oracle 11g nf_1.0Oracle 11g nf_1.0
Oracle 11g nf_1.0
 
Oracle dba trainining in hyderabad
Oracle dba trainining in hyderabadOracle dba trainining in hyderabad
Oracle dba trainining in hyderabad
 
Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimization
 
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangaloreOracle DBA Tutorial for Beginners -Oracle training institute in bangalore
Oracle DBA Tutorial for Beginners -Oracle training institute in bangalore
 
Microsoft SQL Server 2016 - Everything Built In
Microsoft SQL Server 2016 - Everything Built InMicrosoft SQL Server 2016 - Everything Built In
Microsoft SQL Server 2016 - Everything Built In
 
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.euDatabase migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
 
Oracle RDBMS architecture
Oracle RDBMS architectureOracle RDBMS architecture
Oracle RDBMS architecture
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
 

Viewers also liked

Plan and budget for aird ramnad 14 april 2014
Plan and budget for aird ramnad 14 april  2014Plan and budget for aird ramnad 14 april  2014
Plan and budget for aird ramnad 14 april 2014
S.Ananthanarayana Sharma
 
Một số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tế
Một số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tếMột số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tế
Một số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tế
minhdoan102
 
Εργονομία Παρουσίαση
Εργονομία ΠαρουσίασηΕργονομία Παρουσίαση
Εργονομία Παρουσίαση
pathankal
 
3 Keys to Managing Rapid Change
3 Keys to Managing Rapid Change3 Keys to Managing Rapid Change
3 Keys to Managing Rapid Change
Matthew Habuda
 
11 consigli per VENDERE e FATTURARE di più nel 2017!
11 consigli per VENDERE e FATTURARE di più nel 2017!11 consigli per VENDERE e FATTURARE di più nel 2017!
11 consigli per VENDERE e FATTURARE di più nel 2017!
Alessandro Scuratti
 
I chronicles 4 commentary
I chronicles 4 commentaryI chronicles 4 commentary
I chronicles 4 commentary
GLENN PEASE
 
David ramiro poveda_moreno_3.2_pedagogia_activa
David ramiro poveda_moreno_3.2_pedagogia_activaDavid ramiro poveda_moreno_3.2_pedagogia_activa
David ramiro poveda_moreno_3.2_pedagogia_activa
David Poveda
 
Informar Sobre la Renuncia de un Auditor Fiscal
Informar Sobre la Renuncia de un Auditor FiscalInformar Sobre la Renuncia de un Auditor Fiscal
Informar Sobre la Renuncia de un Auditor Fiscal
Miguel A. C. Sánchez
 
CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...
CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...
CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...
Ist. Superiore Marini-Gioia - Enzo Exposyto
 
Imagenes pajarillos-2005
Imagenes pajarillos-2005Imagenes pajarillos-2005
Imagenes pajarillos-2005
Amparo Martín
 
Introductory paragraph
Introductory paragraphIntroductory paragraph
Introductory paragraph
Lidia Lozada
 
Türkiye Badminton Federasyonu - E-Dergi Sayı: 6
Türkiye Badminton Federasyonu - E-Dergi Sayı: 6Türkiye Badminton Federasyonu - E-Dergi Sayı: 6
Türkiye Badminton Federasyonu - E-Dergi Sayı: 6
Baris BAYRAM
 
Ajmera lugaano yelahanka bangalore
Ajmera lugaano yelahanka bangaloreAjmera lugaano yelahanka bangalore
Ajmera lugaano yelahanka bangalore
Saunika Thakur
 
Guía Mexicana y Guía WAO de Urticaria
Guía Mexicana y Guía WAO de UrticariaGuía Mexicana y Guía WAO de Urticaria
Guía Mexicana y Guía WAO de Urticaria
Juan Carlos Ivancevich
 
Đừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sống
Đừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sốngĐừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sống
Đừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sống
giangcdby03
 
9 Dr Ahmed Esawy imaging oral board of skull imaging part I
9 Dr Ahmed Esawy imaging oral board of skull imaging part I9 Dr Ahmed Esawy imaging oral board of skull imaging part I
9 Dr Ahmed Esawy imaging oral board of skull imaging part I
AHMED ESAWY
 

Viewers also liked (16)

Plan and budget for aird ramnad 14 april 2014
Plan and budget for aird ramnad 14 april  2014Plan and budget for aird ramnad 14 april  2014
Plan and budget for aird ramnad 14 april 2014
 
Một số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tế
Một số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tếMột số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tế
Một số lưu ý khi viết các báo cáo 3 chuyên ngành thuộc bộ môn KD quốc tế
 
Εργονομία Παρουσίαση
Εργονομία ΠαρουσίασηΕργονομία Παρουσίαση
Εργονομία Παρουσίαση
 
3 Keys to Managing Rapid Change
3 Keys to Managing Rapid Change3 Keys to Managing Rapid Change
3 Keys to Managing Rapid Change
 
11 consigli per VENDERE e FATTURARE di più nel 2017!
11 consigli per VENDERE e FATTURARE di più nel 2017!11 consigli per VENDERE e FATTURARE di più nel 2017!
11 consigli per VENDERE e FATTURARE di più nel 2017!
 
I chronicles 4 commentary
I chronicles 4 commentaryI chronicles 4 commentary
I chronicles 4 commentary
 
David ramiro poveda_moreno_3.2_pedagogia_activa
David ramiro poveda_moreno_3.2_pedagogia_activaDavid ramiro poveda_moreno_3.2_pedagogia_activa
David ramiro poveda_moreno_3.2_pedagogia_activa
 
Informar Sobre la Renuncia de un Auditor Fiscal
Informar Sobre la Renuncia de un Auditor FiscalInformar Sobre la Renuncia de un Auditor Fiscal
Informar Sobre la Renuncia de un Auditor Fiscal
 
CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...
CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...
CUBICA: dal GRAFICO all'EQUAZIONE - TERZO METODO - ESEMPIO 4c - CALCOLI e GRA...
 
Imagenes pajarillos-2005
Imagenes pajarillos-2005Imagenes pajarillos-2005
Imagenes pajarillos-2005
 
Introductory paragraph
Introductory paragraphIntroductory paragraph
Introductory paragraph
 
Türkiye Badminton Federasyonu - E-Dergi Sayı: 6
Türkiye Badminton Federasyonu - E-Dergi Sayı: 6Türkiye Badminton Federasyonu - E-Dergi Sayı: 6
Türkiye Badminton Federasyonu - E-Dergi Sayı: 6
 
Ajmera lugaano yelahanka bangalore
Ajmera lugaano yelahanka bangaloreAjmera lugaano yelahanka bangalore
Ajmera lugaano yelahanka bangalore
 
Guía Mexicana y Guía WAO de Urticaria
Guía Mexicana y Guía WAO de UrticariaGuía Mexicana y Guía WAO de Urticaria
Guía Mexicana y Guía WAO de Urticaria
 
Đừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sống
Đừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sốngĐừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sống
Đừng bao giờ nói những câu nói này nếu không muốn con thất bại trong cuộc sống
 
9 Dr Ahmed Esawy imaging oral board of skull imaging part I
9 Dr Ahmed Esawy imaging oral board of skull imaging part I9 Dr Ahmed Esawy imaging oral board of skull imaging part I
9 Dr Ahmed Esawy imaging oral board of skull imaging part I
 

Similar to Implementing Change Systems in SQL Server 2016

Advanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar OrientationAdvanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar Orientation
DATAVERSITY
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Inside Analysis
 
Lecture 18
Lecture 18Lecture 18
Lecture 18
Shani729
 
Preparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000DPreparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000D
dclsocialmedia
 
oracle-adw-melts snowflake-report.pdf
oracle-adw-melts snowflake-report.pdforacle-adw-melts snowflake-report.pdf
oracle-adw-melts snowflake-report.pdf
ssuserf8f9b2
 
Intro to Database Design
Intro to Database DesignIntro to Database Design
Intro to Database Design
Sondra Willhite
 
Things learned from OpenWorld 2013
Things learned from OpenWorld 2013Things learned from OpenWorld 2013
Things learned from OpenWorld 2013
Connor McDonald
 
Reduce latency and boost sql server io performance
Reduce latency and boost sql server io performanceReduce latency and boost sql server io performance
Reduce latency and boost sql server io performance
Kevin Kline
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
DATAVERSITY
 
oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021
ssuser8ccb5a
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
camyla81
 
Présentation Oracle DataBase 11g
Présentation Oracle DataBase 11gPrésentation Oracle DataBase 11g
Présentation Oracle DataBase 11g
Cynapsys It Hotspot
 
Managing the Complexities of Conversion to S1000D
Managing the Complexities of Conversion to S1000DManaging the Complexities of Conversion to S1000D
Managing the Complexities of Conversion to S1000D
dclsocialmedia
 
Vijay_Kr_Singh_Oracle_SQL_PLSQL_Developer
Vijay_Kr_Singh_Oracle_SQL_PLSQL_DeveloperVijay_Kr_Singh_Oracle_SQL_PLSQL_Developer
Vijay_Kr_Singh_Oracle_SQL_PLSQL_Developer
Vijay Kumar Singh
 
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
Andrew Miller
 
MySQL Cluster
MySQL ClusterMySQL Cluster
MySQL Cluster
Mario Beck
 
Enough Blame for System Performance Issues
Enough Blame for System Performance IssuesEnough Blame for System Performance Issues
Enough Blame for System Performance Issues
Mahesh Vallampati
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid World
Tanel Poder
 
An AMIS overview of database 12c
An AMIS overview of database 12cAn AMIS overview of database 12c
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 

Similar to Implementing Change Systems in SQL Server 2016 (20)

Advanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar OrientationAdvanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar Orientation
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
Lecture 18
Lecture 18Lecture 18
Lecture 18
 
Preparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000DPreparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000D
 
oracle-adw-melts snowflake-report.pdf
oracle-adw-melts snowflake-report.pdforacle-adw-melts snowflake-report.pdf
oracle-adw-melts snowflake-report.pdf
 
Intro to Database Design
Intro to Database DesignIntro to Database Design
Intro to Database Design
 
Things learned from OpenWorld 2013
Things learned from OpenWorld 2013Things learned from OpenWorld 2013
Things learned from OpenWorld 2013
 
Reduce latency and boost sql server io performance
Reduce latency and boost sql server io performanceReduce latency and boost sql server io performance
Reduce latency and boost sql server io performance
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
 
oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021oracle_soultion_oracledataintegrator_goldengate_2021
oracle_soultion_oracledataintegrator_goldengate_2021
 
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptxCERN_DIS_ODI_OGG_final_oracle_golde.pptx
CERN_DIS_ODI_OGG_final_oracle_golde.pptx
 
Présentation Oracle DataBase 11g
Présentation Oracle DataBase 11gPrésentation Oracle DataBase 11g
Présentation Oracle DataBase 11g
 
Managing the Complexities of Conversion to S1000D
Managing the Complexities of Conversion to S1000DManaging the Complexities of Conversion to S1000D
Managing the Complexities of Conversion to S1000D
 
Vijay_Kr_Singh_Oracle_SQL_PLSQL_Developer
Vijay_Kr_Singh_Oracle_SQL_PLSQL_DeveloperVijay_Kr_Singh_Oracle_SQL_PLSQL_Developer
Vijay_Kr_Singh_Oracle_SQL_PLSQL_Developer
 
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
 
MySQL Cluster
MySQL ClusterMySQL Cluster
MySQL Cluster
 
Enough Blame for System Performance Issues
Enough Blame for System Performance IssuesEnough Blame for System Performance Issues
Enough Blame for System Performance Issues
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid World
 
An AMIS overview of database 12c
An AMIS overview of database 12cAn AMIS overview of database 12c
An AMIS overview of database 12c
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 

Recently uploaded

My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 

Recently uploaded (20)

My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 

Implementing Change Systems in SQL Server 2016

  • 3. The goal of this series is to give you the tools you need to push analytics forward at your company. • The nature and importance of change systems in an overall data platform • Compare and contrast traditional and modern data warehouse architectures • Discuss a key technology that is core to change systems in the enterprise • Compare the SQL Server features that enable robust change data capture Change Systems Agenda
  • 4. Database Engine MDF LDF Overview The Source of Change • A database engine manages files. • Data structures • Transaction logs • Change systems accurately track modifications inside data structures. • The source of record for change is the transaction log. Using this log directly is a characteristic of passive change systems. • Active change systems watch the data structure and record observable change.
  • 5. Overview Modeling Change AccountID CustomerID AccountBalance ModifyDate 4568456 2342 1234758.23 2017-03-11 04:11:05 4624572 9875 5768.01 2017-03-11 04:13:15 4745733 8735 478893.33 2017-03-11 04:13:01 AccountID CustomerID Type Amount EventDate 4568456 2342 Deposit 1198575.32 2017-03-08 09:09:04 4624572 9875 Deposit 4438.70 2017-03-08 09:10:01 4745733 8735 Deposit 460436.02 2017-03-07 10:13:20 4568456 2342 Deposit 528.11 2017-03-08 06:13:45 4624572 9875 Deposit 1345.23 2017-03-09 10:22:25 4745733 8735 Deposit 635.20 2017-03-08 11:13:01 4568456 2342 Withdrawal 23.21 2017-03-09 12:12:02 4624572 9875 Fee 21.34 2017-03-09 06:13:45 4745733 8735 Withdrawal 42.66 2017-03-10 13:13:12 4568456 2342 Transfer 35678.01 2017-03-11 04:11:05 4624572 9875 Deposit 5.42 2017-03-11 04:13:15 4745733 8735 Deposit 17864.77 2017-03-11 04:13:01 Table Log =* *Record the CRUD operations to the table and you get a changelog. The duality is that a table supports data at rest and logs capture change. If you have a log you can not only create the original table but a myriad of other derived tables. Logs therefore seem to be a more fundamental data structure.
  • 6. Overview Modeling Change Valid Time John Doe who lived in Flat Rock, NC made his first visit to us on April 1st, 1985 and changed his permanent address during a sale on November 12th 2005. Name Address ValidFrom ValidTo John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731 1985-04-01 10:00:00 2005-11-12 09:05:00 John Doe 9433 Collingdale Way, Raleigh, NC 27617 2005-11-12 09:06:01 9999-12-31 23:59:99 Transaction Time Our data warehouse went live on November 1st 2005. The ETL runs daily at 4 AM. Name Address CreateDate ExpireDate John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731 2005-11-01 09:25:11 2005-11-13 04:54:11 John Doe 9433 Collingdale Way, Raleigh, NC 27617 2005-11-13 04:54:12 9999-12-31 23:59:99
  • 7. Overview Modeling Change ID Name Address ModifyDate 12345 John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731 1985-04-01 10:00:00 12345 John Doe 9433 Collingdale Way, Raleigh, NC 27617 2005-11-12 09:06:01 Key ID Name Address ValidFrom ValidTo CreateDate ExpireDate 1 12345 John Doe 81 Carl Sandberg Ln, Flat Rock, NC 28731 1985-04-01 10:00:00 2005-11-12 09:06:00 2005-11-01 09:25:11 2005-11-13 04:54:11 2 12345 John Doe 9433 Collingdale Way, Raleigh, NC 27617 2005-11-12 09:06:01 9999-12-31 23:59:99 2005-11-13 04:54:12 9999-12-31 23:59:99 ETL Source Target SCD 2 Dimension This column creates risk Latency of 1 Day at best
  • 8. Application Database SQL DB2 SQL … Enterprise Data Warehouse Mart Mart Batch ETL Jobs Storage and Query Traditional Architecture The Pull Method
  • 9. Application Database SQL DB2 SQL … Enterprise Data Warehouse Mart Mart Batch ETL Jobs Storage and Query Traditional Architecture Focus on the Source Focus Area One Friction & Frustration Data Quality • Timeliness • Latency of change • Latency of build • Consistency • Redundant ETL • Accuracy • Filters • Logic • Source Lead Time • Custom ETL • Manual ETL • Business case and ceremony • Domain knowledge Dependencies • Business logic • Redundancy • Downstream effects • Team
  • 10. Collect and Route Events Query | Model | Automate Modern Architecture The Push Method (Lambda) Speed Layer Batch Layer Serving Layer Real-time Views Batch Views
  • 11. Events Query | Model | Automate Stream Modern Architecture The Push Method (Kappa) Unified Log Storage Archive Collect
  • 13. • Ingest (don’t extract) disparate silos of data • Store data in its atomic form (no transform) • Collect changes as if they were events (immutable) • Run downstream ETL more often (process less data each cycle) Modern Architecture Lessons Learned
  • 14. Mart ETL Application Database Enriched Source Mirror Layer Mart Storage and Query Micro-Batch ETL Jobs Modern Architecture Phase 1 Homogenize, Protect, and Standardize = database transaction log
  • 15. Mirror Layer Analytical Model Temporary Staging Source Why Have a Mirror Layer? 1. Improve the data structure of a source system (add primary keys, indexes) 2. Hide complexity related to the type of source system (SQL, API, Mainframe) 3. Improve the quality and performance of change tracking 4. Enable data governance programs by homogenizing sources 5. Enable prototyping of new automation solutions without developer support Risks/Assumptions This layer must be real-time and simple, close to the metal. The more it looks like another ETL layer, the more the risks will outweigh the benefits. Transform Near Real-time Intensive Transform Mirror layer Overview
  • 16. But all I read is hate for replication on the internets!
  • 17. Mirror layer Replication in Production Sale Transaction Customer Profile Source Database Server T-LOGT-LOG Pub Sub ArticleArticle Push Dist cmd • Set up everything in a lower environment and replay production activity to get an idea of load. • The source database is placed into an Always On availability group so that the database and replication can failover. • Distributor and subscriber are moved to their own failover cluster. • Subscribers connect to an availability group listener so they can find the right server after a failover. • Database and log backups are still taken regularly to support disaster recovery, but additional preparations are made to enable a smooth restore of replication.
  • 18. Mirror Layer Demo Features of SQL Server AccountID CustomerID AccountBalance ModifyDate 4568456 2342 1234758.23 2017-03-11 04:11:05 4624572 9875 5768.01 2017-03-11 04:13:15 AccountID Operation Columns 4568456 INSERT 4624572 UPDATE AccountBalance 4745733 DELETE Base Table Change Table (Internal) AccountID CustomerID AccountBalance ModifyDate CreateDate ExpireDate 4624572 9875 5001.01 2017-03-10 06:19:01 2017-03-10 06:20:35 2017-03-11 04:14:22 4745733 8735 478893.33 2017-03-11 04:13:01 2017-03-11 04:14:59 2017-03-12 09:01:12 History Table Change Tracking • Net changes only • No data • Internal tables • Internal functions • Retention period only Temporal Tables • Net changes not automatic • Data • Normal tables • T-SQL language integration • Full support for archiving

Editor's Notes

  1. #1 challenge: active = observation is difficult, passive = logs are esoteric and sheltered Which is more valuable? In general logs are more valuable; however, it can be expensive to implement. It all depends on controls, trust, and latency. If your source database has the right audit fields, they’re trustworthy, and bath ETL is OK, then active change systems are the easy and cheap choice.
  2. These two things are equivalent and they had better be because this is the foundation of reliability, availability, and resilience of SQL Server and all other RDBMS systems. It’s how why we mirror databases, log ship, and how create high availability in data center infrastructure. A log is essentially a backup of all possible states of the table and any other possible derived table.
  3. Which is more valuable? As you approach real-time integration these two concepts are effectively interchangeable at the margin.
  4. John spent $30,000 over 600 transactions with us while he lived in Flat Rock ($50/tr). He has spent $250 over 2 transactions since he “moved” to Raleigh ($125/tr). Raleigh is where our brick and mortar store resides. What if you have to side-load new data from say an acquisition, merger, or archive situation? What if you have to trunc and reload this table? What about failure? You want valid time more than transaction time. Most data warehouses keep only transaction time. If ModifyDate does not exist or you don’t trust it then you’ve got a rough road ahead.
  5. LOTS OF CEREMONY. This is a synchronous world. Application have their own databases. We reach in and extract large amounts of data, bring it down to disk, and search for changes. We transform the data and load complex schemas with information. Who do you scale this system? You can’t do it horizontally. You can only scale up: bigger SQL servers, SSD SANs. Schemas must be designed and built before the Business can discover and analyze. Arbitrary questions are difficult to ask of the system and typically involve data points not yet modeled. In almost every experience, I have seen the Business’ need for information out pace IT’s capacity to build. Of course, it’s never this simple….
  6. This general architecture is called a lambda architecture. The traditional Extract portion of ETL is no longer relevant. We now “ingest” data in this architecture. Applications (and even devices) are “emitting” their events. VALUE: robustness, fault-tolerance, low latency reads/writes, scalability, generalization, extensibility, minimal maintenance, ad hoc queries, debuggability. I could fill this slide with the companies that implement this architecture including Microsoft, Walmart, Yahoo, LinkedIn, and Netflix. Nathan Marz – creator of Storm
  7. Jay Kreps – originally a software engineer at LinkedIn who was a key contributor to Kafka. LinkedIn is an interesting case because during Kreps’ tenure at the company they went from an VLB Oracle DW to Teradata to Hadoop and along that path they discovered the value of immutable streams of data in stark contrast to batch analytics. Clients have to keep track of their place in the log. The unified log made data fungible.
  8. Oil is fungible because it has equivalent value regardless of source. It is commoditized. Data, like oil, derives its value from an even quality and its ability to flow freely between endpoints as a homogenous commodity. By simplifying the data to its most fundamental structure (a log), we can unify organizations around a data platform, and (click) we can unify our analytics process because we have standardized our data flow.
  9. Most organizations already have a data warehouse. Frankly, the architectures previously presented make a lot of assumptions about the organization like the reliance on software engineers, the way the organization approaches data integration and information management. So how do we get started? Well, luckily SQL Server has been around for a very long time, and we can bootstrap a progressive architecture by understanding and applying the good parts of what we’ve learned from others in the field.
  10. Cloud-born data should remain in the cloud.
  11. Here we move from a pull to a push architecture. We are closer to applications emitting their own events. This is not another ETL layer, we are ingesting database transactions as they appear in real-time. This satisfies the principles of a mirror layer. Indeed, if you cannot satisfy these principles, it is best to move back to the traditional architecture. With this architecture, we can support micro-batch and batch processes with a robust, fault-tolerate tool that is close to the metal and simple. Downstream development becomes simpler and more confident where the focus is more on steering the analytical model and less to do with tracking source system data changes. Data quality and governance metrics become trustworthy because the mirror layer is sentient.
  12. Fundamentally, a mirror layer reflects source systems exactly table-by-table. It makes every source system look like a SQL Server database: DB2, flat files, APIs, etc. We are hiding the complexity of a heterogeneous environment. It becomes the source for analytical data models. This is where you improve a system that you cannot control: you add primary keys, indexing, etc. Remember, a source system database is designed for transactional speed. We would rather it be design for query speed; you can do this in the mirror layer. This is also where you track changes. Especially when a source system tracks changes poorly, a mirror helps you iron out transactional history in a way that is robust and fault tolerant. It should be able to heal when a source system changes its schema during a release, for example. By homogenizing sources, we are supporting data governance from a lineage perspective. Moreover, we are allowing governance personnel to access system data that they would not otherwise have access to in the production operation environment. Metrics can be compared more easily and new metrics can be created in less time. Maybe most importantly, we are providing a layer for the Business to prototype the next valuable solution. The business should not report from or run ad hoc on source system databases. They can run it on this layer and help us build high quality solution faster through prototyping.
  13. This is all possible with SQL Server Standard edition. Always Encrypted makes this not possible.
  14. Change tracking is essentially just metadata. But it sure is powerful and can transform your ETL process if all you need is to know what has changed. Temporal tables is a fully supported time machine. Unfortunately, getting net changes is not automatic. This is transaction time but if replication is setup as continuous then we are very close to valid time at the margin. You will still need to seed valid time as part of an initial load.