SlideShare a Scribd company logo
Performance
Tuning using
Upsert and SCD
Written By: Chris Price
cprice@pragmaticworks.com
Contents
Upserts	3
Upserts with SSIS	 3
Upsert with MERGE	 6
Upsert with Task Factory
Upsert Destination	 7
Upsert Performance Testing	 8
Summary	10
Slowly Changing Dimensions	 11
Slowly Changing Dimension (SCD) Transform	 11
Custom SCD with SSIS	 12
SCD with MERGE	 13
SCD with Task Factory Dimension Merge	 14
SCD Performance Testing	 16
Summary	18
Wrap-Up	19
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 3
Upserts
Upsert is a portmanteau that blends the distinct actions of an
Update and Insert and describes how both occur in the context
of a single execution. Logically speaking, the Upsert process is
extremely straight-forward. Source rows are compared to a
destination, if a match is found based on some specified criteria
the row is updated, otherwise the row is considered new and
an insert occurs. While the process can become more complex
if you decide to do conditional updates rather than doing blind
updates, that is basically it.
To implement an Upsert, you have three primary options in the
SQL Server environment. The first and most obvious is using SSIS
and its data flow components to orchestrate the Upsert process,
the second is using the T-SQL Merge command and finally there is
the Pragmatic Works Task Factory Upsert component.
Upserts with SSIS
Implementing an Upsert using purely SSIS is a trivial task that
consists of a minimum of four data flow components. Data
originating from any source are piped through a Lookup
transformation and the output is split into two, one for rows
matched in lookup and one for rows that were not matched. The
no match output contains new rows that must be inserted using
one of the supported destinations in SSIS. The matched rows
are those that need to be updated and an OLE DB Command
transformation is used to issue an update for each row.
As a SQL Server BI Pro developing SSIS packages, you often encounter situations and scenarios that have a
number of different solutions. Choosing the right solution often means balancing tangible performance
requirements with more intangible requirements like making your packages more maintainable. This white
paper will focus on the options for handling two of these scenarios: Upserts and Slowly-Changing Dimensions.
We will review multiple implementation options for each situation, discuss how each is accomplished, review
performance implications and the trades-offs for each in terms of complexity, manageability and opportunities
for configuration of auditing, and look at logging and error handling.
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 4
Standard SSIS Upsert
As this solution is currently designed, every row from the source
will either be inserted or updated. This may or may not be the
desired behavior based on your business requirements. Most
times, you will find that you can screen out rows that have
not changed to improve performance by eliminating updates.
To accomplish this you can use an expression in a conditional
split, the T-SQL CHECKSUM function, if both your source and
destination are SQL Server or a script transformation to generate
a hash for each row.
While this is as simple an Upsert gets in terms of implementation
and maintenance, there are several obvious performance
drawbacks to this approach as the volume of data grows. The
first is the Lookup transformation. The throughput in terms of
rows per second that you get through the lookup transformation
is directly correlated to the cache mode you configure on the
lookup. Full Cache is the optimal setting but depending on the
size of your destination dataset, the time and amount of memory
required may exceed what’s available. Partial Cache mode and No
Cache mode on the other hand are performance killers and there
are limited scenarios you should use either option.
The second drawback and the one most commonly encountered
in terms of performance issues is the OLE DB Command used
to handle updates. The update command works row-by-row,
meaning that if you have 10,000 rows to update, 10,000 updates
will be issued sequentially. This form of processing is the opposite
of batch processing you may be familiar with and has been
termed RBAR or row-by-agonizing-row because of the severe
effect it has on performance.
Despite these drawbacks, this solution excels when the set of
data contains no more than 20,000 rows. If you find that your
dataset is larger, there are several workarounds to mitigate the
drawbacks both of which come at the expense of maintainability
and ease-of-use.
When the Lookup transformation is the bottleneck, you can
replace it with a Merge Join pattern. The Merge Join pattern
facilitates reading both the source and destination in a single-
pass which allows for handling large sets of data more efficiently.
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 5
To use this pattern, you need an extra source to read in your
destination data. Keep in mind that the Merge Join transformation
requires two sorted inputs. Allowing the source to handle the
sorting is the most efficient but requires that you configure the
each Source as sorted.
If your source does not support sorting, such as a text file, you
must use a Sort Transformation. The Sort Transformation is a fully
blocking transformation meaning that it must read all rows before
it can output anything further degrading package performance.
The Merge Join transform must be configured to use a left-join
to allow both source rows that match the destination and those
that do not to be passed down the data flow. A conditional split
is then used to determine whether an Insert or Update is needed
for each row.
To overcome the row-by-row operation of the OLE DB Command,
a staging table is needed to allow a single set-based Update to
be called. After you created the staging table, replace the OLE DB
Command with an OLE DB Destination and map the row columns
to the columns in the staging table. In the control flow two
Execute SQL Tasks are needed. The first precedes the Data Flow
and simple truncates the staging table so that it is empty. The
second Execute SQL Task follows the data flow and is responsible
for issuing the set-based Update.
When you combine both of these workarounds, the package
actually will handle large sets of data with ease and even rivals
the performance of the MERGE statement when working with
sets of data that exceed 2 million rows. The trade-off however
is obvious, supporting and maintaining the package is now an
order of magnitude more difficult because of the additional
moving pieces and data structures required.
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 6
Upsert with MERGE
Unlike the prior solution that uses SSIS to execute multiple DML
statements to perform an Upsert operation, the MERGE feature
in SQL Server provides a high performance and efficient way to
perform the Upsert by calling both the Insert and Update in a
single statement.
To implement this solution you must stage all of your source data
in a table on the destination database. In the same manner as
the prior solution, an SSIS package can be used to orchestrate
truncating the staging table, moving the data from the source
to the staging table and then executing the MERGE command.
The difference exists in the T-SQL MERGE command. While
a detailed explanation of the MERGE statement is beyond the
scope of this white paper the MERGE combines both inserts and
updates into a single pass of the data using define criteria to
T-SQL MERGE Statement
determine when records match and what operations to perform
when either a match is or is not found.
The drawback to this method is in the complexity of the statement
as the accompanying figure illustrates. Beyond the complexity of
the syntax, control is also sacrificed as the MERGE statement is
essentially a black box. When you use the MERGE command you
have no control or error handling ability, if a single record fails
either on insert or update, the entire transaction is rolled back.
It’s clear that what the solution provides in terms of performance
and efficiency comes at the cost of complexity and loss of control.
A final note on MERGE is also required. If you find yourself
working on any version of SQL Server prior to 2008, this solution
is not applicable as the MERGE statement was first introduced in
SQL Server 2008
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 7
Task Factory Upsert Destination UI
Upsert with Task Factory
Upsert Destination
The Upsert Destination is a component included in the Pragmatic
Works Task Factory library of components and is a balanced
alternative when implementing an Upsert operation. Without
sacrificing performance, much of the complexity is abstracted
away from the developer and is boiled down to configuring
settings across three tabs.
To implement the Upsert Destination, drag-and-drop the Upsert
Destination component to your data flow design surface. The
component requires an ADO.Net connection, so you will need to
create one if one does not already exist. From there, you simply
configure the Destination table, map your source columns to
destination columns (making sure to identify the key column) and
choose your update method and you are ready to go.
Upsert Destination supports four update methods out of the box.
The first and fastest is the Bulk Update. This method is similar to
the one that has been discussed previously as all rows that exist
in the destination are updated. You can also fine tune the update
by choosing to do updates based on timestamps, a last updated
column or even a configurable column comparison. Beyond
the update method you can easily configure the component to
update a Last Modified column, enable identity inserts, provide
insert and update row counts as well as control take control over
the transactional container.
While none of these features are unique to the Task Factory
Upsert Destination, the ease with which you can be up and
running is huge in terms of a developer’s time and effort. When
you consider that there are no staging tables required, no special
requirements of the source data, no workarounds needed and
the component works with SQL Server 2005 and up it is a solid
option to consider.
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 8
Upsert Performance Testing
To assess each of the methods discussed a simple test was performed. In each test the bulk update method in which all rows are either
inserted or updated was used. The testing methodology required that each test be run three times, taking the average execution time for
all three executions then calculating the throughput in rows per second as the result. The results were then pared with rankings for each
method according to complexity, manageability and configurability.
Prior to each test being run the SQL Server cache and buffers were cleared using DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS.
All tests were run on an IBM x220 laptop with an i7 2640M processor and 16GB of RAM. A default install of SQL Server 2012, with the
maximum server memory set to 2GB was used for all database operations.
Test Case Size Rows Inserted Rows Updated
10,000 6,500 3,500
100,000 65,000 35,000
500,000 325,000 175,000
1,000,000 650,000 350,000
Test Cases
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 9
Performance Results
Overall Results
Merge Upsert Destination SSIS (Batch) SSIS
10,000 6917.223887 5169.73979 6609.385327 4144.791379
100,000 28873.91723 19040.36558 28533.38406 1448.862402
500,000 37736.79841 24491.79525 36840.55408 1525.442861
1,000,000 36777.32555 24865.93119 33549.91668 1596.765592
Results in Rows per Second
Performance Complexity Manageability Configurability
Merge 1 4 4 4
Upsert Destination 3 1 2 3
SSIS (Batch) 2 3 3 2
SSIS 4 2 1 1
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 10
As expected, from a pure performance perspective the Upsert with Merge outperformed all other methods of implementing an Upsert
operation. It also easily topped all others in terms of complexity while being the least manageable and least configurable. The SSIS (Batch)
method also performed well as it is able to take advantage of bulk inserts into a staging table and followed by a set-based update. While
not as complex as the MERGE method it does require both sorted sources and staging tables ultimately bumping its manageability down.
The Upsert Destination performed well and was the only method whose performance did not degrade through-out testing. It also tested
out as the least complex and most manageable method for implementing an Upsert operation. Finally, the SSIS implement while being
easy to manage and allowing for the greatest degree of configuration it performed the worst.
Summary
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 11
Slowly Changing Dimensions
When Slowly Changing Dimensions are discussed the two primary types considered are Type-1 and Type-2 Slowly Changing Dimensions.
Recalling that the difference between these two types depends on whether history is tracked when the dimension changes the
fundamental implementation of each is the same. In terms of implementation options you have three available out of the box. You can
use the Slowly Changing Dimension transformation, implement custom slowly changing dimension logic or use the Insert over MERGE.
A fourth option is available using the Task Factory Dimension Merge transformation. No matter which option you choose, understanding
the strengths and weaknesses of each is critical towards selecting the best solution for the task at hand.
The SCD Transform is a wizard based component that consists
of five steps. The first step in the wizard requires that you select
the destination dimension table, map the input columns and
identify key columns. The second step allows you to configure
the SCD type for each column. The three types: Fixed (Type-
0), Changing (Type-1) and Historical (Type-2) allow for mixing
Slowly Changing Dimension Types within the dimension table.
The third, fourth and fifth steps allow for further configuration
of the SCD implementation by allowing you to configure the
behavior for Fixed and Changing Attributes, define how the
Historical versions are implemented and finally set-up support
for inferred members.
Once the wizard completes, a series of new transformations
are added to the data flow of your package to implement the
configured solution. While the built-in SCD Transform excels in ease-
of-use, its numerous drawbacks have been thoroughly discussed
and dissected in a number of books, blogs and white papers.
Slowly Changing
Dimension (SCD) Transform
Built-In SCD Transform
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 12
Starting with performance, the SCD Transform underachieves
both in the way in which source and dimension rows are compared
within the transform and by its reliance on the OLE DB Command
to handle the expiration of Type-2 rows as well as Type-1 updates.
As discussed previously, the OLE DB Command is a Row-by-Row
operation which is a significant drag on performance.
Manageability is also as issue since it is not possible to re-enter the
wizard to change or update the configuration option without the
transformation regenerating each of the downstream data flow
transformations. This may or may not be a huge issue depending
on your requirements but can be a headache if manually update
the downstream transforms for either performance tuning or
functionality reasons.
Despite its numerous issues, the SCD Transform has its place.
If your dimension is small and performance it not an issue, this
transform may be suitable as it is the easiest to implement and
requires nothing beyond the default installation of SSIS.
Custom SCD with SSIS
Implementing a custom SCD solution is handled in a manner
similar to the output of the SCD Transform. Instead of relying
on the SCD to look up and then compare rows, you as the
developer implement each of those task using data flow
transformations. In its simplest form, a custom SCD would use
a Lookup transformation to lookup the dimension rows. New
rows that were not matched to be bulk inserted using the OLE
DB Destination. Rows that matched would need to be compared
using an expression, the T-SQL CHECKSUM or another of the
methods that were previously discussed. A conditional split
transformation would be used to send each match row to the
appropriate output destination, whether Type-1, Type-2 or
Ignored for rows that have not changed.
The Custom SCD implementation gives you the most flexibility
as you would expect since you are responsible for implementing
Custom SCD
each and every step. While this flexibility can be beneficial it also
adds complexity to the solution particularly when the SCD is
extended to implement additional features such as surrogate key
management and inferred member support.
Performance is another area of concern. Building the Custom
SCD allows you to bypass the lookup and match performance
issues associated with the built-in SCD Transform, but if you use
OLE DB Commands it ultimately means you are going to face the
performance penalty of row-by-row operations. Issues could also
arise with the lookup as the dimension grows.
Stepping back to the discussion on Upserts with SSIS, two
patterns are applicable to help you get around these performance
issues. The Merge Join pattern will optimize and facilitate lookups
against large dimension tables, while you could implement
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 13
staging tables to handle perform set-based updates instead of
using the RBAR approach. Both of these patterns will improve
performance but add further complexity to the overall solution.
SCD with MERGE
ImplementingaSlowlyChangingDimensionwiththeT-SQLMERGE
is an almost identical solution to that discussed in the Upsert with
MERGE with just two key differences. First a straight-forward set-
based update is executed to handle all the Type-1 changes. Next,
instead of a straight MERGE statement as done with the Upsert,
an Insert over Merge is used to handle the expiration of Type-2
rows as well as the inserting the new version of the row.
For the MERGE to work, the matching criterion is configured
such that only matching rows with Type-2 changes are affected.
The update statement simply expires the current row. The Insert
over MERGE statement takes advantage of OUTPUT clause which
then allows you to pass the columns from your source and the
merge action in the form of the $action variable back out of
the merge. Using this functionality you can screen the rows that
where updated and pass them back into an insert statement to
complete the Type-2 change.
The benefits and drawbacks to this solution are exactly the same
as with the Upsert using MERGE. This solution performs extremely
well at the expense of both complexity and lack of manageability.
Sample Insert over Merge
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 14
Like the built-in SCD Transform, the Task Factory Dimension
Merge uses a wizard to allow for easy configuration of slowly
changing dimension support. You start by composing the existing
dimensions which includes identifying the business and surrogate
keys as well as configuring the SCD Type for each dimension
column. Column mappings between the source input and the
destination dimension are then defined and can be tweaked by
dragging and dropping the columns to create mappings.
From there, you get into more refined or advanced configuration
than is available in other implementations. You can configure
the Row Change Detection to ignore case, leading/trailing
spaces and nulls during comparisons. Advanced date handling
is supported for Type-2 changes to allow both specific date
endpoints and flexible flag columns to indicate current rows.
Other advanced features include built-in Surrogate Key Handling,
Inferred Member support, input and output row count auditing,
advanced component logging so you know what is happening
internally and a performance tab that allows you to suppress
warning, set timeouts, configure threading and select a hashing
algorithm to use.
SCD with Task Factory
Dimension Merge
Task Factory Dimension Merge UI
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 15
The Task Factory Dimension Merge does not perform any of the
inserts or updates required for the Slowly Changing Dimension.
Instead, each row is directed to one or more outputs and then
the outputs are handled by the developer working with the
transformation. Standard outputs are available for New, Updated
Type-1, Type-2 Expiry/ Type-1 Combined, Type-2 New, Invalid,
Unchanged and Deleted rows. In addition outputs are provided
for auditing and statistical information. The flexibility this
implementation provides allows the developer to choose the level
of complexity of the implementation in terms of either a row-by-
row or set-based update approach.
Task Factory Dimension Merge Implementation
Performance-wise the Task Factory Dimension Merge is
comparable to that of the Custom SCD implementation. While
the Custom SCD implementation will outperform the Dimension
Merge on smaller sets of data, the Dimension Merge excels as the
data set grows. Much like the Task Factory Upsert Destination,
the Dimension Merge also benefits from the simplicity in set-up
and manageability, saving you both time and effort and unlike
the built-in SCD transform; you have the ability to edit the
transformation configuration at any time without losing anything
downstream.
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 16
Test Cases
Source Size New Type-1 Type-2 Unchanged
15,000 rows 5,000 500 500 9,000
50,000 rows 20,000 1,000 1,000 23,000
100,000 rows 25,0000 5,000 5,000 65,000
SCD Performance Testing
Continuing the testing methodology used for the Upsert testing, a similar test was constructed for each SCD implementation discussed.
Each test consisted of a set of source data that contained both Type-1 and Type-2 changes as well as new rows and rows which were
unchanged. Every test was run three times and the average execution time was taken and used to calculate the throughput in terms of
rows per second. The hardware and environment set-up was the same as previously noted.
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 17
Performance Results
Overall Results
Built-In SCD Custom SCD Dimension Merge Merge
15,000 Rows 297.626921 3669.87441 2543.666271 10804.322
50,000 Rows 205.451308 2560.73203 2095.733087 15166.835
100,000 Rows 170.500949 406.19859 501.1501396 18192.844
Results in Rows per Second
Performance Complexity Manageability Configurability
Built-In SCD 4 1 3 3
Custom SCD 2 3 2 2
Dimension Merge 2 2 1 1
Merge 1 4 4 4
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 18
The big winner in terms of performance was the MERGE implementation and much like the previous test it also was the most complex and
least configurable and least manageable. The Dimension Merge and Custom SCD implementations are the most balanced approaches.
Both are similar in performance, with the Dimension Merge gaining an edge in terms of complexity, manageability and configurability.
The Built-In SCD transformation as expected performed the worst, yet is the simplest solution.
Summary
Pragmatic Works White Paper
Performance Tuning using Upsert and SCD
www.pragmaticworks.com PAGE 19
When it comes time to implement an Upsert and/or Slowly Changing Dimension you clearly have options. Often times, business
requirements and your environment will help eliminate one or more possible solutions. What remains requires that you balance the
performance needs with complexity, manageability and the opportunity for configuration whether it be to support auditing, logging or
error handling.
Integration Services offers you the opportunity to implement each of these tasks with a varying degree of support. When you use the
out-of-the-box tools however, regardless of the implementation selected, performance and complexity are directly correlated. The Task
Factory Upsert Destination and Dimension Merge on the other hand both represent a balance implementation. Both components offer
tangible performance while limiting the complexity found in other implementations. In addition, both will save you time and effort in
implementing either an Upsert or Slowly Changing Dimension.
Wrap-Up

More Related Content

What's hot

Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
ITCamp
 
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIESORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
Ludovico Caldara
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Ronald Francisco Vargas Quesada
 
The Database Environment Chapter 7
The Database Environment Chapter 7The Database Environment Chapter 7
The Database Environment Chapter 7
Jeanie Arnoco
 
The Database Environment Chapter 13
The Database Environment Chapter 13The Database Environment Chapter 13
The Database Environment Chapter 13
Jeanie Arnoco
 
Exploring Scalability, Performance And Deployment
Exploring Scalability, Performance And DeploymentExploring Scalability, Performance And Deployment
Exploring Scalability, Performance And Deploymentrsnarayanan
 
Tech-Spark: Scaling Databases
Tech-Spark: Scaling DatabasesTech-Spark: Scaling Databases
Tech-Spark: Scaling Databases
Ralph Attard
 
Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0
kshanmug2
 
Understanding System Performance
Understanding System PerformanceUnderstanding System Performance
Understanding System Performance
Teradata
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
sunildupakuntla
 
Sql server 2008 r2 perf and scale datasheet
Sql server 2008 r2 perf and scale   datasheetSql server 2008 r2 perf and scale   datasheet
Sql server 2008 r2 perf and scale datasheetKlaudiia Jacome
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
Alex Zaballa
 
Overview of Postgres 9.5
Overview of Postgres 9.5 Overview of Postgres 9.5
Overview of Postgres 9.5
EDB
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfoliogaryt1953
 
Oracle 12c New Features for Developers
Oracle 12c New Features for DevelopersOracle 12c New Features for Developers
Oracle 12c New Features for Developers
CompleteITProfessional
 

What's hot (16)

Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
 
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIESORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
 
Migration from 8.1 to 11.3
Migration from 8.1 to 11.3Migration from 8.1 to 11.3
Migration from 8.1 to 11.3
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
The Database Environment Chapter 7
The Database Environment Chapter 7The Database Environment Chapter 7
The Database Environment Chapter 7
 
The Database Environment Chapter 13
The Database Environment Chapter 13The Database Environment Chapter 13
The Database Environment Chapter 13
 
Exploring Scalability, Performance And Deployment
Exploring Scalability, Performance And DeploymentExploring Scalability, Performance And Deployment
Exploring Scalability, Performance And Deployment
 
Tech-Spark: Scaling Databases
Tech-Spark: Scaling DatabasesTech-Spark: Scaling Databases
Tech-Spark: Scaling Databases
 
Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0Day 1 Data Stage Administrator And Director 11.0
Day 1 Data Stage Administrator And Director 11.0
 
Understanding System Performance
Understanding System PerformanceUnderstanding System Performance
Understanding System Performance
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
 
Sql server 2008 r2 perf and scale datasheet
Sql server 2008 r2 perf and scale   datasheetSql server 2008 r2 perf and scale   datasheet
Sql server 2008 r2 perf and scale datasheet
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
Overview of Postgres 9.5
Overview of Postgres 9.5 Overview of Postgres 9.5
Overview of Postgres 9.5
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
 
Oracle 12c New Features for Developers
Oracle 12c New Features for DevelopersOracle 12c New Features for Developers
Oracle 12c New Features for Developers
 

Viewers also liked

Building Self-Service BI WP-7
Building Self-Service BI WP-7Building Self-Service BI WP-7
Building Self-Service BI WP-7
MILL5
 
Community Health Workers_Christy Gavitt_5.8.14
Community Health Workers_Christy Gavitt_5.8.14Community Health Workers_Christy Gavitt_5.8.14
Community Health Workers_Christy Gavitt_5.8.14CORE Group
 
Whitepaper Building Power BI Solutions with Power Query
Whitepaper  Building Power BI Solutions with Power QueryWhitepaper  Building Power BI Solutions with Power Query
Whitepaper Building Power BI Solutions with Power Query
MILL5
 
B ix press2013_v6
B ix press2013_v6B ix press2013_v6
B ix press2013_v6
MILL5
 
Sql Server 2012 Datasheet
Sql Server 2012 DatasheetSql Server 2012 Datasheet
Sql Server 2012 Datasheet
MILL5
 
Latest Learning and Resources for iCCM_Briggs
Latest Learning and Resources for iCCM_BriggsLatest Learning and Resources for iCCM_Briggs
Latest Learning and Resources for iCCM_BriggsCORE Group
 
Empowering Health Workers_Alison Annette Foster_10.17.13
Empowering Health Workers_Alison Annette Foster_10.17.13Empowering Health Workers_Alison Annette Foster_10.17.13
Empowering Health Workers_Alison Annette Foster_10.17.13CORE Group
 
Indian Clinical Research Industry
Indian Clinical Research IndustryIndian Clinical Research Industry
Indian Clinical Research Industry
tanmayshinde
 
Job satisfaction
Job satisfactionJob satisfaction
Job satisfaction
Sams Pharmacy
 

Viewers also liked (9)

Building Self-Service BI WP-7
Building Self-Service BI WP-7Building Self-Service BI WP-7
Building Self-Service BI WP-7
 
Community Health Workers_Christy Gavitt_5.8.14
Community Health Workers_Christy Gavitt_5.8.14Community Health Workers_Christy Gavitt_5.8.14
Community Health Workers_Christy Gavitt_5.8.14
 
Whitepaper Building Power BI Solutions with Power Query
Whitepaper  Building Power BI Solutions with Power QueryWhitepaper  Building Power BI Solutions with Power Query
Whitepaper Building Power BI Solutions with Power Query
 
B ix press2013_v6
B ix press2013_v6B ix press2013_v6
B ix press2013_v6
 
Sql Server 2012 Datasheet
Sql Server 2012 DatasheetSql Server 2012 Datasheet
Sql Server 2012 Datasheet
 
Latest Learning and Resources for iCCM_Briggs
Latest Learning and Resources for iCCM_BriggsLatest Learning and Resources for iCCM_Briggs
Latest Learning and Resources for iCCM_Briggs
 
Empowering Health Workers_Alison Annette Foster_10.17.13
Empowering Health Workers_Alison Annette Foster_10.17.13Empowering Health Workers_Alison Annette Foster_10.17.13
Empowering Health Workers_Alison Annette Foster_10.17.13
 
Indian Clinical Research Industry
Indian Clinical Research IndustryIndian Clinical Research Industry
Indian Clinical Research Industry
 
Job satisfaction
Job satisfactionJob satisfaction
Job satisfaction
 

Similar to Whitepaper Performance Tuning using Upsert and SCD (Task Factory)

Merging data (1)
Merging data (1)Merging data (1)
Merging data (1)
Ris Fernandez
 
introductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptxintroductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptx
YashaswiniSrinivasan1
 
Dbms schemas for decision support
Dbms schemas for decision supportDbms schemas for decision support
Dbms schemas for decision support
rameswara reddy venkat
 
PostGreSQL Performance Tuning
PostGreSQL Performance TuningPostGreSQL Performance Tuning
PostGreSQL Performance Tuning
Maven Logix
 
Migrate Access to SQL Server/Azure
Migrate Access to SQL Server/AzureMigrate Access to SQL Server/Azure
Migrate Access to SQL Server/Azure
ADNUG
 
DataCluster
DataClusterDataCluster
DataCluster
gystell
 
Teradata sql-tuning-top-10
Teradata sql-tuning-top-10Teradata sql-tuning-top-10
Teradata sql-tuning-top-10
Roland Wenzlofsky
 
Ssis optimization –better designs
Ssis optimization –better designsSsis optimization –better designs
Ssis optimization –better designsvarunragul
 
AWS RDS Migration Tool
AWS RDS Migration Tool AWS RDS Migration Tool
Orca: A Modular Query Optimizer Architecture for Big Data
Orca: A Modular Query Optimizer Architecture for Big DataOrca: A Modular Query Optimizer Architecture for Big Data
Orca: A Modular Query Optimizer Architecture for Big DataEMC
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx
政宏 张
 
Sql Server
Sql ServerSql Server
Sql Server
SandyShin
 
Sql server 2008 r2 performance and scale
Sql server 2008 r2 performance and scaleSql server 2008 r2 performance and scale
Sql server 2008 r2 performance and scaleKlaudiia Jacome
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 
Optimizing Callidus TrueComp Suite: Tips and Tricks
Optimizing Callidus TrueComp Suite: Tips and TricksOptimizing Callidus TrueComp Suite: Tips and Tricks
Optimizing Callidus TrueComp Suite: Tips and Tricks
Callidus Software
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / Sharding
Amir Reza Hashemi
 
Optimize access
Optimize accessOptimize access
Optimize access
Ala Esmail
 
E132833
E132833E132833
E132833
irjes
 
Mds cdc implementation
Mds cdc implementationMds cdc implementation
Mds cdc implementation
Sainatth Wagh
 
Sql good practices
Sql good practicesSql good practices
Sql good practices
Deepak Mehtani
 

Similar to Whitepaper Performance Tuning using Upsert and SCD (Task Factory) (20)

Merging data (1)
Merging data (1)Merging data (1)
Merging data (1)
 
introductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptxintroductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptx
 
Dbms schemas for decision support
Dbms schemas for decision supportDbms schemas for decision support
Dbms schemas for decision support
 
PostGreSQL Performance Tuning
PostGreSQL Performance TuningPostGreSQL Performance Tuning
PostGreSQL Performance Tuning
 
Migrate Access to SQL Server/Azure
Migrate Access to SQL Server/AzureMigrate Access to SQL Server/Azure
Migrate Access to SQL Server/Azure
 
DataCluster
DataClusterDataCluster
DataCluster
 
Teradata sql-tuning-top-10
Teradata sql-tuning-top-10Teradata sql-tuning-top-10
Teradata sql-tuning-top-10
 
Ssis optimization –better designs
Ssis optimization –better designsSsis optimization –better designs
Ssis optimization –better designs
 
AWS RDS Migration Tool
AWS RDS Migration Tool AWS RDS Migration Tool
AWS RDS Migration Tool
 
Orca: A Modular Query Optimizer Architecture for Big Data
Orca: A Modular Query Optimizer Architecture for Big DataOrca: A Modular Query Optimizer Architecture for Big Data
Orca: A Modular Query Optimizer Architecture for Big Data
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx
 
Sql Server
Sql ServerSql Server
Sql Server
 
Sql server 2008 r2 performance and scale
Sql server 2008 r2 performance and scaleSql server 2008 r2 performance and scale
Sql server 2008 r2 performance and scale
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Optimizing Callidus TrueComp Suite: Tips and Tricks
Optimizing Callidus TrueComp Suite: Tips and TricksOptimizing Callidus TrueComp Suite: Tips and Tricks
Optimizing Callidus TrueComp Suite: Tips and Tricks
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / Sharding
 
Optimize access
Optimize accessOptimize access
Optimize access
 
E132833
E132833E132833
E132833
 
Mds cdc implementation
Mds cdc implementationMds cdc implementation
Mds cdc implementation
 
Sql good practices
Sql good practicesSql good practices
Sql good practices
 

More from MILL5

Sql Server 2016_datasheet
Sql Server 2016_datasheetSql Server 2016_datasheet
Sql Server 2016_datasheet
MILL5
 
PowerBI Quick Start
PowerBI Quick StartPowerBI Quick Start
PowerBI Quick Start
MILL5
 
Azure Quick Start
Azure Quick StartAzure Quick Start
Azure Quick Start
MILL5
 
Analytics Platform System Information Card
Analytics Platform System Information CardAnalytics Platform System Information Card
Analytics Platform System Information Card
MILL5
 
About Pragmatic Works
About Pragmatic WorksAbout Pragmatic Works
About Pragmatic Works
MILL5
 
Windows Azure SQL Database Tutorials
Windows Azure SQL Database TutorialsWindows Azure SQL Database Tutorials
Windows Azure SQL Database Tutorials
MILL5
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorials
MILL5
 
The Forrester Wave of Self Service BI Platforms
The Forrester Wave of Self Service BI PlatformsThe Forrester Wave of Self Service BI Platforms
The Forrester Wave of Self Service BI Platforms
MILL5
 
Cloud on Your Terms: Hybrid IT Laminate
Cloud on Your Terms: Hybrid IT LaminateCloud on Your Terms: Hybrid IT Laminate
Cloud on Your Terms: Hybrid IT Laminate
MILL5
 
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPressWhitepaper Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
MILL5
 
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper  Troubleshooting SSIS Failures and Performance using BI xPressWhitepaper  Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
MILL5
 
Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16
MILL5
 
Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16
MILL5
 
Powerbi 130926080957-phpapp02
Powerbi 130926080957-phpapp02Powerbi 130926080957-phpapp02
Powerbi 130926080957-phpapp02
MILL5
 
Summer school
Summer schoolSummer school
Summer schoolMILL5
 

More from MILL5 (15)

Sql Server 2016_datasheet
Sql Server 2016_datasheetSql Server 2016_datasheet
Sql Server 2016_datasheet
 
PowerBI Quick Start
PowerBI Quick StartPowerBI Quick Start
PowerBI Quick Start
 
Azure Quick Start
Azure Quick StartAzure Quick Start
Azure Quick Start
 
Analytics Platform System Information Card
Analytics Platform System Information CardAnalytics Platform System Information Card
Analytics Platform System Information Card
 
About Pragmatic Works
About Pragmatic WorksAbout Pragmatic Works
About Pragmatic Works
 
Windows Azure SQL Database Tutorials
Windows Azure SQL Database TutorialsWindows Azure SQL Database Tutorials
Windows Azure SQL Database Tutorials
 
Windows azure sql_database_tutorials
Windows azure sql_database_tutorialsWindows azure sql_database_tutorials
Windows azure sql_database_tutorials
 
The Forrester Wave of Self Service BI Platforms
The Forrester Wave of Self Service BI PlatformsThe Forrester Wave of Self Service BI Platforms
The Forrester Wave of Self Service BI Platforms
 
Cloud on Your Terms: Hybrid IT Laminate
Cloud on Your Terms: Hybrid IT LaminateCloud on Your Terms: Hybrid IT Laminate
Cloud on Your Terms: Hybrid IT Laminate
 
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPressWhitepaper Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
 
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper  Troubleshooting SSIS Failures and Performance using BI xPressWhitepaper  Troubleshooting SSIS Failures and Performance using BI xPress
Whitepaper Troubleshooting SSIS Failures and Performance using BI xPress
 
Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16
 
Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16Sql server bi poweredby pw_v16
Sql server bi poweredby pw_v16
 
Powerbi 130926080957-phpapp02
Powerbi 130926080957-phpapp02Powerbi 130926080957-phpapp02
Powerbi 130926080957-phpapp02
 
Summer school
Summer schoolSummer school
Summer school
 

Recently uploaded

Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 

Recently uploaded (20)

Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 

Whitepaper Performance Tuning using Upsert and SCD (Task Factory)

  • 1. Performance Tuning using Upsert and SCD Written By: Chris Price cprice@pragmaticworks.com
  • 2. Contents Upserts 3 Upserts with SSIS 3 Upsert with MERGE 6 Upsert with Task Factory Upsert Destination 7 Upsert Performance Testing 8 Summary 10 Slowly Changing Dimensions 11 Slowly Changing Dimension (SCD) Transform 11 Custom SCD with SSIS 12 SCD with MERGE 13 SCD with Task Factory Dimension Merge 14 SCD Performance Testing 16 Summary 18 Wrap-Up 19
  • 3. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 3 Upserts Upsert is a portmanteau that blends the distinct actions of an Update and Insert and describes how both occur in the context of a single execution. Logically speaking, the Upsert process is extremely straight-forward. Source rows are compared to a destination, if a match is found based on some specified criteria the row is updated, otherwise the row is considered new and an insert occurs. While the process can become more complex if you decide to do conditional updates rather than doing blind updates, that is basically it. To implement an Upsert, you have three primary options in the SQL Server environment. The first and most obvious is using SSIS and its data flow components to orchestrate the Upsert process, the second is using the T-SQL Merge command and finally there is the Pragmatic Works Task Factory Upsert component. Upserts with SSIS Implementing an Upsert using purely SSIS is a trivial task that consists of a minimum of four data flow components. Data originating from any source are piped through a Lookup transformation and the output is split into two, one for rows matched in lookup and one for rows that were not matched. The no match output contains new rows that must be inserted using one of the supported destinations in SSIS. The matched rows are those that need to be updated and an OLE DB Command transformation is used to issue an update for each row. As a SQL Server BI Pro developing SSIS packages, you often encounter situations and scenarios that have a number of different solutions. Choosing the right solution often means balancing tangible performance requirements with more intangible requirements like making your packages more maintainable. This white paper will focus on the options for handling two of these scenarios: Upserts and Slowly-Changing Dimensions. We will review multiple implementation options for each situation, discuss how each is accomplished, review performance implications and the trades-offs for each in terms of complexity, manageability and opportunities for configuration of auditing, and look at logging and error handling.
  • 4. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 4 Standard SSIS Upsert As this solution is currently designed, every row from the source will either be inserted or updated. This may or may not be the desired behavior based on your business requirements. Most times, you will find that you can screen out rows that have not changed to improve performance by eliminating updates. To accomplish this you can use an expression in a conditional split, the T-SQL CHECKSUM function, if both your source and destination are SQL Server or a script transformation to generate a hash for each row. While this is as simple an Upsert gets in terms of implementation and maintenance, there are several obvious performance drawbacks to this approach as the volume of data grows. The first is the Lookup transformation. The throughput in terms of rows per second that you get through the lookup transformation is directly correlated to the cache mode you configure on the lookup. Full Cache is the optimal setting but depending on the size of your destination dataset, the time and amount of memory required may exceed what’s available. Partial Cache mode and No Cache mode on the other hand are performance killers and there are limited scenarios you should use either option. The second drawback and the one most commonly encountered in terms of performance issues is the OLE DB Command used to handle updates. The update command works row-by-row, meaning that if you have 10,000 rows to update, 10,000 updates will be issued sequentially. This form of processing is the opposite of batch processing you may be familiar with and has been termed RBAR or row-by-agonizing-row because of the severe effect it has on performance. Despite these drawbacks, this solution excels when the set of data contains no more than 20,000 rows. If you find that your dataset is larger, there are several workarounds to mitigate the drawbacks both of which come at the expense of maintainability and ease-of-use. When the Lookup transformation is the bottleneck, you can replace it with a Merge Join pattern. The Merge Join pattern facilitates reading both the source and destination in a single- pass which allows for handling large sets of data more efficiently.
  • 5. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 5 To use this pattern, you need an extra source to read in your destination data. Keep in mind that the Merge Join transformation requires two sorted inputs. Allowing the source to handle the sorting is the most efficient but requires that you configure the each Source as sorted. If your source does not support sorting, such as a text file, you must use a Sort Transformation. The Sort Transformation is a fully blocking transformation meaning that it must read all rows before it can output anything further degrading package performance. The Merge Join transform must be configured to use a left-join to allow both source rows that match the destination and those that do not to be passed down the data flow. A conditional split is then used to determine whether an Insert or Update is needed for each row. To overcome the row-by-row operation of the OLE DB Command, a staging table is needed to allow a single set-based Update to be called. After you created the staging table, replace the OLE DB Command with an OLE DB Destination and map the row columns to the columns in the staging table. In the control flow two Execute SQL Tasks are needed. The first precedes the Data Flow and simple truncates the staging table so that it is empty. The second Execute SQL Task follows the data flow and is responsible for issuing the set-based Update. When you combine both of these workarounds, the package actually will handle large sets of data with ease and even rivals the performance of the MERGE statement when working with sets of data that exceed 2 million rows. The trade-off however is obvious, supporting and maintaining the package is now an order of magnitude more difficult because of the additional moving pieces and data structures required.
  • 6. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 6 Upsert with MERGE Unlike the prior solution that uses SSIS to execute multiple DML statements to perform an Upsert operation, the MERGE feature in SQL Server provides a high performance and efficient way to perform the Upsert by calling both the Insert and Update in a single statement. To implement this solution you must stage all of your source data in a table on the destination database. In the same manner as the prior solution, an SSIS package can be used to orchestrate truncating the staging table, moving the data from the source to the staging table and then executing the MERGE command. The difference exists in the T-SQL MERGE command. While a detailed explanation of the MERGE statement is beyond the scope of this white paper the MERGE combines both inserts and updates into a single pass of the data using define criteria to T-SQL MERGE Statement determine when records match and what operations to perform when either a match is or is not found. The drawback to this method is in the complexity of the statement as the accompanying figure illustrates. Beyond the complexity of the syntax, control is also sacrificed as the MERGE statement is essentially a black box. When you use the MERGE command you have no control or error handling ability, if a single record fails either on insert or update, the entire transaction is rolled back. It’s clear that what the solution provides in terms of performance and efficiency comes at the cost of complexity and loss of control. A final note on MERGE is also required. If you find yourself working on any version of SQL Server prior to 2008, this solution is not applicable as the MERGE statement was first introduced in SQL Server 2008
  • 7. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 7 Task Factory Upsert Destination UI Upsert with Task Factory Upsert Destination The Upsert Destination is a component included in the Pragmatic Works Task Factory library of components and is a balanced alternative when implementing an Upsert operation. Without sacrificing performance, much of the complexity is abstracted away from the developer and is boiled down to configuring settings across three tabs. To implement the Upsert Destination, drag-and-drop the Upsert Destination component to your data flow design surface. The component requires an ADO.Net connection, so you will need to create one if one does not already exist. From there, you simply configure the Destination table, map your source columns to destination columns (making sure to identify the key column) and choose your update method and you are ready to go. Upsert Destination supports four update methods out of the box. The first and fastest is the Bulk Update. This method is similar to the one that has been discussed previously as all rows that exist in the destination are updated. You can also fine tune the update by choosing to do updates based on timestamps, a last updated column or even a configurable column comparison. Beyond the update method you can easily configure the component to update a Last Modified column, enable identity inserts, provide insert and update row counts as well as control take control over the transactional container. While none of these features are unique to the Task Factory Upsert Destination, the ease with which you can be up and running is huge in terms of a developer’s time and effort. When you consider that there are no staging tables required, no special requirements of the source data, no workarounds needed and the component works with SQL Server 2005 and up it is a solid option to consider.
  • 8. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 8 Upsert Performance Testing To assess each of the methods discussed a simple test was performed. In each test the bulk update method in which all rows are either inserted or updated was used. The testing methodology required that each test be run three times, taking the average execution time for all three executions then calculating the throughput in rows per second as the result. The results were then pared with rankings for each method according to complexity, manageability and configurability. Prior to each test being run the SQL Server cache and buffers were cleared using DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS. All tests were run on an IBM x220 laptop with an i7 2640M processor and 16GB of RAM. A default install of SQL Server 2012, with the maximum server memory set to 2GB was used for all database operations. Test Case Size Rows Inserted Rows Updated 10,000 6,500 3,500 100,000 65,000 35,000 500,000 325,000 175,000 1,000,000 650,000 350,000 Test Cases
  • 9. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 9 Performance Results Overall Results Merge Upsert Destination SSIS (Batch) SSIS 10,000 6917.223887 5169.73979 6609.385327 4144.791379 100,000 28873.91723 19040.36558 28533.38406 1448.862402 500,000 37736.79841 24491.79525 36840.55408 1525.442861 1,000,000 36777.32555 24865.93119 33549.91668 1596.765592 Results in Rows per Second Performance Complexity Manageability Configurability Merge 1 4 4 4 Upsert Destination 3 1 2 3 SSIS (Batch) 2 3 3 2 SSIS 4 2 1 1
  • 10. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 10 As expected, from a pure performance perspective the Upsert with Merge outperformed all other methods of implementing an Upsert operation. It also easily topped all others in terms of complexity while being the least manageable and least configurable. The SSIS (Batch) method also performed well as it is able to take advantage of bulk inserts into a staging table and followed by a set-based update. While not as complex as the MERGE method it does require both sorted sources and staging tables ultimately bumping its manageability down. The Upsert Destination performed well and was the only method whose performance did not degrade through-out testing. It also tested out as the least complex and most manageable method for implementing an Upsert operation. Finally, the SSIS implement while being easy to manage and allowing for the greatest degree of configuration it performed the worst. Summary
  • 11. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 11 Slowly Changing Dimensions When Slowly Changing Dimensions are discussed the two primary types considered are Type-1 and Type-2 Slowly Changing Dimensions. Recalling that the difference between these two types depends on whether history is tracked when the dimension changes the fundamental implementation of each is the same. In terms of implementation options you have three available out of the box. You can use the Slowly Changing Dimension transformation, implement custom slowly changing dimension logic or use the Insert over MERGE. A fourth option is available using the Task Factory Dimension Merge transformation. No matter which option you choose, understanding the strengths and weaknesses of each is critical towards selecting the best solution for the task at hand. The SCD Transform is a wizard based component that consists of five steps. The first step in the wizard requires that you select the destination dimension table, map the input columns and identify key columns. The second step allows you to configure the SCD type for each column. The three types: Fixed (Type- 0), Changing (Type-1) and Historical (Type-2) allow for mixing Slowly Changing Dimension Types within the dimension table. The third, fourth and fifth steps allow for further configuration of the SCD implementation by allowing you to configure the behavior for Fixed and Changing Attributes, define how the Historical versions are implemented and finally set-up support for inferred members. Once the wizard completes, a series of new transformations are added to the data flow of your package to implement the configured solution. While the built-in SCD Transform excels in ease- of-use, its numerous drawbacks have been thoroughly discussed and dissected in a number of books, blogs and white papers. Slowly Changing Dimension (SCD) Transform Built-In SCD Transform
  • 12. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 12 Starting with performance, the SCD Transform underachieves both in the way in which source and dimension rows are compared within the transform and by its reliance on the OLE DB Command to handle the expiration of Type-2 rows as well as Type-1 updates. As discussed previously, the OLE DB Command is a Row-by-Row operation which is a significant drag on performance. Manageability is also as issue since it is not possible to re-enter the wizard to change or update the configuration option without the transformation regenerating each of the downstream data flow transformations. This may or may not be a huge issue depending on your requirements but can be a headache if manually update the downstream transforms for either performance tuning or functionality reasons. Despite its numerous issues, the SCD Transform has its place. If your dimension is small and performance it not an issue, this transform may be suitable as it is the easiest to implement and requires nothing beyond the default installation of SSIS. Custom SCD with SSIS Implementing a custom SCD solution is handled in a manner similar to the output of the SCD Transform. Instead of relying on the SCD to look up and then compare rows, you as the developer implement each of those task using data flow transformations. In its simplest form, a custom SCD would use a Lookup transformation to lookup the dimension rows. New rows that were not matched to be bulk inserted using the OLE DB Destination. Rows that matched would need to be compared using an expression, the T-SQL CHECKSUM or another of the methods that were previously discussed. A conditional split transformation would be used to send each match row to the appropriate output destination, whether Type-1, Type-2 or Ignored for rows that have not changed. The Custom SCD implementation gives you the most flexibility as you would expect since you are responsible for implementing Custom SCD each and every step. While this flexibility can be beneficial it also adds complexity to the solution particularly when the SCD is extended to implement additional features such as surrogate key management and inferred member support. Performance is another area of concern. Building the Custom SCD allows you to bypass the lookup and match performance issues associated with the built-in SCD Transform, but if you use OLE DB Commands it ultimately means you are going to face the performance penalty of row-by-row operations. Issues could also arise with the lookup as the dimension grows. Stepping back to the discussion on Upserts with SSIS, two patterns are applicable to help you get around these performance issues. The Merge Join pattern will optimize and facilitate lookups against large dimension tables, while you could implement
  • 13. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 13 staging tables to handle perform set-based updates instead of using the RBAR approach. Both of these patterns will improve performance but add further complexity to the overall solution. SCD with MERGE ImplementingaSlowlyChangingDimensionwiththeT-SQLMERGE is an almost identical solution to that discussed in the Upsert with MERGE with just two key differences. First a straight-forward set- based update is executed to handle all the Type-1 changes. Next, instead of a straight MERGE statement as done with the Upsert, an Insert over Merge is used to handle the expiration of Type-2 rows as well as the inserting the new version of the row. For the MERGE to work, the matching criterion is configured such that only matching rows with Type-2 changes are affected. The update statement simply expires the current row. The Insert over MERGE statement takes advantage of OUTPUT clause which then allows you to pass the columns from your source and the merge action in the form of the $action variable back out of the merge. Using this functionality you can screen the rows that where updated and pass them back into an insert statement to complete the Type-2 change. The benefits and drawbacks to this solution are exactly the same as with the Upsert using MERGE. This solution performs extremely well at the expense of both complexity and lack of manageability. Sample Insert over Merge
  • 14. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 14 Like the built-in SCD Transform, the Task Factory Dimension Merge uses a wizard to allow for easy configuration of slowly changing dimension support. You start by composing the existing dimensions which includes identifying the business and surrogate keys as well as configuring the SCD Type for each dimension column. Column mappings between the source input and the destination dimension are then defined and can be tweaked by dragging and dropping the columns to create mappings. From there, you get into more refined or advanced configuration than is available in other implementations. You can configure the Row Change Detection to ignore case, leading/trailing spaces and nulls during comparisons. Advanced date handling is supported for Type-2 changes to allow both specific date endpoints and flexible flag columns to indicate current rows. Other advanced features include built-in Surrogate Key Handling, Inferred Member support, input and output row count auditing, advanced component logging so you know what is happening internally and a performance tab that allows you to suppress warning, set timeouts, configure threading and select a hashing algorithm to use. SCD with Task Factory Dimension Merge Task Factory Dimension Merge UI
  • 15. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 15 The Task Factory Dimension Merge does not perform any of the inserts or updates required for the Slowly Changing Dimension. Instead, each row is directed to one or more outputs and then the outputs are handled by the developer working with the transformation. Standard outputs are available for New, Updated Type-1, Type-2 Expiry/ Type-1 Combined, Type-2 New, Invalid, Unchanged and Deleted rows. In addition outputs are provided for auditing and statistical information. The flexibility this implementation provides allows the developer to choose the level of complexity of the implementation in terms of either a row-by- row or set-based update approach. Task Factory Dimension Merge Implementation Performance-wise the Task Factory Dimension Merge is comparable to that of the Custom SCD implementation. While the Custom SCD implementation will outperform the Dimension Merge on smaller sets of data, the Dimension Merge excels as the data set grows. Much like the Task Factory Upsert Destination, the Dimension Merge also benefits from the simplicity in set-up and manageability, saving you both time and effort and unlike the built-in SCD transform; you have the ability to edit the transformation configuration at any time without losing anything downstream.
  • 16. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 16 Test Cases Source Size New Type-1 Type-2 Unchanged 15,000 rows 5,000 500 500 9,000 50,000 rows 20,000 1,000 1,000 23,000 100,000 rows 25,0000 5,000 5,000 65,000 SCD Performance Testing Continuing the testing methodology used for the Upsert testing, a similar test was constructed for each SCD implementation discussed. Each test consisted of a set of source data that contained both Type-1 and Type-2 changes as well as new rows and rows which were unchanged. Every test was run three times and the average execution time was taken and used to calculate the throughput in terms of rows per second. The hardware and environment set-up was the same as previously noted.
  • 17. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 17 Performance Results Overall Results Built-In SCD Custom SCD Dimension Merge Merge 15,000 Rows 297.626921 3669.87441 2543.666271 10804.322 50,000 Rows 205.451308 2560.73203 2095.733087 15166.835 100,000 Rows 170.500949 406.19859 501.1501396 18192.844 Results in Rows per Second Performance Complexity Manageability Configurability Built-In SCD 4 1 3 3 Custom SCD 2 3 2 2 Dimension Merge 2 2 1 1 Merge 1 4 4 4
  • 18. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 18 The big winner in terms of performance was the MERGE implementation and much like the previous test it also was the most complex and least configurable and least manageable. The Dimension Merge and Custom SCD implementations are the most balanced approaches. Both are similar in performance, with the Dimension Merge gaining an edge in terms of complexity, manageability and configurability. The Built-In SCD transformation as expected performed the worst, yet is the simplest solution. Summary
  • 19. Pragmatic Works White Paper Performance Tuning using Upsert and SCD www.pragmaticworks.com PAGE 19 When it comes time to implement an Upsert and/or Slowly Changing Dimension you clearly have options. Often times, business requirements and your environment will help eliminate one or more possible solutions. What remains requires that you balance the performance needs with complexity, manageability and the opportunity for configuration whether it be to support auditing, logging or error handling. Integration Services offers you the opportunity to implement each of these tasks with a varying degree of support. When you use the out-of-the-box tools however, regardless of the implementation selected, performance and complexity are directly correlated. The Task Factory Upsert Destination and Dimension Merge on the other hand both represent a balance implementation. Both components offer tangible performance while limiting the complexity found in other implementations. In addition, both will save you time and effort in implementing either an Upsert or Slowly Changing Dimension. Wrap-Up