This document discusses data replication and Informatica's data replication solution. It defines data replication as automating the cloning of thousands of application tables in real-time while managing transaction data capture, routing, and delivery. Informatica's data replication provides continuous availability during upgrades, reduces IT costs by offloading to lower cost systems, and enables uninterrupted migrations. It replicates transactional changes between source and target systems with high extraction and apply speeds. The solution benefits data warehouses, real-time reporting, migrations, and auditing requirements.
The presentation begins with an introduction to Informatica and its focus on Data Replication.
Data Replication automates cloning application data and provides continuous availability, ensuring no downtime during upgrades and maintenance.
Market consensus highlights the flexibility of Data Replication for integration and migration, making it a top request in Informatica's portfolio.
Informatica's solution supports various database systems, enabling fast data replication and real-time reporting while ensuring system performance.
A private financial firm successfully upgraded Oracle systems with Informatica's solutions, minimizing customer disruptions and reducing operational costs.
The solution's unique strengths include rapid deployment, optimization for performance, and compatibility across various sources and targets.
Provides additional resources and a comparison of functionalities between Informatica Data Replication and PowerCenter/PowerExchange.
3
What is DataReplication?
A set of technologies that automates the cloning
of application data – thousands and thousands of
tables at once
It also manages the capture, routing and delivery
of transaction data in real-time
4.
4
Continuous Availability
No downtimeduring hardware or software upgrades
No downtime during application maintenance
Current Information for decision making
Fresh data always available for reporting - Including Transactional Audit Data
Up-to-date information for critical business systems
Reduce and Control IT Spend
Choose cost effective systems for analysis
Offload reporting to lower cost systems
Migrate applications from expensive systems
Business Drivers for Data Replication
The needs and pains
5.
5
What the marketis saying
Customers and Analysts agree…
“Data replication is a flexible technology that can be used for many purposes,
including data integration, data movement for data warehouses and ODS,
reporting server, migrations, and high-availability requirements. Transactional
log-based replication is one of the most common usage scenarios”
Forrester TechRadar™ “Q1 2010 Enterprise Data Integration report “
At the CAB (Customer Advisory Board) meetings, the # 1 asked for
enhancement to Informatica’s portfolio has been Data Replication
“Data Replication tools are the second most popular data
integration tools types after ETL tools”
Gartner’s 6/6/08 report “Survey on Data Integration
Practices Shows Move Toward Strategic Initiatives”
6.
6
Informatica Data ReplicationSolution
Source Systems Target Systems
Oracle
Teradata
Netezza
Greenplum
MySQL
SQL Server
DB2/UDB
PostreSQL
HP Vertica
Sybase ASE
Databases
Data
Warehouse
Appliances
High Speed Extraction
High Speed Apply
Data Replication: Replicate Transaction-oriented incremental changes from source to target
Fast Clone: copy data directly from Oracle to a supported target, in a batch mode process
7.
7
Informatica’s Data Replication
Thebenefits of Data Replication in your environment
Supports Active
Data Warehouses
Ensures Fresher, Up-to-date information from business critical
online systems
Leverages investment in potential existing ‘ETL’ data warehouse
approach
Enables real-time
Reporting with
Zero Impact to the
Production
Systems
Avoid Performance impact on your operational Database
Get Real-Time Data Capture
Up-to-date information for critical business systems
Provides
Uninterrupted
Migration Path
Enables movement from an older OS or database platform version
to a new one with no interruption to operations
Live Auditing Reduce compliance risks with full audit trail
8.
8
• Reduce downtimeon any
planned maintenance
• Lower costs
• Mitigate risk by moving from
unsupported database
version
This private financial firm had to upgrade the oracle
systems and migrate to Linux to achieve currency on
the versions without causing any disruptions to their
customers .
KEY BUSINESS IMPERATIVE AND IT INITIATIVE
INFORMATICA ADVANTAGE RESULTS/BENEFITS
• Minimum disruption to first line
business applications
• Keep the operational systems up
to date with latest versions of
database software
• Reduce cost by migrating to Linux
• Ensure there is no overhead on
the source system
THE CHALLENGE
• Ensure no system overhead on
the source system by
performing log based capture
• Effectively synchronize the
source and target systems and
continue to replicate the
changes till the switchover
• Seamless transition from initial
synchronization to real time
change data capture
Informatica Success Story:
Private Commercial and Financial firm
9.
9
Informatica’s Data ReplicationSolution
The Unique Strengths
Fast time to Value
Rapid deployment and quick time to completion of projects
Data Replication User Interface facilitates quick specification and deployment
Optimized for Operational Reporting & Active data warehousing
appliance loading
Data Replication uses innovative techniques to optimize transaction capture
and Appliance loading
Heterogeneous across a wide array of source, targets and OS’s
Data Replication is architected to support transactional replication across a
breadth of sources and targets
12
Positioning Replication andPWC/PWX
Capability
Informatica
Data Replication
PowerCenter +
PowerExchange CDC
Purpose built U/I
Easy to configure and deploy for
replication scenarios
For development in complex
projects requiring extensive
transformations
Number of Tables Any number Max 70 to 90 tables per session
Transformation
Simple Transformations
(Substring, Data Type…)
Advanced Transformation (DQ,
Sorters, Aggregators...)
End-End Volume/Throughput
(apply to RDBMS):
Optimized for transactional
replication throughput (handles
Telco volumes – Big Data)
Medium (high capture rates,
transform and apply through
PC)
End-End Volume/Throughput –
(apply to Appliances natively)
High (Microbatch Merge Apply)
Medium (high capture rates,
transform and apply through
PC)
Sources
Relational: Oracle, log based
SQL Server, DB2 LUW,
Netezza.
Will add more over time.
Mainframe, iSeries and
Relational CDC
Informatica: Internal and Channel Use Only
Editor's Notes
#3 Hello everyone! I will be walking you through this introduction of Informatica’s Data Replication Solution. In this brief presentation we are going to provide a high level overview of and discuss how Data Replication enables companies and organizations to meet the challenges posed by modern day requirements in running 24x7x365 business.
#4 Now before we start it is important to have a common understanding of what Data Replication is and how it is defined in the industry.
#5 Sound Decisions need Accurate Data – The ability to react quickly to changes is increasingly important in the ‘always online’ environment in which we operate. Having immediate access to accurate, up to date information is a key factor in making the correct decisions. How many times does the business make a decision based on untimely information? And what are the ramifications in cost?
-----------------
IT Costs must be controlled – Not all IT platforms are equal – and re-hosting Applications, or simply offloading non-critical query access to a cheaper platform to a more focused audience can help effect real savings.
-----------------
Downtime costs money – Although hardware duplication and high availability solutions can help to eliminate unplanned downtime caused by failures, there are still occasions when ‘planned’ downtime is needed to implement hardware or software upgrades – and/or to absorb new applications as a result of a merger or acquisition. For every unproductive hour of downtime how much does that cost a company in terms of both monetary cost and reputation?
#6 This is a proof point on slide 4 from that is externally validating the pain and the cost of the pain.
Don’t just take our word for it, here are what our customers are telling us and here are what the key analysts that buyers routinely talk to are saying as well.
#7 I will briefly introduce our product offerings in this architecture diagram and then in a moment we will see how these offerings integrate with a variety of business use cases, some of which you may be experiencing now.
First, Informatica Data Replication allows users to share information across different systems in a heterogeneous environment, manage and audit database transactional data.
Informatica Data Replication enables high-performance data replication between different hardware platforms and data sources without losing the transactional integrity of the data, replicating from source to target in real time.
Informatica Data Replication provides immediate update of the targets shown with the most recent transactional changes extracted from Oracle, Netezza, SQL Server and DB2/UDB.
Informatica Fast Clone enables high performance copies directly from Oracle to a depicted target of choice and is used where low latency is not a big requirement. This a Batch move.
#8 Let’s now take a look at the Replication use cases supported by Informatica’s Data Replication solution.
For the Active Data Warehouse use case, business insight is improved through real-time BI Data Warehousing.
It makes a Good Warehouse even better by providing up to the minute information for reporting.
-----------------
Offlloading Operational Reporting: This reduces barriers to sharing data because the data can be offloaded and still be accessed in near real-time for current reporting needs.
-----------------
Zero Downtime Migration: This enables continuous operations for mission critical applications - which eliminates unplanned outages and reduces the cost of planned outages.
Reduces risk by ensuring data integrity and reliability between source and target systems. We will see an example of this in our customer use case.
-----------------
Live Auditing makes it possible to Address Compliance Requirements by providing a complete audit trail of all changes made to critical production systems, if required.
#9 This is an example of the Zero Downtime Migration – in real time - use case.
Eliminating database downtime is a significant challenge for IT organizations that need to upgrade or migrate (both, here!) mission-critical systems. This is particularly true for applications that must provide continuous or near-real time operations to clients who increasingly expect uninterrupted availability to get to their data. Any application outage, whether planned or not, can have an impact on both the revenue and the reputation of the business. Leveraging our Data Replication solution, these challenges were overcome. During their target load period, the client was able to replicate and store changes from the source until the switchover occurred. At that point in time, all of the incremental changes that had accumulated from the source database since the initial target load start time, were loaded into the target database. When the switchover occurred, it was seamless.
#10 So Why Informatica for Data Replication?
Now let’s look at some of the assets which we believe make Informatica’s Data Replication Solution Unique
Rapid Implementation – by using the Informatica Data Replication Graphical Design Tool, the process of defining which tables are to be replicated, to which targets is accomplished quickly and efficiently, and the same configuration is used for both Initial Synchronization, and for continuing replication once targets have been populated.
Designed to Exploit Warehouse Appliances – because Informatica Data Replication was designed with Active Data Warehousing in mind, particular efforts have been made to efficiently exploit the specific interfaces provided by the Warehouse Appliances which we support.
Any to Any Replication – Informatica develops Data Integration Solutions – not Databases. So it is important to us that Informatica Data Replication can handle any source and any target to give our customers complete freedom of choice of Operational and Data Warehouse and/or BI Reporting platforms. And if you do review past decisions and change your mind about your database, you won’t have to change your Data Integration Solution.
#13 The overlap of our Data Replication solution and the PowerExchange CDC solution is minimal, when looking at the definition of Replication, and the use cases involved. In the past PowerExchange CDC and PowerCenter have been leveraged in some mostly uni-directional use cases to try to fill the Replication gap and now we have solution that Does fill the gap.
And, keep in mind that for the Active Data Warehouse scenario, for instance, it could be quite possible that there is a need to quickly and on a continuous basis replicate data from an operational system into some type of operational staging area for subsequent use by a new or existing PowerCenter proess. Remember that Data Replication covers pure data replication and high speed data transfer use cases with lighter transformation capability. Let’s now take a few minutes and get specific on when to look at which solution.
If there is a need for complex transformations and/or data quality integration, then PWC will provide your solution.
If however there is a need to build out and deploy a data movement scenario where there are little to no transformations, then Data Replication is your solution.
Throughput factor: Data Replication supports native apply mechanisms to these appliances: Teradata, Netezza, Greenplum, Vertica and provides focused optimized SQL to ensure high speed target loading.
Another compelling Throughput factor in considering the Data Replication solution: only changed data is propagated from source to target – not all columns which is the way that it works today for PWX CDC except for Mainframe DB2. That can offer compelling performance results.
And, it is very important to take a look at what sources and targets are needed for any particular Data Replication Use Case. As of right now, if Mainframe (zSeries) or AS/400 (iSeries) is involved, there is not currently a solution. However, this is on the Roadmap for 2012 and see your friendly Replication Product Manager for more details.
Here is a point specifically around our existing PWX Oracle CDC solution. If you are wondering whether to use PWX Oracle CDC (which includes both the LogMiner and Express options) or our Data Replication solution – the short answer is: if there is little to no (i.e. outside of data type conversion) transformation – regardless of the number of tables involved – take advantage of our Data Replication solution. Also… if you need to process Oracle logs off the Oracle Server or you need to process CLOB, LOB type columns, then Data Replication is also your solution available right now.
Note that PWX Ora Express for 9.x will be out end of August 2011 and will add support for RAC/ASM (Oracle) and support for 9.x of PowerCenter.
So – overall, looking at the two solutions at a very high level:
Data Replication is about More Replication, Little Transformation, the ability to perform massive set at a time processing.
PowerCenter + PowerExchange CDC is about Complex Transformations, Environments, Little Replication, Table at a time processing.
And, very important: if you are getting to this level of technical detail in any client discussion, this is where you will need to engage a technical resource, versed in our real-time data connectivity offerings, to support you in coming up with the best approach for that individual client’s data integration initiatives. This matrix is meant to be an overall guide to help you understand some of the distinctions/boundaries between the two solutions. So you want to work closely with your Informatica Account Team and your customer or prospect to come up with the best solution possible.