1
Informatica
Data Replication
3
What is Data Replication?
A set of technologies that automates the cloning
of application data – thousands and thousands of
tables at once
It also manages the capture, routing and delivery
of transaction data in real-time
4
Continuous Availability
No downtime during hardware or software upgrades
No downtime during application maintenance
Current Information for decision making
Fresh data always available for reporting - Including Transactional Audit Data
Up-to-date information for critical business systems
Reduce and Control IT Spend
Choose cost effective systems for analysis
Offload reporting to lower cost systems
Migrate applications from expensive systems
Business Drivers for Data Replication
The needs and pains
5
What the market is saying
Customers and Analysts agree…
“Data replication is a flexible technology that can be used for many purposes,
including data integration, data movement for data warehouses and ODS,
reporting server, migrations, and high-availability requirements. Transactional
log-based replication is one of the most common usage scenarios”
Forrester TechRadar™ “Q1 2010 Enterprise Data Integration report “
At the CAB (Customer Advisory Board) meetings, the # 1 asked for
enhancement to Informatica’s portfolio has been Data Replication
“Data Replication tools are the second most popular data
integration tools types after ETL tools”
Gartner’s 6/6/08 report “Survey on Data Integration
Practices Shows Move Toward Strategic Initiatives”
6
Informatica Data Replication Solution
Source Systems Target Systems
Oracle
Teradata
Netezza
Greenplum
MySQL
SQL Server
DB2/UDB
PostreSQL
HP Vertica
Sybase ASE
Databases
Data
Warehouse
Appliances
High Speed Extraction
High Speed Apply
Data Replication: Replicate Transaction-oriented incremental changes from source to target
Fast Clone: copy data directly from Oracle to a supported target, in a batch mode process
7
Informatica’s Data Replication
The benefits of Data Replication in your environment
Supports Active
Data Warehouses
 Ensures Fresher, Up-to-date information from business critical
online systems
 Leverages investment in potential existing ‘ETL’ data warehouse
approach
Enables real-time
Reporting with
Zero Impact to the
Production
Systems
 Avoid Performance impact on your operational Database
 Get Real-Time Data Capture
 Up-to-date information for critical business systems
Provides
Uninterrupted
Migration Path
 Enables movement from an older OS or database platform version
to a new one with no interruption to operations
Live Auditing  Reduce compliance risks with full audit trail
8
• Reduce downtime on any
planned maintenance
• Lower costs
• Mitigate risk by moving from
unsupported database
version
This private financial firm had to upgrade the oracle
systems and migrate to Linux to achieve currency on
the versions without causing any disruptions to their
customers .
KEY BUSINESS IMPERATIVE AND IT INITIATIVE
INFORMATICA ADVANTAGE RESULTS/BENEFITS
• Minimum disruption to first line
business applications
• Keep the operational systems up
to date with latest versions of
database software
• Reduce cost by migrating to Linux
• Ensure there is no overhead on
the source system
THE CHALLENGE
• Ensure no system overhead on
the source system by
performing log based capture
• Effectively synchronize the
source and target systems and
continue to replicate the
changes till the switchover
• Seamless transition from initial
synchronization to real time
change data capture
Informatica Success Story:
Private Commercial and Financial firm
9
Informatica’s Data Replication Solution
The Unique Strengths
Fast time to Value
Rapid deployment and quick time to completion of projects
Data Replication User Interface facilitates quick specification and deployment
Optimized for Operational Reporting & Active data warehousing
appliance loading
Data Replication uses innovative techniques to optimize transaction capture
and Appliance loading
Heterogeneous across a wide array of source, targets and OS’s
Data Replication is architected to support transactional replication across a
breadth of sources and targets
10
1
Additional Learning
Information you might find interesting
12
Positioning Replication and PWC/PWX
Capability
Informatica
Data Replication
PowerCenter +
PowerExchange CDC
Purpose built U/I
Easy to configure and deploy for
replication scenarios
For development in complex
projects requiring extensive
transformations
Number of Tables Any number Max 70 to 90 tables per session
Transformation
Simple Transformations
(Substring, Data Type…)
Advanced Transformation (DQ,
Sorters, Aggregators...)
End-End Volume/Throughput
(apply to RDBMS):
Optimized for transactional
replication throughput (handles
Telco volumes – Big Data)
Medium (high capture rates,
transform and apply through
PC)
End-End Volume/Throughput –
(apply to Appliances natively)
High (Microbatch Merge Apply)
Medium (high capture rates,
transform and apply through
PC)
Sources
Relational: Oracle, log based
SQL Server, DB2 LUW,
Netezza.
Will add more over time.
Mainframe, iSeries and
Relational CDC
Informatica: Internal and Channel Use Only

informatica data replication (IDR)

  • 1.
  • 2.
  • 3.
    3 What is DataReplication? A set of technologies that automates the cloning of application data – thousands and thousands of tables at once It also manages the capture, routing and delivery of transaction data in real-time
  • 4.
    4 Continuous Availability No downtimeduring hardware or software upgrades No downtime during application maintenance Current Information for decision making Fresh data always available for reporting - Including Transactional Audit Data Up-to-date information for critical business systems Reduce and Control IT Spend Choose cost effective systems for analysis Offload reporting to lower cost systems Migrate applications from expensive systems Business Drivers for Data Replication The needs and pains
  • 5.
    5 What the marketis saying Customers and Analysts agree… “Data replication is a flexible technology that can be used for many purposes, including data integration, data movement for data warehouses and ODS, reporting server, migrations, and high-availability requirements. Transactional log-based replication is one of the most common usage scenarios” Forrester TechRadar™ “Q1 2010 Enterprise Data Integration report “ At the CAB (Customer Advisory Board) meetings, the # 1 asked for enhancement to Informatica’s portfolio has been Data Replication “Data Replication tools are the second most popular data integration tools types after ETL tools” Gartner’s 6/6/08 report “Survey on Data Integration Practices Shows Move Toward Strategic Initiatives”
  • 6.
    6 Informatica Data ReplicationSolution Source Systems Target Systems Oracle Teradata Netezza Greenplum MySQL SQL Server DB2/UDB PostreSQL HP Vertica Sybase ASE Databases Data Warehouse Appliances High Speed Extraction High Speed Apply Data Replication: Replicate Transaction-oriented incremental changes from source to target Fast Clone: copy data directly from Oracle to a supported target, in a batch mode process
  • 7.
    7 Informatica’s Data Replication Thebenefits of Data Replication in your environment Supports Active Data Warehouses  Ensures Fresher, Up-to-date information from business critical online systems  Leverages investment in potential existing ‘ETL’ data warehouse approach Enables real-time Reporting with Zero Impact to the Production Systems  Avoid Performance impact on your operational Database  Get Real-Time Data Capture  Up-to-date information for critical business systems Provides Uninterrupted Migration Path  Enables movement from an older OS or database platform version to a new one with no interruption to operations Live Auditing  Reduce compliance risks with full audit trail
  • 8.
    8 • Reduce downtimeon any planned maintenance • Lower costs • Mitigate risk by moving from unsupported database version This private financial firm had to upgrade the oracle systems and migrate to Linux to achieve currency on the versions without causing any disruptions to their customers . KEY BUSINESS IMPERATIVE AND IT INITIATIVE INFORMATICA ADVANTAGE RESULTS/BENEFITS • Minimum disruption to first line business applications • Keep the operational systems up to date with latest versions of database software • Reduce cost by migrating to Linux • Ensure there is no overhead on the source system THE CHALLENGE • Ensure no system overhead on the source system by performing log based capture • Effectively synchronize the source and target systems and continue to replicate the changes till the switchover • Seamless transition from initial synchronization to real time change data capture Informatica Success Story: Private Commercial and Financial firm
  • 9.
    9 Informatica’s Data ReplicationSolution The Unique Strengths Fast time to Value Rapid deployment and quick time to completion of projects Data Replication User Interface facilitates quick specification and deployment Optimized for Operational Reporting & Active data warehousing appliance loading Data Replication uses innovative techniques to optimize transaction capture and Appliance loading Heterogeneous across a wide array of source, targets and OS’s Data Replication is architected to support transactional replication across a breadth of sources and targets
  • 10.
  • 11.
  • 12.
    12 Positioning Replication andPWC/PWX Capability Informatica Data Replication PowerCenter + PowerExchange CDC Purpose built U/I Easy to configure and deploy for replication scenarios For development in complex projects requiring extensive transformations Number of Tables Any number Max 70 to 90 tables per session Transformation Simple Transformations (Substring, Data Type…) Advanced Transformation (DQ, Sorters, Aggregators...) End-End Volume/Throughput (apply to RDBMS): Optimized for transactional replication throughput (handles Telco volumes – Big Data) Medium (high capture rates, transform and apply through PC) End-End Volume/Throughput – (apply to Appliances natively) High (Microbatch Merge Apply) Medium (high capture rates, transform and apply through PC) Sources Relational: Oracle, log based SQL Server, DB2 LUW, Netezza. Will add more over time. Mainframe, iSeries and Relational CDC Informatica: Internal and Channel Use Only

Editor's Notes

  • #3 Hello everyone! I will be walking you through this introduction of Informatica’s Data Replication Solution. In this brief presentation we are going to provide a high level overview of and discuss how Data Replication enables companies and organizations to meet the challenges posed by modern day requirements in running 24x7x365 business.
  • #4 Now before we start it is important to have a common understanding of what Data Replication is and how it is defined in the industry.
  • #5 Sound Decisions need Accurate Data – The ability to react quickly to changes is increasingly important in the ‘always online’ environment in which we operate. Having immediate access to accurate, up to date information is a key factor in making the correct decisions. How many times does the business make a decision based on untimely information? And what are the ramifications in cost? ----------------- IT Costs must be controlled – Not all IT platforms are equal – and re-hosting Applications, or simply offloading non-critical query access to a cheaper platform to a more focused audience can help effect real savings. ----------------- Downtime costs money – Although hardware duplication and high availability solutions can help to eliminate unplanned downtime caused by failures, there are still occasions when ‘planned’ downtime is needed to implement hardware or software upgrades – and/or to absorb new applications as a result of a merger or acquisition. For every unproductive hour of downtime how much does that cost a company in terms of both monetary cost and reputation?
  • #6 This is a proof point on slide 4 from that is externally validating the pain and the cost of the pain. Don’t just take our word for it, here are what our customers are telling us and here are what the key analysts that buyers routinely talk to are saying as well.
  • #7 I will briefly introduce our product offerings in this architecture diagram and then in a moment we will see how these offerings integrate with a variety of business use cases, some of which you may be experiencing now. First, Informatica Data Replication allows users to share information across different systems in a heterogeneous environment, manage and audit database transactional data.   Informatica Data Replication enables high-performance data replication between different hardware platforms and data sources without losing the transactional integrity of the data, replicating from source to target in real time.   Informatica Data Replication provides immediate update of the targets shown with the most recent transactional changes extracted from Oracle, Netezza, SQL Server and DB2/UDB.   Informatica Fast Clone enables high performance copies directly from Oracle to a depicted target of choice and is used where low latency is not a big requirement. This a Batch move.
  • #8 Let’s now take a look at the Replication use cases supported by Informatica’s Data Replication solution. For the Active Data Warehouse use case, business insight is improved through real-time BI Data Warehousing. It makes a Good Warehouse even better by providing up to the minute information for reporting.   ----------------- Offlloading Operational Reporting: This reduces barriers to sharing data because the data can be offloaded and still be accessed in near real-time for current reporting needs. ----------------- Zero Downtime Migration: This enables continuous operations for mission critical applications - which eliminates unplanned outages and reduces the cost of planned outages. Reduces risk by ensuring data integrity and reliability between source and target systems. We will see an example of this in our customer use case. ----------------- Live Auditing makes it possible to Address Compliance Requirements by providing a complete audit trail of all changes made to critical production systems, if required.
  • #9 This is an example of the Zero Downtime Migration – in real time - use case. Eliminating database downtime is a significant challenge for IT organizations that need to upgrade or migrate (both, here!) mission-critical systems. This is particularly true for applications that must provide continuous or near-real time operations to clients who increasingly expect uninterrupted availability to get to their data. Any application outage, whether planned or not, can have an impact on both the revenue and the reputation of the business. Leveraging our Data Replication solution, these challenges were overcome. During their target load period, the client was able to replicate and store changes from the source until the switchover occurred. At that point in time, all of the incremental changes that had accumulated from the source database since the initial target load start time, were loaded into the target database. When the switchover occurred, it was seamless.
  • #10 So Why Informatica for Data Replication? Now let’s look at some of the assets which we believe make Informatica’s Data Replication Solution Unique   Rapid Implementation – by using the Informatica Data Replication Graphical Design Tool, the process of defining which tables are to be replicated, to which targets is accomplished quickly and efficiently, and the same configuration is used for both Initial Synchronization, and for continuing replication once targets have been populated.   Designed to Exploit Warehouse Appliances – because Informatica Data Replication was designed with Active Data Warehousing in mind, particular efforts have been made to efficiently exploit the specific interfaces provided by the Warehouse Appliances which we support.   Any to Any Replication – Informatica develops Data Integration Solutions – not Databases. So it is important to us that Informatica Data Replication can handle any source and any target to give our customers complete freedom of choice of Operational and Data Warehouse and/or BI Reporting platforms. And if you do review past decisions and change your mind about your database, you won’t have to change your Data Integration Solution.
  • #13 The overlap of our Data Replication solution and the PowerExchange CDC solution is minimal, when looking at the definition of Replication, and the use cases involved. In the past PowerExchange CDC and PowerCenter have been leveraged in some mostly uni-directional use cases to try to fill the Replication gap and now we have solution that Does fill the gap. And, keep in mind that for the Active Data Warehouse scenario, for instance, it could be quite possible that there is a need to quickly and on a continuous basis replicate data from an operational system into some type of operational staging area for subsequent use by a new or existing PowerCenter proess. Remember that Data Replication covers pure data replication and high speed data transfer use cases with lighter transformation capability. Let’s now take a few minutes and get specific on when to look at which solution. If there is a need for complex transformations and/or data quality integration, then PWC will provide your solution. If however there is a need to build out and deploy a data movement scenario where there are little to no transformations, then Data Replication is your solution. Throughput factor: Data Replication supports native apply mechanisms to these appliances: Teradata, Netezza, Greenplum, Vertica and provides focused optimized SQL to ensure high speed target loading. Another compelling Throughput factor in considering the Data Replication solution: only changed data is propagated from source to target – not all columns which is the way that it works today for PWX CDC except for Mainframe DB2. That can offer compelling performance results. And, it is very important to take a look at what sources and targets are needed for any particular Data Replication Use Case. As of right now, if Mainframe (zSeries) or AS/400 (iSeries) is involved, there is not currently a solution. However, this is on the Roadmap for 2012 and see your friendly Replication Product Manager for more details. Here is a point specifically around our existing PWX Oracle CDC solution. If you are wondering whether to use PWX Oracle CDC (which includes both the LogMiner and Express options) or our Data Replication solution – the short answer is: if there is little to no (i.e. outside of data type conversion) transformation – regardless of the number of tables involved – take advantage of our Data Replication solution. Also… if you need to process Oracle logs off the Oracle Server or you need to process CLOB, LOB type columns, then Data Replication is also your solution available right now. Note that PWX Ora Express for 9.x will be out end of August 2011 and will add support for RAC/ASM (Oracle) and support for 9.x of PowerCenter. So – overall, looking at the two solutions at a very high level: Data Replication is about More Replication, Little Transformation, the ability to perform massive set at a time processing. PowerCenter + PowerExchange CDC is about Complex Transformations, Environments, Little Replication, Table at a time processing. And, very important: if you are getting to this level of technical detail in any client discussion, this is where you will need to engage a technical resource, versed in our real-time data connectivity offerings, to support you in coming up with the best approach for that individual client’s data integration initiatives. This matrix is meant to be an overall guide to help you understand some of the distinctions/boundaries between the two solutions. So you want to work closely with your Informatica Account Team and your customer or prospect to come up with the best solution possible.