Disaster Recovery for the Real-Time Data Warehouses

  • 712 views
Uploaded on

More and more, front-line business operations depend on data warehouses and real-time analysis. Decisions are driven by data that’s captured from all over the enterprise, helping companies like yours …

More and more, front-line business operations depend on data warehouses and real-time analysis. Decisions are driven by data that’s captured from all over the enterprise, helping companies like yours compete more fiercely in crowded marketplaces.
But are your disaster recovery policies keeping up with the changing role of your real-time data warehouse? The sheer volume of data and the rate at which it changes makes traditional backup and restore practices unworkable – so, what techniques do work?
In these slides, you will learn how to construct disaster recovery procedures that fit your 24-7, up-all-the-time data warehouse

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
712
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
17
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Big Data, Big Opportunity – University of Texas at Austin, Sept 2011A “real-time infrastructure” – Gartner – (ranks 3rd after “developing business solutions” and “reducing the cost of IT”)Organizations using analytics for competitive advantage – “Outperforming in a Data-Rich and Hyper-Connected World.” IBM Center for Applied Insights and Economic Intelligence
  • Use this, instead, for “new role of the data warehouse” slide??
  • Benefit: better manage performanceChallenge: Keep reporting systems up to date with changes
  • Benefit: get changes out to remote sites faster
  • Second “about Tervela Turbo” slide??

Transcript

  • 1. Disaster RecoveryFor the Real-Time Data Warehouse:Replicating and Parallelizing Big Data
  • 2. What you will learn: 4 strategies1. Separate operational warehouses from reporting systems2. Use changed data capture and Big Data replication3. Implement parallel, active-active data warehouses4. Maintain a “golden event” warehouse in Hadoop Confidential & Proprietary 2
  • 3. Analytics Have a Measurable Effect• For the median Fortune 1000 Company, a 10% increase in data usability corresponds to $2.01B in annual revenue gains Big Data, Big Opportunity – University of Texas at Austin, Sept 2011• A “real-time infrastructure” ranks #3 on the CIO’s list of strategies A “real-time infrastructure” – Gartner• Organizations adept at analytics see 1.6x the revenue growth 2.0x the profit growth, and 2.5x the stock price appreciation of their peers – “Outperforming in a Data-Rich and Hyper-Connected World.” IBM Center for Applied Insights and Economic Intelligence Confidential & Proprietary 3
  • 4. Data Warehousing: Now Part of Operations real-time pricing real-time marketing fraud detection inventory management customer service Confidential & Proprietary 4
  • 5. Analytics in Business Operations:Constant, Up-to-Minute Access to Big DataADVERTISING CAPITAL MARKETSClick-stream Mobile ads Market Data Securities TradingUTILITIES TRANSPORTATIONEnergy usage Power production Traffic & Logistics Fleet DeploymentINFORMATION TECHNOLOGY TELECOMMUNICATIONSNetwork Activity IT Root-Cause Call Activity Capacity Allocation 5
  • 6. Expectations have changed Confidential & Proprietary 6
  • 7. What we need…vs. what we have Need Have SLAs: 99.999% Backup and recovery can Up-Time take days in the event of an outage or system failure Access to information as it ETL processes can take Real-time happens hours before information is available Add new applications as Access to warehouse is the business demands tightly controlled; Distribution performance bottlenecks of a single database can impact mission-critical systems Confidential & Proprietary 7
  • 8. 4 disaster recovery strategies for big data1. Separate operational warehouses from reporting systems2. Use changed data capture and Big Data replication3. Implement parallel, active-active data warehousing4. Maintain a “golden event” warehouse in Hadoop Confidential & Proprietary 8
  • 9. 1. Separate operations from reporting Operations Primary application Warehouse DB2 Run day-to-day applications in one place. Ad-hoc reporting happens in a separate warehouse. WAN BENEFIT Better control over performance CHALLENGE Keeping changes in Secondary sync Reporting Warehouse 9
  • 10. 2. Changed data capture Primary Cluster Determine what hasapplication changed, then replicate it to achieve parity between environments 1 GB/s Data Fabric BENEFIT 250 MB/s per box Load-balanced Quickly propagate Linearly scalable changes to remote Built-in persistence sites WAN CHALLENGE Identifying changes is difficult. The volume of data represents a stop- gap as it continues to Reporting Cluster grow. 10
  • 11. 3. Parallel, active-active data warehousing Primary Cluster Capture application data streams and load to parallel data warehouses over the WAN1 GB/s BENEFIT Data Fabric Multiple warehouses 250 MB/s per box are kept up to date Load-balanced WAN Linearly scalable Built-in persistence CHALLENGE Synchronization of many data streams Reporting Cluster Confidential & Proprietary 11
  • 12. 4. “Golden Event” store Data Fabric Primary Data Warehouse 250 MB/s per box application Load-balanced Linearly scalable Built-in persistenceCapture raw data andstore it in HadoopBENEFITNew analytics are Reporting Data Warehousealways possible (Optional)CHALLENGEBest practices are only New Apps &just being developed Analytics Golden Event Store Confidential & Proprietary 12
  • 13. About Tervela Turbo• New release!• Capture, share, and distribute data• Accelerate any of the use cases we discussed today Confidential & Proprietary 13
  • 14. Big Data Requires Big Data MovementAs companiesimplement more bigdata solutions, theneed to use high-performance messagedelivery with thosesystems will grow.Gartner: Hype Cycle for Big Data, 2012 Confidential & Proprietary 14
  • 15. Key Features and Benefits of Tervela TurboKey Features Key BenefitsData Capture• Adapters for top data stores Real-Time• Flexible multi-language API Regardless of data volume or• Real-time acquisition number of sourcesData Availability Reliable• Parallel loading• Large-volume buffering For mission-critical operations• Automatic retry that can’t go down• Data replayData Distribution Multi-Platform• Continuous loading• No disruption with bad consumers Feeds explosion of analytic• Warehouses, DBs, Hadoop, etc apps on any platform without• Web, mobile, custom apps disrupting other consumers 15
  • 16. Learn More About Big Data Movement Capture, Share, and DistributeBig Data For Mission-Critical Analytics Access videos, how-to guides, and other educational materials at: www.terverla.com tervela.com/datafabric @tervela info@tervela.com 16