Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Disaster Recovery for the Real-Time Data Warehouses


Published on

More and more, front-line business operations depend on data warehouses and real-time analysis. Decisions are driven by data that’s captured from all over the enterprise, helping companies like yours compete more fiercely in crowded marketplaces.
But are your disaster recovery policies keeping up with the changing role of your real-time data warehouse? The sheer volume of data and the rate at which it changes makes traditional backup and restore practices unworkable – so, what techniques do work?
In these slides, you will learn how to construct disaster recovery procedures that fit your 24-7, up-all-the-time data warehouse

  • Be the first to comment

  • Be the first to like this

Disaster Recovery for the Real-Time Data Warehouses

  1. 1. Disaster RecoveryFor the Real-Time Data Warehouse:Replicating and Parallelizing Big Data
  2. 2. What you will learn: 4 strategies1. Separate operational warehouses from reporting systems2. Use changed data capture and Big Data replication3. Implement parallel, active-active data warehouses4. Maintain a “golden event” warehouse in Hadoop Confidential & Proprietary 2
  3. 3. Analytics Have a Measurable Effect• For the median Fortune 1000 Company, a 10% increase in data usability corresponds to $2.01B in annual revenue gains Big Data, Big Opportunity – University of Texas at Austin, Sept 2011• A “real-time infrastructure” ranks #3 on the CIO’s list of strategies A “real-time infrastructure” – Gartner• Organizations adept at analytics see 1.6x the revenue growth 2.0x the profit growth, and 2.5x the stock price appreciation of their peers – “Outperforming in a Data-Rich and Hyper-Connected World.” IBM Center for Applied Insights and Economic Intelligence Confidential & Proprietary 3
  4. 4. Data Warehousing: Now Part of Operations real-time pricing real-time marketing fraud detection inventory management customer service Confidential & Proprietary 4
  5. 5. Analytics in Business Operations:Constant, Up-to-Minute Access to Big DataADVERTISING CAPITAL MARKETSClick-stream Mobile ads Market Data Securities TradingUTILITIES TRANSPORTATIONEnergy usage Power production Traffic & Logistics Fleet DeploymentINFORMATION TECHNOLOGY TELECOMMUNICATIONSNetwork Activity IT Root-Cause Call Activity Capacity Allocation 5
  6. 6. Expectations have changed Confidential & Proprietary 6
  7. 7. What we need…vs. what we have Need Have SLAs: 99.999% Backup and recovery can Up-Time take days in the event of an outage or system failure Access to information as it ETL processes can take Real-time happens hours before information is available Add new applications as Access to warehouse is the business demands tightly controlled; Distribution performance bottlenecks of a single database can impact mission-critical systems Confidential & Proprietary 7
  8. 8. 4 disaster recovery strategies for big data1. Separate operational warehouses from reporting systems2. Use changed data capture and Big Data replication3. Implement parallel, active-active data warehousing4. Maintain a “golden event” warehouse in Hadoop Confidential & Proprietary 8
  9. 9. 1. Separate operations from reporting Operations Primary application Warehouse DB2 Run day-to-day applications in one place. Ad-hoc reporting happens in a separate warehouse. WAN BENEFIT Better control over performance CHALLENGE Keeping changes in Secondary sync Reporting Warehouse 9
  10. 10. 2. Changed data capture Primary Cluster Determine what hasapplication changed, then replicate it to achieve parity between environments 1 GB/s Data Fabric BENEFIT 250 MB/s per box Load-balanced Quickly propagate Linearly scalable changes to remote Built-in persistence sites WAN CHALLENGE Identifying changes is difficult. The volume of data represents a stop- gap as it continues to Reporting Cluster grow. 10
  11. 11. 3. Parallel, active-active data warehousing Primary Cluster Capture application data streams and load to parallel data warehouses over the WAN1 GB/s BENEFIT Data Fabric Multiple warehouses 250 MB/s per box are kept up to date Load-balanced WAN Linearly scalable Built-in persistence CHALLENGE Synchronization of many data streams Reporting Cluster Confidential & Proprietary 11
  12. 12. 4. “Golden Event” store Data Fabric Primary Data Warehouse 250 MB/s per box application Load-balanced Linearly scalable Built-in persistenceCapture raw data andstore it in HadoopBENEFITNew analytics are Reporting Data Warehousealways possible (Optional)CHALLENGEBest practices are only New Apps &just being developed Analytics Golden Event Store Confidential & Proprietary 12
  13. 13. About Tervela Turbo• New release!• Capture, share, and distribute data• Accelerate any of the use cases we discussed today Confidential & Proprietary 13
  14. 14. Big Data Requires Big Data MovementAs companiesimplement more bigdata solutions, theneed to use high-performance messagedelivery with thosesystems will grow.Gartner: Hype Cycle for Big Data, 2012 Confidential & Proprietary 14
  15. 15. Key Features and Benefits of Tervela TurboKey Features Key BenefitsData Capture• Adapters for top data stores Real-Time• Flexible multi-language API Regardless of data volume or• Real-time acquisition number of sourcesData Availability Reliable• Parallel loading• Large-volume buffering For mission-critical operations• Automatic retry that can’t go down• Data replayData Distribution Multi-Platform• Continuous loading• No disruption with bad consumers Feeds explosion of analytic• Warehouses, DBs, Hadoop, etc apps on any platform without• Web, mobile, custom apps disrupting other consumers 15
  16. 16. Learn More About Big Data Movement Capture, Share, and DistributeBig Data For Mission-Critical Analytics Access videos, how-to guides, and other educational materials at: @tervela 16