Hadoop, Big Data, and the Future of the Enterprise Data Warehouse


Published on

Under the umbrella of big data, the nature of data warehousing inside enterprises is undergoing a massive transformation. Originally designed as a clearinghouse for organizing data to discover and analyze historical trends, business units are now putting extreme pressure on their data groups to enhance their services. Their goals: provide better customer service, real-time marketing, and more efficient business operations.

In this webcast, Big Data expert Barry Thompson will discuss how will enterprise data warehouses are evolving to meet these challenges. Some of the topics we will cover include:

- How Hadoop and other big data technologies are coexisting with traditional data warehouses
- Dealing with multiple big data sources – and multiple versions of the truth
- Techniques like warehouse replication and parallel data loading that enable platforms with different levels of service for different types of applications

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Hadoop, Big Data, and the Future of the Enterprise Data Warehouse

  1. 1. Hosted by Barry Thompson,Founder & CTO of Tervela 1
  2. 2. What We’ll Discuss Today…• How is the role of the data warehouse changing in the face of big data?• How are Hadoop and other big data technologies coexisting with traditional data warehouses?• What happens when we have multiple big data sources (and multiple versions of the truth)?• How do I use replication, data loading, cloud integration, and other technologies during this transition period? 2
  3. 3. About the presenter...• Barry Thompson• Founder and CTO of Tervela• Visionary with 20 years of experience• Background in transformative technologies (robotics, imaging & traditional enterprise)• Technology leadership for AIG, NatWest and UBS• X-Prize board of trustees 3
  4. 4. Data Complexity Exploding Data End Point Global Regulatory Explosion Proliferation Distribution Requirements 30 million Dodd-Frank networked sensor nodes with 30% annual growth 5 billion 58% of Europe is mobile phones in use on the Internet HIPAA30 billion in 2010pieces of content sharedon Facebook every X X 78% of North America is X Basel IIImonth 40 billion on the Internet64 exabytes Devices connected to the Internet by the end of the 23% Consumer Amount of data moved decade of Asia is on the around the Internet per month by the end of the Internet Protection decade More Data In More Places By More People and Apps Faster 4
  5. 5. It Should Be Easy Operational Traditional Data Hadoop Data Stores Warehouses Map-Reduce Transactional Data Structured AnalysisUnstructured Analysis 5
  6. 6. But It’s Not Operational Traditional Data Hadoop Data Stores Warehouses Map-Reduce NoSQL Transactional Data Real-Time Operations ETL Structured Analysis Replacement Real-Time Decision Real-Time Support AnalyticsUnstructured Analysis 6
  7. 7. What’s Driving This Activity? Explosion in Real-Time Analytics Accessibility of Big Data Streams Multi-Format, Multi-Type Inconsistent Ingest RatesScaling Across Geographies 7
  8. 8. A Question For You…What is the relationship at your company between Hadoop and yourcorporate data warehouse?1) We have an integrated Hadoop - Data Warehouse strategy2) We arent sure how Hadoop should fit with our warehouse3) Theres no interaction between Hadoop and our Data Warehouse4) We arent running Hadoop5) I don’t know 8
  9. 9. Warehousing… The Old Way Slows down Single location, Inflexible data data single point of formats availability failureOperational DataStore (Database) Data Mart Business ReportData Feeds & ETLWeb Services Data Warehouse Data Mart Analytic App Flat Files I don’t fit  9
  10. 10. The New Warehouse Paradigm Real-time The right apps get DR & Backup format, the right immediate for Big Data processing access to data Real-Time ConsoleOperational DataStore (Database) Data Mart Business User Data Fabric ETL Data WarehouseData Feeds & Data MartWeb Services Analytic App ETL Backup Warehouse Flat Files Hadoop Analytic App 10
  11. 11. What is a Data Fabric? Requirements •High performanceapps & SOA file systems DBs ODS/clusters clouds •No loss Data Sources •Centralized Management & Visibility •Ease of integration •5 9’s of reliability Tervela Data Fabric Software, Hardware Appliances or Cloud Services Features •Data Capture Data Stores •Data Movement •Data Availability •Data Protection •Data Management clouds warehouses analytics 11
  12. 12. High-Performance & Parallel Loading• Guaranteed delivery of data into multiple systems• Buffered and streamed to deal with slow consumers• Efficient multi-casting avoids excessive network traffic 12
  13. 13. Real-Time Analytics• Streaming avoids bottlenecks in ETL or warehousing• Delivers the right format for your analytic system• Best way to handle the explosion of analytic apps 13
  14. 14. Cloud Integration• Buffering simplifies big data transfer over slow WANs• Stream data between cloud apps without temp storage• Bridge your cloud apps with on-premise systems 14
  15. 15. Global Data Synchronization• Backup heterogeneous Big Data over unreliable WANs• Create active-active configuration for DR & scale• Geographic distribution for better local performance 15
  16. 16. Big Data Replication• 10-100x faster than existing / native replication over WAN• Multi-cast replication saves bandwidth• Local data improves performance 16
  17. 17. For More Information Request a trial: http://tervela.com/download Read• @tervela some• barry@tervela.com case• www.tervela.com studies: http://tervela.com/customers 17
  18. 18. Thank you! 18