Under the umbrella of big data, the nature of data warehousing inside enterprises is undergoing a massive transformation. Originally designed as a clearinghouse for organizing data to discover and analyze historical trends, business units are now putting extreme pressure on their data groups to enhance their services. Their goals: provide better customer service, real-time marketing, and more efficient business operations.
In this webcast, Big Data expert Barry Thompson will discuss how will enterprise data warehouses are evolving to meet these challenges. Some of the topics we will cover include:
- How Hadoop and other big data technologies are coexisting with traditional data warehouses
- Dealing with multiple big data sources – and multiple versions of the truth
- Techniques like warehouse replication and parallel data loading that enable platforms with different levels of service for different types of applications
2. What We’ll Discuss Today…
• How is the role of the data warehouse changing in the
face of big data?
• How are Hadoop and other big data technologies
coexisting with traditional data warehouses?
• What happens when we have multiple big data sources
(and multiple versions of the truth)?
• How do I use replication, data loading, cloud integration,
and other technologies during this transition period?
2
3. About the presenter...
• Barry Thompson
• Founder and CTO of Tervela
• Visionary with 20 years of
experience
• Background in transformative
technologies (robotics,
imaging & traditional
enterprise)
• Technology leadership for
AIG, NatWest and UBS
• X-Prize board of trustees
3
4. Data Complexity Exploding
Data End Point Global Regulatory
Explosion Proliferation Distribution Requirements
30 million Dodd-Frank
networked sensor nodes
with 30% annual growth 5 billion 58%
of Europe is
mobile phones in use on the Internet
HIPAA
30 billion in 2010
pieces of content shared
on Facebook every
X X 78%
of North America is
X Basel III
month
40 billion on the Internet
64 exabytes Devices connected to the
Internet by the end of the 23% Consumer
Amount of data moved
decade of Asia is on the
around the Internet per
month by the end of the Internet Protection
decade
More Data In More Places
By More People and Apps Faster
4
5. It Should Be Easy
Operational Traditional Data Hadoop
Data Stores Warehouses Map-Reduce
Transactional Data
Structured Analysis
Unstructured Analysis
5
6. But It’s Not
Operational Traditional Data Hadoop
Data Stores Warehouses Map-Reduce
NoSQL
Transactional Data
Real-Time
Operations ETL
Structured Analysis Replacement
Real-Time Decision Real-Time
Support Analytics
Unstructured Analysis
6
7. What’s Driving This Activity?
Explosion in Real-Time Analytics
Accessibility of Big Data Streams
Multi-Format, Multi-Type
Inconsistent Ingest Rates
Scaling Across Geographies
7
8. A Question For You…
What is the relationship at your company between Hadoop and your
corporate data warehouse?
1) We have an integrated Hadoop - Data Warehouse strategy
2) We aren't sure how Hadoop should fit with our warehouse
3) There's no interaction between Hadoop and our Data Warehouse
4) We aren't running Hadoop
5) I don’t know
8
9. Warehousing… The Old Way
Slows down Single location, Inflexible data
data single point of formats
availability failure
Operational Data
Store (Database)
Data Mart Business Report
Data Feeds & ETL
Web Services
Data Warehouse
Data Mart Analytic App
Flat
Files
I don’t fit
9
10. The New Warehouse Paradigm
Real-time
The right
apps get DR & Backup
format, the right
immediate for Big Data
processing
access to data
Real-Time
Console
Operational Data
Store (Database)
Data Mart
Business User
Data Fabric
ETL
Data Warehouse
Data Feeds & Data Mart
Web Services
Analytic App
ETL
Backup Warehouse
Flat
Files
Hadoop Analytic App
10
11. What is a Data Fabric?
Requirements
•High performance
apps & SOA file systems DBs ODS/clusters clouds •No loss
Data Sources •Centralized Management & Visibility
•Ease of integration
•5 9’s of reliability
Tervela Data Fabric
Software, Hardware Appliances
or Cloud Services
Features
•Data Capture
Data Stores
•Data Movement
•Data Availability
•Data Protection
•Data Management
clouds warehouses analytics
11
12. High-Performance & Parallel Loading
• Guaranteed delivery of data into multiple systems
• Buffered and streamed to deal with slow consumers
• Efficient multi-casting avoids excessive network traffic
12
13. Real-Time Analytics
• Streaming avoids bottlenecks in ETL or warehousing
• Delivers the right format for your analytic system
• Best way to handle the explosion of analytic apps
13
14. Cloud Integration
• Buffering simplifies big data transfer over slow WANs
• Stream data between cloud apps without temp storage
• Bridge your cloud apps with on-premise systems
14
15. Global Data Synchronization
• Backup heterogeneous Big Data over unreliable WANs
• Create active-active configuration for DR & scale
• Geographic distribution for better local performance
15
16. Big Data Replication
• 10-100x faster than existing / native replication over WAN
• Multi-cast replication saves bandwidth
• Local data improves performance
16
17. For More Information
Request
a trial:
http://tervela.com/download
Read
• @tervela some
• barry@tervela.com case
• www.tervela.com
studies:
http://tervela.com/customers
17