More Related Content Similar to EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It Sounds (20) EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It Sounds1. 1© Copyright 2014 EMC Corporation. All rights reserved.
Breakout: Move to the Business Data Lake –
Not as Hard as it Sounds
Michael Wood & Steve Jones
2. 2© Copyright 2014 EMC Corporation. All rights reserved.
Agenda
! Introductions
! What is the Data Lake? (…and better yet, Why?)
! Business Demands on Data
! Dealing with People and Technology Realistically
! No Rip and Replace/Evolve Towards Business Value
! Call to Action
3. 3© Copyright 2014 EMC Corporation. All rights reserved.
What Do We Need to Change?
• Data Volume Exploding
• Importance of Analytics
Accelerating
• Demand for Different
Kinds of Data
Enterprise Data Systems
Limited by Schema!
Limited by Cost!
Data that Doesn’t Fit is Discarded!
4. 4© Copyright 2014 EMC Corporation. All rights reserved.
What if We Can Break Out?
BATTLE-TESTED
MPP DATABASE
MPP QUERY
ON HADOOP
IN-MEMORY DATA GRID
Store Everything!
Analyze Anything!
5. 5© Copyright 2014 EMC Corporation. All rights reserved.
Multiple Internal Views–
Consistently Compromised
Corporate
Ad-hoc
LOB
Management
Operations
Market
Operations
LOB mart Spreadsheets
Line of business
Transactional systems
CRMERP PLM
EDW CorporateODS
Web
6. 6© Copyright 2014 EMC Corporation. All rights reserved.
Multiple Internal Views–Consistently
Compromised
Corporate
Ad-hoc
LOB
Management
Operations
Market
Operations
LOB mart Spreadsheets
Line of business
Transactional systems
CRMERP PLM
EDW CorporateODS
Web
Fit
Detail
Freshness
Fidelity
7. 7© Copyright 2014 EMC Corporation. All rights reserved.
Why the Single View Fails
Division1
Sales
Finance
Supply chain
Marketing
R&D
Personal
KPIs
DivsionalKPIs Corporate
Now agree on
everythingDivision2
Sales
Finance
Supply chain
Marketing
R&D
PersonalKPIs
DivsionalKPIs
Division3
Sales
Finance
Supply chain
Marketing
R&D
PersonalKPIs
DivsionalKPIs
Division4
Sales
Finance
Supply chain
Marketing
R&D
PersonalKPIs
DivsionalKPIs
Corporate
KPIs
8. 8© Copyright 2014 EMC Corporation. All rights reserved.
And That Was When We Just Worked
Internally…
• The volumes of data are exploding
• The ability to control and dictate in
an ‘outside-in’ world is minimal
• More and more business value is
beyond the core transactions
• The old approach of ‘a single view’ is
impossible in a world of federated
internal and external data
Core
transactions
9. 9© Copyright 2014 EMC Corporation. All rights reserved.
Remember…
Culture eats strategy for breakfast.
– Peter Drucker
10. 10© Copyright 2014 EMC Corporation. All rights reserved.
How Do Pivotal & Capgemini Deliver the
Business Data Lake
Govern where it
matters
Capgemini’s Information governance approach
" MDM & RDM data integrated
" Information RADAR approach to identification
Encourage local
requirements
" HAWQ – Traditional disk-based structured SQL
" Pivotal GemFire XD – Fast in-memory database
" Pivotal GemFire XD – Real-time analytics and integration
Distill on demand
" HAWQ
" Structured SQL on Pivotal HD
" Pivotal Data Dispatch
" Data movement and transformation
Store everything
" Pivotal HD
" Low cost
" Simplified deployment
Save 80%
on Data
Storage
Compress
the time to
value
Sell to the
business
and IT
Capgemini’s
end to end
value
11. 11© Copyright 2014 EMC Corporation. All rights reserved.
What Does This Mean?
HDFS
Load everything
Keep the history
Business driven
North America
operations
Marketing
campaign
EMEA
data mart
Distill
HAWQ
Transactional systems
CRM PLMERPSensor NetworkWeb Social Media MarketSupplier
12. 12© Copyright 2014 EMC Corporation. All rights reserved.
Business driven
Customers
Orders
Inventory
Customers
Campaign
Contract
Customers
Orders
Invoices
What Does This Mean?
HDFS
Load everything
Keep the history
Distill
HAWQ
Transactional systems
CRM PLMERPSensor NetworkWeb Social Media MarketSupplier
13. 13© Copyright 2014 EMC Corporation. All rights reserved.
What Does This Mean?
Business driven
Customers
Orders
Inventory
Customers
Campaign
Contract
Customers
Orders
Invoices
Distill
HAWQHDFS
Load everything
Keep the history
Transactional systems
CRM PLMERPSensor NetworkWeb Social Media MarketSupplier
Information governance MDM and RDM
The need to share
We need a global view
on customers
Customers Customers Customers
CustomerThe global view Revenue
14. 14© Copyright 2014 EMC Corporation. All rights reserved.
Why the Business Data Lake Succeeds
Division1
Sales
Finance
Supply chain
Marketing
R&D
Personal
KPIs
DivsionalKPIs Corporate
Division2
Sales
Finance
Supply chain
Marketing
R&D
PersonalKPIs
DivsionalKPIs
Division3
Sales
Finance
Supply chain
Marketing
R&D
PersonalKPIs
DivsionalKPIs
Division4
Sales
Finance
Supply chain
Marketing
R&D
PersonalKPIs
DivsionalKPIs
Corporate
KPIs
Now agree
where it
counts
15. 15© Copyright 2014 EMC Corporation. All rights reserved.
Business Data Lake Architecture
Ingestion
Tier
Insights
Tier
Unified Operations Tier
System monitoring System management
Unified Data Management Tier
Data mgmt.
services
MDM
RDM
Audit and
policy mgmt.
Processing Tier
Workflow management
Distillation Tier
HDFS storage
Unstructured and structured data
In-memory
MPP database
Real-time
Micro batch
Mega batch
SQL
NoSQL
SQL
MapReduce
Query interfaces
SQL
Sources Action Tier
Real-time
ingestion
Micro batch
ingestion
Batch
ingestion
Real-time
insights
Interactive
insights
Batch
insights
16. 16© Copyright 2014 EMC Corporation. All rights reserved.
How the Business Data Lake Works
Structured tier
* SDH = Source Data History
Structured data tier
Business mart LOB Ad-hoc analytics LOB analytics hub
Business mart model LOB Ad-hoc analytics model LOB analytics Model
All data loaded
‘as is’ from sources
with history
automatically added
LOB creates
their model
Maps their model to
the sources
Source
Distillation tier
Map Map Map Map
Map Map Map
Map
Data storage
Source Source Source Source Source Source Source
SDH SDH SDH SDH SDH SDH SDH SDH
17. 17© Copyright 2014 EMC Corporation. All rights reserved.
How the Corporate View Works
Local
view
Corporate
standards
Master data and
reference data
Corporate
view
Customer
x-ref
Customer
MDM
Invoices Orders
Customer
Invoices Orders
BU1
Information
governance
BU2 BU3
Customer
Invoices
Orders
Customer
Invoices
Orders
Customer
Invoices
Orders
18. 18© Copyright 2014 EMC Corporation. All rights reserved.
The New Philosophy
Business
Data
Lake
Store
everything
Encourage
local
Govern
only the
common
Treat
global as a
local view
It’s all about insight at the point of action
19. 19© Copyright 2014 EMC Corporation. All rights reserved.
Call to Action
• Learn More about the Business Data Lake:
- http://www.gopivotal.com/big-data/businessdatalake
• Learn about Capgemini’s capabilities
- http://www.capgemini.com/big-data-analytics/business-data-lake
• Partners can get involved at http://www.gopivotal.com/partners
• Visit the EMC booth to discover how the EMC Federation of Companies
helps drive the Data Lake
• Follow Us on Twitter!
- Michael - @aBitCloudy
- Steve - @mosesjones