More Related Content Similar to Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything) (20) Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)1. Hype, Hopes, Hell & Hadoop
Big Data: Reality Check and Infrastructure
Implications of “The Enterprise of Everything”
Jean-Luc Chatelain, EVP & CTOStampedeCon 2014
2. 2
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
2 And now, a quick word from my sponsor
3. 3
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
DDN | Who We Are
• Main Office: Santa Clara, California, USA
• Employees: ~550 in 20 Countries
• Installed Base: End Customers in 50 Countries
• Go To Market: Partner & Reseller Assisted, Direct
• DDN: World’s Largest Private Storage Company
We Design, Deploy and Optimize Storage Systems that Solve
HPC, Big Data and Cloud Business Challenges at Scale
World-Renowned & Award-Winning
4. 4
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Big Data & Cloud Infrastructure
DDN’s Award-Winning Product Portfolio
Analytics Reference
Architectures
EXAScaler™
10Ks of Clients
1TB/s+, HSM
Linux HPC Clients
NFS & CIFS [2014]
Petascale
Lustre® Storage
Enterprise
Scale-Out File Storage
GRIDScaler™
~10K Clients
1TB/s+, HSM
Linux/Windows HPC Clients
NFS & CIFS
SFA12KX™
48GB/s, 1.7M IOPS
1,680 Drives in 2
Racks
Optional Embedded
Computing
SFA7700™
13GB/s; 600K
IOPS
• 7700X
• 7700E
Storage Fusion Architecture™ Core Storage Platforms
SATA SSD
Flexible Drive Configuration
SAS
SFX™ Automated Flash Caching
WOS® 3.0
32 Trillion Unique Objects
Geo-Replicated Cloud Storage
256 Million Objects/Second
Self-Healing Cloud
Embedded metadata mgmt
Cloud Foundation
Big Data Platform
Management
DirectMon®
Cloud
Tiering
Infinite Memory Engine™
Distributed File System Buffer Cache
WOS7000
60 Drives in 4U
Self-Contained Servers
Adaptive Transparent Flash Cache
SFX API Gives Users Control
[pre-staging, alignment, bypass]
S3/Swift
6. 6
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Hype
2011 2014
#bigdata in the trough of disillusion is great news for the enterprise!
Today
7. 7
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Back To The Future?
The term “Big Data” coined circa 1999(1)
• Pervasive in some existing markets since late 90’s
– HPC sensu latissimo
– Life Sciences
– Intelligence
– ASP (remember that word?)
Is there anything new here? Why the hype?
(1) A Personal Perspective on the Origin(s) and Development of Big Data" Diebold 2012
8. 8
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Is There a #bigdata Definition?
For some yes; for others no – or maybe there are multiple definitions
• It is “a basket of
technologies”
• It creates “a mindset
change in decision
making”
“Data sets that exceed the boundaries and sizes of current infrastructure
capabilities, forcing technologists to take a non-traditional approach”
Normal
Processing
Capabilities
File/Object Size, Content Volume
Activity:IOPS
Lots of
data
Large file
sizes
Lots of
transactions
9. 9
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
#bigdata: 2 Dimensions of the 3 V’s
Petabytes of Data
but also
Trillions of Information
Objects
GB/s to TB/s
but also
Millions of Information
Object per second
Structured & Unstructured
but also
Streams & Batches
workloads
The “trillions” & “millions” are the primary drivers of complexity
and challenge “Time to Results”
VelocityVolume Variety
Remember . . .
1ms lost per operation on a billion operations workload= 11.5 days lost!
10. 10
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
So, is #bigdata the new thing?
11. 11
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Quiz!
12. 12
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
The Dawn of a Telemetry Revolution
Internet
of
Things
Social
Sensors
Telemetry
Revolution
The Birth of a
Mindset Change in
Business Decision
Making
14. 14
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Governance, Regulation, Compliance
The Universe of Big Data is
a massive black hole into
which GRC has fallen
• Governance
• Regulation
• Compliance
• Security
• Privacy
Now, welcome to the era of shadow data and
behold the plague of hyper-scalability
15. 15
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Tackling #bigdata Is Non-trivial
Value extraction (insights
driving business results) is
only done on 1% of total
enterprise data
Time to value & time to result is
business critical
– Inadequate infrastructure =
failure & credibility loss
The cardinality
dimensions of the 3V’s
are the infrastructure
killers
Material: network, compute,
storage
– Human: DBA, sysadmin &
storadmin
Today #bigdata project cannot live
in IT or it will fail
Dare to be different
#bigdata nullifies the feature race
and favors the benefit race
16. 16
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Let’s Talk Real #bignumbers
HPC is a forward looking time machine that eats #bigdata for lunch
• Enterprise’s
#bigdata problems
of today were HPC
problems 3 to 5
years ago
• HPC & WEB
architectures are
converging
17. 17
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
The #bigdata Effect on Existing IT Infrastructures
18. 18
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Top 3 #bigdata Infrastructure Challenges
19. 19
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
The Scalability Devil Effect on Typical
Analytics
• Economics of large capacity EDW storage
• Scalability of NAS/SAN file systems
• Bandwidth demand of OLAP engine
• IOPS demand of modelization
• Memory requirements of visualization
• MPP drives I/O blending
Structured
Data
Unstructured
Data
ETL
ETL
EDW
NAS/SAN
ETL
ETL
OLAP
Engine
Semantic
Engine
Model
Visualize
Report
21. 21
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Hadoop
• IS NOT a person or the solution to world famine or a BI
platform or an analytics platform or an EDW or a CEP
engine or …..
• IS a growing basket of technologies facilitating BI and/or
analytics especially if there is a lot of unstructured data
• IS at the core of many “science projects”
• IS in the infancy of deployment in the traditional enterprise
• HDFS “data lake” concept is very important
22. 22
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
BI & Analytics Today
Database
File System
ETL
(primary)
Enterprise
Data
Warehouse
Reporting
&
Visualization
ETL
(secondary)
Analytics
CEP
Business
Auditing
&
Planning
23. 23
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Hadoop Effect
Database
ETL
Enterprise
Data
Warehouse
Reporting
&
Visualization
Analytics
CEP
Business
Auditing
&
Planning
Buiness
Data
Warehouse
24. 24
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
24
#bigdata “At Work” with DDN
Case Studies
25. 25
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Accelerating Fraud Awareness
Harnessing Hadoop and Big Data
DDN helps PayPal’s Financial Linking
System achieve 200–250ms
processing and customer transparency
“On the cost side, the same
performance at 3-4 times less cost,
that’s clearly important. The fact is,
you’ve got scalability you didn’t
have previously.”
Ryan Quick, Principal Architect, PayPal
26. 26
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Accelerating Financial Insights
“Other technologies paled in
comparison to the performance
levels achieved with DDN’s SFA12K.”
Brian Alexseychuk, Managing Director of Infrastructure
• Resolved scaling challenges and
parallelized workflows
• Exceeded competitors on metrics such as
scalability, speed, density, and TCO
• Improved revenues, reduced trade
slippage by 70% & cut telecom expenses
27. 27
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Accelerating Time To Cure
“If you can serve some of the fastest
computers on the planet, then you
can help us.”
Phil Butcher, Head IT
“If you need 10K cores to perform an
extra layer of analysis in an hour …
you need a real solution that can
address everything from very small
to extremely large data sets.”
Tim Cutts, Head of Scientific Computing
28. 28
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
Accelerating Intelligence Insights
Naval Research Lab
Large Data Program
Application
• Deep storage & fast distributed search
• Super-HD, 2/3-D, and streaming data
DDN enables rapid threat detection by speeding
up real-time data and imagery up to 500%.
30. 30
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
2 Faces of #bigdata =
Opportunities for Innovation
Technology
– Hyper-scalability: DB & FS
– Privacy (masking, obfuscation)
– Keyless security
– Visualization and navigation of
large datasets
– HDFS persistence
– Provenance
– In-memory computing
– In-Storage Processing
– GraphDB on MPP
– Brute force or machine
learning?
– Predictive & prescriptive
analytics
Business
– Agility
– Narrow casted solutions with
higher stickiness
– Data driven business decision
– Retain existing customers and
gain new ones
31. 31
© 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change. ddn.com
@informationcto