Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
Upcoming SlideShare
Loading in...5
×
 

Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)

on

  • 157 views

The amount of data in our world has been exploding, and storing and analyzing large data sets—so-called big data—will become a key basis of competition for the new “Enterprise of Things”, ...

The amount of data in our world has been exploding, and storing and analyzing large data sets—so-called big data—will become a key basis of competition for the new “Enterprise of Things”, underpinning fresh waves of productivity growth, innovation, and consumer surplus. Leaders in every sector – from government to healthcare to finance – will have to grapple with the implications of big data, as data growth continues unabated for the foreseeable future. The quest to make sense of all this big data begins with breaking down data silos within organizations using the cost appropriate, shared infrastructure to ensure optimal extraction and analysis of data, knowledge and insight.

This presentation highlights all aspect of #bigdata exploitation, good or bad. It also speaks of the infrastructure challenges associated with it, the place of #hadoop in the big picture and areas of opportunity for innovations.

Statistics

Views

Total Views
157
Views on SlideShare
156
Embed Views
1

Actions

Likes
1
Downloads
3
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything) Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything) Presentation Transcript

  • Hype, Hopes, Hell & Hadoop Big Data: Reality Check and Infrastructure Implications of “The Enterprise of Everything” Jean-Luc Chatelain, EVP & CTOStampedeCon 2014
  • 2 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com 2 And now, a quick word from my sponsor 
  • 3 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com DDN | Who We Are • Main Office: Santa Clara, California, USA • Employees: ~550 in 20 Countries • Installed Base: End Customers in 50 Countries • Go To Market: Partner & Reseller Assisted, Direct • DDN: World’s Largest Private Storage Company We Design, Deploy and Optimize Storage Systems that Solve HPC, Big Data and Cloud Business Challenges at Scale World-Renowned & Award-Winning
  • 4 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Big Data & Cloud Infrastructure DDN’s Award-Winning Product Portfolio Analytics Reference Architectures EXAScaler™ 10Ks of Clients 1TB/s+, HSM Linux HPC Clients NFS & CIFS [2014] Petascale Lustre® Storage Enterprise Scale-Out File Storage GRIDScaler™ ~10K Clients 1TB/s+, HSM Linux/Windows HPC Clients NFS & CIFS SFA12KX™ 48GB/s, 1.7M IOPS 1,680 Drives in 2 Racks Optional Embedded Computing SFA7700™ 13GB/s; 600K IOPS • 7700X • 7700E Storage Fusion Architecture™ Core Storage Platforms SATA SSD Flexible Drive Configuration SAS SFX™ Automated Flash Caching WOS® 3.0 32 Trillion Unique Objects Geo-Replicated Cloud Storage 256 Million Objects/Second Self-Healing Cloud Embedded metadata mgmt Cloud Foundation Big Data Platform Management DirectMon® Cloud Tiering Infinite Memory Engine™ Distributed File System Buffer Cache WOS7000 60 Drives in 4U Self-Contained Servers Adaptive Transparent Flash Cache SFX API Gives Users Control [pre-staging, alignment, bypass] S3/Swift
  • Hype & Hopes
  • 6 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Hype 2011 2014 #bigdata in the trough of disillusion is great news for the enterprise! Today
  • 7 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Back To The Future? The term “Big Data” coined circa 1999(1) • Pervasive in some existing markets since late 90’s – HPC sensu latissimo – Life Sciences – Intelligence – ASP (remember that word?) Is there anything new here? Why the hype? (1) A Personal Perspective on the Origin(s) and Development of Big Data" Diebold 2012
  • 8 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Is There a #bigdata Definition? For some yes; for others no – or maybe there are multiple definitions • It is “a basket of technologies” • It creates “a mindset change in decision making” “Data sets that exceed the boundaries and sizes of current infrastructure capabilities, forcing technologists to take a non-traditional approach” Normal Processing Capabilities File/Object Size, Content Volume Activity:IOPS Lots of data Large file sizes Lots of transactions
  • 9 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com #bigdata: 2 Dimensions of the 3 V’s Petabytes of Data but also Trillions of Information Objects GB/s to TB/s but also Millions of Information Object per second Structured & Unstructured but also Streams & Batches workloads The “trillions” & “millions” are the primary drivers of complexity and challenge “Time to Results” VelocityVolume Variety Remember . . . 1ms lost per operation on a billion operations workload= 11.5 days lost!
  • 10 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com So, is #bigdata the new thing?
  • 11 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Quiz!
  • 12 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com The Dawn of a Telemetry Revolution Internet of Things Social Sensors Telemetry Revolution The Birth of a Mindset Change in Business Decision Making
  • Hell
  • 14 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Governance, Regulation, Compliance The Universe of Big Data is a massive black hole into which GRC has fallen • Governance • Regulation • Compliance • Security • Privacy Now, welcome to the era of shadow data and behold the plague of hyper-scalability
  • 15 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Tackling #bigdata Is Non-trivial Value extraction (insights driving business results) is only done on 1% of total enterprise data Time to value & time to result is business critical – Inadequate infrastructure = failure & credibility loss The cardinality dimensions of the 3V’s are the infrastructure killers Material: network, compute, storage – Human: DBA, sysadmin & storadmin Today #bigdata project cannot live in IT or it will fail Dare to be different #bigdata nullifies the feature race and favors the benefit race
  • 16 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Let’s Talk Real #bignumbers HPC is a forward looking time machine that eats #bigdata for lunch • Enterprise’s #bigdata problems of today were HPC problems 3 to 5 years ago • HPC & WEB architectures are converging
  • 17 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com The #bigdata Effect on Existing IT Infrastructures
  • 18 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Top 3 #bigdata Infrastructure Challenges
  • 19 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com The Scalability Devil Effect on Typical Analytics • Economics of large capacity EDW storage • Scalability of NAS/SAN file systems • Bandwidth demand of OLAP engine • IOPS demand of modelization • Memory requirements of visualization • MPP drives I/O blending Structured Data Unstructured Data ETL ETL EDW NAS/SAN ETL ETL OLAP Engine Semantic Engine Model Visualize Report
  • Hadoop
  • 21 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Hadoop • IS NOT a person or the solution to world famine or a BI platform or an analytics platform or an EDW or a CEP engine or ….. • IS a growing basket of technologies facilitating BI and/or analytics especially if there is a lot of unstructured data • IS at the core of many “science projects” • IS in the infancy of deployment in the traditional enterprise • HDFS “data lake” concept is very important
  • 22 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com BI & Analytics Today Database File System ETL (primary) Enterprise Data Warehouse Reporting & Visualization ETL (secondary) Analytics CEP Business Auditing & Planning
  • 23 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Hadoop Effect Database ETL Enterprise Data Warehouse Reporting & Visualization Analytics CEP Business Auditing & Planning Buiness Data Warehouse
  • 24 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com 24 #bigdata “At Work” with DDN Case Studies
  • 25 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Accelerating Fraud Awareness Harnessing Hadoop and Big Data DDN helps PayPal’s Financial Linking System achieve 200–250ms processing and customer transparency “On the cost side, the same performance at 3-4 times less cost, that’s clearly important. The fact is, you’ve got scalability you didn’t have previously.” Ryan Quick, Principal Architect, PayPal
  • 26 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Accelerating Financial Insights “Other technologies paled in comparison to the performance levels achieved with DDN’s SFA12K.” Brian Alexseychuk, Managing Director of Infrastructure • Resolved scaling challenges and parallelized workflows • Exceeded competitors on metrics such as scalability, speed, density, and TCO • Improved revenues, reduced trade slippage by 70% & cut telecom expenses
  • 27 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Accelerating Time To Cure “If you can serve some of the fastest computers on the planet, then you can help us.” Phil Butcher, Head IT “If you need 10K cores to perform an extra layer of analysis in an hour … you need a real solution that can address everything from very small to extremely large data sets.” Tim Cutts, Head of Scientific Computing
  • 28 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com Accelerating Intelligence Insights Naval Research Lab Large Data Program Application • Deep storage & fast distributed search • Super-HD, 2/3-D, and streaming data DDN enables rapid threat detection by speeding up real-time data and imagery up to 500%.
  • In Conclusion
  • 30 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com 2 Faces of #bigdata = Opportunities for Innovation Technology – Hyper-scalability: DB & FS – Privacy (masking, obfuscation) – Keyless security – Visualization and navigation of large datasets – HDFS persistence – Provenance – In-memory computing – In-Storage Processing – GraphDB on MPP – Brute force or machine learning? – Predictive & prescriptive analytics Business – Agility – Narrow casted solutions with higher stickiness – Data driven business decision – Retain existing customers and gain new ones
  • 31 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change. ddn.com @informationcto