A modern, flexible approach to Hadoop implementation incorporating innovations from HP Haven
Jeff Veis
Vice President
HP Software Big Data
Gilles Noisette
Master Solution Architect
HP EMEA Big Data CoE
3. Data Accessibility today
Infrastructure that becomes
unaffordable at scale
Analytics power that is
accessible to only the few
A trade-off between quality of
insight & the speed of decisions
Typical Compromises
Data is often past its effective
expiration date to add value
4. IT
• Static Reporting
• Uniformity & Traceability
• Resource Rationing
• Cost focus
• Governance through denial
Efficiency of the Answer
Business
• Interactive Exploration
• Unfettered access
• Always on anywhere access
• Results focus
• Governance through enablement
Importance of the Question
Over 50% of all analytics related buying is now coming from the
business and increasingly from individuals – Gartner ‘15SHIFT >
5. Empty
• Loss of Control & Budget
• IT’s future viability
• Risk of Duplication
• Unintentional Siloed Data
• Tie IT results to IT operations
Full
• Opportunity to collaborate
• Refocus on innovation
• Enable data-driven risk taking
• Spur business agility
• Tie IT results to business outcomes
Changing Role of the CIO
Emergence of Decentralized Analytics
6. 6
OLD
NEW
Management &
Governance
Data lake
Business Aligned Insight in Action
Enabling ubiquitous data flows for business-driven composite applications & services
Data-driven Composite Apps
& OnDemand Services
Business as a passive
consumer of data
Business as an active,
collaborative data-
driven partner with IT
EDW
Big Data Analytics
Descriptive Analytics (Data Discovery, Embedded
Analytics, Analytic Applications)
7. Management & Governance
A connected intelligence platform designed to harness 100% of the data
EDW
App
DB
App
DB
App
DB
Next gen data
services
Composite analytic
apps
Next gen predictive
analytics
Data lake
HP Haven Big Data platform
Reporting
Other data
New Style of IT
Data Tone
HP Haven Big Data Platform
9. Haven
Big Data Platform
Turn 100% of your
data into action.
Powering Big Data Analytics to Applications
Insight
Haven OnDemand
• Open APIs
• Rapid POCs & deployment
• Elastic / Multi-tenant
• Private Cloud-ready
• Pay-as-you-go
Haven Enterprise
• SQL / BI / Reporting
• Predictive Analytics
• Machine Learning
• Log Analytics
• Search
• Image / Audio / Video
The HP Haven Big Data Platform
Haven OnHadoop
• Secure Data Lake
• Exploration
• Open Data Format
• YARN-ready
• Governance
• Native support for MapR,
Hortonworks & Cloudera
Human Data
Business Data
Machine Data
HP Vertica, HP IDOL, KeyView,
HP Distributed R Predictive Analytics
HP Vertica SQL on Hadoop
HP IDOL for Hadoop
HP Vertica OnDemand &
HP IDOL OnDemand
10. Gain insights into your data in near-real time by running queries 50x-1,000x faster than legacy products
Blazing Fast Analytics
Speed, Scalability, and Openness at Lower TCO
HP Vertica
High-Performance Data Analytics Platform Purpose Built for Big Data
HP Vertica Analytics Platform
Infinitely scale your solution by adding an unlimited number of industry-standard servers
Massive Scalability
Protect and embrace your investment in hardware and software, with built-in support for
Hadoop, R, and a range of ETL and BI tools
Open Architecture
Store 10x-30x more data per server than row databases with patented columnar compression
Optimized Data Storage
11. HP Vertica – Built for Speed
We boost performance
Use to take Now takes
1 hour 3.6 Seconds
8 hours (overnight) Under 30 seconds
What Vertrica Performance Advantage means:
"When we did the first queries, they were done so
fast, we thought they were broken.“
- Michael Relich, Guess?
13. Haven OnHadoop – Delivering a Smarter Data Lake
Vertica Optimized Storage Hadoop
Enterprise-class discovery analytics on ANY Hadoop node
HP Vertica SQL on Hadoop
HP Vertica SQL on Hadoop :
- Best-in-class ANSI SQL Analytics
- Hadoop Distribution Agnostic
- Query data in place in Hadoop Formats
- Co-Locate and leverage existing Hadoop infrastructure
- HP Vertica performance on lower-cost infrastructure
-Single query engine across diverse formats and infrastructure
14. Apache YARN : The resource manager for Hadoop 2.0
HP Vertica on Hadoop YARN
HP Software works on porting Vertica on YARN
Data Processing Engines Run Natively IN Hadoop
INTERACTIVE
Tez
STREAMING
Storm
GRAPH
Giraph
ANALYTICS
hp Vertica
ONLINE
HBase
OTHERS
…
HDFS: Redundant, Reliable Storage
YARN: Cluster Resource Management
BATCH
MapReduce
F U T U R E
ANALYTICS
hp Vertica
15. HP Haven Predictive Analytics
Delivering scale and performance with Distributed R breakthrough technology
Build models
Evaluate models
Deploy
models
(In-database
scoring)
BI integration
1 2
3
Build and evaluate
predictive models on large
data sets using Distributed
R
2
1 Ingest and prepare data by
leveraging HP Vertica
3 Deploy models to Vertica and
use in-database scoring to
produce prediction results for
BI and applications
5XPerformance
improvement
A scalable, high-performance engine for the R language developed by HP Labs
•Natively integration to HP Vertica
•Compatible with popular tools like R Studio and existing R libraries
•Open source supported by HP with enterprise-class support
HP powered
clustered
computing
New
19. Haven OnDemand Big Data services powered by IDOL
+ 50 easy-to-use web services to power the next generation of apps
Now includes
Speech to Text powered by
Deep Neural Network
technology that is 75% more
accurate as well as advanced
Knowledge Graph search
technology