SlideShare a Scribd company logo
1 of 35
Download to read offline
Eliminating the Challenges of Big Data
Management Inside Hadoop
2 © RedPoint Global Inc. 2015 Confidential
Today’s Speakers
Justin Sears, Senior Manager, Product Marketing,
Hortonworks
Jamie Keeffe, Product Marketing Manager, RedPoint
Global
Kris Tomes, Solutions Director, RedPoint Global
	
  
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hortonworks: Hadoop for the Enterprise
We Do Hadoop
Spring 2015
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop for the Enterprise:
Implement a Modern Data Architecture with HDP
Customer Momentum
•  330+ customers (as of year-end 2014)
Hortonworks Data Platform
•  Completely open multi-tenant platform for any app & any data.
•  A centralized architecture of consistent enterprise services for
resource management, security, operations, and governance.
Partner for Customer Success
•  Open source community leadership focus on enterprise needs
•  Unrivaled world class support
•  Founded in 2011
•  Original 24 architects, developers,
operators of Hadoop from Yahoo!
•  600+ Employees
•  1000+ Ecosystem Partners
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop for the Enterprise: Implement a
Modern Data Architecture with HDP
Spring 2015
Hortonworks. We do Hadoop.
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Traditional systems under pressure
Challenges
•  Constrains data to app
•  Can’t manage new data
•  Costly to Scale
Business Value
Clickstream
Geolocation
Web Data
Internet of Things
Docs, emails
Server logs
2012
2.8 Zettabytes
2020
40 Zettabytes
LAGGARDS
INDUSTRY
LEADERS
1
2 New Data
ERP CRM SCM
New
Traditional
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop emerged as foundation of new data architecture
Apache Hadoop is an open source data platform for
managing large volumes of high velocity and variety of data
•  Built by Yahoo! to be the heartbeat of its ad & search business
•  Donated to Apache Software Foundation in 2005 with rapid adoption by
large web properties & early adopter enterprises
•  Incredibly disruptive to current platform economics
Traditional Hadoop Advantages
ü  Manages new data paradigm
ü  Handles data at scale
ü  Cost effective
ü  Open source
Traditional Hadoop Had Limitations
" Batch-only architecture
" Single purpose clusters, specific data sets
" Difficult to integrate with existing investments
" Not enterprise-grade
Application
Storage
HDFS
Batch Processing
MapReduce
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Modern Data Architecture emerges to unify data & processing
Modern Data Architecture
•  Enable applications to have access to
all your enterprise data through an
efficient centralized platform
•  Supported with a centralized approach
governance, security and operations
•  Versatile to handle any applications
and datasets no matter the size or type
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
SOURCES
Existing Systems
ERP	
   CRM	
   SCM	
  
ANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
ANALYTICS
Applications
Business
Analytics
Visualization
& Dashboards
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
HDFS
(Hadoop Distributed File System)
YARN: Data Operating System
Interactive Real-TimeBatch Partner ISVBatch Batch
MPP	
   EDW	
  
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Modern Data Architecture emerges to unify data & processing
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
SOURCES
Existing Systems
ERP	
   CRM	
   SCM	
  
ANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
ANALYTICS
Applications
Business
Analytics
Visualization
& Dashboards
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
HDFS
(Hadoop Distributed File System)
YARN: Data Operating System
Interactive Real-TimeBatch Partner ISVBatch Batch
MPP	
   EDW	
  
RedPoint	
  Global	
  is	
  a	
  Hortonworks	
  Partner,	
  
cer3fied	
  on	
  HDP	
  and	
  YARN.	
  
	
  
With	
  RedPoint,	
  your	
  exis:ng	
  data	
  analysts	
  and	
  
database	
  administrators	
  can	
  easily	
  work	
  with	
  
data	
  stored	
  in	
  Hadoop.	
  No	
  new	
  skills	
  are	
  
required.	
  
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop adoption follows a predictable journey
Cost Optimization, new analytic apps, and ultimately to a “data lake”
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop Driver: Cost optimization
Archive Data off EDW
Move rarely used data to Hadoop as active
archive, store more data longer
Offload costly ETL process
Free your EDW to perform high-value functions
like analytics & operations, not ETL
Enrich the value of your EDW
Use Hadoop to refine new data sources, such as
web and machine data for new analytical context
ANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
HDP helps you reduce costs and optimize the value associated with your EDW
ANALYTICSDATASYSTEMS
Data
Marts
Business
Analytics
Visualization
& Dashboards
HDP 2.2
ELT
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
N
Cold Data,
Deeper Archive
& New Sources
Enterprise Data
Warehouse
Hot
MPP
In-Memory
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
Existing Systems
ERP	
   CRM	
   SCM	
  
SOURCES
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Single View
Improve acquisition and retention
Predictive Analytics
Identify your next best action
Data Discovery
Uncover new findings
Financial Services
New Account Risk Screens Trading Risk Insurance Underwriting
Improved Customer Service Insurance Underwriting Aggregate Banking Data as a Service
Cross-sell & Upsell of Financial Products Risk Analysis for Usage-Based Car Insurance Identify Claims Errors for Reimbursement
Telecom
Unified Household View of the Customer Searchable Data for NPTB Recommendations Protect Customer Data from Employee Misuse
Analyze Call Center Contacts Records Network Infrastructure Capacity Planning Call Detail Records (CDR) Analysis
Inferred Demographics for Improved Targeting Proactive Maintenance on Transmission Equipment Tiered Service for High-Value Customers
Retail
360° View of the Customer Supply Chain Optimization Website Optimization for Path to Purchase
Localized, Personalized Promotions A/B Testing for Online Advertisements Data-Driven Pricing, improved loyalty programs
Customer Segmentation Personalized, Real-time Offers In-Store Shopper Behavior
Manufacturing
Supply Chain and Logistics Optimize Warehouse Inventory Levels Product Insight from Electronic Usage Data
Assembly Line Quality Assurance Proactive Equipment Maintenance Crowdsource Quality Assurance
Single View of a Product Throughout Lifecycle Connected Car Data for Ongoing Innovation Improve Manufacturing Yields
Healthcare
Electronic Medical Records Monitor Patient Vitals in Real-Time Use Genomic Data in Medical Trials
Improving Lifelong Care for Epilepsy Rapid Stroke Detection and Intervention Monitor Medical Supply Chain to Reduce Waste
Reduce Patient Re-Admittance Rates Video Analysis for Surgical Decision Support Healthcare Analytics as a Service
Oil & Gas
Unify Exploration & Production Data Monitor Rig Safety in Real-Time Geographic exploration
DCA to Slow Well Declines Curves Proactive Maintenance for Oil Field Equipment Define Operational Set Points for Wells
Government
Single View of Entity CBM & Autonomic Logistic Analysis Sentiment Analysis on Program Effectiveness
Prevent Fraud, Waste and Abuse Proactive Maintenance for Public Infrastructure Meet Deadlines for Government Reporting
Hadoop Driver: Advanced analytic applications
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop Driver: Enabling the data lakeSCALE
SCOPE
Data Lake Definition
•  Centralized Architecture
Multiple applications on a shared data set
with consistent levels of service
•  Any App, Any Data
Multiple applications accessing all data
affording new insights and opportunities.
•  Unlocks ‘Systems of Insight’
Advanced algorithms and applications
used to derive new value and optimize
existing value.
Drivers:
1.  Cost Optimization
2.  Advanced Analytic Apps
Goal:
•  Centralized Architecture
•  Data-driven Business
DATA LAKE
Journey to the Data Lake with Hadoop
Systems of Insight
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Case Study: 12 month Hadoop evolution at TrueCar
DataPlatformCapabilities
12 months execution plan
June 2013
Begin
Hadoop
Execution
July 2013
Hortonworks
Partnership
May ‘14
IPO
Aug 2013
Training
& Dev
Begins
Nov 2013
Production
Cluster
60 Nodes
2 PB
Jan 2014
40% Dev
Staff
Perficient
Dec 2013
Three
Production
Apps
(3 total)
Feb 2014
Three More
Production
Apps
(6 total)
12 Month Results at TRUECar
•  Six Production Hadoop Applications
•  Sixty nodes/2PB data
•  Storage Costs/Compute Costs
from $19/GB to $0.23/GB
“We addressed our data platform capabilities
strategically as a pre-cursor to IPO.”
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hortonworks Data Platform
Hadoop for the Enterprise
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Only HDP delivers a Centralized Architecture
HDP is uniquely built around YARN serving as a data operating system that provides multi-tenant Resource
Management, consistent Governance & Security and efficient Operations services across Hadoop applications.
Hortonworks Data Platform
YARN
Data Operating System
•  A centralized architecture of
consistent enterprise services for
resource management, security,
operations, and governance.
•  The versatility to support multiple
applications and diverse workloads
from batch to interactive to real-time,
open source and commercial.
Key Benefits
•  Multiple applications on a shared data set
with consistent levels of service: a
multitenant data platform.
•  Provides a shared platform to enable new
analytic applications.
•  Delivers maximum cost efficiency for
cluster resource management. Fewer
servers fewer nodes.
Storage
YARN: Data Operating System
Governance Security
Operations
Resource Management
Existing
Applications
New
Analytics
Partner
Applications
Data Access: Batch, Interactive & Real-time
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
HDP delivers a completely open data platform
Hortonworks Data Platform 2.2
Hortonworks Data Platform provides Hadoop for the Enterprise: a centralized architecture
of core enterprise services, for any application and any data.
Completely Open
•  HDP incorporates every element
required of an enterprise data
platform: data storage, data access,
governance, security, operations
•  All components are developed in
open source and then rigorously
tested, certified, and delivered as an
integrated open source platform
that’s easy to consume and use by
the enterprise and ecosystem.
YARN: Data Operating System
(Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
ApachePig
° °
° °
° ° °
° ° °
HDFS
(Hadoop Distributed File System)
GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
Apache Falcon
ApacheHive
Cascading
ApacheHBase
ApacheAccumulo
ApacheSolr
ApacheSpark
ApacheStorm
Apache Sqoop
Apache Flume
Apache Kafka
SECURITY
Apache Ranger
Apache Knox
Apache Falcon
OPERATIONS
Apache Ambari
Apache
Zookeeper
Apache Oozie
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
HDP: Any Data, Any Application, Anywhere
Any Application
•  Deep integration with ecosystem
partners to extend existing
investments and skills
•  Broadest set of applications through
the stable of YARN-Ready applications
Any Data
Deploy applications fueled by clickstream, sensor,
social, mobile, geo-location, server log, and other new
paradigm datasets with existing legacy datasets.
Anywhere
Implement HDP naturally across the complete
range of deployment options
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Internet	
  of	
  
Things	
  
Server	
  	
  
Logs	
  
Files,	
  emails	
  ERP	
   CRM	
   SCM	
  
hybrid
commodity appliance cloud
Over 70 Hortonworks Certified YARN Apps
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Expansion
Architecture &
Development ProductionImplementation
Hortonworks supports the full application lifecycle
Hadoop usage follows a consistent lifecycle
From architecture to expansion, all with a consistent support experience
Most Common Support
Issues by Project Phase
Issues address by Hortonworks
Support by type for the past year
Issue Type
Architecture 7%
Application Development	
   10%
Installation	
   10%
Performance	
   5%
Configuration 	
   25%
Executing Jobs	
   20%
Cluster Administration 	
   18%
HDP Upgrades	
   3%
Enhancement Requests	
   3%
TOTAL 100%
Hortonworks Support
Full Lifecycle Subscription Support
Support through EVERY phase of adoption of
your Hadoop project to ensure your success
# tickets
Project 2
Project 3
Project N
.
.
.
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
“Hortonworks loves and lives
open source innovation”
World Class Support and Services.
Hortonworks' Customer Support received a
maximum score and was significantly higher
than both Cloudera and MapR
A Leader in Hadoop
The Forrester Wave™
Big Data Hadoop Solutions
Q1 2014
Eliminating the Challenges of Big Data
Management Inside Hadoop
22 © RedPoint Global Inc. 2015 Confidential
Overview of RedPoint Global
" Launched	
  2006	
  
" Founded	
  and	
  staffed	
  by	
  industry	
  veterans	
  
" Headquarters:	
  Wellesley,	
  MassachuseJs	
  
" Offices	
  in	
  US,	
  UK,	
  Australia,	
  Philippines	
  
" Global	
  customer	
  base	
  
" Serves	
  most	
  major	
  industries	
  
	
  
MAGIC	
  QUADRANT	
  
Data	
  Quality	
  	
  
MAGIC	
  QUADRANT	
  
Mul:channel	
  Campaign	
  
Management	
  
MAGIC	
  QUADRANT	
  
Integrated	
  Marke:ng	
  
Management	
  
23 © RedPoint Global Inc. 2015 Confidential
Andrew Brust, GigaOm Research
24 © RedPoint Global Inc. 2015 Confidential
New Data Straining Current Architectures
Unstructured	
  documents,	
  emails	

Transac:onal	
  data	

Server	
  logs	

Sen:ment,	
  web	
  data	

Geoloca:on	

Sensor,	
  machine	
  data	

Clickstream	

Hierarchical	
  data	

OLTP,	
  ERP,	
  CRM	

Master	
  data	

2.8	
  ZB	
  in	
  2013	
  
85%	
  from	
  new	
  data	
  types	
  
15x	
  Machine	
  Data	
  by	
  2020	
  
40	
  ZB	
  by	
  2020	
  
Source: IDC
25 © RedPoint Global Inc. 2015 Confidential
Key Functions for Data Management
Master	
  Key	
  Management	
  
ETL	
  &	
  ELT	
   Data	
  Quality	
  
Web	
  Services	
  Integra:on	
  
Integra:on	
  &	
  Matching	
  
Process	
  Automa:on	
  	
  
&	
  Opera:ons	
  
•  Profiling,	
  reads/writes,	
  
transforma:ons	
  
•  Single	
  project	
  for	
  all	
  jobs	
  
•  Cleanse	
  data	
  
•  Parsing,	
  correc:on	
  
•  Geo-­‐spa:al	
  analysis	
  
•  Grouping	
  
•  Fuzzy	
  match	
  
•  Create	
  keys	
  
•  Track	
  changes	
  
•  Maintain	
  matches	
  	
  
over	
  :me	
  
•  Consume	
  and	
  publish	
  
•  HTTP/HTTPS	
  protocols	
  
•  XML/JSON/SOAP	
  formats	
  
•  Job	
  scheduling,	
  monitoring,	
  
no:fica:ons	
  
•  Central	
  point	
  of	
  control	
  
•  Meta	
  Data	
  Management	
  
26 © RedPoint Global Inc. 2015 Confidential
Overview - What is Hadoop?
Hadoop	
  1.0	
  
•  All	
  opera:ons	
  based	
  on	
  Map	
  Reduce	
  
•  Intrinsic	
  inconsistency	
  of	
  code	
  based	
  
solu:ons	
  
•  Highly	
  skilled	
  and	
  expensive	
  resources	
  
needed	
  
•  3rd	
  party	
  applica:ons	
  constrained	
  by	
  
the	
  need	
  to	
  generate	
  code	
  
Hadoop	
  2.0	
  
•  Introduc:on	
  of	
  the	
  YARN:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
“a	
  general-­‐purpose,	
  distributed,	
  applica:on	
  management	
  
framework	
  that	
  supersedes	
  the	
  classic	
  Apache	
  Hadoop	
  
MapReduce	
  framework	
  for	
  processing	
  data	
  in	
  Hadoop	
  
clusters.”	
  
•  Mature	
  applica:ons	
  can	
  now	
  operate	
  directly	
  on	
  
Hadoop	
  
•  Reduce	
  skill	
  requirements	
  and	
  increased	
  
consistency	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
HDFS	
  
(Hadoop	
  Distributed	
  File	
  System)	
  
YARN:	
  	
  Data	
  Opera3ng	
  System	
  
Batch	
  
MapReduce	
  
Batch	
  &	
  Interac3ve	
  
Tez	
  
Real-­‐Time	
  
Slider	
  
Spark	
  
Other	
  ISV	
  
	
  
Other	
  	
  
ISV	
  
	
  
	
  
Stream	
  
	
  
	
  
Storm	
  
	
  
	
  
	
  
NoSQL	
  
	
  
	
  	
  
HBase	
  
Accumulo	
  
	
  
	
  
Other	
  	
  
ISV	
  
	
  
	
  
Cascading	
  
	
  
Scala	
  
Java	
  
	
  
	
  
	
  
SQL	
  
	
  
Hive	
  
	
  
	
  
	
  
	
  
Scrip3ng	
  
	
  
Pig	
  
	
  
	
  
	
  
	
  
Direct	
  
	
  
Java	
  
.NET	
  
	
  
	
  
	
  
API	
  
Engine	
  
System	
  
HADOOP	
  2.0	
  
27 © RedPoint Global Inc. 2015 Confidential
RedPoint Data Management on Hadoop
Par::oning	
  
AM	
  /	
  Tasks	
  
Execu:on	
  
AM	
  /	
  Tasks	
  
Data	
  I/O	
  
Key	
  /	
  Split	
  
Analysis	
  
Parallel	
  Sec:on	
  
YARN	
  
MapReduce	
  
28 © RedPoint Global Inc. 2015 Confidential
Resource	
  
Manager	
  
Launches	
  
Tasks	
  
Node	
  Manager	
  
DM	
  App	
  Master	
  
DM	
  Task	
  
Node	
  Manager	
  
DM	
  Task	
  
DM	
  Task	
  
Node	
  Manager	
  
DM	
  Task	
  
DM	
  Task	
  
Launches	
  DM	
  
App	
  Master	
  
Data	
  Management	
  
Designer	
  
DM	
  
Execu3on	
  
Server	
  
Parallel	
  Sec:on	
  
Running	
  DM	
  Task	
  
1
2
3
RedPoint DM for Hadoop: Processing Flow
29 © RedPoint Global Inc. 2015 Confidential
>150	
  Lines	
  of	
  MR	
  Code	
   ~50	
  Lines	
  of	
  Script	
  Code	
   0	
  Lines	
  of	
  Code	
  
6	
  hours	
  of	
  development	
   3	
  hours	
  of	
  development	
   15	
  min.	
  of	
  development	
  
6	
  minutes	
  run:me	
   15	
  minutes	
  run:me	
   3	
  minutes	
  run:me	
  
Extensive	
  op:miza:on	
  
needed	
  
User	
  Defined	
  Func:ons	
  
required	
  prior	
  to	
  running	
  
script	
  
No	
  tuning	
  or	
  op:miza:on	
  
required	
  
RedPoint	
  
Benchmarks – Project Gutenberg
Map	
  Reduce	
   Pig	
  
Sample	
  MapReduce	
  (small	
  subset	
  of	
  the	
  entire	
  code	
  which	
  totals	
  nearly	
  150	
  lines):	
  
public	
  static	
  class	
  MapClass
extends	
  Mapper<WordOffset, Text, Text, IntWritable> {	
  
private	
  final	
  static	
  String delimiters =
"',./<>?;:"[]{}-=_+()&*%^#$!@`~ |«»¡¢£¤¥¦©¬®¯±¶·¿";	
  
private	
  final	
  static	
  IntWritable one = new	
  IntWritable(1);	
  
private	
  Text word = new	
  Text();	
  
public	
  void	
  map(WordOffset key, Text value, Context context)
throws	
  IOException, InterruptedException {
String line = value.toString();	
  
StringTokenizer itr = new	
  StringTokenizer(line, delimiters);	
  
while	
  (itr.hasMoreTokens()) {	
  
word.set(itr.nextToken());	
  
context.write(word, one);	
  
}	
  
}	
  
}	
  
	
  
Sample	
  Pig	
  script	
  without	
  the	
  UDF:	
  
SET	
  pig.maxCombinedSplitSize 67108864	
  
SET	
  pig.splitCombination true	
  
A = LOAD	
  '/testdata/pg/*/*/*';	
  
B = FOREACH A GENERATE FLATTEN(TOKENIZE((chararray)$0)) AS	
  word;	
  
C = FOREACH B GENERATE UPPER(word) AS	
  word;	
  
D = GROUP	
  C BY	
  word;	
  
E = FOREACH D GENERATE COUNT(C) AS	
  occurrences, group; 	
  
F = ORDER	
  E BY	
  occurrences DESC;	
  
STORE F INTO	
  '/user/cleonardi/pg/pig-count';
30 © RedPoint Global Inc. 2015 Confidential
RedPoint Ranks #1
31 © RedPoint Global Inc. 2015 Confidential
Consistent High Rankings
32 © RedPoint Global Inc. 2015 Confidential
Data Lake Architecture for MDM
33 © RedPoint Global Inc. 2015 Confidential
Kris Tomes
Solution Director at RedPoint
Demonstration Introduction
34 © RedPoint Global Inc. 2015 Confidential
Key Factors to Consider
" Traditional data architectures are challenged
" Maximize the scale & cost optimization of the Hortonworks Modern Data Architecture
" Leverage your DBAs to control development / implementation / production costs and schedules
" Smooth out your journey to a data lake
" Expedite the speed for getting business applications into production
" Insist on Any Data, Any Application, Any Environment
" Do your data quality and data integration in the cluster
35 © RedPoint Global Inc. 2015 Confidential
Thank You & Please Visit Us at www.RedPoint.net
Jamie	
  Keeffe	
  
Product	
  Marke:ng	
  Manager	
  	
  
	
  
RedPoint	
  Global	
  Inc.	
  
Jamie.Keeffe@RedPoint.Net	
  	
  
+1	
  978-­‐764-­‐3839	
  

More Related Content

What's hot

Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success DataWorks Summit/Hadoop Summit
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...Hortonworks
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Hortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
Trucking demo w Spark ML - Paul Hargis - Hortonworks
Trucking demo w Spark ML - Paul Hargis - HortonworksTrucking demo w Spark ML - Paul Hargis - Hortonworks
Trucking demo w Spark ML - Paul Hargis - HortonworksKelly Kohlleffel
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Hortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBigDataExpo
 

What's hot (19)

Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Trucking demo w Spark ML - Paul Hargis - Hortonworks
Trucking demo w Spark ML - Paul Hargis - HortonworksTrucking demo w Spark ML - Paul Hargis - Hortonworks
Trucking demo w Spark ML - Paul Hargis - Hortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
HPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare TransformationHPE and Hortonworks join forces to Deliver Healthcare Transformation
HPE and Hortonworks join forces to Deliver Healthcare Transformation
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
 

Viewers also liked

Hadoop and Spark – Perfect Together
Hadoop and Spark – Perfect TogetherHadoop and Spark – Perfect Together
Hadoop and Spark – Perfect TogetherHortonworks
 
Apache Hive on ACID
Apache Hive on ACIDApache Hive on ACID
Apache Hive on ACIDHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Hortonworks
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...Hortonworks
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambariHortonworks
 

Viewers also liked (9)

Hadoop and Spark – Perfect Together
Hadoop and Spark – Perfect TogetherHadoop and Spark – Perfect Together
Hadoop and Spark – Perfect Together
 
Apache Hive on ACID
Apache Hive on ACIDApache Hive on ACID
Apache Hive on ACID
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
 

Similar to Eliminating the Challenges of Big Data Management

Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJDaniel Madrigal
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopMats Johansson
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupMats Johansson
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
4. Big data & analytics HP
4. Big data & analytics HP4. Big data & analytics HP
4. Big data & analytics HPMITEF México
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsHortonworks
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 

Similar to Eliminating the Challenges of Big Data Management (20)

Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with Hadoop
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User Group
 
Meetup oslo hortonworks HDP
Meetup oslo hortonworks HDPMeetup oslo hortonworks HDP
Meetup oslo hortonworks HDP
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
4. Big data & analytics HP
4. Big data & analytics HP4. Big data & analytics HP
4. Big data & analytics HP
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 

Recently uploaded (20)

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 

Eliminating the Challenges of Big Data Management

  • 1. Eliminating the Challenges of Big Data Management Inside Hadoop
  • 2. 2 © RedPoint Global Inc. 2015 Confidential Today’s Speakers Justin Sears, Senior Manager, Product Marketing, Hortonworks Jamie Keeffe, Product Marketing Manager, RedPoint Global Kris Tomes, Solutions Director, RedPoint Global  
  • 3. Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hortonworks: Hadoop for the Enterprise We Do Hadoop Spring 2015
  • 4. Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hadoop for the Enterprise: Implement a Modern Data Architecture with HDP Customer Momentum •  330+ customers (as of year-end 2014) Hortonworks Data Platform •  Completely open multi-tenant platform for any app & any data. •  A centralized architecture of consistent enterprise services for resource management, security, operations, and governance. Partner for Customer Success •  Open source community leadership focus on enterprise needs •  Unrivaled world class support •  Founded in 2011 •  Original 24 architects, developers, operators of Hadoop from Yahoo! •  600+ Employees •  1000+ Ecosystem Partners
  • 5. Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hadoop for the Enterprise: Implement a Modern Data Architecture with HDP Spring 2015 Hortonworks. We do Hadoop.
  • 6. Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Traditional systems under pressure Challenges •  Constrains data to app •  Can’t manage new data •  Costly to Scale Business Value Clickstream Geolocation Web Data Internet of Things Docs, emails Server logs 2012 2.8 Zettabytes 2020 40 Zettabytes LAGGARDS INDUSTRY LEADERS 1 2 New Data ERP CRM SCM New Traditional
  • 7. Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hadoop emerged as foundation of new data architecture Apache Hadoop is an open source data platform for managing large volumes of high velocity and variety of data •  Built by Yahoo! to be the heartbeat of its ad & search business •  Donated to Apache Software Foundation in 2005 with rapid adoption by large web properties & early adopter enterprises •  Incredibly disruptive to current platform economics Traditional Hadoop Advantages ü  Manages new data paradigm ü  Handles data at scale ü  Cost effective ü  Open source Traditional Hadoop Had Limitations " Batch-only architecture " Single purpose clusters, specific data sets " Difficult to integrate with existing investments " Not enterprise-grade Application Storage HDFS Batch Processing MapReduce
  • 8. Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Modern Data Architecture emerges to unify data & processing Modern Data Architecture •  Enable applications to have access to all your enterprise data through an efficient centralized platform •  Supported with a centralized approach governance, security and operations •  Versatile to handle any applications and datasets no matter the size or type Clickstream   Web     &  Social   Geoloca3on   Sensor     &  Machine   Server     Logs   Unstructured   SOURCES Existing Systems ERP   CRM   SCM   ANALYTICS Data Marts Business Analytics Visualization & Dashboards ANALYTICS Applications Business Analytics Visualization & Dashboards ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) YARN: Data Operating System Interactive Real-TimeBatch Partner ISVBatch Batch MPP   EDW  
  • 9. Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Modern Data Architecture emerges to unify data & processing Clickstream   Web     &  Social   Geoloca3on   Sensor     &  Machine   Server     Logs   Unstructured   SOURCES Existing Systems ERP   CRM   SCM   ANALYTICS Data Marts Business Analytics Visualization & Dashboards ANALYTICS Applications Business Analytics Visualization & Dashboards ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) YARN: Data Operating System Interactive Real-TimeBatch Partner ISVBatch Batch MPP   EDW   RedPoint  Global  is  a  Hortonworks  Partner,   cer3fied  on  HDP  and  YARN.     With  RedPoint,  your  exis:ng  data  analysts  and   database  administrators  can  easily  work  with   data  stored  in  Hadoop.  No  new  skills  are   required.  
  • 10. Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hadoop adoption follows a predictable journey Cost Optimization, new analytic apps, and ultimately to a “data lake”
  • 11. Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hadoop Driver: Cost optimization Archive Data off EDW Move rarely used data to Hadoop as active archive, store more data longer Offload costly ETL process Free your EDW to perform high-value functions like analytics & operations, not ETL Enrich the value of your EDW Use Hadoop to refine new data sources, such as web and machine data for new analytical context ANALYTICS Data Marts Business Analytics Visualization & Dashboards HDP helps you reduce costs and optimize the value associated with your EDW ANALYTICSDATASYSTEMS Data Marts Business Analytics Visualization & Dashboards HDP 2.2 ELT ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N Cold Data, Deeper Archive & New Sources Enterprise Data Warehouse Hot MPP In-Memory Clickstream   Web     &  Social   Geoloca3on   Sensor     &  Machine   Server     Logs   Unstructured   Existing Systems ERP   CRM   SCM   SOURCES
  • 12. Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Single View Improve acquisition and retention Predictive Analytics Identify your next best action Data Discovery Uncover new findings Financial Services New Account Risk Screens Trading Risk Insurance Underwriting Improved Customer Service Insurance Underwriting Aggregate Banking Data as a Service Cross-sell & Upsell of Financial Products Risk Analysis for Usage-Based Car Insurance Identify Claims Errors for Reimbursement Telecom Unified Household View of the Customer Searchable Data for NPTB Recommendations Protect Customer Data from Employee Misuse Analyze Call Center Contacts Records Network Infrastructure Capacity Planning Call Detail Records (CDR) Analysis Inferred Demographics for Improved Targeting Proactive Maintenance on Transmission Equipment Tiered Service for High-Value Customers Retail 360° View of the Customer Supply Chain Optimization Website Optimization for Path to Purchase Localized, Personalized Promotions A/B Testing for Online Advertisements Data-Driven Pricing, improved loyalty programs Customer Segmentation Personalized, Real-time Offers In-Store Shopper Behavior Manufacturing Supply Chain and Logistics Optimize Warehouse Inventory Levels Product Insight from Electronic Usage Data Assembly Line Quality Assurance Proactive Equipment Maintenance Crowdsource Quality Assurance Single View of a Product Throughout Lifecycle Connected Car Data for Ongoing Innovation Improve Manufacturing Yields Healthcare Electronic Medical Records Monitor Patient Vitals in Real-Time Use Genomic Data in Medical Trials Improving Lifelong Care for Epilepsy Rapid Stroke Detection and Intervention Monitor Medical Supply Chain to Reduce Waste Reduce Patient Re-Admittance Rates Video Analysis for Surgical Decision Support Healthcare Analytics as a Service Oil & Gas Unify Exploration & Production Data Monitor Rig Safety in Real-Time Geographic exploration DCA to Slow Well Declines Curves Proactive Maintenance for Oil Field Equipment Define Operational Set Points for Wells Government Single View of Entity CBM & Autonomic Logistic Analysis Sentiment Analysis on Program Effectiveness Prevent Fraud, Waste and Abuse Proactive Maintenance for Public Infrastructure Meet Deadlines for Government Reporting Hadoop Driver: Advanced analytic applications
  • 13. Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hadoop Driver: Enabling the data lakeSCALE SCOPE Data Lake Definition •  Centralized Architecture Multiple applications on a shared data set with consistent levels of service •  Any App, Any Data Multiple applications accessing all data affording new insights and opportunities. •  Unlocks ‘Systems of Insight’ Advanced algorithms and applications used to derive new value and optimize existing value. Drivers: 1.  Cost Optimization 2.  Advanced Analytic Apps Goal: •  Centralized Architecture •  Data-driven Business DATA LAKE Journey to the Data Lake with Hadoop Systems of Insight
  • 14. Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Case Study: 12 month Hadoop evolution at TrueCar DataPlatformCapabilities 12 months execution plan June 2013 Begin Hadoop Execution July 2013 Hortonworks Partnership May ‘14 IPO Aug 2013 Training & Dev Begins Nov 2013 Production Cluster 60 Nodes 2 PB Jan 2014 40% Dev Staff Perficient Dec 2013 Three Production Apps (3 total) Feb 2014 Three More Production Apps (6 total) 12 Month Results at TRUECar •  Six Production Hadoop Applications •  Sixty nodes/2PB data •  Storage Costs/Compute Costs from $19/GB to $0.23/GB “We addressed our data platform capabilities strategically as a pre-cursor to IPO.”
  • 15. Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hortonworks Data Platform Hadoop for the Enterprise
  • 16. Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Only HDP delivers a Centralized Architecture HDP is uniquely built around YARN serving as a data operating system that provides multi-tenant Resource Management, consistent Governance & Security and efficient Operations services across Hadoop applications. Hortonworks Data Platform YARN Data Operating System •  A centralized architecture of consistent enterprise services for resource management, security, operations, and governance. •  The versatility to support multiple applications and diverse workloads from batch to interactive to real-time, open source and commercial. Key Benefits •  Multiple applications on a shared data set with consistent levels of service: a multitenant data platform. •  Provides a shared platform to enable new analytic applications. •  Delivers maximum cost efficiency for cluster resource management. Fewer servers fewer nodes. Storage YARN: Data Operating System Governance Security Operations Resource Management Existing Applications New Analytics Partner Applications Data Access: Batch, Interactive & Real-time
  • 17. Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved HDP delivers a completely open data platform Hortonworks Data Platform 2.2 Hortonworks Data Platform provides Hadoop for the Enterprise: a centralized architecture of core enterprise services, for any application and any data. Completely Open •  HDP incorporates every element required of an enterprise data platform: data storage, data access, governance, security, operations •  All components are developed in open source and then rigorously tested, certified, and delivered as an integrated open source platform that’s easy to consume and use by the enterprise and ecosystem. YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ApachePig ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS Apache Falcon ApacheHive Cascading ApacheHBase ApacheAccumulo ApacheSolr ApacheSpark ApacheStorm Apache Sqoop Apache Flume Apache Kafka SECURITY Apache Ranger Apache Knox Apache Falcon OPERATIONS Apache Ambari Apache Zookeeper Apache Oozie
  • 18. Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved HDP: Any Data, Any Application, Anywhere Any Application •  Deep integration with ecosystem partners to extend existing investments and skills •  Broadest set of applications through the stable of YARN-Ready applications Any Data Deploy applications fueled by clickstream, sensor, social, mobile, geo-location, server log, and other new paradigm datasets with existing legacy datasets. Anywhere Implement HDP naturally across the complete range of deployment options Clickstream   Web     &  Social   Geoloca3on   Internet  of   Things   Server     Logs   Files,  emails  ERP   CRM   SCM   hybrid commodity appliance cloud Over 70 Hortonworks Certified YARN Apps
  • 19. Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Expansion Architecture & Development ProductionImplementation Hortonworks supports the full application lifecycle Hadoop usage follows a consistent lifecycle From architecture to expansion, all with a consistent support experience Most Common Support Issues by Project Phase Issues address by Hortonworks Support by type for the past year Issue Type Architecture 7% Application Development   10% Installation   10% Performance   5% Configuration   25% Executing Jobs   20% Cluster Administration   18% HDP Upgrades   3% Enhancement Requests   3% TOTAL 100% Hortonworks Support Full Lifecycle Subscription Support Support through EVERY phase of adoption of your Hadoop project to ensure your success # tickets Project 2 Project 3 Project N . . .
  • 20. Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved “Hortonworks loves and lives open source innovation” World Class Support and Services. Hortonworks' Customer Support received a maximum score and was significantly higher than both Cloudera and MapR A Leader in Hadoop The Forrester Wave™ Big Data Hadoop Solutions Q1 2014
  • 21. Eliminating the Challenges of Big Data Management Inside Hadoop
  • 22. 22 © RedPoint Global Inc. 2015 Confidential Overview of RedPoint Global " Launched  2006   " Founded  and  staffed  by  industry  veterans   " Headquarters:  Wellesley,  MassachuseJs   " Offices  in  US,  UK,  Australia,  Philippines   " Global  customer  base   " Serves  most  major  industries     MAGIC  QUADRANT   Data  Quality     MAGIC  QUADRANT   Mul:channel  Campaign   Management   MAGIC  QUADRANT   Integrated  Marke:ng   Management  
  • 23. 23 © RedPoint Global Inc. 2015 Confidential Andrew Brust, GigaOm Research
  • 24. 24 © RedPoint Global Inc. 2015 Confidential New Data Straining Current Architectures Unstructured  documents,  emails Transac:onal  data Server  logs Sen:ment,  web  data Geoloca:on Sensor,  machine  data Clickstream Hierarchical  data OLTP,  ERP,  CRM Master  data 2.8  ZB  in  2013   85%  from  new  data  types   15x  Machine  Data  by  2020   40  ZB  by  2020   Source: IDC
  • 25. 25 © RedPoint Global Inc. 2015 Confidential Key Functions for Data Management Master  Key  Management   ETL  &  ELT   Data  Quality   Web  Services  Integra:on   Integra:on  &  Matching   Process  Automa:on     &  Opera:ons   •  Profiling,  reads/writes,   transforma:ons   •  Single  project  for  all  jobs   •  Cleanse  data   •  Parsing,  correc:on   •  Geo-­‐spa:al  analysis   •  Grouping   •  Fuzzy  match   •  Create  keys   •  Track  changes   •  Maintain  matches     over  :me   •  Consume  and  publish   •  HTTP/HTTPS  protocols   •  XML/JSON/SOAP  formats   •  Job  scheduling,  monitoring,   no:fica:ons   •  Central  point  of  control   •  Meta  Data  Management  
  • 26. 26 © RedPoint Global Inc. 2015 Confidential Overview - What is Hadoop? Hadoop  1.0   •  All  opera:ons  based  on  Map  Reduce   •  Intrinsic  inconsistency  of  code  based   solu:ons   •  Highly  skilled  and  expensive  resources   needed   •  3rd  party  applica:ons  constrained  by   the  need  to  generate  code   Hadoop  2.0   •  Introduc:on  of  the  YARN:                                                           “a  general-­‐purpose,  distributed,  applica:on  management   framework  that  supersedes  the  classic  Apache  Hadoop   MapReduce  framework  for  processing  data  in  Hadoop   clusters.”   •  Mature  applica:ons  can  now  operate  directly  on   Hadoop   •  Reduce  skill  requirements  and  increased   consistency                   HDFS   (Hadoop  Distributed  File  System)   YARN:    Data  Opera3ng  System   Batch   MapReduce   Batch  &  Interac3ve   Tez   Real-­‐Time   Slider   Spark   Other  ISV     Other     ISV       Stream       Storm         NoSQL         HBase   Accumulo       Other     ISV       Cascading     Scala   Java         SQL     Hive           Scrip3ng     Pig           Direct     Java   .NET         API   Engine   System   HADOOP  2.0  
  • 27. 27 © RedPoint Global Inc. 2015 Confidential RedPoint Data Management on Hadoop Par::oning   AM  /  Tasks   Execu:on   AM  /  Tasks   Data  I/O   Key  /  Split   Analysis   Parallel  Sec:on   YARN   MapReduce  
  • 28. 28 © RedPoint Global Inc. 2015 Confidential Resource   Manager   Launches   Tasks   Node  Manager   DM  App  Master   DM  Task   Node  Manager   DM  Task   DM  Task   Node  Manager   DM  Task   DM  Task   Launches  DM   App  Master   Data  Management   Designer   DM   Execu3on   Server   Parallel  Sec:on   Running  DM  Task   1 2 3 RedPoint DM for Hadoop: Processing Flow
  • 29. 29 © RedPoint Global Inc. 2015 Confidential >150  Lines  of  MR  Code   ~50  Lines  of  Script  Code   0  Lines  of  Code   6  hours  of  development   3  hours  of  development   15  min.  of  development   6  minutes  run:me   15  minutes  run:me   3  minutes  run:me   Extensive  op:miza:on   needed   User  Defined  Func:ons   required  prior  to  running   script   No  tuning  or  op:miza:on   required   RedPoint   Benchmarks – Project Gutenberg Map  Reduce   Pig   Sample  MapReduce  (small  subset  of  the  entire  code  which  totals  nearly  150  lines):   public  static  class  MapClass extends  Mapper<WordOffset, Text, Text, IntWritable> {   private  final  static  String delimiters = "',./<>?;:"[]{}-=_+()&*%^#$!@`~ |«»¡¢£¤¥¦©¬®¯±¶·¿";   private  final  static  IntWritable one = new  IntWritable(1);   private  Text word = new  Text();   public  void  map(WordOffset key, Text value, Context context) throws  IOException, InterruptedException { String line = value.toString();   StringTokenizer itr = new  StringTokenizer(line, delimiters);   while  (itr.hasMoreTokens()) {   word.set(itr.nextToken());   context.write(word, one);   }   }   }     Sample  Pig  script  without  the  UDF:   SET  pig.maxCombinedSplitSize 67108864   SET  pig.splitCombination true   A = LOAD  '/testdata/pg/*/*/*';   B = FOREACH A GENERATE FLATTEN(TOKENIZE((chararray)$0)) AS  word;   C = FOREACH B GENERATE UPPER(word) AS  word;   D = GROUP  C BY  word;   E = FOREACH D GENERATE COUNT(C) AS  occurrences, group;   F = ORDER  E BY  occurrences DESC;   STORE F INTO  '/user/cleonardi/pg/pig-count';
  • 30. 30 © RedPoint Global Inc. 2015 Confidential RedPoint Ranks #1
  • 31. 31 © RedPoint Global Inc. 2015 Confidential Consistent High Rankings
  • 32. 32 © RedPoint Global Inc. 2015 Confidential Data Lake Architecture for MDM
  • 33. 33 © RedPoint Global Inc. 2015 Confidential Kris Tomes Solution Director at RedPoint Demonstration Introduction
  • 34. 34 © RedPoint Global Inc. 2015 Confidential Key Factors to Consider " Traditional data architectures are challenged " Maximize the scale & cost optimization of the Hortonworks Modern Data Architecture " Leverage your DBAs to control development / implementation / production costs and schedules " Smooth out your journey to a data lake " Expedite the speed for getting business applications into production " Insist on Any Data, Any Application, Any Environment " Do your data quality and data integration in the cluster
  • 35. 35 © RedPoint Global Inc. 2015 Confidential Thank You & Please Visit Us at www.RedPoint.net Jamie  Keeffe   Product  Marke:ng  Manager       RedPoint  Global  Inc.   Jamie.Keeffe@RedPoint.Net     +1  978-­‐764-­‐3839