SlideShare a Scribd company logo
1 of 30
Download to read offline
© Talend 2014 
1 
Starting Small and 
Scaling Big 
with Hadoop 
November 20, 2014
© Talend 2014 
2 
Your Speakers Today 
Jim Walker 
Director, Product Marketing 
Julien Sauvage 
Director, Product Marketing
Page 3 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
2013 
Digital universe 2.3 Zettabytes 
1 Zettabyte (ZB) = 1 million Petabytes (PB); Sources: IDC and IDG Enterprise 
85% of growth from new types of data with machine-generated data increasing 15x 
2020 
Digital universe 40 Zettabytes 
& Hadoop Market $50B 
Analysts consensus estimates enterprise data growth of year over year through 2020 
50x
Page 4 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
A shift from reactive to proactive interactions 
HDP and Hadoop allow organizations to shift interactions from… 
Reactive Post Transaction 
Proactive Pre Decision 
…to Real-time Personalization 
From static branding 
…to repair before break 
From break then fix 
…to Designer Medicine 
From mass treatment 
…to Automated Algorithms 
From Educated Investing 
…to 1x1 Targeting 
From mass branding 
A shift in Advertising 
A shift in Financial Services 
A shift in Healthcare 
A shift in Retail 
A shift in Telco
Page 5 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDP Realized Cost Savings with EDW Optimization 
Archive Data away from EDW 
•Move cold or rarely used data to Hadoop as active archive 
•Store more of data longer 
Offload costly ETL process 
•Free your EDW to perform high-value functions like analytics & operations, not ETL. 
•Use Hadoop for advanced ETL 
Optimize the value of your EDW 
•Use Hadoop to refine new data sources, such as web and machine data for new analytical context 
ANALYTICS 
DATA SYSTEMS 
Data Marts 
Business Analytics 
Visualization 
& Dashboards 
Systems of Record 
RDBMS 
ERP 
CRM 
Other 
Clickstream 
Web & Social 
Geolocation 
Sensor & Machine 
Server Logs 
Unstructured 
NEW SOURCES 
HDP 2.2 
ELT 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
N 
Cold Data, 
Deeper Archive & New Sources 
Enterprise Data Warehouse 
Hot 
Hadoop Helps you optimize and reduce costs associated with your EDW
Page 6 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Realize dramatic savings for cost of storage 
Cost Efficiencies Reduce costs associated with expensive archive systems 
•Utilize existing relationships with hardware vendors 
•Open Source Software 
Active Archive Provide access to archived data not just collect dust 
MPP 
SAN 
Engineered System 
NAS 
HADOOP 
Cloud Storage 
$0 
$20,000 
$40,000 
$60,000 
$80,000 
$180,000 
Fully-loaded Cost Per Raw TB of Data (Min–Max Cost) 
Hadoop Enables Scalable Compute & Storage at a Compelling Cost Structure 
Storage Costs/Compute Costs from $19/GB to $0.23/GB
Page 7 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Unlock New Applications from New Types of Data 
INDUSTRY 
USE CASE 
Sentiment 
& Web 
Clickstream 
& Behavior 
Machine & Sensor 
Geographic 
Server Logs 
Structured & 
Unstructured 
Financial Services 
New Account Risk Screens 
✔ 
✔ 
Trading Risk 
✔ 
Insurance Underwriting 
✔ 
✔ 
✔ 
Telecom 
Call Detail Records (CDR) 
✔ 
✔ 
Infrastructure Investment 
✔ 
✔ 
Real-time Bandwidth Allocation 
✔ 
✔ 
✔ 
Retail 
360° View of the Customer 
✔ 
✔ 
✔ 
Localized, Personalized Promotions 
✔ 
Website Optimization 
✔ 
Manufacturing 
Supply Chain and Logistics 
✔ 
Assembly Line Quality Assurance 
✔ 
Crowd-sourced Quality Assurance 
✔ 
Healthcare 
Use Genomic Data in Medial Trials 
✔ 
✔ 
✔ 
Monitor Patient Vitals in Real-Time 
Pharmaceuticals 
Recruit and Retain Patients for Drug Trials 
✔ 
✔ 
Improve Prescription Adherence 
✔ 
✔ 
✔ 
✔ 
Oil & Gas 
Unify Exploration & Production Data 
✔ 
✔ 
✔ 
✔ 
Monitor Rig Safety in Real-Time 
✔ 
✔ 
✔ 
Government 
ETL Offload/Federal Budgetary Pressures 
✔ 
✔ 
Sentiment Analysis for Government Programs 
✔
Page 8 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
End Game: Data Lake - An architectural shift 
SCALE 
SCOPE 
Unlocking the Data Lake 
RDBMS 
MPP 
EDW 
Data Lake Enabled by YARN 
•Single data repository, shared infrastructure 
•Multiple biz apps accessing all the data 
•Enable a shift from reactive to proactive interactions 
•Gain new insight across the entire enterprise 
New Analytic Apps or IT Optimization 
HDP 2.1 
Governance & Integration 
Security 
Operations 
Data Access 
Data Management 
YARN
Page 9 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Enterprise Goals for the Modern Data Architecture 
•Consolidate siloed data sets structured and unstructured 
•Central data set on a single cluster 
•Multiple workloads across batch interactive and real time 
•Central services for security, governance and operation 
•Preserve existing investment in current tools and platforms 
•Single view of the customer, product, supply chain 
APPLICATIONS 
DATA SYSTEM 
Business Analytics 
Custom Applications 
Packaged 
Applications 
RDBMS 
EDW 
MPP 
YARN: Data Operating System 
1 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
N 
Interactive 
Real-Time 
Batch 
CRM 
ERP 
Other 
1 
° 
° 
° 
° 
° 
° 
HDFS (Hadoop Distributed File System) 
SOURCES 
EXISTING Systems 
Clickstream 
Web &Social 
Geolocation 
Sensor & Machine 
Server Logs 
Unstructured
Page 10 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDP delivers a comprehensive data management platform 
Hortonworks Data Platform 2.2 
YARN: Data Operating System 
(Cluster Resource Management) 
1 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
Script Pig 
SQL 
Hive 
Tez 
Tez 
Java Scala Cascading 
Tez 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
° 
Others ISV Engines 
HDFS (Hadoop Distributed File System) 
Stream 
Storm 
Search 
Solr 
NoSQL HBase Accumulo 
Slider 
Slider 
SECURITY 
GOVERNANCE 
OPERATIONS 
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS 
In-Memory Spark 
Provision, Manage & Monitor 
Ambari 
Zookeeper 
Scheduling 
Oozie 
Data Workflow, Lifecycle & Governance 
Falcon 
Sqoop 
Flume 
Kafka 
NFS 
WebHDFS 
Authentication Authorization Accounting Data Protection Storage: HDFS Resources: YARN Access: Hive, … Pipeline: Falcon Cluster: Knox Cluster: Ranger 
Deployment Choice 
Linux 
Windows 
On-Premises 
Cloud 
YARN is the architectural center of HDP 
Enables batch, interactive and real-time workloads 
Provides comprehensive enterprise capabilities 
The widest range of deployment options 
Delivered Completely in the OPEN
Page 11 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
OPERATIONAL TOOLS 
DEV & DATA TOOLS 
INFRASTRUCTURE 
HDP and Talend in the Modern Data Architecture 
SOURCES 
EXISTING Systems 
Clickstream 
Web &Social 
Geolocation 
Sensor & Machine 
Server Logs 
Unstructured 
DATA SYSTEM 
RDBMS 
EDW 
APPLICATIONS 
BusinessObjects BI 
HDP 2.1 
Governance & Integration 
Security 
Operations 
Data Access 
Data Management 
YARN 
Hadoop 2.0, YARN 
Data Quality 
Pig, Hive, 
ETL ELT 
HBase, NoSQL 
Deep Partnerships 
Hortonworks engages in deep engineered relationships with the leaders in the data center, such as Microsoft, Teradata, Redhat, HP, SAS & SAP 
Broad Partnerships 
Over 600 partners work with us to certify their applications to work with Hadoop so they can extend big data to their users
© Talend 2014 
12 
Connecting the Data-Driven Enterprise
© Talend 2014 
13 
The Talend Platform
© Talend 2014 
14 
Still Hand-Coding Data Integration? 
Hand-coding 
Talend Enterprise 
•Unproductive 
•Need specialized skills 
•Hard to maintain 
•Limited support 
•800+ drag-n-drop components 
•Generates optimized code 
•Collaboration & management 
•Gold support (SLAs)
© Talend 2014 
15 
Encumbered with Legacy ETL? 
Legacy ETL 
Talend Enterprise 
•Proprietary engine 
•Hard to scale Big Data 
•Expensive 
•Open 
•Generates native code 
•Low TCO
© Talend 2014 16 
Next big 
thing 
SQL 
ELT 
DW appliance 
Future-Proof Architecture 
ETL 
Day-to-day 
integration 
JAVA 
Hadoop 
Highly Scalable 
MapReduce 
CAMEL 
Message 
transform-ation 
CAMEL
© Talend 2014 
17 
ONE cluster to deploy 
ONE cluster to manage 
ONE cluster to monitor 
ONE cluster to scale 
ONE cluster to update 
ONE cluster to pay for! 
And it will be 100x faster in 2 years 
Infinite Scale
© Talend 2014 
18 
Unlock New Applications from New Types of Data 
INDUSTRY 
USE CASE 
Sentiment 
& Web 
Clickstream 
& Behavior 
Machine & Sensor 
Geographic 
Server Logs 
Structured & 
Unstructured 
Financial Services 
New Account Risk Screens 
✔ 
✔ 
Trading Risk 
✔ 
Insurance Underwriting 
✔ 
✔ 
✔ 
Telecom 
Call Detail Records (CDR) 
✔ 
✔ 
Infrastructure Investment 
✔ 
✔ 
Real-time Bandwidth Allocation 
✔ 
✔ 
✔ 
Retail 
360° View of the Customer 
✔ 
✔ 
✔ 
Localized, Personalized Promotions 
✔ 
Website Optimization 
✔ 
Manufacturing 
Supply Chain and Logistics 
✔ 
Assembly Line Quality Assurance 
✔ 
Crowd-sourced Quality Assurance 
✔ 
Healthcare 
Use Genomic Data in Medial Trials 
✔ 
✔ 
✔ 
Monitor Patient Vitals in Real-Time 
Pharmaceuticals 
Recruit and Retain Patients for Drug Trials 
✔ 
✔ 
Improve Prescription Adherence 
✔ 
✔ 
✔ 
✔ 
Oil & Gas 
Unify Exploration & Production Data 
✔ 
✔ 
✔ 
✔ 
Monitor Rig Safety in Real-Time 
✔ 
✔ 
✔ 
Government 
ETL Offload/Federal Budgetary Pressures 
✔ 
✔ 
Sentiment Analysis for Government Programs 
✔ 
integration 
jobs 
+ 
++
© Talend 2014 
19 
100x 
performance increase 
< 1 sec 
response 
Address new use cases 
(last minute defense, dynamic pricing, real-time fraud detection, etc.) 
Simplify Real-Time Big Data 
New components for streaming data
© Talend 2014 
20 
The Talend Solution 
Scalable 
•Generates native code 
•Future-proof 
•Built-in data quality 
•More productive 
•Open source 
•Innovative 
Agile 
•Open source platform 
•Learn once 
•Expand many times 
Easy 
•Subscription pricing 
•Per developer 
•Predictable cost 
Lowest TCO 
The ease of use of the Talend platform allows us to deliver
© Talend 2014 21 
The Three Drivers of Success 
Product Innovation Market Adoption Industry Recognition 
Customers 
Community 
Partners 
“Visionary” 
“Leader” 
Multi-award winner 
Big Data 
Cloud
Page 22 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Customer Case Study 
Product Inventory and Pricing
© Talend 2014 
23 
The Old Way to Do Forecasting 
Product category 
“HALLOWEEN”
© Talend 2014 
24 
Data Explosion in Size 
Multiple SKUs 
Multiple stores 
Product 2 
Product 1 
Product 3 
Halloween mask 
Halloween candies 
Pumpkin 
10,000’s 
1, 000’s 
X
© Talend 2014 
25 
Need for a Modern Architecture 
data at rest 
DAO 
Cassandra 
OLTP 
Hadoop 
EDW 
data in motion 
BI 
Viz & 
Analytics 
Graphical  Generates code  Runs on Hadoop
© Talend 2014 
26 
A New EDW “Eco” System 
Enterprise Intelligence 
& 
Advanced Analytics 
SSAS 
Enterprise 
Data 
Warehouse 
Advanced Analytics 
Platform 
Data Refinery & Ingest Engine 
Fast Data Cache
Page 27 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Talend + Hortonworks = Open = Awesome! 
•Pure open source governed cluster 
•Don’t need to recode or reformat data 
•No vendor lock-in 
•Subscription models 
•Most recent releases of Apache projects 
•We are always aligned and up to date
Page 28 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
The Forrester Wave™ 
Big Data Hadoop Solutions Q1 2014 
“Hortonworks loves and lives open source innovation” 
World Class Support and Services. Hortonworks' Customer Support received a maximum score and was significantly higher than both Cloudera and MapR 
A Leader in Hadoop
© Talend 2014 
29 
Questions? 
Jim Walker 
@jaymce 
Julien Sauvage 
@sauvageju
© Talend 2014 
30 
Check Out Our Talend + Hortonworks Sandbox! 
http://www.talend.com

More Related Content

What's hot

Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Hortonworks
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitDataWorks Summit
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHortonworks
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveHortonworks
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 

What's hot (20)

Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 

Viewers also liked

Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.OW2
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
ETL using Big Data Talend
ETL using Big Data Talend  ETL using Big Data Talend
ETL using Big Data Talend Edureka!
 
Talend Data Preparation Overview
Talend Data Preparation OverviewTalend Data Preparation Overview
Talend Data Preparation OverviewJean-Michel Franco
 
Big Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBig Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBigDataExpo
 
Talend For Big Data : Secret Key to Hadoop
Talend For Big Data  : Secret Key to HadoopTalend For Big Data  : Secret Key to Hadoop
Talend For Big Data : Secret Key to HadoopEdureka!
 
Talend Open Studio for Big Data (powered by Apache Hadoop)
Talend Open Studio for Big Data (powered by Apache Hadoop)Talend Open Studio for Big Data (powered by Apache Hadoop)
Talend Open Studio for Big Data (powered by Apache Hadoop)Modern Data Stack France
 
Simplifying Big Data ETL with Talend
Simplifying Big Data ETL with TalendSimplifying Big Data ETL with Talend
Simplifying Big Data ETL with TalendEdureka!
 
Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Jean-Michel Franco
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic EcosystemsHortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHortonworks
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendAdoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendHortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 

Viewers also liked (20)

Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
ETL using Big Data Talend
ETL using Big Data Talend  ETL using Big Data Talend
ETL using Big Data Talend
 
Talend Data Preparation Overview
Talend Data Preparation OverviewTalend Data Preparation Overview
Talend Data Preparation Overview
 
Big Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBig Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real Time
 
Talend For Big Data : Secret Key to Hadoop
Talend For Big Data  : Secret Key to HadoopTalend For Big Data  : Secret Key to Hadoop
Talend For Big Data : Secret Key to Hadoop
 
Talend Open Studio for Big Data (powered by Apache Hadoop)
Talend Open Studio for Big Data (powered by Apache Hadoop)Talend Open Studio for Big Data (powered by Apache Hadoop)
Talend Open Studio for Big Data (powered by Apache Hadoop)
 
Simplifying Big Data ETL with Talend
Simplifying Big Data ETL with TalendSimplifying Big Data ETL with Talend
Simplifying Big Data ETL with Talend
 
Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendAdoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 

Similar to Starting Small and Scaling Big with Hadoop and Talend

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopMats Johansson
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJDaniel Madrigal
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with HadoopPrecisely
 
Hortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data ScienceHortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data ScienceThiago Santiago
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Hortonworks
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataPentaho
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupMats Johansson
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseHortonworks
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 

Similar to Starting Small and Scaling Big with Hadoop and Talend (20)

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with Hadoop
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with Hadoop
 
Hortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data ScienceHortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data Science
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User Group
 
Meetup oslo hortonworks HDP
Meetup oslo hortonworks HDPMeetup oslo hortonworks HDP
Meetup oslo hortonworks HDP
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Starting Small and Scaling Big with Hadoop and Talend

  • 1. © Talend 2014 1 Starting Small and Scaling Big with Hadoop November 20, 2014
  • 2. © Talend 2014 2 Your Speakers Today Jim Walker Director, Product Marketing Julien Sauvage Director, Product Marketing
  • 3. Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 2013 Digital universe 2.3 Zettabytes 1 Zettabyte (ZB) = 1 million Petabytes (PB); Sources: IDC and IDG Enterprise 85% of growth from new types of data with machine-generated data increasing 15x 2020 Digital universe 40 Zettabytes & Hadoop Market $50B Analysts consensus estimates enterprise data growth of year over year through 2020 50x
  • 4. Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A shift from reactive to proactive interactions HDP and Hadoop allow organizations to shift interactions from… Reactive Post Transaction Proactive Pre Decision …to Real-time Personalization From static branding …to repair before break From break then fix …to Designer Medicine From mass treatment …to Automated Algorithms From Educated Investing …to 1x1 Targeting From mass branding A shift in Advertising A shift in Financial Services A shift in Healthcare A shift in Retail A shift in Telco
  • 5. Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP Realized Cost Savings with EDW Optimization Archive Data away from EDW •Move cold or rarely used data to Hadoop as active archive •Store more of data longer Offload costly ETL process •Free your EDW to perform high-value functions like analytics & operations, not ETL. •Use Hadoop for advanced ETL Optimize the value of your EDW •Use Hadoop to refine new data sources, such as web and machine data for new analytical context ANALYTICS DATA SYSTEMS Data Marts Business Analytics Visualization & Dashboards Systems of Record RDBMS ERP CRM Other Clickstream Web & Social Geolocation Sensor & Machine Server Logs Unstructured NEW SOURCES HDP 2.2 ELT ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N Cold Data, Deeper Archive & New Sources Enterprise Data Warehouse Hot Hadoop Helps you optimize and reduce costs associated with your EDW
  • 6. Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Realize dramatic savings for cost of storage Cost Efficiencies Reduce costs associated with expensive archive systems •Utilize existing relationships with hardware vendors •Open Source Software Active Archive Provide access to archived data not just collect dust MPP SAN Engineered System NAS HADOOP Cloud Storage $0 $20,000 $40,000 $60,000 $80,000 $180,000 Fully-loaded Cost Per Raw TB of Data (Min–Max Cost) Hadoop Enables Scalable Compute & Storage at a Compelling Cost Structure Storage Costs/Compute Costs from $19/GB to $0.23/GB
  • 7. Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Unlock New Applications from New Types of Data INDUSTRY USE CASE Sentiment & Web Clickstream & Behavior Machine & Sensor Geographic Server Logs Structured & Unstructured Financial Services New Account Risk Screens ✔ ✔ Trading Risk ✔ Insurance Underwriting ✔ ✔ ✔ Telecom Call Detail Records (CDR) ✔ ✔ Infrastructure Investment ✔ ✔ Real-time Bandwidth Allocation ✔ ✔ ✔ Retail 360° View of the Customer ✔ ✔ ✔ Localized, Personalized Promotions ✔ Website Optimization ✔ Manufacturing Supply Chain and Logistics ✔ Assembly Line Quality Assurance ✔ Crowd-sourced Quality Assurance ✔ Healthcare Use Genomic Data in Medial Trials ✔ ✔ ✔ Monitor Patient Vitals in Real-Time Pharmaceuticals Recruit and Retain Patients for Drug Trials ✔ ✔ Improve Prescription Adherence ✔ ✔ ✔ ✔ Oil & Gas Unify Exploration & Production Data ✔ ✔ ✔ ✔ Monitor Rig Safety in Real-Time ✔ ✔ ✔ Government ETL Offload/Federal Budgetary Pressures ✔ ✔ Sentiment Analysis for Government Programs ✔
  • 8. Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved End Game: Data Lake - An architectural shift SCALE SCOPE Unlocking the Data Lake RDBMS MPP EDW Data Lake Enabled by YARN •Single data repository, shared infrastructure •Multiple biz apps accessing all the data •Enable a shift from reactive to proactive interactions •Gain new insight across the entire enterprise New Analytic Apps or IT Optimization HDP 2.1 Governance & Integration Security Operations Data Access Data Management YARN
  • 9. Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Enterprise Goals for the Modern Data Architecture •Consolidate siloed data sets structured and unstructured •Central data set on a single cluster •Multiple workloads across batch interactive and real time •Central services for security, governance and operation •Preserve existing investment in current tools and platforms •Single view of the customer, product, supply chain APPLICATIONS DATA SYSTEM Business Analytics Custom Applications Packaged Applications RDBMS EDW MPP YARN: Data Operating System 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N Interactive Real-Time Batch CRM ERP Other 1 ° ° ° ° ° ° HDFS (Hadoop Distributed File System) SOURCES EXISTING Systems Clickstream Web &Social Geolocation Sensor & Machine Server Logs Unstructured
  • 10. Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP delivers a comprehensive data management platform Hortonworks Data Platform 2.2 YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Script Pig SQL Hive Tez Tez Java Scala Cascading Tez ° ° ° ° ° ° ° ° ° ° ° ° ° ° Others ISV Engines HDFS (Hadoop Distributed File System) Stream Storm Search Solr NoSQL HBase Accumulo Slider Slider SECURITY GOVERNANCE OPERATIONS BATCH, INTERACTIVE & REAL-TIME DATA ACCESS In-Memory Spark Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie Data Workflow, Lifecycle & Governance Falcon Sqoop Flume Kafka NFS WebHDFS Authentication Authorization Accounting Data Protection Storage: HDFS Resources: YARN Access: Hive, … Pipeline: Falcon Cluster: Knox Cluster: Ranger Deployment Choice Linux Windows On-Premises Cloud YARN is the architectural center of HDP Enables batch, interactive and real-time workloads Provides comprehensive enterprise capabilities The widest range of deployment options Delivered Completely in the OPEN
  • 11. Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved OPERATIONAL TOOLS DEV & DATA TOOLS INFRASTRUCTURE HDP and Talend in the Modern Data Architecture SOURCES EXISTING Systems Clickstream Web &Social Geolocation Sensor & Machine Server Logs Unstructured DATA SYSTEM RDBMS EDW APPLICATIONS BusinessObjects BI HDP 2.1 Governance & Integration Security Operations Data Access Data Management YARN Hadoop 2.0, YARN Data Quality Pig, Hive, ETL ELT HBase, NoSQL Deep Partnerships Hortonworks engages in deep engineered relationships with the leaders in the data center, such as Microsoft, Teradata, Redhat, HP, SAS & SAP Broad Partnerships Over 600 partners work with us to certify their applications to work with Hadoop so they can extend big data to their users
  • 12. © Talend 2014 12 Connecting the Data-Driven Enterprise
  • 13. © Talend 2014 13 The Talend Platform
  • 14. © Talend 2014 14 Still Hand-Coding Data Integration? Hand-coding Talend Enterprise •Unproductive •Need specialized skills •Hard to maintain •Limited support •800+ drag-n-drop components •Generates optimized code •Collaboration & management •Gold support (SLAs)
  • 15. © Talend 2014 15 Encumbered with Legacy ETL? Legacy ETL Talend Enterprise •Proprietary engine •Hard to scale Big Data •Expensive •Open •Generates native code •Low TCO
  • 16. © Talend 2014 16 Next big thing SQL ELT DW appliance Future-Proof Architecture ETL Day-to-day integration JAVA Hadoop Highly Scalable MapReduce CAMEL Message transform-ation CAMEL
  • 17. © Talend 2014 17 ONE cluster to deploy ONE cluster to manage ONE cluster to monitor ONE cluster to scale ONE cluster to update ONE cluster to pay for! And it will be 100x faster in 2 years Infinite Scale
  • 18. © Talend 2014 18 Unlock New Applications from New Types of Data INDUSTRY USE CASE Sentiment & Web Clickstream & Behavior Machine & Sensor Geographic Server Logs Structured & Unstructured Financial Services New Account Risk Screens ✔ ✔ Trading Risk ✔ Insurance Underwriting ✔ ✔ ✔ Telecom Call Detail Records (CDR) ✔ ✔ Infrastructure Investment ✔ ✔ Real-time Bandwidth Allocation ✔ ✔ ✔ Retail 360° View of the Customer ✔ ✔ ✔ Localized, Personalized Promotions ✔ Website Optimization ✔ Manufacturing Supply Chain and Logistics ✔ Assembly Line Quality Assurance ✔ Crowd-sourced Quality Assurance ✔ Healthcare Use Genomic Data in Medial Trials ✔ ✔ ✔ Monitor Patient Vitals in Real-Time Pharmaceuticals Recruit and Retain Patients for Drug Trials ✔ ✔ Improve Prescription Adherence ✔ ✔ ✔ ✔ Oil & Gas Unify Exploration & Production Data ✔ ✔ ✔ ✔ Monitor Rig Safety in Real-Time ✔ ✔ ✔ Government ETL Offload/Federal Budgetary Pressures ✔ ✔ Sentiment Analysis for Government Programs ✔ integration jobs + ++
  • 19. © Talend 2014 19 100x performance increase < 1 sec response Address new use cases (last minute defense, dynamic pricing, real-time fraud detection, etc.) Simplify Real-Time Big Data New components for streaming data
  • 20. © Talend 2014 20 The Talend Solution Scalable •Generates native code •Future-proof •Built-in data quality •More productive •Open source •Innovative Agile •Open source platform •Learn once •Expand many times Easy •Subscription pricing •Per developer •Predictable cost Lowest TCO The ease of use of the Talend platform allows us to deliver
  • 21. © Talend 2014 21 The Three Drivers of Success Product Innovation Market Adoption Industry Recognition Customers Community Partners “Visionary” “Leader” Multi-award winner Big Data Cloud
  • 22. Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Customer Case Study Product Inventory and Pricing
  • 23. © Talend 2014 23 The Old Way to Do Forecasting Product category “HALLOWEEN”
  • 24. © Talend 2014 24 Data Explosion in Size Multiple SKUs Multiple stores Product 2 Product 1 Product 3 Halloween mask Halloween candies Pumpkin 10,000’s 1, 000’s X
  • 25. © Talend 2014 25 Need for a Modern Architecture data at rest DAO Cassandra OLTP Hadoop EDW data in motion BI Viz & Analytics Graphical  Generates code  Runs on Hadoop
  • 26. © Talend 2014 26 A New EDW “Eco” System Enterprise Intelligence & Advanced Analytics SSAS Enterprise Data Warehouse Advanced Analytics Platform Data Refinery & Ingest Engine Fast Data Cache
  • 27. Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Talend + Hortonworks = Open = Awesome! •Pure open source governed cluster •Don’t need to recode or reformat data •No vendor lock-in •Subscription models •Most recent releases of Apache projects •We are always aligned and up to date
  • 28. Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Forrester Wave™ Big Data Hadoop Solutions Q1 2014 “Hortonworks loves and lives open source innovation” World Class Support and Services. Hortonworks' Customer Support received a maximum score and was significantly higher than both Cloudera and MapR A Leader in Hadoop
  • 29. © Talend 2014 29 Questions? Jim Walker @jaymce Julien Sauvage @sauvageju
  • 30. © Talend 2014 30 Check Out Our Talend + Hortonworks Sandbox! http://www.talend.com