SlideShare a Scribd company logo
1 of 21
Download to read offline
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Introduction to Hadoop
Eric Mizell – Director, Solution Engineering
Hortonworks. We do Hadoop.
© Hortonworks Inc. 2012
Page 2
© Hortonworks Inc. 2012
Page 3
© Hortonworks Inc. 2012
Page 4
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Quick Audience Poll
Which best describes how your org is using Hadoop?
A.  We’re using Hadoop
B.  We’re in the process of getting Hadoop integrated
C. We don’t have Hadoop installed
D. What’s Hadoop?
E.  I don’t know
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Big Data, Hadoop, and the Modern Data Architecture
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Big
Data
Explosion
Big Data Market Trends & Projections
20%
% by which org’s leveraging
modern info management
systems outperform peers by
2015
!"
1 Zettabyte (ZB)
=
1 Billion TBs
15x
growth rate of
machine generated
data by 2020
The US has 1/3 of the world’s data
Big Data is 1 of 5 US GDP Game Changers $325 billion
incremental annual GDP from big data analytics in retail and manufacturing by
2020
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Existing Siloed Data Architectures Under PressureAPPLICATIONS	
  DATA	
  	
  SYSTEM	
  SOURCES	
  
Business	
  	
  
Analy:cs	
  
Custom	
  
Applica:ons	
  
Packaged	
  
Applica:ons	
  
Exis:ng	
  Sources	
  	
  
(CRM,	
  ERP,	
  Clickstream,	
  Logs)	
  
SILO	
  
SILO	
  
RDBMS	
  
SILO	
   SILO	
  
SILO	
   SILO	
  
EDW	
   MPP	
  
Data	
  growth:	
  New	
  Data	
  Types	
  
OLTP,	
  ERP,	
  CRM	
  Systems	
  
Unstructured	
  docs,	
  emails	
  
Clickstream	
  
Server	
  logs	
  
Social/Web	
  Data	
  
Sensor.	
  Machine	
  Data	
  
Geoloca:on	
  
85% 
Source: IDC
??
"   Can’t manage new
data paradigm
"   Constrains data to
specific schema
" Siloed data
"   Limited scalability
"   Economically
unfeasible
"   Limited analytics
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop is Driving the New Data-driven Era of IT
1st
Era
Real-time Data Driven
RDBMS
2nd
Era 3rd
Era
Automation + EfficiencyProcessing Power
Mainframe
GoalDataTechnology
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Key Drivers of Hadoop
OPERATIONS	
  TOOLS	
  
Provision,
Manage &
Monitor
DEV	
  &	
  DATA	
  TOOLS	
  
Build &
Test
DATA	
  	
  SYSTEM	
  
REPOSITORIES	
  
SOURCES	
  
RDBMS	
   EDW	
   MPP	
  
APPLICATIONS	
  
Business	
  	
  
Analy:cs	
  
Custom	
  
Applica:ons	
  
Packaged	
  
Applica:ons	
  
Unlock	
  New	
  Approach	
  to	
  Analy:cs	
  
•  Agile	
  analy*cs	
  via	
  “Schema	
  on	
  Read”	
  with	
  
ability	
  to	
  store	
  all	
  data	
  in	
  na*ve	
  format	
  
•  Create	
  new	
  apps	
  from	
  new	
  types	
  of	
  data	
  
A
Op:mize	
  Investments,	
  Cut	
  Costs	
  
•  Focus	
  EDW	
  on	
  high	
  value	
  workloads	
  
•  Use	
  commodity	
  servers	
  &	
  storage	
  to	
  
enable	
  all	
  data	
  (original	
  and	
  historical)	
  to	
  
be	
  accessible	
  for	
  ongoing	
  explora*on	
  
B
Enable	
  a	
  Modern	
  Data	
  Architecture	
  
•  Integrate	
  new	
  &	
  exis*ng	
  data	
  sets	
  
•  Make	
  all	
  data	
  available	
  for	
  shared	
  access	
  and	
  
processing	
  in	
  mul*tenant	
  infrastructure	
  
•  Batch,	
  interac*ve	
  &	
  real-­‐*me	
  use	
  cases	
  
•  Integrated	
  with	
  exis*ng	
  tools	
  &	
  skills	
  
C
EXISTING	
  
Systems	
  
Clickstream	
   Web	
  &	
  
Social	
  
Geoloca:on	
   Sensor	
  &	
  
Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
YARN: Data Operating System
° ° ° ° ° ° ° ° °
Interactive Real-TimeBatch
HDFS: Hadoop Distributed File System
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
…to real-time personalizationFrom static branding
…to repair before breakFrom break then fix
…to designer medicineFrom mass treatment
…to automated algorithmsFrom educated investing
…to 1x1 targetingFrom mass branding
A shift in Advertising
A shift in Financial Services
A shift in Healthcare
A shift in Retail
A shift in Manufacturing
Hadoop enables
organizations to cost
effectively store and use
all of the data available
in a way that shifts the
business from…
Reactive
Proactive
Shift to Data-driven Means Treating Data like Capital
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Enterprise Goals for the Modern Data Architecture
ü  Centrally manage new and existing data
ü  Data needs flexibility and lands in
Hadoop without schema
ü  Prepare data with no predetermined
questions
ü  User self-service – no limit to questions
ü  Run batch, interactive & real time analytic
applications on shared datasets
ü  Leverage new and existing data center
infrastructure investments
ü  Scalable and affordable; low cost per TB
APPLICATIONSDATASYSTEM
Business
Analytics
Custom
Applications
Packaged
Applications
RDBMS
EDW
MPP
YARN: Data Operating System
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° N
Interactive Real-TimeBatch
CRM
ERP
Other
1 ° ° °
° ° ° °
HDFS
(Hadoop Distributed File System)
SOURCES
EXISTING	
  
Systems	
  
Clickstream	
   Web	
  &	
  
Social	
  
Geoloca:on	
   Sensor	
  &	
  
Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN and HDP Enables the Modern Data Architecture
YARN is the architectural center of
Hadoop and HDP
•  YARN enables a common data set
across all applications
•  Batch, interactive & real-time
workloads
•  Support multi-tenant access &
processing
HDP enables Apache Hadoop to
become Enterprise Viable Data
Platform with centralized services
•  Security
•  Governance
•  Operations
•  Productization
Enabled broad ecosystem
adoption
Hortonworks drove this innovation of Hadoop through YARN
Hortonworks Data Platform 2.2
YARN: Data Operating System
(Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
Tez
Tez
Java
Scala
Cascading
Tez
° °
° °
° ° ° ° °
° ° ° ° °
HDFS
(Hadoop Distributed File System)
Stream
Storm
Search
Solr
NoSQL
HBase
Accumulo
Slider
 Slider
SECURITYGOVERNANCE OPERATIONSBATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-
Memory
Spark
Provision,
Manage &
Monitor
Ambari
Zookeeper
Scheduling
Oozie
Data Workflow,
Lifecycle &
Governance
Falcon
Sqoop
Flume
Kafka
NFS
WebHDFS
Authentication
Authorization
Audit
Data Protection
Storage: HDFS
Resources: YARN
Access: Hive
Pipeline: Falcon
Cluster: Ranger
Cluster: Knox
Deployment ChoiceLinux Windows Cloud
Others
ISV
Engines
On-Premises
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
OPERATIONAL	
  TOOLS	
  
DEV	
  &	
  DATA	
  TOOLS	
  
INFRASTRUCTURE	
  
Modern Data ArchitectureSOURCES
EXISTING	
  
Systems	
  
Clickstream	
   Web	
  &Social	
   Geoloca:on	
   Sensor	
  &	
  
Machine	
  
Server	
  Logs	
   Unstructured	
  
DATASYSTEM
RDBMS	
   EDW	
  
HANA
APPLICATIONS	
  
BusinessObjects BI
Deep Partnerships
Hortonworks engages
in deep engineered relationships
with the leaders in the data center,
such as Microsoft, HP, Teradata,
SAS, SAP & Redhat
Broad Partnerships
Over 600 partners work with us to
certify their applications to work with
Hadoop so they can extend big data
to their users
YARN: Data Operating System
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° N
HDFS
(Hadoop Distributed File System)
Interactive Real-TimeBatch
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop unlocks a new approach: Iterative Analytics
	
  
	
  
	
  
	
  
✚
Determine	
  list	
  of	
  ques:ons	
  
Design	
  solu:ons	
  
Collect	
  structured	
  data	
  
Ask	
  ques:ons	
  from	
  list	
  
Detect	
  addi:onal	
  ques:ons	
  
Current Reality
Apply schema on write
Dependent on IT
Repeatable Process: SQL Only
Augment w/ Hadoop
Apply schema on read
Support range of access patterns to
data stored in HDFS: polymorphic access
HADOOP
Iterate over structure
Transform and Analyze
batch interactive real-time
Right Engine, Right Job
in-memory
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop delivers compelling economics
✚
EDW Optimization
OPERATIONS
50%
ANALYTICS
20%
ETL PROCESS
30%
OPERATIONS
50% ANALYTICS
50%
Current Reality
EDW at capacity: some usage
from low value workloads
Older data archived, unavailable
for ongoing exploration
Source data often discarded
Augment w/ Hadoop
Free up EDW resources from low value
tasks
Keep 100% of source data and historical
data for ongoing exploration
Mine data for value after loading it
because of schema-on-read
MPP
SAN
Engineered System
NAS
HADOOP
Cloud Storage
$0 $20,000 $40,000 $60,000 $80,000 $180,000
Fully-loaded Cost Per Raw TB
of Data (Min–Max Cost)
Commodity Compute & Storage
Hadoop Enables Scalable Compute &
Storage at a Compelling Cost Structure
Hadoop
Parse, Cleanse
Apply Structure, Transform
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
How to Get Started with Hadoop
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Try Hadoop Today
Download the Hortonworks Sandbox
http://hortonworks.com/products/hortonworks-sandbox/
Learn Hadoop
Build a Proof of Concept
Test New Functionality
© Hortonworks Inc. 2013
5 Reasons Hadoop is Kicking Cans
and Taking Names
Hadoop’s momentum is unstoppable as its open source roots grow wildly into
enterprises. Its refreshingly unique approach to data management is transforming how
companies store, process, analyze, and share big data.
Forrester believes that Hadoop will become must-have infrastructure for large
enterprises.
Here are five reasons firms should adopt Hadoop today:
1.  Build a data lake with the Hadoop file system (HDFS)
2.  Enjoy cheap, quick processing with MapReduce
3.  Data scientists can wrangle big data faster
4.  Even the POC can make you money
5.  The future of Hadoop is real-time and transactional
Page 19
http://blogs.forrester.com/mike_gualtieri/13-10-22-5_reasons_hadoop_is_kicking_can_and_taking_names
© Hortonworks Inc. 2013
Hadoop Summit 2015
Page 20
Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
© Hortonworks Inc. 2013
Thank You!
Eric Mizell - Director, Solutions Engineering
emizell@hortonworks.com

More Related Content

What's hot

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudHortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Hortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Tanguy MOAL
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Common and unique use cases for Apache Hadoop
Common and unique use cases for Apache HadoopCommon and unique use cases for Apache Hadoop
Common and unique use cases for Apache HadoopBrock Noland
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesDataWorks Summit
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with AmbariAmbari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with AmbariHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics? BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics? Datameer
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemGregg Barrett
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysiszafarali1981
 
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAXHow Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAXBMC Software
 
Democratizing Big Data with Microsoft Azure HDInsight
Democratizing Big Data with Microsoft Azure HDInsightDemocratizing Big Data with Microsoft Azure HDInsight
Democratizing Big Data with Microsoft Azure HDInsightHortonworks
 

What's hot (20)

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Common and unique use cases for Apache Hadoop
Common and unique use cases for Apache HadoopCommon and unique use cases for Apache Hadoop
Common and unique use cases for Apache Hadoop
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with AmbariAmbari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
Ambari Meetup: 2nd April 2013: Teradata Viewpoint Hadoop Integration with Ambari
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics? BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics?
 
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystem
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysis
 
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAXHow Big Data and Hadoop Integrated into BMC ControlM at CARFAX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
 
Democratizing Big Data with Microsoft Azure HDInsight
Democratizing Big Data with Microsoft Azure HDInsightDemocratizing Big Data with Microsoft Azure HDInsight
Democratizing Big Data with Microsoft Azure HDInsight
 

Viewers also liked

Google Summer of Code
Google Summer of CodeGoogle Summer of Code
Google Summer of CodePOSSCON
 
Community Building: The Open Source Way
Community Building: The Open Source WayCommunity Building: The Open Source Way
Community Building: The Open Source WayPOSSCON
 
Vagrant 101
Vagrant 101Vagrant 101
Vagrant 101POSSCON
 
Why Your Open Source Story Matters
Why Your Open Source Story MattersWhy Your Open Source Story Matters
Why Your Open Source Story MattersPOSSCON
 
Graph the Planet!
Graph the Planet!Graph the Planet!
Graph the Planet!POSSCON
 
Tools for Open Source Systems Administration
Tools for Open Source Systems AdministrationTools for Open Source Systems Administration
Tools for Open Source Systems AdministrationPOSSCON
 
Intro to AngularJS
Intro to AngularJSIntro to AngularJS
Intro to AngularJSPOSSCON
 
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayI Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayPOSSCON
 
Application Security on a Dime: A Practical Guide to Using Functional Open So...
Application Security on a Dime: A Practical Guide to Using Functional Open So...Application Security on a Dime: A Practical Guide to Using Functional Open So...
Application Security on a Dime: A Practical Guide to Using Functional Open So...POSSCON
 
Software Defined Networking (SDN) for the Datacenter
Software Defined Networking (SDN) for the DatacenterSoftware Defined Networking (SDN) for the Datacenter
Software Defined Networking (SDN) for the DatacenterPOSSCON
 
Why Meteor.JS?
Why Meteor.JS?Why Meteor.JS?
Why Meteor.JS?POSSCON
 
Openstack 101
Openstack 101Openstack 101
Openstack 101POSSCON
 
Accelerating Application Delivery with OpenShift
Accelerating Application Delivery with OpenShiftAccelerating Application Delivery with OpenShift
Accelerating Application Delivery with OpenShiftPOSSCON
 
Docker 101: An Introduction
Docker 101: An IntroductionDocker 101: An Introduction
Docker 101: An IntroductionPOSSCON
 
How to Use Cryptography Properly: The Common Mistakes People Make When Using ...
How to Use Cryptography Properly: The Common Mistakes People Make When Using ...How to Use Cryptography Properly: The Common Mistakes People Make When Using ...
How to Use Cryptography Properly: The Common Mistakes People Make When Using ...POSSCON
 
Cyber Security and Open Source
Cyber Security and Open SourceCyber Security and Open Source
Cyber Security and Open SourcePOSSCON
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopPOSSCON
 

Viewers also liked (17)

Google Summer of Code
Google Summer of CodeGoogle Summer of Code
Google Summer of Code
 
Community Building: The Open Source Way
Community Building: The Open Source WayCommunity Building: The Open Source Way
Community Building: The Open Source Way
 
Vagrant 101
Vagrant 101Vagrant 101
Vagrant 101
 
Why Your Open Source Story Matters
Why Your Open Source Story MattersWhy Your Open Source Story Matters
Why Your Open Source Story Matters
 
Graph the Planet!
Graph the Planet!Graph the Planet!
Graph the Planet!
 
Tools for Open Source Systems Administration
Tools for Open Source Systems AdministrationTools for Open Source Systems Administration
Tools for Open Source Systems Administration
 
Intro to AngularJS
Intro to AngularJSIntro to AngularJS
Intro to AngularJS
 
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayI Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
 
Application Security on a Dime: A Practical Guide to Using Functional Open So...
Application Security on a Dime: A Practical Guide to Using Functional Open So...Application Security on a Dime: A Practical Guide to Using Functional Open So...
Application Security on a Dime: A Practical Guide to Using Functional Open So...
 
Software Defined Networking (SDN) for the Datacenter
Software Defined Networking (SDN) for the DatacenterSoftware Defined Networking (SDN) for the Datacenter
Software Defined Networking (SDN) for the Datacenter
 
Why Meteor.JS?
Why Meteor.JS?Why Meteor.JS?
Why Meteor.JS?
 
Openstack 101
Openstack 101Openstack 101
Openstack 101
 
Accelerating Application Delivery with OpenShift
Accelerating Application Delivery with OpenShiftAccelerating Application Delivery with OpenShift
Accelerating Application Delivery with OpenShift
 
Docker 101: An Introduction
Docker 101: An IntroductionDocker 101: An Introduction
Docker 101: An Introduction
 
How to Use Cryptography Properly: The Common Mistakes People Make When Using ...
How to Use Cryptography Properly: The Common Mistakes People Make When Using ...How to Use Cryptography Properly: The Common Mistakes People Make When Using ...
How to Use Cryptography Properly: The Common Mistakes People Make When Using ...
 
Cyber Security and Open Source
Cyber Security and Open SourceCyber Security and Open Source
Cyber Security and Open Source
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 

Similar to Introduction to Hadoop

Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paperSupratim Ray
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksData Con LA
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championAmeet Paranjape
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Hortonworks
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextHortonworks
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks
 

Similar to Introduction to Hadoop (20)

Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 

More from POSSCON

Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...
Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...
Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...POSSCON
 
Software Freedom Licensing: What You Must Know
Software Freedom Licensing: What You Must KnowSoftware Freedom Licensing: What You Must Know
Software Freedom Licensing: What You Must KnowPOSSCON
 
Contributing to an Open Source Project 101
Contributing to an Open Source Project 101Contributing to an Open Source Project 101
Contributing to an Open Source Project 101POSSCON
 
Messaging Standards and Systems - AMQP & RabbitMQ
Messaging Standards and Systems - AMQP & RabbitMQMessaging Standards and Systems - AMQP & RabbitMQ
Messaging Standards and Systems - AMQP & RabbitMQPOSSCON
 
Converged Infrastructure with Sanoid
Converged Infrastructure with SanoidConverged Infrastructure with Sanoid
Converged Infrastructure with SanoidPOSSCON
 
Guide to the Open Source Desktop
Guide to the Open Source DesktopGuide to the Open Source Desktop
Guide to the Open Source DesktopPOSSCON
 
Introduction to OpenNMS
Introduction to OpenNMSIntroduction to OpenNMS
Introduction to OpenNMSPOSSCON
 
Backing Up Android
Backing Up AndroidBacking Up Android
Backing Up AndroidPOSSCON
 

More from POSSCON (8)

Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...
Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...
Assembling an Open Source Toolchain to Manage Public, Private and Hybrid Clou...
 
Software Freedom Licensing: What You Must Know
Software Freedom Licensing: What You Must KnowSoftware Freedom Licensing: What You Must Know
Software Freedom Licensing: What You Must Know
 
Contributing to an Open Source Project 101
Contributing to an Open Source Project 101Contributing to an Open Source Project 101
Contributing to an Open Source Project 101
 
Messaging Standards and Systems - AMQP & RabbitMQ
Messaging Standards and Systems - AMQP & RabbitMQMessaging Standards and Systems - AMQP & RabbitMQ
Messaging Standards and Systems - AMQP & RabbitMQ
 
Converged Infrastructure with Sanoid
Converged Infrastructure with SanoidConverged Infrastructure with Sanoid
Converged Infrastructure with Sanoid
 
Guide to the Open Source Desktop
Guide to the Open Source DesktopGuide to the Open Source Desktop
Guide to the Open Source Desktop
 
Introduction to OpenNMS
Introduction to OpenNMSIntroduction to OpenNMS
Introduction to OpenNMS
 
Backing Up Android
Backing Up AndroidBacking Up Android
Backing Up Android
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Introduction to Hadoop

  • 1. Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Introduction to Hadoop Eric Mizell – Director, Solution Engineering Hortonworks. We do Hadoop.
  • 2. © Hortonworks Inc. 2012 Page 2
  • 3. © Hortonworks Inc. 2012 Page 3
  • 4. © Hortonworks Inc. 2012 Page 4
  • 5. Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Quick Audience Poll Which best describes how your org is using Hadoop? A.  We’re using Hadoop B.  We’re in the process of getting Hadoop integrated C. We don’t have Hadoop installed D. What’s Hadoop? E.  I don’t know
  • 6. Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Big Data, Hadoop, and the Modern Data Architecture
  • 7. Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Big Data Explosion Big Data Market Trends & Projections 20% % by which org’s leveraging modern info management systems outperform peers by 2015 !" 1 Zettabyte (ZB) = 1 Billion TBs 15x growth rate of machine generated data by 2020 The US has 1/3 of the world’s data Big Data is 1 of 5 US GDP Game Changers $325 billion incremental annual GDP from big data analytics in retail and manufacturing by 2020
  • 8. Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Existing Siloed Data Architectures Under PressureAPPLICATIONS  DATA    SYSTEM  SOURCES   Business     Analy:cs   Custom   Applica:ons   Packaged   Applica:ons   Exis:ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   SILO   SILO   RDBMS   SILO   SILO   SILO   SILO   EDW   MPP   Data  growth:  New  Data  Types   OLTP,  ERP,  CRM  Systems   Unstructured  docs,  emails   Clickstream   Server  logs   Social/Web  Data   Sensor.  Machine  Data   Geoloca:on   85% Source: IDC ?? "   Can’t manage new data paradigm "   Constrains data to specific schema " Siloed data "   Limited scalability "   Economically unfeasible "   Limited analytics
  • 9. Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop is Driving the New Data-driven Era of IT 1st Era Real-time Data Driven RDBMS 2nd Era 3rd Era Automation + EfficiencyProcessing Power Mainframe GoalDataTechnology
  • 10. Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Key Drivers of Hadoop OPERATIONS  TOOLS   Provision, Manage & Monitor DEV  &  DATA  TOOLS   Build & Test DATA    SYSTEM   REPOSITORIES   SOURCES   RDBMS   EDW   MPP   APPLICATIONS   Business     Analy:cs   Custom   Applica:ons   Packaged   Applica:ons   Unlock  New  Approach  to  Analy:cs   •  Agile  analy*cs  via  “Schema  on  Read”  with   ability  to  store  all  data  in  na*ve  format   •  Create  new  apps  from  new  types  of  data   A Op:mize  Investments,  Cut  Costs   •  Focus  EDW  on  high  value  workloads   •  Use  commodity  servers  &  storage  to   enable  all  data  (original  and  historical)  to   be  accessible  for  ongoing  explora*on   B Enable  a  Modern  Data  Architecture   •  Integrate  new  &  exis*ng  data  sets   •  Make  all  data  available  for  shared  access  and   processing  in  mul*tenant  infrastructure   •  Batch,  interac*ve  &  real-­‐*me  use  cases   •  Integrated  with  exis*ng  tools  &  skills   C EXISTING   Systems   Clickstream   Web  &   Social   Geoloca:on   Sensor  &   Machine   Server     Logs   Unstructured   YARN: Data Operating System ° ° ° ° ° ° ° ° ° Interactive Real-TimeBatch HDFS: Hadoop Distributed File System
  • 11. Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved …to real-time personalizationFrom static branding …to repair before breakFrom break then fix …to designer medicineFrom mass treatment …to automated algorithmsFrom educated investing …to 1x1 targetingFrom mass branding A shift in Advertising A shift in Financial Services A shift in Healthcare A shift in Retail A shift in Manufacturing Hadoop enables organizations to cost effectively store and use all of the data available in a way that shifts the business from… Reactive Proactive Shift to Data-driven Means Treating Data like Capital
  • 12. Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Enterprise Goals for the Modern Data Architecture ü  Centrally manage new and existing data ü  Data needs flexibility and lands in Hadoop without schema ü  Prepare data with no predetermined questions ü  User self-service – no limit to questions ü  Run batch, interactive & real time analytic applications on shared datasets ü  Leverage new and existing data center infrastructure investments ü  Scalable and affordable; low cost per TB APPLICATIONSDATASYSTEM Business Analytics Custom Applications Packaged Applications RDBMS EDW MPP YARN: Data Operating System 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N Interactive Real-TimeBatch CRM ERP Other 1 ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) SOURCES EXISTING   Systems   Clickstream   Web  &   Social   Geoloca:on   Sensor  &   Machine   Server     Logs   Unstructured  
  • 13. Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved YARN and HDP Enables the Modern Data Architecture YARN is the architectural center of Hadoop and HDP •  YARN enables a common data set across all applications •  Batch, interactive & real-time workloads •  Support multi-tenant access & processing HDP enables Apache Hadoop to become Enterprise Viable Data Platform with centralized services •  Security •  Governance •  Operations •  Productization Enabled broad ecosystem adoption Hortonworks drove this innovation of Hadoop through YARN Hortonworks Data Platform 2.2 YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Script Pig SQL Hive Tez Tez Java Scala Cascading Tez ° ° ° ° ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Stream Storm Search Solr NoSQL HBase Accumulo Slider Slider SECURITYGOVERNANCE OPERATIONSBATCH, INTERACTIVE & REAL-TIME DATA ACCESS In- Memory Spark Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie Data Workflow, Lifecycle & Governance Falcon Sqoop Flume Kafka NFS WebHDFS Authentication Authorization Audit Data Protection Storage: HDFS Resources: YARN Access: Hive Pipeline: Falcon Cluster: Ranger Cluster: Knox Deployment ChoiceLinux Windows Cloud Others ISV Engines On-Premises
  • 14. Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved OPERATIONAL  TOOLS   DEV  &  DATA  TOOLS   INFRASTRUCTURE   Modern Data ArchitectureSOURCES EXISTING   Systems   Clickstream   Web  &Social   Geoloca:on   Sensor  &   Machine   Server  Logs   Unstructured   DATASYSTEM RDBMS   EDW   HANA APPLICATIONS   BusinessObjects BI Deep Partnerships Hortonworks engages in deep engineered relationships with the leaders in the data center, such as Microsoft, HP, Teradata, SAS, SAP & Redhat Broad Partnerships Over 600 partners work with us to certify their applications to work with Hadoop so they can extend big data to their users YARN: Data Operating System 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System) Interactive Real-TimeBatch
  • 15. Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop unlocks a new approach: Iterative Analytics         ✚ Determine  list  of  ques:ons   Design  solu:ons   Collect  structured  data   Ask  ques:ons  from  list   Detect  addi:onal  ques:ons   Current Reality Apply schema on write Dependent on IT Repeatable Process: SQL Only Augment w/ Hadoop Apply schema on read Support range of access patterns to data stored in HDFS: polymorphic access HADOOP Iterate over structure Transform and Analyze batch interactive real-time Right Engine, Right Job in-memory
  • 16. Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop delivers compelling economics ✚ EDW Optimization OPERATIONS 50% ANALYTICS 20% ETL PROCESS 30% OPERATIONS 50% ANALYTICS 50% Current Reality EDW at capacity: some usage from low value workloads Older data archived, unavailable for ongoing exploration Source data often discarded Augment w/ Hadoop Free up EDW resources from low value tasks Keep 100% of source data and historical data for ongoing exploration Mine data for value after loading it because of schema-on-read MPP SAN Engineered System NAS HADOOP Cloud Storage $0 $20,000 $40,000 $60,000 $80,000 $180,000 Fully-loaded Cost Per Raw TB of Data (Min–Max Cost) Commodity Compute & Storage Hadoop Enables Scalable Compute & Storage at a Compelling Cost Structure Hadoop Parse, Cleanse Apply Structure, Transform
  • 17. Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved How to Get Started with Hadoop
  • 18. Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Try Hadoop Today Download the Hortonworks Sandbox http://hortonworks.com/products/hortonworks-sandbox/ Learn Hadoop Build a Proof of Concept Test New Functionality
  • 19. © Hortonworks Inc. 2013 5 Reasons Hadoop is Kicking Cans and Taking Names Hadoop’s momentum is unstoppable as its open source roots grow wildly into enterprises. Its refreshingly unique approach to data management is transforming how companies store, process, analyze, and share big data. Forrester believes that Hadoop will become must-have infrastructure for large enterprises. Here are five reasons firms should adopt Hadoop today: 1.  Build a data lake with the Hadoop file system (HDFS) 2.  Enjoy cheap, quick processing with MapReduce 3.  Data scientists can wrangle big data faster 4.  Even the POC can make you money 5.  The future of Hadoop is real-time and transactional Page 19 http://blogs.forrester.com/mike_gualtieri/13-10-22-5_reasons_hadoop_is_kicking_can_and_taking_names
  • 20. © Hortonworks Inc. 2013 Hadoop Summit 2015 Page 20
  • 21. Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved © Hortonworks Inc. 2013 Thank You! Eric Mizell - Director, Solutions Engineering emizell@hortonworks.com