SlideShare a Scribd company logo
1 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hortonworks	Data	Cloud
Enterprise	ready	Hadoop	on	the	cloud
蒋 逸峰(しょう いつほう/Yifeng	Jiang)
Solutions	Engineer,	Hortonworks
@uprush
December	14,	2016
2 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
About Me
蒋 逸峰 (しょう いつほう / Yifeng Jiang)
• Solutions Engineer, Hortonworks
• Apache HBase book author
• I like hiking & running
• Twitter: @uprush
3 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hortonworks	Data	Platform	(HDP)
4 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
What’s	Missing?
à Ambari makes	deploying	HDP	super	easy,	but..
– It	is	not	easy	to	get	there
– Cluster	sizing
– HW	purchase,	setup	in	DC,	network
– OS	setup
à Average	three	weeks	or	even	more
5 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved6
Introducing	Hortonworks	Data	Cloud	for	AWS
à A	new	cloud	product	from	Hortonworks
– Powered	by	Hortonworks Data	Platform
à Offers	Pay-As-You-Go	(PAYG)	pricing
à Delivered	and	sold	via	AWS	Marketplace
à Handles	most	common	big	data	use	cases	
with	Apache	Hadoop,	Spark,	and	Hive
– Choose	from	a	set	of	prescriptive	cluster	types
à Focuses	on	ease	of	use	and	business	agility
– Avoids	infinite	configurability	and	customization
à Optional	Free	Community	Support	**
**	Enterprise	Support	option	coming	soon
7 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
DEMO
8 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Architecture
Amazon	Web	Services
Cloudbreak	
Services
Cloud	controller	(aka	Cloudbreak)
Cloudbreak	
DB
Connector
AWS GCE Azure
HDP	Cluster:	ETL	/	EDW
Master	GroupMaster	Group:	
Hive,	Spark
Ambari
Slave	
Group
Blueprint
HDP	Cluster:	Analytics
Master	GroupMaster	Group:	
LLAP,	Zeppelin
Ambari
Slave	
Group
Blueprint
Cloudbreak	
Deployer
Access	tools
Shell REST	API Web	UI
OpenStack
S3aFileSystem
S3aFileSystem
9 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Hortonworks	Data	Cloud	- Summary
à Launch	and	manage	clusters	by	workload	type
– ETL	/	EDW,	Data	science,	Business	analytics
à Use	highly	scalable,	durable	storage	for	data	(S3)	
&	metadata	(RDS)
à Share	data	and	metadata	among	multiple	
ephemeral	clusters
à Scale	up	and	down	at	the	click	of	a	button
à Secure	clusters	with	IAM	roles,	security	groups,	
etc.
10 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Improving	Enterprise	
Readiness
11 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Enterprise	Readiness
Improving	enterprise	readiness	in	the	cloud
à Cloud	storage
à Security	and	governance
à Reliability	and	fault	tolerance
12 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Matching	Hadoop	with	the	Cloud
Datacenter
• Data	Locality
• Consistent	
Storage
• Single	cluster	
administration
Cloud
• Scalable	storage
• Customizability
• Cost	effective	
compute
• Scalable	storage	with
performance	and	
consistency
• Customizability	with
ease	of	administration
• Cost	effective	compute	
with SLA	policies
13 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Cloud	Storage	access	facts
HDFS
Applicati
on
Input Output tmp
Interaction	models
Applicati
on
HDFSInput
Output
Copy
à Cloud	storage	optimizes	for	scale
– S3	data	is	replicated	for	high	scale	access,	
durability
à Data	access	is	remote
– Data	locality
– Costlier	metadata	operations	(E.g.	hadoop fs
–mv is	actually	a	copy	and	delete)
à Eventual	Consistency
– Takes	time	for	effect	of	modification	operations	
to	permeate	to	all	copies
14 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Performance	with	Scalability
à General	strategy:	Optimize	by	workload	types
à ETL	workloads
– Typical	pipeline:	Bring	in	data	=>	Transform	=>	Repair	partitions	=>	Compute	statistics
– Multiple	metadata	calls:	Batched	and	issued	in	parallel	for	performance	gains
à Distcp
– Optimized	buffer	management	for	transferring	large	files
– Randomize	input	to	Distcp to	avoid	hot-spotting	S3	nodes
15 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Performance	with	Scalability
à Analytics	workloads	– ORC	file	related	optimizations
– Support	fast	random	access	reads	(both	directions)	by	avoiding	tearing	down	
S3	HTTP	connections
– Pass	index	information	to	compute	tasks	as	part	of	split	data	to	avoid	re-
computation
à Status:	Available,	but	performance	optimizations	never	stop	J
https://hortonworks.github.io/hdp-aws/s3-performance/index.html
16 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Correctness	with	strong	consistency
à Write	operations	followed	by	read	may	not	return	correct	
results
– Issues	for	data	pipelines,	multi-stage	jobs,	etc.
à S3Guard	project:	Intermediate,	consistent	metadata	store
à Write	calls	from	S3AFileSystem	update	both	S3	and	metadata	
store
à S3AFileSystem	automatically	tries	to	reconcile	metadata	
between	S3	and	metadata	store	on	subsequent	reads
– Inconsistencies	are	handled	based	on	policy
à Status:	In	progress
16
https://issues.apache.org/jira/browse/HADOOP-13345
17 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Securing	data	access	via	IAM	Roles
à Integration	with	cloud	provider
à Provide	an	IAM	role	as	instance	profile	for	
a	cluster
à Attach	policies	for	accessing	S3	to	the	role
– E.g.	Read-only	access	for	BI	cluster	to	
specific	buckets
à Status:	Available
18 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data	Security	in	Hadoop
Apache	Ranger
à Fine	grained,	role-based	access	policies	to	
data
– Table/column	level	ACL
à Audit	access	information
à Row	level	filtering
à Dynamic	data	masking
19 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data	Governance	in	Hadoop
Apache	Atlas
à Auto	discover	&	index	metadata
à Tag	data
à Track	data	lineage
20 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data	governance	technical	architecture	– On	Premise
On	Premise	HDP	Cluster
Ranger	
Admin
Policy
Policy
Atlas	
Admin Metadata
Governed	HDP	
Component	
(E.g.	Hive)
Ranger	
Plugin
Atlas	
Plugin
LDAP	/	AD
Data	Steward
21 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data	Governance	in	the	Cloud:	
Ease	of	administration	with	flexibility
à No	longer	a	single	compute	cluster	generating	/	accessing	data
à Data	&	Metadata	are	still	single	and	shared
à Evolve	Atlas	and	Ranger	to	be	data	lake	centric	than	cluster	centric
– Shared	long	running	Admin	components
– Ephemeral	plugins	on	compute	clusters
à Status:	Available	as	a	Tech	Preview
https://github.com/hortonworks/hdc-cli/blob/master/shared_cluster.md
22 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Shared	Ranger	/	Atlas	admin	services
Available	in	Tech	Preview	in	Hortonworks	Data	Cloud
ETL-EDW	Cluster
Governed	HDP	
Component	(E.g.	Hive)
LDAP	/	AD
Ranger	
Plugin
Atlas	
Plugin
Data	Analytics	Cluster
Governed	HDP	
Component	(E.g.	Hive)
Ranger	
Plugin
Atlas	
Plugin
Ranger	
Admin Policy
Policy
Atlas	
Admin Metadata
Cloud	
Controller
Shared	Enterprise	Services
Data	Steward
23 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
HDP	Cloud	Compute	nodes	on	AWS
à Regular	EC2	instances	
à Can	attach	EBS	volumes	or	ephemeral	storage	disks
à Grouped	according	to	functionality	/	access	requirements
à Opportunistic	provisioning	– spot	instances	(work	in	progress)
HDP	Cluster
Master	Group
Group	#1
Gateway	node:	
Ambari
Master	Group
Group	#2
Cloud	Controller
24 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
HDP	Cloud	Compute	nodes	on	AWS
24
25 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Reliability	with	cost	benefits
à HDP	host	instances	could	become	unhealthy
– Unreliable	underlying	infrastructure
– Spot	instances	are	transient,	dependent	on	bid	price
– SLA	impact	for	workloads
à Automatically	replace	un-healthy	nodes
– No	costs	incurred	if	node	is	not	functional
– Replace	unhealthy	instances	to	maintain	a	desired	capacity
à Status:	Work	in	progress
26 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Auto-recovery	of	slave	nodes
à Use	Ambari	to	detect	unhealthy	status	&	notify	Cloudbreak
à Decommission	and	terminate	unhealthy	instances
à Provision	new	instances	and	add	to	cluster
HDP	Cluster
Master	Group
Group	#1
Gateway	node:	
Ambari
Master	Group
Group	#2
Cloud	Controller
27 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Summary
28 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Our	Connected	Data	Platform	Solutions
Hortonworks	:		Powering	the	Future	of	Data
(Every	business	is	a	data	business,	master	value	of	data	via	open	approach)
Modern	Data	Applications
(CyberSecurity,	IoT,	Partners,	Custom,	etc.)
Connected	Data	Platforms
(Manage	All	Data:	data-at-rest,	data-in-motion,	data	center	&	cloud)
Training	|	Consulting	|	Community	Connection	|	Partnerworks
Data	Center	Solutions Cloud	Solutions
Hortonworks
Data	Cloud	
for	AWS
Azure	
HDInsight
Rackspace
Accenture
Others
HDP HDF
Syncsort
AtScale
Pivotal	HDB
Others
Enterprise	Subscription
SmartSense operational	svc’s
24x7	Support
Maintenance
Etc.
29 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
http://hortonworks.com/info/aws-marketplace-credits-signup/
30 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
THANK	YOU

More Related Content

What's hot

How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Hortonworks
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
Hortonworks
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
Hortonworks
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
Hortonworks
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical Workshop
Hortonworks
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
Hortonworks
 
Apache Hadoop 0.23
Apache Hadoop 0.23Apache Hadoop 0.23
Apache Hadoop 0.23
Hortonworks
 
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
Hortonworks
 
HDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for EveryoneHDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for Everyone
Yifeng Jiang
 
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto Meetup
Hortonworks
 
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power SystemsDelivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Hortonworks
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks
 
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
Hortonworks
 
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC IsilonScaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Hortonworks
 
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureAn Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
DataWorks Summit/Hadoop Summit
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 

What's hot (20)

How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical Workshop
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Apache Hadoop 0.23
Apache Hadoop 0.23Apache Hadoop 0.23
Apache Hadoop 0.23
 
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
 
HDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for EveryoneHDF 3.0 IoT Platform for Everyone
HDF 3.0 IoT Platform for Everyone
 
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto Meetup
 
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power SystemsDelivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14
 
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari Hortonworks Data In Motion Series Part 3 - HDF Ambari
Hortonworks Data In Motion Series Part 3 - HDF Ambari
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC IsilonScaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC Isilon
 
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureAn Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 

Similar to Introduction to Hortonworks Data Cloud for AWS

Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
Isheeta Sanghi
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Timothy Spann
 
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteWelcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
DataWorks Summit/Hadoop Summit
 
Using Apache® NiFi to Empower Self-Organising Teams
Using Apache® NiFi to Empower Self-Organising TeamsUsing Apache® NiFi to Empower Self-Organising Teams
Using Apache® NiFi to Empower Self-Organising Teams
Sebastian Carroll
 
introduction-to-apache-kafka
introduction-to-apache-kafkaintroduction-to-apache-kafka
introduction-to-apache-kafka
Yifeng Jiang
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
 
Hadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise HadoopHadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise Hadoop
Yifeng Jiang
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
Mac Moore
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
Hortonworks
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
Raúl Marín
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
Koji Kawamura
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
Hortonworks
 
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
VMware Tanzu
 
Hadoop In Action
Hadoop In ActionHadoop In Action
Hadoop In Action
Bigdata Meetup Kochi
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
Hortonworks
 
OSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier RenaultOSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier Renault
NETWAYS
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
Hortonworks
 
Hadoop in adtech
Hadoop in adtechHadoop in adtech
Hadoop in adtech
Yuta Imai
 

Similar to Introduction to Hortonworks Data Cloud for AWS (20)

Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy KeynoteWelcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
 
Using Apache® NiFi to Empower Self-Organising Teams
Using Apache® NiFi to Empower Self-Organising TeamsUsing Apache® NiFi to Empower Self-Organising Teams
Using Apache® NiFi to Empower Self-Organising Teams
 
introduction-to-apache-kafka
introduction-to-apache-kafkaintroduction-to-apache-kafka
introduction-to-apache-kafka
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Hadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise HadoopHadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise Hadoop
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
 
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
Achieving Mega-Scale Business Intelligence Through Speed of Thought Analytics...
 
Hadoop In Action
Hadoop In ActionHadoop In Action
Hadoop In Action
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
 
OSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier RenaultOSDC 2013 | Introduction into Hadoop by Olivier Renault
OSDC 2013 | Introduction into Hadoop by Olivier Renault
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Hadoop in adtech
Hadoop in adtechHadoop in adtech
Hadoop in adtech
 

More from Yifeng Jiang

Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsHive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
Yifeng Jiang
 
Introduction to Streaming Analytics Manager
Introduction to Streaming Analytics ManagerIntroduction to Streaming Analytics Manager
Introduction to Streaming Analytics Manager
Yifeng Jiang
 
Real-time Analytics in Financial
Real-time Analytics in FinancialReal-time Analytics in Financial
Real-time Analytics in Financial
Yifeng Jiang
 
sparksql-hive-bench-by-nec-hwx-at-hcj16
sparksql-hive-bench-by-nec-hwx-at-hcj16sparksql-hive-bench-by-nec-hwx-at-hcj16
sparksql-hive-bench-by-nec-hwx-at-hcj16
Yifeng Jiang
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
Yifeng Jiang
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
Yifeng Jiang
 
Yifeng hadoop-present-public
Yifeng hadoop-present-publicYifeng hadoop-present-public
Yifeng hadoop-present-public
Yifeng Jiang
 
Hive-sub-second-sql-on-hadoop-public
Hive-sub-second-sql-on-hadoop-publicHive-sub-second-sql-on-hadoop-public
Hive-sub-second-sql-on-hadoop-public
Yifeng Jiang
 
Yifeng spark-final-public
Yifeng spark-final-publicYifeng spark-final-public
Yifeng spark-final-public
Yifeng Jiang
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
Yifeng Jiang
 
Apache Hiveの今とこれから
Apache Hiveの今とこれからApache Hiveの今とこれから
Apache Hiveの今とこれから
Yifeng Jiang
 
HDFS Deep Dive
HDFS Deep DiveHDFS Deep Dive
HDFS Deep Dive
Yifeng Jiang
 
Hadoop Trends & Hadoop on EC2
Hadoop Trends & Hadoop on EC2Hadoop Trends & Hadoop on EC2
Hadoop Trends & Hadoop on EC2
Yifeng Jiang
 
Apache Ambari Overview -- Hadoop for Everyone
Apache Ambari Overview -- Hadoop for EveryoneApache Ambari Overview -- Hadoop for Everyone
Apache Ambari Overview -- Hadoop for Everyone
Yifeng Jiang
 
HDP Security Overview
HDP Security OverviewHDP Security Overview
HDP Security Overview
Yifeng Jiang
 
Data Science on Hadoop
Data Science on HadoopData Science on Hadoop
Data Science on Hadoop
Yifeng Jiang
 

More from Yifeng Jiang (16)

Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsHive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
 
Introduction to Streaming Analytics Manager
Introduction to Streaming Analytics ManagerIntroduction to Streaming Analytics Manager
Introduction to Streaming Analytics Manager
 
Real-time Analytics in Financial
Real-time Analytics in FinancialReal-time Analytics in Financial
Real-time Analytics in Financial
 
sparksql-hive-bench-by-nec-hwx-at-hcj16
sparksql-hive-bench-by-nec-hwx-at-hcj16sparksql-hive-bench-by-nec-hwx-at-hcj16
sparksql-hive-bench-by-nec-hwx-at-hcj16
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
 
Yifeng hadoop-present-public
Yifeng hadoop-present-publicYifeng hadoop-present-public
Yifeng hadoop-present-public
 
Hive-sub-second-sql-on-hadoop-public
Hive-sub-second-sql-on-hadoop-publicHive-sub-second-sql-on-hadoop-public
Hive-sub-second-sql-on-hadoop-public
 
Yifeng spark-final-public
Yifeng spark-final-publicYifeng spark-final-public
Yifeng spark-final-public
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
 
Apache Hiveの今とこれから
Apache Hiveの今とこれからApache Hiveの今とこれから
Apache Hiveの今とこれから
 
HDFS Deep Dive
HDFS Deep DiveHDFS Deep Dive
HDFS Deep Dive
 
Hadoop Trends & Hadoop on EC2
Hadoop Trends & Hadoop on EC2Hadoop Trends & Hadoop on EC2
Hadoop Trends & Hadoop on EC2
 
Apache Ambari Overview -- Hadoop for Everyone
Apache Ambari Overview -- Hadoop for EveryoneApache Ambari Overview -- Hadoop for Everyone
Apache Ambari Overview -- Hadoop for Everyone
 
HDP Security Overview
HDP Security OverviewHDP Security Overview
HDP Security Overview
 
Data Science on Hadoop
Data Science on HadoopData Science on Hadoop
Data Science on Hadoop
 

Recently uploaded

WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 

Recently uploaded (20)

WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 

Introduction to Hortonworks Data Cloud for AWS