1 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HDF	3.2	– What’s	New
Dinesh	Chandrasekhar,	Director,	Product	Marketing	@AppInt4All
Jeremy	Dyer,	Sr.	Product	Manager	@mightyjeremy
2 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Modern	Data	Architecture	
DATA	CENTER
Machine
Learning/
Artificial	
Intelligence
Telemetry	–
Connected	
Devices
Time	Series
Databases
Stream	Analytics
Deep	Historical
Analysis
Exception	
Monitoring
Legacy/
Operational
Data	
Sensors,	
Control	
Systems	
Cyber	
Security
Edge	
Analytics
Social Mobile
IoT
IoT
CLOUD
Geo	Location
3 ©	Hortonworks	Inc.	2011	– 2018.	All	Rights	Reserved
Capture
streaming	data
Deliver
perishable	insights
Combine
new	&	old	data
Store
data	forever
Access
a	multi-tenant	data	lake
Model
with	machine	learning
DATA	AT	REST
( Hortonw orks	Data	Platf orm)
DATA	IN	MOTION
( Hortonw orks	DataFlow)
ACTIONABLE
INTELLIGENCE
Perishable	Insights
Historical	Insights
A	Connected	Data	Strategy	Solves	for	All	Data
HORTONWORKS
DATAPLANE
SERVICE
Manage,	 Secure,	 Govern
MULTIPLE	CLUSTERS	AND		SOURCES
MULTIHYBRID
4 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
5 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HDF	Use	Cases
Data	Movement
Optimize	 resource	utilization	by	moving	data	
between	data	centers	or	between	on-
premises	infrastructure	and	cloud	
infrastructure
Optimize	Log	Collection	&	Analysis											
Optimize	 log	analytics	solutions	such	as	Splunk
by	using	HDF	as	a	single	platform	to	collect	
and	deliver	multiple	data	sources	and	using	
HDP	for	lower	cost	storage	options
Gain	key	insights	with	Streaming	Analytics	
Accelerate	big	data	ROI	by	analyzing	streaming	
data	for	patterns,	comparing	with	ML	models	
and	delivering	actionable	intelligence
Single	view	/	360 view	of	customer
Ingest,	transform	and	combine	customer	
data	from	multiple	sources	into	a	single	
data	view	/	lake
Stream	Processing
Combine	multiple	streams	of	data	in	
real-time,	enrich	the	data	and	route	it	to	
different	end	points	based	on	rules
Capture	and	Analyze	IoT	Data
Ingest	sensor	data	from	IoT	devices	and	
stream	it	for	further	processing	and	
comprehensive	 analysis
6 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
7 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HDF	3.2	Highlights
EDGE
INTELLIGENCE
+
CURE KAFKA
BLINDNESS
100K
PROCESSORS
PER FLOW
HDP 3.0
SUPPORT
8 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HDF	3.2	Release	Themes
Cross-platform	
Integration
HDP	3.0	Support
Product	
Enhancements
Performance	
enhancements
• Support	deployment	of	single	cluster	
with	HDF	and	HDP	services
• Single	Ambari	instance	for	HDF	and	
HDP	– one	Ranger,	Atlas,	Knox	etc.
• NiFi	supports	Hive	3	on	HDP	3.0
• NiFi	processors	to	support	HDP	3.0
• Storm	connectors	to	support	HDP	3.0
• SAM	components	to	support	HDP	3.0
• NiFi	flows	can	have	as	much	as	100K	
processors
• More	stability	on	large	NiFi	clusters
• Support	for	Nifi 1.7.0
• Kerberos	keytab isolation
• Support	for	Kafka	1.1.1
• Support	for	TensorFlow	in	MiNiFi
9 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Cross-Platform	Integration
10 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Single	Shared	Ambari	Instance
• Single	Ambari	instance
• Shared	common	services
• Ranger
• Atlas
• Knox
11 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HDP	3.0	Platform	Support
12 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HDF/HDP	3.0	Integration
• Hive	3	specific	processors
• HBase	2
• Updated	HDFS	processors
• New	GetHDFSFileInfo processor
• Ranger	1.1.0
• Support	for	group	based	policies
• Knox	1.0.0	– NiFi SSO
• Existed	previously	but	now	upgraded
• Atlas	1.0.0
• SAM	compatibility	with	HDP	3.0	services
13 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Product	Enhancements
14 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HDF	3.2	– NiFi,	Kafka	and	MiNiFi
Updated	to	1.7.0 Updated	to	1.1.1
Developer	enhancements
• NiFi-Registry	Git	Integration
• NiFi	Registry	event	hooks
Security	enhancements
• Kerberos	keytab isolation
• Support	 for	TensorFlow
• New	Expression	Language	
features	/	updates
• New	C-based	SDK
• SuSE support
• Feature	to	capture	
producer/topic	 metrics	at	
partition	level
• Security	and	governance	
enhancements
Support	for
15 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Performance	
Enhancements
16 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
NiFi 1.7.0	– Performance	Enhancements
• NiFi	flows	can	have	as	much	as	100K	processors
• More	stability	on	large	NiFi	clusters
• Improved	multi-tenant	concurrent	usage
17 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
18 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Hortonworks	Streams	
Messaging	Manager
19 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Streaming	Analytics	Reference	Architecture
Data Flow Apps
Powered by NiFi
Subcribing
Streaming Analytics
Apps
Kafka is Everywhere. Critical Component of Streaming Architectures
Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers
Data Syndication
Services Powered by
Kafka
syndicate-
speed
Kafka Topic
syndicate-
geo
Kafka Topic
syndicate-
transmission
Kafka Topic
syndicate-
temp
Kafka Topic
syndicate-
oil
Kafka Topic
syndicate-
breaks
Kafka Topic
syndicate-
battery
Kafka Topic
syndicate-
start/stop
Kafka Topic
syndicate-
acceleration
Kafka Topic
syndicate-
idle
Kafka Topic
Data Collection
at the Edge
US West Fleet
Truck	Sensors	 C++
Agent
US Central Fleet
Truck	Sensors	 C++
Agent
US East Fleet
Truck	Sensors	 C++
Agent
Analytics App 1
Analytics App 2
Analytics App 5
Analytics App 3
Analytics App 4
gateway-west-
raw-sensors
Kafka Topic
IOT Ingest Gateway
Powered by Kafka
gateway-central-
raw-sensors
Kafka Topic
gateway-east-
raw-sensors
Kafka Topic
20 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
HORTONWORKS
DATA	PLATFORM	(HDP®)	
DATA-AT-REST
PRODUCERS CONSUMERS
TOPICS
BROKERS
SMM REST Server
HORTONWORKS	
DATAFLOW	(HDF)
DATA-IN-MOTION
Optimize	your	
Kafka	Clusters
Troubleshoot	
Kafka	Streams
Trace	end-to-
end	Kafka	flows
Streams	Messaging	Manager	(SMM)
21 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Cure	is	Here:	Hortonworks	Streams	Messaging	Manager	(SMM)
What	is	SMM?
à Kafka	Management	and	Monitoring	tool
à Cure	the	“Kafka	Blindness”
à Single	Monitoring	Dashboard	for	all	your	
Kafka	Clusters	across	4	entities
– Broker	
– Producer
– Topic
– Consumer
à Supports	multiple	HDP	and/or	HDF	Kafka	
Clusters
à REST	as	a	First	Class	Citizen
à Delivered	as	a	DataPlane	Service
22 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved
Questions?

HDF 3.2 - What's New