1 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Wade	Salazar	- Senior	Solutions	Engineer,	Hortonworks	
Analytics	&	Industrial	Process	Data
2 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
OT	/	IT	Convergence	– Must	Occur	to	Achieve	Business	Improvement	
Source:	IBM
3 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Connected	Data	Platforms	Enables	IIoT	in	Energy	&	Utilities	
Source:	https://www.cm-collaborative-tech.com/wp-content/uploads/2016/11/Smart-grid-A-1.jpg
Predictive	MaintenanceFraud	DetectionExternal	 Sources	
(Weather,	 Social	
Media,	GPS,	etc.)
Single	View	of	Customer
4 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Highest Value Data
Always on, always connected devices	generate	a	
constant	stream	of	data	related	to	the	operations	of	
industrial	businesses
These datasets contain:
• What	events	occurred	
• Why	and	event	occurred,	or	not
• Quantification	of	an	event’s	impact
These datasets go by many names:
• “SCADA	Data”
• “Control	System	Data”
• “Historian	Data”
• “Machine	Data”
• “Measurement	Logs”
• “Telemetry”
How	are	my	…
People?
Processes?
Equipment?
Lots	of	misnomers
5 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Instrumentation
§ Commonly	 only	output	is	
electrical	signals	
§ Integration	with	sensors	
requires	specialized	
hardware
§ serial	bus,	 or	wireless	are	
increasingly	 available
Challenges	in	accessing	data	in	the	ICS	landscape
Control	Systems
§ Data	is	transmitted	via	
proprietary	vendor	 specific	
protocols	
§ Direct	Integration	with	
control	systems	requires	
protocol	 translation/parsing	
for	each	platform	family
Nifi’s is	a	toolbox	of	connectors
§ Ingest	text	files	and	interrogate	REST	APIs	
using	built	in	connectors
§ Connect	 to	industry	 standard	protocols	 like	
OPC	UA	with	custom	processors
§ Build	your	own		
Existing	ICS	Components
PLC,	RTU	&	DCS
Open	Source	Tools
Governance
&Integration
Security
Operations
Data Access
Data Management
Process	Historians	&
OPC	Servers
§ Data	is	typically	available	via	
programmatic	access	such	
as	OPC,	API	or	SQL
§ There	is	almost	always	an	
option	 to	create	text	files
6 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Stepwise	approach	to	the	challenge
Remote	Field	or	Manufacturing	Site
Location	1
Data	Consumers
Data	
Marts
Analytics,	
Statics	&	
Science
Visualization
&	Dashboards
RDBMS	&	EDW
Files	/	PDFs	/Other	Unstructured	Data
Photos,	Video	&	Audio
IoT	Gateways
Modbus/OPC/HTTPS/WITSML
SCADA,	DCS,	PLC,	RTU,	Historians
7 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Data	lakes	address	part	of	the	problem
Field	or	Manufacturing	Site
RDBMS	&	EDW
Files	/	PDFs	/Other	Unstructured	Data
Photos,	Video	&	Audio
IoT	Gateways
Modbus/OPC/HTTPS/WITSML
SCADA,	DCS,	PLC,	RTU,	Historians	
Location	1
Data	Consumers
Data	
Marts
Analytics,	
Statics	&	
Science
Visualization
&	Dashboards
8 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Connected	platform	approach	addresses	the	end	to	challenge
Field	or	Manufacturing	Site
Location	1
Data	Consumers
Data	
Marts
Analytics,	
Statics	&	
Science
Visualization
&	Dashboards
RDBMS	&	EDW
Files	/	PDFs	/Other	Unstructured	Data
Photos,	Video	&	Audio
IoT	Gateways
Modbus/OPC/HTTPS/WITSML
SCADA,	DCS,	PLC,	RTU,	Historians
9 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Deployment	Model	for	The	Connected	Data	Platform		
Field	or	Manufacturing	Site Office	or	Datacenter
Central	HDP	Cluster
Central	HDF	Cluster
Location	1
Location	n
Data	Consumers
Data	
Marts
Analytics,	
Statics	&	
Science
Visualization
&	Dashboards
RDBMS	&	EDW
Files	/	PDFs	/Other	Unstructured	Data
Photos,	Video	&	Audio
IoT	Gateways
Modbus/OPC/HTTPS/WITSML
SCADA,	DCS,	PLC,	RTU,	Historians
10 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
HDF	&	HDP	Components	break	down
Field	or	Manufacturing	Site Office	or	Datacenter
Central	HDP	Cluster
Central	HDF	Cluster
Location	1
Location	n
Data	Consumers
Data	
Marts
Analytics,	
Statics	&	
Science
Visualization
&	Dashboards
RDBMS	&	EDW
Files	/	PDFs	/Other	Unstructured	Data
Photos,	Video	&	Audio
IoT	Gateways
Modbus/OPC/HTTPS/WITSML
SCADA,	DCS,	PLC,	RTU,	Historians
11 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Ethernet/IP Nifi S2S Hbase API
PLC	program	cyclically	 updates	the	values	
of	20k	PLC	registers
Kepware automatically	downloads	 the	AB	
PLC’s	tag	database	and	configures	 polling	
of	all	available	tags	in	the	AB	PLC.	
In	a	similar	fashion	 Nifi browses	 Kepware’s
database	and	polls	each	tag	found	 in	the	
IoT gateway	database.		The	frequency	 of	
this	polling	is	set	in	nifi
Nifi merges	then	compresses	 a	
configurable	 number	 of	responses	 from	
the	kepware server		before	transmitting	
them	over	the	Site	to	site	protocol
Nifi recieves,	decompresses	then	splits	the	
merged	text	documents	into	small	JSON	
documents	containing	individual	 data	point	
samples
Each	sample	is	inserted	into	Hbase serially	
using	the	Nifi Put	processors
kb/sMB/s MB/s
GB/d
Allen	Bradley	Example	
Azure	Based	NifiAsset	Based	NifiEquipment	Control	System
D A T A 	I N 	 M O T I O N D A T A 	A T 	R E S T
Data	Sources Data	Flow Data	Platform
12 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Typical	Goals	for	an	Industrial	Analytics	Practice
• Data	democratization	(	broad	simple	access	)
• Event	processing	– create	events	or	react	to	variables	(e.g.	pump	
overheat,	weather,	emission)
• Forecasting	/	Prediction	- Predict	the	most	likely	value
• Event	Correlation	– Measure	the	coincidence	of	two	things?	Measure	the	
likeness	of	events	or	periods	of	time?
• Impute	missing	values	- What	are	the	most	likely	values	of	missing	data?
• Data	normalization	– clean	up	messy	time	series	for	BI	purposes
• Anomaly	detection	– Find	“out	of	normal”	events	in	a	series,	based	on	a	
model	of	expected	behavior
13 ©	Hortonworks	Inc.	2011–2018.	All	rights	reserved.
Questions?		How	can	we	
help	you	get	started?

TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA