SlideShare a Scribd company logo
1 of 42
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Harnessing Data-in-Motion
with Hortonworks DataFlow
Edge intelligence for IoT with
Apache MiNiFi (subproject of Apache NiFi)
Joe Percivall
Software Engineer
Anna Yong
Product Marketing
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Apache NiFi Refresher
• Intro to Apache MiNiFi
• Demo
• Next Step
• How MiNiFi can be used – use cases
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
EMBRACE AN OPEN
APPROACH
MASTER THE VALUE
OF DATA
EVERY BUSINESS
IS A DATA BUSINESS
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
 Constrained
 High-latency
 Localized context
 Hybrid – cloud / on-premises
 Low-latency
 Global context
Core
Infrastructure
Harnessing Data in Motion
Regional
InfrastructureSources
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks DataFlow Manages Data in Motion
Core
InfrastructureSources
 Constrained
 High-latency
 Localized context
 Hybrid – cloud / on-premises
 Low-latency
 Global context
Regional
Infrastructure
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Requirements for Data in Motion
Perishable Insights
ConnectivityPrioritization
Security
AdaptabilityExtensibility
Scalability
Provenance
Real-Time
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Connecting Data Between Ecosystems Without Coding: 170+ Processors
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
HL7
FTP
UDP
XML
SFTP
HTTP
Syslog
Email
HTML
Image
AMQP
MQTT
All Apache project logos are trademarks of the ASF and the respective projects.
Fetch
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi
 Guaranteed delivery
 Data buffering
‒ Backpressure
‒ Pressure release
 Prioritized queuing
 Flow specific QoS
‒ Latency vs. throughput
‒ Loss tolerance
 Data provenance
 Recovery / recording a rolling log
of fine-grained history
 Designed for extension
 Visual command and control
 Flow templates
 Policy based security
 Clustering
Key Features
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Apache NiFi Refresher
• Intro to Apache MiNiFi
• Demo
• Next Step
• How MiNiFi can be used – use cases
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simplified Ingestion Example
Let’s consider the needs of a courier service
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
NiFi NiFi NiFi NiFi NiFi NiFi
On Delivery Routes
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi & MiNiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client
Libraries
Client
Libraries
MiNiFi
MiNiFi
NiFi NiFi NiFi NiFi NiFi NiFi
Client
Libraries
On Delivery Routes
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Created to more effectively collect
data at the edge
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi
 Guaranteed delivery
 Data buffering
‒ Backpressure
‒ Pressure release
 Prioritized queuing
 Flow specific QoS
‒ Latency vs. throughput
‒ Loss tolerance
 Data provenance
 Recovery / recording a rolling log
of fine-grained history
 Designed for extension
 Visual command and control
 Flow templates
 Policy based security
 Clustering
Key Features
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
 Guaranteed delivery
 Data buffering
‒ Backpressure
‒ Pressure release
 Prioritized queuing
 Flow specific QoS
‒ Latency vs. throughput
‒ Loss tolerance
 Data provenance
Different from Apache NiFi
 Design and Deploy
 Warm re-deploys
Key Features
Apache NiFi MiNiFi
 Recovery / recording a rolling log
of fine-grained history
 Designed for extension
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Command and Control
vs.
Design and Deploy
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Different MiNiFi Distributions
 Java
– <40MB binary distribution
– Requires Java 1.8
– More feature complete
– Targeted for any systems that can run a JVM (ie. Servers, Raspberry Pi)
 C++
– 600KB code size and static data ~50KB
– Dynamic heap of ~1MB based on use-case
– Targeted for resource constrained environments (ie. edge IoT devices)
 Both use same config format and use NiFi terminology
Different focuses depending on requirements
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Typical Simple Example
Tail a rolling file -> Site to Site
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Typical Simple Example
Tail a rolling file -> Site to Site
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simple Config.yml
Tail a rolling file -> Site to Site
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi and MiNiFi, JAVA and C++
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi vs MiNiFi Java Processes
NiFi Framework
Components
MiNiFi
NiFi Framework
User Interface
Components
NiFi
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Java Processes
Bootstrap
NiFi
UI
bootstrap.conf
nifi.properties
flow.xml.gzreads &
modifies
reads
reads
starts
NiFi MiNiFi
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi Java Processes
MiNiFi
Bootstrap
Configuration
Change Notifier(s)
bootstrap.conf
nifi.properties
flow.xml.gz
reads
reads
starts
config.ymltransforms
reads
into
NiFi MiNiFi
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Same Extensible framework (nars)
 NiFi standard processors are bundled
– TailLog
– UpdateAttribute
– Route on content and attributes
– PutEmail
– ….
Allows MiNiFi Java to use NiFi processors
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi C++ Features
 Same design and deploy configuration
 Initial set of processors
– TailFile
– GetFile
– GenerateFlowFile
– LogAttribute
– ListenSyslog
 Site to Site Client implementation in C++ for talking to NiFi instances
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Apache NiFi Refresher
• Intro to Apache MiNiFi
• Demo
• Next Step
• How MiNiFi can be used – use cases
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simple Config.yml
Tail a rolling file -> Site to Site
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi & MiNiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client
Libraries
Client
Libraries
MiNiFi
MiNiFi
NiFi NiFi NiFi NiFi NiFi NiFi
Client
Libraries
On Delivery Routes
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Demo: Digital Temp & Humidity Sensor, Raspberry Pi Zero, MiNiFi
Raspberry
Pi Zero
with MiNiFi
(link)
USB hub for
power & WiFi
Digital Temp & Humidity
Measurement Sensor
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Apache NiFi Refresher
• Intro to Apache MiNiFi
• Demo
• Next Step
• How MiNiFi can be used – use cases
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi Command and Control
 Design Flow at a centralized place, deploy on the edge
 Version control of flows
– Align with NiFi SDLC work
 Agent status monitoring
 Bi-directional command and control
Currently a feature proposal, initial version being architected
https://cwiki.apache.org/confluence/display/MINIFI/MiNiFi+Command+and+Control
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Apache NiFi Refresher
• Intro to Apache MiNiFi
• Demo
• Next Step
• How MiNiFi can be used – use cases
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Core
Infrastructure
Hortonworks DataFlow Use Cases
Regional
InfrastructureSources
Dataflow Management
• On-ramp into Hadoop
• Log Collection / Splunk Optimization
• Cyber Security
• IoT Ingestion
• Deliver data into stream processing engines
Real-time Event Processing
(Kafka, Storm)
Move data between
from on-prem and
cloud environments
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache MiNiFi Use Case: Distributed Systems: Retail POS
100’s or 1000’s or more terminals to gather:
 Real-time pricing info
 Real-time inventory updates
 Real-time offers
 Traceable logs of transactions
 Log of user-interactions for trouble-shooting
devices
 Predictive help desk
Apache MiNiFi Use Case: Distributed Systems: Connected Car
MiNiFi
SD Card
Wi-Fi
DSRC
LTE
Raw data
via CAN,
Ethernet
Car Wireless Gateway
Buffer offline
delay-
tolerant data
1
Real-time
data
1
4
Wi-Fi AP
DSRC RSU
eNodeB
5
5
Possibly modify route
or adjust sampling rate
and prepare to flush
delay-tolerant buffers
when SDRC RSU or Wi-
Fi AP in range
2
Data Storage
Decision-Making Server
Map Database
3
Hortonworks maximizes intelligence with platforms that deliver actionable
insights from connected car data
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
GENIVI Alliance: Open Source, In-Vehicle Infotainment Software
The GENIVI Alliance is a nonprofit industry
alliance committed to driving the broad adoption
of specified, open source, In-Vehicle Infotainment
software.
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache MiNiFi Use Cases: Distributed Systems of Devices
 Routers
 Security cameras
 Cable modems
 ATM
 Fleet of Trucks
 Manufacturing Line
 Security Appliance
 Point of Sale
 Weather detection system
 Thermostats
 Utility/Power meters
 Fleet of Ships
Any distributed system of devices with data to be collected
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions?
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Learn more and join us!
Apache NiFi site
http://nifi.apache.org
Subproject MiNiFi site
http://nifi.apache.org/minifi/
Subscribe to and collaborate at
dev@nifi.apache.org
users@nifi.apache.org
Submit Ideas or Issues
https://issues.apache.org/jira/browse/NIFI
https://issues.apache.org/jira/browse/MINIFI
Follow us on Twitter
@apachenifi
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions?
Hortonworks Community Connection:
Data Ingestion and Streaming
https://community.hortonworks.com/
42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!

More Related Content

More from Hortonworks

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 

Hortonworks Data in Motion Webinar Series - Part 6, Edge Intelligence IoT MiniFi

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Harnessing Data-in-Motion with Hortonworks DataFlow Edge intelligence for IoT with Apache MiNiFi (subproject of Apache NiFi) Joe Percivall Software Engineer Anna Yong Product Marketing
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Apache NiFi Refresher • Intro to Apache MiNiFi • Demo • Next Step • How MiNiFi can be used – use cases
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EMBRACE AN OPEN APPROACH MASTER THE VALUE OF DATA EVERY BUSINESS IS A DATA BUSINESS
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved  Constrained  High-latency  Localized context  Hybrid – cloud / on-premises  Low-latency  Global context Core Infrastructure Harnessing Data in Motion Regional InfrastructureSources
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks DataFlow Manages Data in Motion Core InfrastructureSources  Constrained  High-latency  Localized context  Hybrid – cloud / on-premises  Low-latency  Global context Regional Infrastructure
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Requirements for Data in Motion Perishable Insights ConnectivityPrioritization Security AdaptabilityExtensibility Scalability Provenance Real-Time
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Connecting Data Between Ecosystems Without Coding: 170+ Processors Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute HL7 FTP UDP XML SFTP HTTP Syslog Email HTML Image AMQP MQTT All Apache project logos are trademarks of the ASF and the respective projects. Fetch
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi  Guaranteed delivery  Data buffering ‒ Backpressure ‒ Pressure release  Prioritized queuing  Flow specific QoS ‒ Latency vs. throughput ‒ Loss tolerance  Data provenance  Recovery / recording a rolling log of fine-grained history  Designed for extension  Visual command and control  Flow templates  Policy based security  Clustering Key Features
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Apache NiFi Refresher • Intro to Apache MiNiFi • Demo • Next Step • How MiNiFi can be used – use cases
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Simplified Ingestion Example Let’s consider the needs of a courier service Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Courier service from the perspective of NiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ NiFi NiFi NiFi NiFi NiFi NiFi On Delivery Routes
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Courier service from the perspective of NiFi & MiNiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ Client Libraries Client Libraries MiNiFi MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi Client Libraries On Delivery Routes
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Created to more effectively collect data at the edge
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi  Guaranteed delivery  Data buffering ‒ Backpressure ‒ Pressure release  Prioritized queuing  Flow specific QoS ‒ Latency vs. throughput ‒ Loss tolerance  Data provenance  Recovery / recording a rolling log of fine-grained history  Designed for extension  Visual command and control  Flow templates  Policy based security  Clustering Key Features
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved  Guaranteed delivery  Data buffering ‒ Backpressure ‒ Pressure release  Prioritized queuing  Flow specific QoS ‒ Latency vs. throughput ‒ Loss tolerance  Data provenance Different from Apache NiFi  Design and Deploy  Warm re-deploys Key Features Apache NiFi MiNiFi  Recovery / recording a rolling log of fine-grained history  Designed for extension
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Command and Control vs. Design and Deploy
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Different MiNiFi Distributions  Java – <40MB binary distribution – Requires Java 1.8 – More feature complete – Targeted for any systems that can run a JVM (ie. Servers, Raspberry Pi)  C++ – 600KB code size and static data ~50KB – Dynamic heap of ~1MB based on use-case – Targeted for resource constrained environments (ie. edge IoT devices)  Both use same config format and use NiFi terminology Different focuses depending on requirements
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Typical Simple Example Tail a rolling file -> Site to Site
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Typical Simple Example Tail a rolling file -> Site to Site
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Simple Config.yml Tail a rolling file -> Site to Site
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi and MiNiFi, JAVA and C++
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi vs MiNiFi Java Processes NiFi Framework Components MiNiFi NiFi Framework User Interface Components NiFi
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Java Processes Bootstrap NiFi UI bootstrap.conf nifi.properties flow.xml.gzreads & modifies reads reads starts NiFi MiNiFi
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi Java Processes MiNiFi Bootstrap Configuration Change Notifier(s) bootstrap.conf nifi.properties flow.xml.gz reads reads starts config.ymltransforms reads into NiFi MiNiFi
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Same Extensible framework (nars)  NiFi standard processors are bundled – TailLog – UpdateAttribute – Route on content and attributes – PutEmail – …. Allows MiNiFi Java to use NiFi processors
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi C++ Features  Same design and deploy configuration  Initial set of processors – TailFile – GetFile – GenerateFlowFile – LogAttribute – ListenSyslog  Site to Site Client implementation in C++ for talking to NiFi instances
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Apache NiFi Refresher • Intro to Apache MiNiFi • Demo • Next Step • How MiNiFi can be used – use cases
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Simple Config.yml Tail a rolling file -> Site to Site
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Courier service from the perspective of NiFi & MiNiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ Client Libraries Client Libraries MiNiFi MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi Client Libraries On Delivery Routes
  • 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Demo: Digital Temp & Humidity Sensor, Raspberry Pi Zero, MiNiFi Raspberry Pi Zero with MiNiFi (link) USB hub for power & WiFi Digital Temp & Humidity Measurement Sensor
  • 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Apache NiFi Refresher • Intro to Apache MiNiFi • Demo • Next Step • How MiNiFi can be used – use cases
  • 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi Command and Control  Design Flow at a centralized place, deploy on the edge  Version control of flows – Align with NiFi SDLC work  Agent status monitoring  Bi-directional command and control Currently a feature proposal, initial version being architected https://cwiki.apache.org/confluence/display/MINIFI/MiNiFi+Command+and+Control
  • 33. 33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Apache NiFi Refresher • Intro to Apache MiNiFi • Demo • Next Step • How MiNiFi can be used – use cases
  • 34. 34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Core Infrastructure Hortonworks DataFlow Use Cases Regional InfrastructureSources Dataflow Management • On-ramp into Hadoop • Log Collection / Splunk Optimization • Cyber Security • IoT Ingestion • Deliver data into stream processing engines Real-time Event Processing (Kafka, Storm) Move data between from on-prem and cloud environments
  • 35. 35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache MiNiFi Use Case: Distributed Systems: Retail POS 100’s or 1000’s or more terminals to gather:  Real-time pricing info  Real-time inventory updates  Real-time offers  Traceable logs of transactions  Log of user-interactions for trouble-shooting devices  Predictive help desk
  • 36. Apache MiNiFi Use Case: Distributed Systems: Connected Car MiNiFi SD Card Wi-Fi DSRC LTE Raw data via CAN, Ethernet Car Wireless Gateway Buffer offline delay- tolerant data 1 Real-time data 1 4 Wi-Fi AP DSRC RSU eNodeB 5 5 Possibly modify route or adjust sampling rate and prepare to flush delay-tolerant buffers when SDRC RSU or Wi- Fi AP in range 2 Data Storage Decision-Making Server Map Database 3 Hortonworks maximizes intelligence with platforms that deliver actionable insights from connected car data
  • 37. 37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GENIVI Alliance: Open Source, In-Vehicle Infotainment Software The GENIVI Alliance is a nonprofit industry alliance committed to driving the broad adoption of specified, open source, In-Vehicle Infotainment software.
  • 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache MiNiFi Use Cases: Distributed Systems of Devices  Routers  Security cameras  Cable modems  ATM  Fleet of Trucks  Manufacturing Line  Security Appliance  Point of Sale  Weather detection system  Thermostats  Utility/Power meters  Fleet of Ships Any distributed system of devices with data to be collected
  • 39. 39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions?
  • 40. 40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Learn more and join us! Apache NiFi site http://nifi.apache.org Subproject MiNiFi site http://nifi.apache.org/minifi/ Subscribe to and collaborate at dev@nifi.apache.org users@nifi.apache.org Submit Ideas or Issues https://issues.apache.org/jira/browse/NIFI https://issues.apache.org/jira/browse/MINIFI Follow us on Twitter @apachenifi
  • 41. 41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions? Hortonworks Community Connection: Data Ingestion and Streaming https://community.hortonworks.com/
  • 42. 42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank you!

Editor's Notes

  1. Hortonworks: Powering the Future of Data
  2. Hortonworks: Powering the Future of Data
  3. Hortonworks: Powering the Future of Data
  4. Hortonworks: Powering the Future of Data