© Hortonworks Inc. 2011–2018. All rights reserved;1
Dataflow Management From Edge to
Core with Apache NiFi
Andy LoPresto | @yolopey
Sr. Member of Technical Staff at Hortonworks, Apache NiFi PMC & Committer
06 February 2019 Dataworks Summit Melbourne
© Hortonworks Inc. 2011–2019. All rights reserved;2
Acknowledgement of Country
I acknowledge the Traditional Owners of the land on which we
are meeting. I pay my respects to their Elders, past and
present, and the Aboriginal Elders of other communities who
may be here today.
© Hortonworks Inc. 2011–2018. All rights reserved;3
Gauging Audience Familiarity With NiFi
“What’s a NeeFee?”
No experience with dataflow
No experience with NiFi
“I can pick this up pretty quickly”
Some experience with dataflow
Some experience with NiFi
“I refactored the Ambari
integration endpoint to allow
for mutual authentication
TLS during my coffee break”
Forgotten more about NiFi
than most of us will ever
know
© Hortonworks Inc. 2011–2018. All rights reserved;4
Agenda
• What is dataflow and what are the challenges?
• Apache NiFi
• Apache MiNiFi
• Apache NiFi Registry
• Complementary Tools
• Community
• All slides provided online, so no need to transcribe
© Hortonworks Inc. 2011–2018. All rights reserved;5
What is dataflow?
© Hortonworks Inc. 2011–2018. All rights reserved;6
What is dataflow?
• Moving some content from A to B
• Content could be any bytes
• Logs
• HTTP
• XML
• CSV
• Images
• Video
• Telemetry
Producers A.K.A
Things
Anything
AND
Everything
Internet!
Consumers
• User
• Storage
• System
• …More Things
© Hortonworks Inc. 2011–2018. All rights reserved;7
Moving data effectively is hard
“Data Pipeline” https://xkcd.com/2054/
© Hortonworks Inc. 2011–2018. All rights reserved;8
• Standards
• Formats
• Protocols
• Veracity
• Validity
• Schemas
• Partitioning/
Bundling
Data
Dataflow Challenges In 3 Categories
Infrastructure
• “Exactly Once”
Delivery
• Ensuring
Security
• Overcoming
Security
• Credential
Management
• Network
People
• Compliance
• “That [person|
team|group]”
• Consumers
Change
• Requirements
Change
• “Exactly Once”
Delivery
© Hortonworks Inc. 2011–2018. All rights reserved;9
Raise your hand if you want to maintain Python scripts for the rest of your life
Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
© Hortonworks Inc. 2011–2018. All rights reserved;10
Apache NiFi
© Hortonworks Inc. 2011–2018. All rights reserved;11
• Guaranteed delivery
• Data buffering
• Backpressure
• Pressure release
• Prioritized queuing
• Flow specific QoS
• Latency vs. throughput
• Loss tolerance
Key Features
Apache NiFi
• Data provenance
• Supports push and pull models
• Recovery/recording 

a rolling log of fine-grained history
• Visual command and control
• Flow templates
• Pluggable, multi-tenant security
• Designed for extension
• Clustering
© Hortonworks Inc. 2011–2018. All rights reserved;12
Flowfiles Are Like HTTP Data
HTTP Data FlowFile
HTTP/1.1 200 OK
Date: Sun, 10 Oct 2010 23:26:07 GMT
Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g
Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT
ETag: "45b6-834-49130cc1182c0"
Accept-Ranges: bytes
Content-Length: 13
Connection: close
Content-Type: text/html
Hello world!
Standard FlowFile Attributes
Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'fileSize’ Value: '23609'
FlowFile Attribute Map Content
Key: 'filename’ Value: '15650246997242'
Key: 'path’ Value: './’
Binary Content *
Header
Content
© Hortonworks Inc. 2011–2018. All rights reserved;13
User Interface
Less of this…
© Hortonworks Inc. 2011–2018. All rights reserved;13
User Interface
Less of this…… more of this
© Hortonworks Inc. 2011–2018. All rights reserved;14
Deeper Ecosystem Integration: 286+ Processors, 61 Controller
Services
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
All Apache project logos are trademarks of the ASF and the respective projects.
Fetch
HTTP
Syslog
Email
HTML
Image
HL7
FTP
UDP
XML
SFTP
AMQP
WebSocket
Parse Records Convert Records
© Hortonworks Inc. 2011–2018. All rights reserved;15
Apache MiNiFi
© Hortonworks Inc. 2011–2018. All rights reserved;16
IoT Challenges
• Limited computing capability
• Limited power/network
• Restricted software library/platform
availability
• No UI
• Physically inaccessible
• Not frequently updated
• Competing standards/protocols
• Scalability
• Privacy & Security
@_lennart
© Hortonworks Inc. 2011–2018. All rights reserved;17
• NiFi is designed to “own the box”
• NiFi 0.7.x started up in about 10-15 minutes on RP3 (593 MB)
• NiFi 1.x started up in about 30 minutes on RP3 (760 MB)
• 33 new processors
• Rewrite for multi tenant authorization
• Complete UI overhaul
So Why Do We Need A Different Solution?
© Hortonworks Inc. 2011–2018. All rights reserved;18
• Get the key parts of NiFi close to where data begins and provide bidirectional
communication
• NiFi lives in the data center — give it an enterprise server or a cluster of them
• MiNiFi lives as close to where data is born and is a guest on that device or system
• IoT
• Connected car
• Legacy hardware
Apache NiFi Subproject: MiNiFi
© Hortonworks Inc. 2011–2018. All rights reserved;19
• MiNiFi Java (v0.5.0)
• Modified version of NiFi
• No UI
• YAML configuration
• Reduced processor count
• 63+ by default, more 

available with 

additional NARs
• MiNiFi C++ (v0.5.0)
• Written from scratch
• 33 processors by default
• Bi-directional site-to-site & provenance data
Flavors of MiNiFi
© Hortonworks Inc. 2011–2018. All rights reserved;20
• NiFi
• Design flows
• Aggregate data from many
sources
• Perform routing/analysis/SEP
• MiNiFi
• Receive flows
• Collect data
• Send for processing
How Does MiNiFi Interact With NiFi?
© Hortonworks Inc. 2011–2018. All rights reserved;21
• We’ve been imagining EDGE to CORE as a bi-directional linear system
• Let’s expand 

that to the real 

world
Let’s Add Dimensionality
© Hortonworks Inc. 2011–2018. All rights reserved;22
• Data tagging/provenance
• Governance from edge (geopolitical
restrictions)
• Security (encryption, certificate-based
authentication)
• Low latency (immediate reactions &
decision-making)
What does MiNiFi provide? Connected Car Reference Platform Box
Tuner + DSRC CardConnectivity Card
© Hortonworks Inc. 2011–2018. All rights reserved;23
• Site-to-Site
• NiFi protocol
• Two implementations
• Raw socket
• HTTP(S)
• Secured with mutual authentication TLS
• HTTP(S), (S)FTP, JMS, Syslog, File, Email, Process
MiNiFi Exfil
© Hortonworks Inc. 2011–2018. All rights reserved;24
Apache NiFi Registry
© Hortonworks Inc. 2011–2018. All rights reserved;25
Flow Development Lifecycle (FDLC)
• Origins of NiFi
• Operator Experience
• MC data, don’t drop, mitigate
temporarily
• Version Control
• Environment Promotion
© Hortonworks Inc. 2011–2018. All rights reserved;26
Operator Experience
© Hortonworks Inc. 2011–2018. All rights reserved;26
Operator Experience
© Hortonworks Inc. 2011–2018. All rights reserved;26
Operator Experience
© Hortonworks Inc. 2011–2018. All rights reserved;27
• Shows previous values (user,
time changed)
• Sensitive values are always
encrypted at rest and never
returned via the API
Component Property History
© Hortonworks Inc. 2011–2018. All rights reserved;28
Exporting Flows
• XML templates
• Copying flow.xml.gz
between systems
© Hortonworks Inc. 2011–2018. All rights reserved;29
Challenges
• Templates
• Updates/replacement
• Sensitive property replacement
• Flow.xml.gz migration
• Key synchronization
• Environment promotion
• Approval processes
• Verifiability
© Hortonworks Inc. 2011–2018. All rights reserved;30
Template Replacement
• Export a new version of template
• Transfer (somehow)
• Verify?
• Import onto canvas side-by-side existing
flow
• Stop processors
• Empty queues
• Reconnect queues
• Start
• Pray?
© Hortonworks Inc. 2011–2018. All rights reserved;31
Template Replacement
© Hortonworks Inc. 2011–2018. All rights reserved;32
• Previously, flows were exported via
XML templates
• Didn’t contain sensitive values
• Couldn’t be updated in-place
• No tracking system
• NiFi Registry brings asset management
as first-class citizen to NiFi
• Flows can be versioned
Introducing Apache NiFi Registry 0.3.0
NiFi Registry for Dataflows
© Hortonworks Inc. 2011–2018. All rights reserved;33
• Connect multiple NiFi instances
to a NiFi Registry instance
• Communicate between
multiple NiFi Registry instances
• via multiple Registry Clients
• via NiFi CLI
Flows can be promoted between environments
© Hortonworks Inc. 2011–2018. All rights reserved;34
• Git-backed persistence
• Share flows via GitHub, etc.
• Commit hooks
• Register a hook & action
• “When a new version of the
flow is committed to QA
Registry, email the QA team
and post in the QA Deploy
Slack channel”
• Pluggable DB implementations
Extensibility
© Hortonworks Inc. 2011–2018. All rights reserved;35
Demo
© Hortonworks Inc. 2011–2018. All rights reserved;36
• Install nifi-registry
• $ mvn clean install
• $ ./bin/nifi-registry.sh
start
• Browse to http://localhost:18080
Create Registry
© Hortonworks Inc. 2011–2018. All rights reserved;37
Create Bucket
© Hortonworks Inc. 2011–2018. All rights reserved;38
Connect to NiFi
© Hortonworks Inc. 2011–2018. All rights reserved;38
Connect to NiFi
© Hortonworks Inc. 2011–2018. All rights reserved;39
Create Process Group
© Hortonworks Inc. 2011–2018. All rights reserved;39
Create Process Group
© Hortonworks Inc. 2011–2018. All rights reserved;40
Commit Version
© Hortonworks Inc. 2011–2018. All rights reserved;40
Commit Version
© Hortonworks Inc. 2011–2018. All rights reserved;40
Commit Version
© Hortonworks Inc. 2011–2018. All rights reserved;41
View flow in Registry
© Hortonworks Inc. 2011–2018. All rights reserved;42
Import new instance into NiFi
© Hortonworks Inc. 2011–2018. All rights reserved;42
Import new instance into NiFi
© Hortonworks Inc. 2011–2018. All rights reserved;42
Import new instance into NiFi
© Hortonworks Inc. 2011–2018. All rights reserved;43
Modify the original flow
© Hortonworks Inc. 2011–2018. All rights reserved;43
Modify the original flow
© Hortonworks Inc. 2011–2018. All rights reserved;43
Modify the original flow
© Hortonworks Inc. 2011–2018. All rights reserved;44
See local changes before committing
© Hortonworks Inc. 2011–2018. All rights reserved;44
See local changes before committing
© Hortonworks Inc. 2011–2018. All rights reserved;44
See local changes before committing
© Hortonworks Inc. 2011–2018. All rights reserved;45
Commit
© Hortonworks Inc. 2011–2018. All rights reserved;45
Commit
© Hortonworks Inc. 2011–2018. All rights reserved;46
Update new instance from Registry
© Hortonworks Inc. 2011–2018. All rights reserved;46
Update new instance from Registry
© Hortonworks Inc. 2011–2018. All rights reserved;46
Update new instance from Registry
© Hortonworks Inc. 2011–2018. All rights reserved;47
Complementary Tools
© Hortonworks Inc. 2011–2018. All rights reserved;48
• NiFi Toolkit
• NiPyAPI
• MiNiFi Converter Toolkit
Complementary Tools
© Hortonworks Inc. 2011–2018. All rights reserved;49
NiFi Toolkit
• TLS Toolkit
• Generates, signs, and packages
keys and certificates for NiFi
services (node/cluster, clients)
• Encrypt Config
• Protects sensitive
configuration values like
passwords
• CLI
• Interacts with NiFi & NiFi
Registry to operate on flows
© Hortonworks Inc. 2011–2018. All rights reserved;50
NiPyAPI
• Python wrapper around NiFi REST API
• Community-provided by Daniel Chaffelson
• Exposes common operations for automation, batch processing, recursion, etc.
dev_bucket = nipyapi.versioning.get_registry_bucket(dev_bucket_name)
dev_ver_flow = nipyapi.versioning.get_flow_in_bucket(
dev_bucket.identifier,
identifier=dev_ver_flow_name
)
dev_export = nipyapi.versioning.export_flow_version(
bucket_id=dev_bucket.identifier,
flow_id=dev_ver_flow.identifier,
mode='yaml'
)
© Hortonworks Inc. 2011–2018. All rights reserved;51
MiNiFi Converter Toolkit
• Save as template from NiFi
• Run $ ./bin/config.sh transform
template.xml config.yml
• MiNiFi flow ready to run
© Hortonworks Inc. 2011–2018. All rights reserved;52
Community
© Hortonworks Inc. 2011–2018. All rights reserved;53
• FDLC with Apache NiFi, Kevin
Doran
• NiPyAPI Docs, Daniel
Chaffelson
• DevOps Tips, Tim Spann
• Automate Workflow, Pierre
Villard
More Resources
© Hortonworks Inc. 2011–2018. All rights reserved;54
• NiFi 1.8.0 — 26 Oct 2018 (212+ Jiras)
• Jetty, DB improvements
• Auto load-balancing queues
• TLS Toolkit w/ external CA
• Record processor improvements
• MiNiFi C++ 0.5.0 — 6 June 2018
• MiNiFi Java 0.5.0 — 7 July 2018
• NiFi Registry 0.3.0 — 25 Sept 2018
New Announcements
© Hortonworks Inc. 2011–2018. All rights reserved;55
Community Health
© Hortonworks Inc. 2011–2018. All rights reserved;56
Apache NiFi site

https://nifi.apache.org
Subproject MiNiFi site
https://nifi.apache.org/minifi/
Subscribe to and collaborate at

dev@nifi.apache.org
users@nifi.apache.org
Submit Ideas or Issues

https://issues.apache.org/jira/browse/NIFI
Follow us on Twitter
@apachenifi
Learn more and join us
© Hortonworks Inc. 2011–2018. All rights reserved;57
More NiFi Today
Title Time Room
The First Mile – Edge and IoT Data Collection with Apache NiFi and
MiNiFi
1100 - 1140 Room 103
Apache NiFi Crash Course 1400 - 1600 Room 109
Dataflow Management From Edge to Core with Apache NiFi 1650 - 1730 Room 112
Using Spark Streaming and NiFi for the Next Generation of ETL in
the Enterprise
1650 - 1730 Room 103
© Hortonworks Inc. 2011–2018. All rights reserved;58
Thank you
alopresto@hortonworks.com | alopresto@apache.org | @yolopey
github.com/alopresto/slides

Dataflow Management From Edge to Core with Apache NiFi

  • 1.
    © Hortonworks Inc.2011–2018. All rights reserved;1 Dataflow Management From Edge to Core with Apache NiFi Andy LoPresto | @yolopey Sr. Member of Technical Staff at Hortonworks, Apache NiFi PMC & Committer 06 February 2019 Dataworks Summit Melbourne
  • 2.
    © Hortonworks Inc.2011–2019. All rights reserved;2 Acknowledgement of Country I acknowledge the Traditional Owners of the land on which we are meeting. I pay my respects to their Elders, past and present, and the Aboriginal Elders of other communities who may be here today.
  • 3.
    © Hortonworks Inc.2011–2018. All rights reserved;3 Gauging Audience Familiarity With NiFi “What’s a NeeFee?” No experience with dataflow No experience with NiFi “I can pick this up pretty quickly” Some experience with dataflow Some experience with NiFi “I refactored the Ambari integration endpoint to allow for mutual authentication TLS during my coffee break” Forgotten more about NiFi than most of us will ever know
  • 4.
    © Hortonworks Inc.2011–2018. All rights reserved;4 Agenda • What is dataflow and what are the challenges? • Apache NiFi • Apache MiNiFi • Apache NiFi Registry • Complementary Tools • Community • All slides provided online, so no need to transcribe
  • 5.
    © Hortonworks Inc.2011–2018. All rights reserved;5 What is dataflow?
  • 6.
    © Hortonworks Inc.2011–2018. All rights reserved;6 What is dataflow? • Moving some content from A to B • Content could be any bytes • Logs • HTTP • XML • CSV • Images • Video • Telemetry Producers A.K.A Things Anything AND Everything Internet! Consumers • User • Storage • System • …More Things
  • 7.
    © Hortonworks Inc.2011–2018. All rights reserved;7 Moving data effectively is hard “Data Pipeline” https://xkcd.com/2054/
  • 8.
    © Hortonworks Inc.2011–2018. All rights reserved;8 • Standards • Formats • Protocols • Veracity • Validity • Schemas • Partitioning/ Bundling Data Dataflow Challenges In 3 Categories Infrastructure • “Exactly Once” Delivery • Ensuring Security • Overcoming Security • Credential Management • Network People • Compliance • “That [person| team|group]” • Consumers Change • Requirements Change • “Exactly Once” Delivery
  • 9.
    © Hortonworks Inc.2011–2018. All rights reserved;9 Raise your hand if you want to maintain Python scripts for the rest of your life Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
  • 10.
    © Hortonworks Inc.2011–2018. All rights reserved;10 Apache NiFi
  • 11.
    © Hortonworks Inc.2011–2018. All rights reserved;11 • Guaranteed delivery • Data buffering • Backpressure • Pressure release • Prioritized queuing • Flow specific QoS • Latency vs. throughput • Loss tolerance Key Features Apache NiFi • Data provenance • Supports push and pull models • Recovery/recording 
 a rolling log of fine-grained history • Visual command and control • Flow templates • Pluggable, multi-tenant security • Designed for extension • Clustering
  • 12.
    © Hortonworks Inc.2011–2018. All rights reserved;12 Flowfiles Are Like HTTP Data HTTP Data FlowFile HTTP/1.1 200 OK Date: Sun, 10 Oct 2010 23:26:07 GMT Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT ETag: "45b6-834-49130cc1182c0" Accept-Ranges: bytes Content-Length: 13 Connection: close Content-Type: text/html Hello world! Standard FlowFile Attributes Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'fileSize’ Value: '23609' FlowFile Attribute Map Content Key: 'filename’ Value: '15650246997242' Key: 'path’ Value: './’ Binary Content * Header Content
  • 13.
    © Hortonworks Inc.2011–2018. All rights reserved;13 User Interface Less of this…
  • 14.
    © Hortonworks Inc.2011–2018. All rights reserved;13 User Interface Less of this…… more of this
  • 15.
    © Hortonworks Inc.2011–2018. All rights reserved;14 Deeper Ecosystem Integration: 286+ Processors, 61 Controller Services Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute All Apache project logos are trademarks of the ASF and the respective projects. Fetch HTTP Syslog Email HTML Image HL7 FTP UDP XML SFTP AMQP WebSocket Parse Records Convert Records
  • 16.
    © Hortonworks Inc.2011–2018. All rights reserved;15 Apache MiNiFi
  • 17.
    © Hortonworks Inc.2011–2018. All rights reserved;16 IoT Challenges • Limited computing capability • Limited power/network • Restricted software library/platform availability • No UI • Physically inaccessible • Not frequently updated • Competing standards/protocols • Scalability • Privacy & Security @_lennart
  • 18.
    © Hortonworks Inc.2011–2018. All rights reserved;17 • NiFi is designed to “own the box” • NiFi 0.7.x started up in about 10-15 minutes on RP3 (593 MB) • NiFi 1.x started up in about 30 minutes on RP3 (760 MB) • 33 new processors • Rewrite for multi tenant authorization • Complete UI overhaul So Why Do We Need A Different Solution?
  • 19.
    © Hortonworks Inc.2011–2018. All rights reserved;18 • Get the key parts of NiFi close to where data begins and provide bidirectional communication • NiFi lives in the data center — give it an enterprise server or a cluster of them • MiNiFi lives as close to where data is born and is a guest on that device or system • IoT • Connected car • Legacy hardware Apache NiFi Subproject: MiNiFi
  • 20.
    © Hortonworks Inc.2011–2018. All rights reserved;19 • MiNiFi Java (v0.5.0) • Modified version of NiFi • No UI • YAML configuration • Reduced processor count • 63+ by default, more 
 available with 
 additional NARs • MiNiFi C++ (v0.5.0) • Written from scratch • 33 processors by default • Bi-directional site-to-site & provenance data Flavors of MiNiFi
  • 21.
    © Hortonworks Inc.2011–2018. All rights reserved;20 • NiFi • Design flows • Aggregate data from many sources • Perform routing/analysis/SEP • MiNiFi • Receive flows • Collect data • Send for processing How Does MiNiFi Interact With NiFi?
  • 22.
    © Hortonworks Inc.2011–2018. All rights reserved;21 • We’ve been imagining EDGE to CORE as a bi-directional linear system • Let’s expand 
 that to the real 
 world Let’s Add Dimensionality
  • 23.
    © Hortonworks Inc.2011–2018. All rights reserved;22 • Data tagging/provenance • Governance from edge (geopolitical restrictions) • Security (encryption, certificate-based authentication) • Low latency (immediate reactions & decision-making) What does MiNiFi provide? Connected Car Reference Platform Box Tuner + DSRC CardConnectivity Card
  • 24.
    © Hortonworks Inc.2011–2018. All rights reserved;23 • Site-to-Site • NiFi protocol • Two implementations • Raw socket • HTTP(S) • Secured with mutual authentication TLS • HTTP(S), (S)FTP, JMS, Syslog, File, Email, Process MiNiFi Exfil
  • 25.
    © Hortonworks Inc.2011–2018. All rights reserved;24 Apache NiFi Registry
  • 26.
    © Hortonworks Inc.2011–2018. All rights reserved;25 Flow Development Lifecycle (FDLC) • Origins of NiFi • Operator Experience • MC data, don’t drop, mitigate temporarily • Version Control • Environment Promotion
  • 27.
    © Hortonworks Inc.2011–2018. All rights reserved;26 Operator Experience
  • 28.
    © Hortonworks Inc.2011–2018. All rights reserved;26 Operator Experience
  • 29.
    © Hortonworks Inc.2011–2018. All rights reserved;26 Operator Experience
  • 30.
    © Hortonworks Inc.2011–2018. All rights reserved;27 • Shows previous values (user, time changed) • Sensitive values are always encrypted at rest and never returned via the API Component Property History
  • 31.
    © Hortonworks Inc.2011–2018. All rights reserved;28 Exporting Flows • XML templates • Copying flow.xml.gz between systems
  • 32.
    © Hortonworks Inc.2011–2018. All rights reserved;29 Challenges • Templates • Updates/replacement • Sensitive property replacement • Flow.xml.gz migration • Key synchronization • Environment promotion • Approval processes • Verifiability
  • 33.
    © Hortonworks Inc.2011–2018. All rights reserved;30 Template Replacement • Export a new version of template • Transfer (somehow) • Verify? • Import onto canvas side-by-side existing flow • Stop processors • Empty queues • Reconnect queues • Start • Pray?
  • 34.
    © Hortonworks Inc.2011–2018. All rights reserved;31 Template Replacement
  • 35.
    © Hortonworks Inc.2011–2018. All rights reserved;32 • Previously, flows were exported via XML templates • Didn’t contain sensitive values • Couldn’t be updated in-place • No tracking system • NiFi Registry brings asset management as first-class citizen to NiFi • Flows can be versioned Introducing Apache NiFi Registry 0.3.0 NiFi Registry for Dataflows
  • 36.
    © Hortonworks Inc.2011–2018. All rights reserved;33 • Connect multiple NiFi instances to a NiFi Registry instance • Communicate between multiple NiFi Registry instances • via multiple Registry Clients • via NiFi CLI Flows can be promoted between environments
  • 37.
    © Hortonworks Inc.2011–2018. All rights reserved;34 • Git-backed persistence • Share flows via GitHub, etc. • Commit hooks • Register a hook & action • “When a new version of the flow is committed to QA Registry, email the QA team and post in the QA Deploy Slack channel” • Pluggable DB implementations Extensibility
  • 38.
    © Hortonworks Inc.2011–2018. All rights reserved;35 Demo
  • 39.
    © Hortonworks Inc.2011–2018. All rights reserved;36 • Install nifi-registry • $ mvn clean install • $ ./bin/nifi-registry.sh start • Browse to http://localhost:18080 Create Registry
  • 40.
    © Hortonworks Inc.2011–2018. All rights reserved;37 Create Bucket
  • 41.
    © Hortonworks Inc.2011–2018. All rights reserved;38 Connect to NiFi
  • 42.
    © Hortonworks Inc.2011–2018. All rights reserved;38 Connect to NiFi
  • 43.
    © Hortonworks Inc.2011–2018. All rights reserved;39 Create Process Group
  • 44.
    © Hortonworks Inc.2011–2018. All rights reserved;39 Create Process Group
  • 45.
    © Hortonworks Inc.2011–2018. All rights reserved;40 Commit Version
  • 46.
    © Hortonworks Inc.2011–2018. All rights reserved;40 Commit Version
  • 47.
    © Hortonworks Inc.2011–2018. All rights reserved;40 Commit Version
  • 48.
    © Hortonworks Inc.2011–2018. All rights reserved;41 View flow in Registry
  • 49.
    © Hortonworks Inc.2011–2018. All rights reserved;42 Import new instance into NiFi
  • 50.
    © Hortonworks Inc.2011–2018. All rights reserved;42 Import new instance into NiFi
  • 51.
    © Hortonworks Inc.2011–2018. All rights reserved;42 Import new instance into NiFi
  • 52.
    © Hortonworks Inc.2011–2018. All rights reserved;43 Modify the original flow
  • 53.
    © Hortonworks Inc.2011–2018. All rights reserved;43 Modify the original flow
  • 54.
    © Hortonworks Inc.2011–2018. All rights reserved;43 Modify the original flow
  • 55.
    © Hortonworks Inc.2011–2018. All rights reserved;44 See local changes before committing
  • 56.
    © Hortonworks Inc.2011–2018. All rights reserved;44 See local changes before committing
  • 57.
    © Hortonworks Inc.2011–2018. All rights reserved;44 See local changes before committing
  • 58.
    © Hortonworks Inc.2011–2018. All rights reserved;45 Commit
  • 59.
    © Hortonworks Inc.2011–2018. All rights reserved;45 Commit
  • 60.
    © Hortonworks Inc.2011–2018. All rights reserved;46 Update new instance from Registry
  • 61.
    © Hortonworks Inc.2011–2018. All rights reserved;46 Update new instance from Registry
  • 62.
    © Hortonworks Inc.2011–2018. All rights reserved;46 Update new instance from Registry
  • 63.
    © Hortonworks Inc.2011–2018. All rights reserved;47 Complementary Tools
  • 64.
    © Hortonworks Inc.2011–2018. All rights reserved;48 • NiFi Toolkit • NiPyAPI • MiNiFi Converter Toolkit Complementary Tools
  • 65.
    © Hortonworks Inc.2011–2018. All rights reserved;49 NiFi Toolkit • TLS Toolkit • Generates, signs, and packages keys and certificates for NiFi services (node/cluster, clients) • Encrypt Config • Protects sensitive configuration values like passwords • CLI • Interacts with NiFi & NiFi Registry to operate on flows
  • 66.
    © Hortonworks Inc.2011–2018. All rights reserved;50 NiPyAPI • Python wrapper around NiFi REST API • Community-provided by Daniel Chaffelson • Exposes common operations for automation, batch processing, recursion, etc. dev_bucket = nipyapi.versioning.get_registry_bucket(dev_bucket_name) dev_ver_flow = nipyapi.versioning.get_flow_in_bucket( dev_bucket.identifier, identifier=dev_ver_flow_name ) dev_export = nipyapi.versioning.export_flow_version( bucket_id=dev_bucket.identifier, flow_id=dev_ver_flow.identifier, mode='yaml' )
  • 67.
    © Hortonworks Inc.2011–2018. All rights reserved;51 MiNiFi Converter Toolkit • Save as template from NiFi • Run $ ./bin/config.sh transform template.xml config.yml • MiNiFi flow ready to run
  • 68.
    © Hortonworks Inc.2011–2018. All rights reserved;52 Community
  • 69.
    © Hortonworks Inc.2011–2018. All rights reserved;53 • FDLC with Apache NiFi, Kevin Doran • NiPyAPI Docs, Daniel Chaffelson • DevOps Tips, Tim Spann • Automate Workflow, Pierre Villard More Resources
  • 70.
    © Hortonworks Inc.2011–2018. All rights reserved;54 • NiFi 1.8.0 — 26 Oct 2018 (212+ Jiras) • Jetty, DB improvements • Auto load-balancing queues • TLS Toolkit w/ external CA • Record processor improvements • MiNiFi C++ 0.5.0 — 6 June 2018 • MiNiFi Java 0.5.0 — 7 July 2018 • NiFi Registry 0.3.0 — 25 Sept 2018 New Announcements
  • 71.
    © Hortonworks Inc.2011–2018. All rights reserved;55 Community Health
  • 72.
    © Hortonworks Inc.2011–2018. All rights reserved;56 Apache NiFi site
 https://nifi.apache.org Subproject MiNiFi site https://nifi.apache.org/minifi/ Subscribe to and collaborate at
 dev@nifi.apache.org users@nifi.apache.org Submit Ideas or Issues
 https://issues.apache.org/jira/browse/NIFI Follow us on Twitter @apachenifi Learn more and join us
  • 73.
    © Hortonworks Inc.2011–2018. All rights reserved;57 More NiFi Today Title Time Room The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi 1100 - 1140 Room 103 Apache NiFi Crash Course 1400 - 1600 Room 109 Dataflow Management From Edge to Core with Apache NiFi 1650 - 1730 Room 112 Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise 1650 - 1730 Room 103
  • 74.
    © Hortonworks Inc.2011–2018. All rights reserved;58 Thank you alopresto@hortonworks.com | alopresto@apache.org | @yolopey github.com/alopresto/slides