Introduction to Apache NiFi 1.11.4

Introduction to Apache NiFi 1.11.4
Timothy Spann
Principal DataFlow Field Engineer
Cloudera
@PaasDev

© 2020 Cloudera, Inc. All rights reserved. 2
Welcome to Future of Data - Princeton
@PaasDev
https://www.meetup.com/futureofdata-princeton/
From Big Data to AI to Streaming to Containers to
Cloud to Analytics to Cloud Storage to Fast Data to
Machine Learning to Microservices to ...

Meetup Presenter
Who am I?
Principal DataFlow Field Engineer
@PaasDev
DZone Zone Leader and Big Data MVB;
Princeton NJ Future of Data Meetup;
ex-Pivotal Field Engineer;
Apache Kafka, Tensorﬂow, Apache Spark RefCards
https://github.com/tspannhw https://www.datainmotion.dev/
https://dzone.com/users/297029/bunkertor.html

STORAGE LAYER
sensors
EXAMPLE REFERENCE ARCHITECTURE
Apache NiFi
Apache Kafka
DATA SYNDICATION
SERVICE BY KAFKA
Kafka Topic
iot
DATA FLOW APPS
POWERED BY NIFI
Apache Impala
Cloudera Machine
Learning
MODEL EXECUTION

Cloudera Flow Management
Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud,
data center) to any downstream system with built in end-to-end security and provenance
ACQUIRE PROCESS DELIVER
• Over 300 Prebuilt Processors
• Easy to build your own
• Parse, Enrich & Apply Schema
• Filter, Split, Merger & Route
• Throttle & Backpressure
• Guaranteed Delivery
• Full data provenance from acquisition to
delivery
• Diverse, Non-Traditional Sources
• Eco-system integration
Advanced tooling to industrialize ﬂow development
(Flow Development Life Cycle)

NiFi 1.14

Stateless Engine
• Granular containers per flow
• Flows From NiFi Registry
https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
bin/nifi.sh stateless RunFromRegistry Continuous --file kafka.json
https://github.com/apache/nifi/blob/ea1becac4fc519c54b8b4d21773e68f8da364755/nifi-nar-bundles/nifi-framework-bundle/nifi-
framework/nifi-stateless/README.md

Stateless Engine
• See also Parameters
• Docker
• YARN
• Kubernetes (K8)
• Stateful NiFi clusters
• Apache OpenWhisk (FaaS)
{"registryUrl": "http://tspann-mbp15-hw14277:18080",
"bucketId": "140b30f0-5a47-4747-9021-19d4fde7f993",
"flowId": "0540e1fd-c7ca-46fb-9296-e37632021945",
"ssl": {
"keystoreFile": "","keystorePass": "","keyPass": "","keystoreType": "",
"truststoreFile":
"/Library/Java/JavaVirtualMachines/amazon-corretto-11.jdk/Contents/Home/lib/sec
urity/cacerts",
"truststorePass": "changeit", "truststoreType": "JKS"
},
"parameters": {
"broker" : "4.317.852.100:9092",
"topic" : "iot",
"group_id" : "nifi-stateless-kafka-consumer",
"DestinationDirectory" : "/tmp/nifistateless/output2/",
"output_dir": "/Users/tspann/Documents/nifi-1.10.0-SNAPSHOT/logs/output"
}
}
https://github.com/tspannhw/stateless-examples

Parameters
• Parameters
• Parameter Context

Parameters
• Advanced Editors
• Easy to Use
• PARAM

Parameters
• Conﬁgure Externally with JSON
Files to Execute Stateless Flows

Parameters
• Create / Edit Parameters from
NiFi or in JSON Files

Parameter Context
• Sensitive or Normal
• Connect to Multiple Process
Groups

RetryFlowFile
• Conﬁgurable Retries
• Maximum #
• Penalties
• When to Fail
• Reuse Mode
https://medium.com/@abdelkrim.hadjidj/apache-niﬁ-1-10-series-simplifying-error-handling-7de86f130acd

BackPressure
Prediction
• OrdinaryLeastSquares
• SimpleRegression
• Enable analytics feature
http://lonniﬁ.blogspot.com/2019/11/back-pressure-prediction-deep-dive.html?es_id=5233333939
https://youtu.be/Tt8TSlHu7PE

ParquetReader /
ParquetWriter
Records
• Native Record Processors for
Apache Parquet Files!
• CSV <-> Parquet
• XML <-> Parquet
• AVRO <-> Parquet
• JSON <-> Parquet
• More...
https://www.datainmotion.dev/2019/10/migrating-apache-ﬂume-ﬂows-to-apac
he_7.html

PostSlack
• Post Images to Slack
https://www.datainmotion.dev/2019/11/niﬁ-110-postslack-easy-image-upload.html

Remote Input Port
in a Process Group
• Put Remote Connections for
Site-To-Site (S2S) Anywhere!
• Not only top level
• Drop down simplicity

Many New
Features
• Prometheus Reporting Task
• Experimental Encrypted content repository
• PublishKafka Partition Support
• Toolkit module to generate and build Swagger
• GeoEnrichIPRecord Processor
• Command Line Diagnostics
• RocksDB FlowFile Repository
• PutBigQueryStreaming Processor
• Enhanced DevOps and CD/CI
ELT/ETL Lookup Services
• DatabaseRecordLookupService
• KuduLookupService
• HBase_2_ListLookupService
https://cwiki.apache.org/conﬂuence/display/NIFI/Release+Notes#ReleaseNotes-Version1.10.0

NiFI 1.11 Features
• Improved handling and support for partitions when sending data to Azure Event Hubs.
• All repositories (Content, FlowFile, Provenance) can now be encrypted on disk controlled at an application level.
• Class loader isolation now includes isolating native libraries within the Nars! Huge help for interacting with many Hadoop
vendors or other systems from the same NiFi cluster.
• Keytab Credential Service now supported to ensure easily conﬁgured secure communications with the Hortonworks
Schema Registry.
• IBM MQ now easier to integrate with for existing NiFi JMS processors.
• Metrics Events Reporting Task
• Rules Action Handler Lookup Service
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020&version=12
346451

Apache NiFi 1.11.4 Features
Reporting Tasks
Total number of reporting tasks.
Examples of new components:
- Prometheus Reporting Task
- Azure Log Analytics RT
- Azure Provenance RT
- Query NiFi Reporting Task
- Metrics Event Reporting Task
Controller Services
Total number of controller services.
- Rules Engine Controller Service
- Kudu Lookup Service
- Azure Storage Credentials
- Amazon S3 Encryption Service
- HBase List Lookup Service
- Parquet Reader/Writer
Processors
Total number of processors.
- Accumulo processors
- Put Elasticsearch Record
- Put BigQuery Streaming
- RetryFlowFile

Other Features of Apache NiFi 1.11.4
JDK 11 Support
Improvements:
- Class loading isolation with
native libraries
Security
- Encrypted content repository &
flow file repository (tech
preview)
Operations
Improvements:
- Monitoring analytics and rule
based monitoring
- Parameters to improve CI/CD
and support sensitive
properties
https://www.youtube.com/watch?v=IUjz-rhA3xs

Cloud, VMs, Containers and Pods
https://hub.docker.com/r/apache/nifi/
https://hub.helm.sh/charts/cetic/nifi

Example

Useful Links
https://www.datainmotion.dev/2020/02/connecting-apache-nifi-to-apache-atlas.html
https://dev.to/tspannhw/quicktip-ingesting-google-analytics-api-with-apache-nifi-mg1
https://dev.to/tspannhw/analyzing-wood-burning-stoves-with-flank-stack-minifi-flink-ni
fi-kafka-kudu-36on
https://dev.to/tspannhw/cloudera-edge2ai-minifi-java-agent-with-raspberry-pi-and-ther
mal-camera-and-air-quality-sensor-part-1-3oo9
https://dev.to/tspannhw/iot-series-minifi-agent-on-raspberry-pi-4-with-enviro-hat-for-en
vironmental-monitoring-and-analytics-l8d
https://dev.to/tspannhw/introducing-mm-flank-an-apache-flink-stack-for-rapid-streami
ng-development-from-edge-2-ai-5c12
https://dev.to/tspannhw/nifi-1-10-postslack-easy-image-upload-22mh
https://dev.to/tspannhw/nifi-toolkit-cli-for-nifi-1-10-213h

TH N Y U

Introduction to Apache NiFi 1.11.4

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to Apache NiFi 1.11.4

Similar to Introduction to Apache NiFi 1.11.4 (20)

More from Timothy Spann

More from Timothy Spann (20)

Recently uploaded

Recently uploaded (20)

Introduction to Apache NiFi 1.11.4