Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Flink and NiFi,
Two Stars in the Apache Big Data Constellation
Matthew Ring, Chicago Apache Flink Meetup, Jan. 19, 2016
About me:
● Matthew Ring is currently a Senior Software Engineer at HP Enterprise.
● Matt has been a professional Java dev...
What is NiFi?
Origin:
NSA -> Onyara -> Apache NiFi
-> Hortonworks DataFlow
Summary:
Visual Dataflow Programming for Big Da...
What is NiFi?
IMHO, good for:
● Ingestion
● Format Conversion
● Light (simple) Processing
● Delivery to other systems
Screenshot?
from: https://www.silvercloudcomputing.com/nifi.html
What is Flink?
I’m pretty sure you’ve
already heard about it...
Together?
● Similar, but different...
● Friends in common:
○ Sockets
○ Kafka
○ HDFS
○ Flume
○ RabbitMQ
○ NATS Messaging
○ ...
Together?
● NiFi is visual
● NiFi keeps a paper trail RE: the data
running through it
● Supports monitoring/metrics report...
Paper Trail!
NiFi records:
● Content
● Metadata
● Provenance (touches)
Sooooo what?
● Allows replay of individual items!
●...
Downsides?
● Weak deployment paradigm
○ Can import/export flow templates
○ But various processor config values will need t...
NOW IS TIME FOR QUIZ!
...err, how ‘bout a demo?
Demo Notes
Custom Java code provides:
● synthetic intraday ticks
● trader state management
● glue logic
● websocket backen...
Demo: Screenshot of NiFi Flow
Demo: Screenshot of Live Web Dashboard
Questions?
Thank you!
Upcoming SlideShare
Loading in …5
×

Flink and NiFi, Two Stars in the Apache Big Data Constellation

2,836 views

Published on

Presented to the Chicago Apache Flink Meetup, Jan. 19, 2016

Goal: To provide a non-exhaustive but interesting demonstration of Apache NiFi and Apache Flink working together. Included a demo of NiFi and Flink together to simulate a simplified trading ecosystem of Brokers and Day Traders, with streaming market data, orders, executions and P/L results.

Published in: Software

Flink and NiFi, Two Stars in the Apache Big Data Constellation

  1. 1. Flink and NiFi, Two Stars in the Apache Big Data Constellation Matthew Ring, Chicago Apache Flink Meetup, Jan. 19, 2016
  2. 2. About me: ● Matthew Ring is currently a Senior Software Engineer at HP Enterprise. ● Matt has been a professional Java developer in multiple industries, including finance, healthcare and education, since 1999. ● Prior to that, he was an electrical engineer in defense communications. ● He is currently working on a new Investigative Analytics product for HP Enterprise. ● He has presented talks at JavaOne and Bank of America's developer conferences. ● His github is https://github.com/mring33621
  3. 3. What is NiFi? Origin: NSA -> Onyara -> Apache NiFi -> Hortonworks DataFlow Summary: Visual Dataflow Programming for Big Data/Fast Data Ingestion! (Or, yet another package where you drop stuff on the screen and connect it with arrows)
  4. 4. What is NiFi? IMHO, good for: ● Ingestion ● Format Conversion ● Light (simple) Processing ● Delivery to other systems
  5. 5. Screenshot? from: https://www.silvercloudcomputing.com/nifi.html
  6. 6. What is Flink? I’m pretty sure you’ve already heard about it...
  7. 7. Together? ● Similar, but different... ● Friends in common: ○ Sockets ○ Kafka ○ HDFS ○ Flume ○ RabbitMQ ○ NATS Messaging ○ Elasticsearch ○ Solr ● There is also the option of direct NiFi <-> Flink connections!
  8. 8. Together? ● NiFi is visual ● NiFi keeps a paper trail RE: the data running through it ● Supports monitoring/metrics reporting ○ Ambari ○ Ganglia ○ Reimann ● Oh, and you can modify flows while they are LIVE! ● NiFi has more friends to bring to the party: ○ JSON/Avro/Parquet/Kite ○ HTTP/S, UDP, S/FTP ○ Text matching/parsing with regex ○ Tagging (meta data) ○ Scripting ○ AWS S3, SQS, SNS, Azure events ○ Tailing/Syslog ○ HL7 ○ MongoDB ○ HBase ○ SQL ○ JMS ○ Images ○ ...AND MORE!
  9. 9. Paper Trail! NiFi records: ● Content ● Metadata ● Provenance (touches) Sooooo what? ● Allows replay of individual items! ● Queryable through UI or REST interface ● Assists in post hoc data forensics (compliance? legal discovery?)
  10. 10. Downsides? ● Weak deployment paradigm ○ Can import/export flow templates ○ But various processor config values will need to be updated by hand when moving from env to env ● Weak clustering story ○ non-elastic ○ SPOF master node ● Weak querying capability from UI ● Most processors are micro-batching (event-time stream processing is still experimental) ● Sometimes tedious -- have to think in terms of several little, built-in pieces to get a simple job done
  11. 11. NOW IS TIME FOR QUIZ! ...err, how ‘bout a demo?
  12. 12. Demo Notes Custom Java code provides: ● synthetic intraday ticks ● trader state management ● glue logic ● websocket backend for dashboard UI Custom HTML/JS code provides: ● live dashboard UI ● smoothie.js charts ● knockout.js binding/templating NiFi: ● observes orders ○ can deny orders based on ‘compliance rules’ ● observes executions ○ routes ‘suspicious’ executions to file system for future scrutiny Flink Streaming provides: ● trade recommendation engine ● execution engine
  13. 13. Demo: Screenshot of NiFi Flow
  14. 14. Demo: Screenshot of Live Web Dashboard
  15. 15. Questions?
  16. 16. Thank you!

×