SlideShare a Scribd company logo
1 of 16
Download to read offline
Flink and NiFi,
Two Stars in the Apache Big Data Constellation
Matthew Ring, Chicago Apache Flink Meetup, Jan. 19, 2016
About me:
● Matthew Ring is currently a Senior Software Engineer at HP Enterprise.
● Matt has been a professional Java developer in multiple industries, including finance, healthcare
and education, since 1999.
● Prior to that, he was an electrical engineer in defense communications.
● He is currently working on a new Investigative Analytics product for HP Enterprise.
● He has presented talks at JavaOne and Bank of America's developer conferences.
● His github is https://github.com/mring33621
What is NiFi?
Origin:
NSA -> Onyara -> Apache NiFi
-> Hortonworks DataFlow
Summary:
Visual Dataflow Programming for Big Data/Fast Data Ingestion!
(Or, yet another package where you drop stuff on the screen and connect it with
arrows)
What is NiFi?
IMHO, good for:
● Ingestion
● Format Conversion
● Light (simple) Processing
● Delivery to other systems
Screenshot?
from: https://www.silvercloudcomputing.com/nifi.html
What is Flink?
I’m pretty sure you’ve
already heard about it...
Together?
● Similar, but different...
● Friends in common:
○ Sockets
○ Kafka
○ HDFS
○ Flume
○ RabbitMQ
○ NATS Messaging
○ Elasticsearch
○ Solr
● There is also the option of direct NiFi <-> Flink connections!
Together?
● NiFi is visual
● NiFi keeps a paper trail RE: the data
running through it
● Supports monitoring/metrics reporting
○ Ambari
○ Ganglia
○ Reimann
● Oh, and you can modify flows while they
are LIVE!
● NiFi has more friends to bring to the party:
○ JSON/Avro/Parquet/Kite
○ HTTP/S, UDP, S/FTP
○ Text matching/parsing with regex
○ Tagging (meta data)
○ Scripting
○ AWS S3, SQS, SNS, Azure events
○ Tailing/Syslog
○ HL7
○ MongoDB
○ HBase
○ SQL
○ JMS
○ Images
○ ...AND MORE!
Paper Trail!
NiFi records:
● Content
● Metadata
● Provenance (touches)
Sooooo what?
● Allows replay of individual items!
● Queryable through UI or REST interface
● Assists in post hoc data forensics (compliance? legal discovery?)
Downsides?
● Weak deployment paradigm
○ Can import/export flow templates
○ But various processor config values will need to be updated by hand when moving from env to
env
● Weak clustering story
○ non-elastic
○ SPOF master node
● Weak querying capability from UI
● Most processors are micro-batching (event-time stream processing is still
experimental)
● Sometimes tedious -- have to think in terms of several little, built-in pieces to
get a simple job done
NOW IS TIME FOR QUIZ!
...err, how ‘bout a demo?
Demo Notes
Custom Java code provides:
● synthetic intraday ticks
● trader state management
● glue logic
● websocket backend for dashboard UI
Custom HTML/JS code provides:
● live dashboard UI
● smoothie.js charts
● knockout.js binding/templating
NiFi:
● observes orders
○ can deny orders based on ‘compliance
rules’
● observes executions
○ routes ‘suspicious’ executions to file
system for future scrutiny
Flink Streaming provides:
● trade recommendation engine
● execution engine
Demo: Screenshot of NiFi Flow
Demo: Screenshot of Live Web Dashboard
Questions?
Thank you!

More Related Content

What's hot

Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®confluent
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasFlink Forward
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on KubernetesDatabricks
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdKai Wähner
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...ScyllaDB
 
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...HostedbyConfluent
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentHostedbyConfluent
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022HostedbyConfluent
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and visionStephan Ewen
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain confluent
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
 

What's hot (20)

Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Apache Airflow overview
Apache Airflow overviewApache Airflow overview
Apache Airflow overview
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
 
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
 
Apache airflow
Apache airflowApache airflow
Apache airflow
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 

Similar to Flink and NiFi, Two Stars in the Apache Big Data Constellation

Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Neo4j
 
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...it-people
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Holden Karau
 
Introduction to PHP (SDPHP)
Introduction to PHP   (SDPHP)Introduction to PHP   (SDPHP)
Introduction to PHP (SDPHP)Eric Johnson
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
 
Powering tensor flow with big data using apache beam, flink, and spark cern...
Powering tensor flow with big data using apache beam, flink, and spark   cern...Powering tensor flow with big data using apache beam, flink, and spark   cern...
Powering tensor flow with big data using apache beam, flink, and spark cern...Holden Karau
 
Python in Industry
Python in IndustryPython in Industry
Python in IndustryDharmit Shah
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...Holden Karau
 
Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?Holden Karau
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Holden Karau
 
Making the big data ecosystem work together with python apache arrow, spark,...
Making the big data ecosystem work together with python  apache arrow, spark,...Making the big data ecosystem work together with python  apache arrow, spark,...
Making the big data ecosystem work together with python apache arrow, spark,...Holden Karau
 
Making the big data ecosystem work together with Python & Apache Arrow, Apach...
Making the big data ecosystem work together with Python & Apache Arrow, Apach...Making the big data ecosystem work together with Python & Apache Arrow, Apach...
Making the big data ecosystem work together with Python & Apache Arrow, Apach...Holden Karau
 
Go frugal with web services
Go frugal with web servicesGo frugal with web services
Go frugal with web servicesDaniel Fireman
 
Activity feeds (and more) at mate1
Activity feeds (and more) at mate1Activity feeds (and more) at mate1
Activity feeds (and more) at mate1Hisham Mardam-Bey
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyYaroslav Tkachenko
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Holden Karau
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3 Omid Vahdaty
 

Similar to Flink and NiFi, Two Stars in the Apache Big Data Constellation (20)

Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
 
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
«Что такое serverless-архитектура и как с ней жить?» Николай Марков, Aligned ...
 
Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018Accelerating Big Data beyond the JVM - Fosdem 2018
Accelerating Big Data beyond the JVM - Fosdem 2018
 
Introduction to PHP (SDPHP)
Introduction to PHP   (SDPHP)Introduction to PHP   (SDPHP)
Introduction to PHP (SDPHP)
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 
Powering tensor flow with big data using apache beam, flink, and spark cern...
Powering tensor flow with big data using apache beam, flink, and spark   cern...Powering tensor flow with big data using apache beam, flink, and spark   cern...
Powering tensor flow with big data using apache beam, flink, and spark cern...
 
Python in Industry
Python in IndustryPython in Industry
Python in Industry
 
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...Powering tensorflow with big data (apache spark, flink, and beam)   dataworks...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
 
Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...Simplifying training deep and serving learning models with big data in python...
Simplifying training deep and serving learning models with big data in python...
 
Making the big data ecosystem work together with python apache arrow, spark,...
Making the big data ecosystem work together with python  apache arrow, spark,...Making the big data ecosystem work together with python  apache arrow, spark,...
Making the big data ecosystem work together with python apache arrow, spark,...
 
Making the big data ecosystem work together with Python & Apache Arrow, Apach...
Making the big data ecosystem work together with Python & Apache Arrow, Apach...Making the big data ecosystem work together with Python & Apache Arrow, Apach...
Making the big data ecosystem work together with Python & Apache Arrow, Apach...
 
Go frugal with web services
Go frugal with web servicesGo frugal with web services
Go frugal with web services
 
Activity feeds (and more) at mate1
Activity feeds (and more) at mate1Activity feeds (and more) at mate1
Activity feeds (and more) at mate1
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at Shopify
 
Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018Big Data Beyond the JVM - Strata San Jose 2018
Big Data Beyond the JVM - Strata San Jose 2018
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 

Recently uploaded

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 

Recently uploaded (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 

Flink and NiFi, Two Stars in the Apache Big Data Constellation

  • 1. Flink and NiFi, Two Stars in the Apache Big Data Constellation Matthew Ring, Chicago Apache Flink Meetup, Jan. 19, 2016
  • 2. About me: ● Matthew Ring is currently a Senior Software Engineer at HP Enterprise. ● Matt has been a professional Java developer in multiple industries, including finance, healthcare and education, since 1999. ● Prior to that, he was an electrical engineer in defense communications. ● He is currently working on a new Investigative Analytics product for HP Enterprise. ● He has presented talks at JavaOne and Bank of America's developer conferences. ● His github is https://github.com/mring33621
  • 3. What is NiFi? Origin: NSA -> Onyara -> Apache NiFi -> Hortonworks DataFlow Summary: Visual Dataflow Programming for Big Data/Fast Data Ingestion! (Or, yet another package where you drop stuff on the screen and connect it with arrows)
  • 4. What is NiFi? IMHO, good for: ● Ingestion ● Format Conversion ● Light (simple) Processing ● Delivery to other systems
  • 6. What is Flink? I’m pretty sure you’ve already heard about it...
  • 7. Together? ● Similar, but different... ● Friends in common: ○ Sockets ○ Kafka ○ HDFS ○ Flume ○ RabbitMQ ○ NATS Messaging ○ Elasticsearch ○ Solr ● There is also the option of direct NiFi <-> Flink connections!
  • 8. Together? ● NiFi is visual ● NiFi keeps a paper trail RE: the data running through it ● Supports monitoring/metrics reporting ○ Ambari ○ Ganglia ○ Reimann ● Oh, and you can modify flows while they are LIVE! ● NiFi has more friends to bring to the party: ○ JSON/Avro/Parquet/Kite ○ HTTP/S, UDP, S/FTP ○ Text matching/parsing with regex ○ Tagging (meta data) ○ Scripting ○ AWS S3, SQS, SNS, Azure events ○ Tailing/Syslog ○ HL7 ○ MongoDB ○ HBase ○ SQL ○ JMS ○ Images ○ ...AND MORE!
  • 9. Paper Trail! NiFi records: ● Content ● Metadata ● Provenance (touches) Sooooo what? ● Allows replay of individual items! ● Queryable through UI or REST interface ● Assists in post hoc data forensics (compliance? legal discovery?)
  • 10. Downsides? ● Weak deployment paradigm ○ Can import/export flow templates ○ But various processor config values will need to be updated by hand when moving from env to env ● Weak clustering story ○ non-elastic ○ SPOF master node ● Weak querying capability from UI ● Most processors are micro-batching (event-time stream processing is still experimental) ● Sometimes tedious -- have to think in terms of several little, built-in pieces to get a simple job done
  • 11. NOW IS TIME FOR QUIZ! ...err, how ‘bout a demo?
  • 12. Demo Notes Custom Java code provides: ● synthetic intraday ticks ● trader state management ● glue logic ● websocket backend for dashboard UI Custom HTML/JS code provides: ● live dashboard UI ● smoothie.js charts ● knockout.js binding/templating NiFi: ● observes orders ○ can deny orders based on ‘compliance rules’ ● observes executions ○ routes ‘suspicious’ executions to file system for future scrutiny Flink Streaming provides: ● trade recommendation engine ● execution engine
  • 13. Demo: Screenshot of NiFi Flow
  • 14. Demo: Screenshot of Live Web Dashboard