Cracking the nut, solving edge ai with apache tools and frameworks

Timothy Spann
Timothy SpannDeveloper Advocate
Cracking the Nut, Solving Edge AI with
Apache Tools and Frameworks
Timothy Spann
Principal DataFlow Field Engineer
Cloudera
@PaasDev
© 2020 Cloudera, Inc. All rights reserved. 2
Tim Spann
Who am I?
Cloudera Principal DataFlow Field Engineer
@PaasDev
DZone Zone Leader and Big Data MVB
Future of Data Meetup Leader
ex-Pivotal Field Engineer
https://github.com/tspannhw https://www.datainmotion.dev/
© 2020 Cloudera, Inc. All rights reserved. 3
Welcome to Future of Data - Princeton - Virtual
@PaasDev
https://www.meetup.com/futureofdata-princeton/
From Big Data to AI to Streaming to Containers to
Cloud to Analytics to Cloud Storage to Fast Data to
Machine Learning to Microservices to ...
4
Where Can I Run Edge AI Easily?
CDP services are optimized for the elastic compute &
‘always-on’ storage services provided by any cloud provider
Web service hosted and managed by Cloudera
Hosted in the your cloud environment, but managed by the
CDP Management Console
Shared Data Experience (SDX) technologies form a secure and
governed data lake backed by object storage (S3, ADLS, GCS)
Flow Management Streams Messaging Streaming Analytics
•
•
•
•
•
•
•
© 2020 Cloudera, Inc. All rights reserved. 6
Streaming Data Pipelines with Apache NiFi + Kafka + Flink
© 2020 Cloudera, Inc. All rights reserved. 7
Apache Tools and Frameworks Used
Apache OpenNLP with Apache NiFi
Apache MXNet Native Processor for Apache NiFi
This is a beta, community release by me using the new beta Java API for Apache MXNet.
https://github.com/tspannhw/nifi-mxnetinference-processor
https://community.hortonworks.com/articles/229215/apache-nifi-processor-for-apache-mxnet-ssd-single.html
https://www.youtube.com/watch?v=Q4dSGPvqXSA
Apache MXNet Native Processor through DJL.AI for Apache
NiFi
This processor uses the DJL.AI Java Interface
https://github.com/tspannhw/nifi-djl-processor
https://dev.to/tspannhw/easy-deep-learning-in-apache-nifi-with-djl-2d79
DJL NiFi Processors - Sentiment Analysis
https://www.datainmotion.dev/2020/09/using-djlai-for-deep-learning-based.html
https://github.com/tspannhw/nifi-djlsentimentanalysis-processor
probnegative
0.99
No value set
probnegativeperc
99.44
No value set
probpositive
0.01
No value set
probpositiveperc
0.56
No value set
rawclassification
[class: "Negative", probability: 0.99440, class: "Positive",
probability: 0.00559]
BERT QA through DJL.AI for Apache NiFi
This processor uses the DJL.AI Java Interface
https://github.com/tspannhw/nifi-djlqa-processor
https://www.datainmotion.dev/2020/09/using-djlai-for-deep-learning-bert-q-in.html
https://dev.to/tspannhw/easy-deep-learning-in-apache-nifi-with-djl-2d79
The pretrained model is DistilBERT model
trained by HuggingFace using PyTorch.
BERT QA
13
What is Apache NiFi and MiNiFi used for?
© 2020 Cloudera, Inc. All rights reserved. 14
https://blog.cloudera.com/benchmarking-nifi-performance-and-scalability/
NiFi Processing Billions of Events
© 2020 Cloudera, Inc. All rights reserved. 15
STREAMS MESSAGING / APACHE KAFKA
Kafka Connect Support
Simple Data Movement
In/Out of Kafka
Schema Registry Ranger Plugin
Improved ACL and Audit for
Kafka and Schema Registry
Cruise Control Support
Intelligent Kafka Cluster
Rebalancing & Self Healing
© 2020 Cloudera, Inc. All rights reserved. 16
Key Capabilities
STREAMING ANALYTICS / APACHE FLINK
Flink SQL Support
Agile Streaming App
Development using SQL
Apache Flink Atlas Hook
Capture operational Flink
app metadata and lineage
Single View of Flink Yarn Jobs
Improve Developer Experience
& operational visibility
Demo
18
Edge AI to Cloud Streaming Pipeline
Device Data
SensorsEnergy Logs
Weather
Sensors
Aggregates
Energy
SQL
Analytics
MiNiFi
Agent
Deep Learning
Classification
Edge Private
Cloud
Multi-Public
Cloud
© 2020 Cloudera, Inc. All rights reserved. 19
{"uuid": "rpi4_uuid_jfx_20200826203733", "amplitude100": 1.2, "amplitude500": 0.6, "amplitude1000": 0.3, "lownoise": 0.6,
"midnoise": 0.2, "highnoise": 0.2, "amps": 0.3, "ipaddress": "192.168.1.76", "host": "rp4", "host_name": "rp4", "macaddress":
"6e:37:12:08:63:e1", "systemtime": "08/26/2020 16:37:34", "endtime": "1598474254.75", "runtime": "28179.03", "starttime":
"08/26/2020 08:47:54", "cpu": 48.3, "cpu_temp": "72.0", "diskusage": "40219.3 MB", "memory": 24.3, "id":
"20200826203733_28ce9520-6832-4f80-b17d-f36c21fd8fc9", "temperature": "47.2", "adjtemp": "35.8", "adjtempf": "76.4",
"temperaturef": "97.0", "pressure": 1010.0, "humidity": 8.3, "lux": 67.4, "proximity": 0, "oxidising": 77.9, "reducing": 184.6, "nh3":
144.7, "gasKO": "Oxidising: 77913.04 OhmsnReducing: 184625.00 OhmsnNH3: 144651.47 Ohms"}
SHOW ME THE DATA
© 2020 Cloudera, Inc. All rights reserved. 20
BME280 - temperature, pressure, humidity sensor
LTR-559 - light and proximity sensor
MICS6814 - analog gas sensor
ADS1015 ADC
MEMS - microphone
0.96-inch, 160 x 80 color LCD
WHERE DID THAT DATA COME FROM?
© 2020 Cloudera, Inc. All rights reserved. 21
Learn More
23© 2020 Cloudera, Inc. All rights reserved.
DEMO SOURCE CODE
● https://github.com/tspannhw/FlinkForwardGlobal2020
● https://github.com/tspannhw/ApacheConAtHome2020
● https://github.com/tspannhw/minifi-xaviernx
● https://github.com/tspannhw/minifi-jetson-nano
● https://github.com/tspannhw/minifi-enviroplus
● https://github.com/tspannhw/EverythingApacheNiFi
● https://github.com/tspannhw/FlinkSQLWithCatalogsDemo
The code, build scripts, schemas, table
DDL, Flink SQL, Kafka Connect
configuration, NiFi flows, HBase
tables, Kudu tables, Hive tables, HDFS
directories, alerts, images, HTML,
docs, links and all the goodies are
here. Please fork and contribute.
24© 2020 Cloudera, Inc. All rights reserved.
DEEPER CONTENT
● https://www.datainmotion.dev/2020/10/running-flink-sql-against-kafka-using.html
● https://www.datainmotion.dev/2020/10/top-25-use-cases-of-cloudera-flow.html
●
© 2020 Cloudera, Inc. All rights reserved. 25
TH N Y U
1 of 25

Recommended

Cracking the nut, solving edge ai with apache tools and frameworks by
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksTimothy Spann
504 views18 slides
Using apache mx net in production deep learning streaming pipelines by
Using apache mx net in production deep learning streaming pipelinesUsing apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesTimothy Spann
455 views16 slides
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP) by
ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)Timothy Spann
355 views32 slides
Api world apache nifi 101 by
Api world   apache nifi 101Api world   apache nifi 101
Api world apache nifi 101Timothy Spann
523 views33 slides
Real time stock processing with apache nifi, apache flink and apache kafka by
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaTimothy Spann
786 views10 slides
Learning the basics of Apache NiFi for iot OSS Europe 2020 by
Learning the basics of Apache NiFi for iot OSS Europe 2020Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020Timothy Spann
614 views33 slides

More Related Content

What's hot

Data science online camp using the flipn stack for edge ai (flink, nifi, pu... by
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...Timothy Spann
1K views48 slides
Using the FLiPN stack for edge ai (flink, nifi, pulsar) by
Using the FLiPN stack for edge ai (flink, nifi, pulsar)Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Using the FLiPN stack for edge ai (flink, nifi, pulsar)Timothy Spann
450 views10 slides
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a... by
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...Timothy Spann
493 views11 slides
Let's build a simple ingest to cloud datawarehouse with low code by
Let's build a simple ingest to cloud datawarehouse with low codeLet's build a simple ingest to cloud datawarehouse with low code
Let's build a simple ingest to cloud datawarehouse with low codeTimothy Spann
416 views12 slides
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka... by
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Timothy Spann
519 views15 slides
ApacheCon 2021 Apache Deep Learning 302 by
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302Timothy Spann
632 views23 slides

What's hot(20)

Data science online camp using the flipn stack for edge ai (flink, nifi, pu... by Timothy Spann
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Timothy Spann1K views
Using the FLiPN stack for edge ai (flink, nifi, pulsar) by Timothy Spann
Using the FLiPN stack for edge ai (flink, nifi, pulsar)Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Timothy Spann450 views
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a... by Timothy Spann
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...
Using the FLaNK Stack for edge ai (apache mxnet, apache flink, apache nifi, a...
Timothy Spann493 views
Let's build a simple ingest to cloud datawarehouse with low code by Timothy Spann
Let's build a simple ingest to cloud datawarehouse with low codeLet's build a simple ingest to cloud datawarehouse with low code
Let's build a simple ingest to cloud datawarehouse with low code
Timothy Spann416 views
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka... by Timothy Spann
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Timothy Spann519 views
ApacheCon 2021 Apache Deep Learning 302 by Timothy Spann
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
Timothy Spann632 views
Music city data Hail Hydrate! from stream to lake by Timothy Spann
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
Timothy Spann708 views
Codeless pipelines with pulsar and flink by Timothy Spann
Codeless pipelines with pulsar and flinkCodeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flink
Timothy Spann658 views
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ... by Timothy Spann
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Timothy Spann591 views
Cloud lunch and learn real-time streaming in azure by Timothy Spann
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann663 views
Using FLiP with influxdb for edgeai iot at scale 2022 by Timothy Spann
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
Timothy Spann465 views
Ai dev world utilizing apache pulsar, apache ni fi and minifi for edgeai io... by Timothy Spann
Ai dev world   utilizing apache pulsar, apache ni fi and minifi for edgeai io...Ai dev world   utilizing apache pulsar, apache ni fi and minifi for edgeai io...
Ai dev world utilizing apache pulsar, apache ni fi and minifi for edgeai io...
Timothy Spann436 views
DBCC 2021 - FLiP Stack for Cloud Data Lakes by Timothy Spann
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
Timothy Spann717 views
Matt Franklin - Apache Software (Geekfest) by W2O Group
Matt Franklin - Apache Software (Geekfest)Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)
W2O Group2K views
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r... by Timothy Spann
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Timothy Spann553 views
ApacheCon 2021 - Apache NiFi Deep Dive 300 by Timothy Spann
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300
Timothy Spann690 views
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and... by Timothy Spann
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Timothy Spann3.1K views
Spark summit-east-dowling-feb2017-full by Jim Dowling
Spark summit-east-dowling-feb2017-fullSpark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-full
Jim Dowling501 views
Hail hydrate! from stream to lake using open source by Timothy Spann
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
Timothy Spann569 views

Similar to Cracking the nut, solving edge ai with apache tools and frameworks

Introduction to Apache NiFi 1.11.4 by
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Timothy Spann
1.1K views32 slides
A journay to do AI research in the cloud.pdf by
A journay to do AI research in the cloud.pdfA journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdfLiang Yan
42 views18 slides
Kauli SSPにおけるVyOSの導入事例 by
Kauli SSPにおけるVyOSの導入事例Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例Kazuhito Ohkawa
27.1K views39 slides
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj... by
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...Infoshare
80 views105 slides
Monitoring of GPU Usage with Tensorflow Models Using Prometheus by
Monitoring of GPU Usage with Tensorflow Models Using PrometheusMonitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using PrometheusDatabricks
5.3K views49 slides
[Open infra] how to calculate the cloud system operating rate by
[Open infra] how to calculate the cloud system operating rate[Open infra] how to calculate the cloud system operating rate
[Open infra] how to calculate the cloud system operating rateNalee Jang
579 views30 slides

Similar to Cracking the nut, solving edge ai with apache tools and frameworks(20)

Introduction to Apache NiFi 1.11.4 by Timothy Spann
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann1.1K views
A journay to do AI research in the cloud.pdf by Liang Yan
A journay to do AI research in the cloud.pdfA journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdf
Liang Yan42 views
Kauli SSPにおけるVyOSの導入事例 by Kazuhito Ohkawa
Kauli SSPにおけるVyOSの導入事例Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例
Kazuhito Ohkawa27.1K views
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj... by Infoshare
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
Infoshare80 views
Monitoring of GPU Usage with Tensorflow Models Using Prometheus by Databricks
Monitoring of GPU Usage with Tensorflow Models Using PrometheusMonitoring of GPU Usage with Tensorflow Models Using Prometheus
Monitoring of GPU Usage with Tensorflow Models Using Prometheus
Databricks5.3K views
[Open infra] how to calculate the cloud system operating rate by Nalee Jang
[Open infra] how to calculate the cloud system operating rate[Open infra] how to calculate the cloud system operating rate
[Open infra] how to calculate the cloud system operating rate
Nalee Jang579 views
Robotics and Machine Learning: Working with NVIDIA Jetson Kits by Data Works MD
Robotics and Machine Learning: Working with NVIDIA Jetson KitsRobotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
Data Works MD262 views
Introduction to PowerAI - The Enterprise AI Platform by Indrajit Poddar
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
Indrajit Poddar1.5K views
Stream Processing with Apache Kafka and .NET by confluent
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
confluent13.4K views
Bare Metal to OpenStack with Razor and Chef by Matt Ray
Bare Metal to OpenStack with Razor and ChefBare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and Chef
Matt Ray7K views
Spark Summit EU talk by Jorg Schad by Spark Summit
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg Schad
Spark Summit848 views
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ... by Jason Dai
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Jason Dai216 views
MCL303-Deep Learning with Apache MXNet and Gluon by Amazon Web Services
MCL303-Deep Learning with Apache MXNet and GluonMCL303-Deep Learning with Apache MXNet and Gluon
MCL303-Deep Learning with Apache MXNet and Gluon
Amazon Web Services1.4K views
One-Man Ops by Jos Boumans
One-Man OpsOne-Man Ops
One-Man Ops
Jos Boumans12.7K views
Harnessing the virtual realm for successful real world artificial intelligence by Alison B. Lowndes
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
Alison B. Lowndes150 views
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio... by Sri Ambati
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Sri Ambati403 views
Backend.AI Technical Introduction (19.09 / 2019 Autumn) by Lablup Inc.
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Lablup Inc.329 views
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio by Alluxio, Inc.
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Alluxio, Inc.399 views
IoT Edge Data Processing with NVidia Jetson Nano oct 3 2019 by Timothy Spann
IoT  Edge Data Processing with NVidia Jetson Nano oct 3 2019IoT  Edge Data Processing with NVidia Jetson Nano oct 3 2019
IoT Edge Data Processing with NVidia Jetson Nano oct 3 2019
Timothy Spann1.5K views

More from Timothy Spann

Building Real-Time Travel Alerts by
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
165 views48 slides
JConWorld_ Continuous SQL with Kafka and Flink by
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
156 views36 slides
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines by
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data PipelinesTimothy Spann
150 views25 slides
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo by
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoTimothy Spann
162 views8 slides
CoC23_ Looking at the New Features of Apache NiFi by
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiTimothy Spann
36 views24 slides
CoC23_ Let’s Monitor The Conditions at the Conference by
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the ConferenceTimothy Spann
17 views17 slides

More from Timothy Spann(20)

Building Real-Time Travel Alerts by Timothy Spann
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann165 views
JConWorld_ Continuous SQL with Kafka and Flink by Timothy Spann
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann156 views
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines by Timothy Spann
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
Timothy Spann150 views
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo by Timothy Spann
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Timothy Spann162 views
CoC23_ Looking at the New Features of Apache NiFi by Timothy Spann
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
Timothy Spann36 views
CoC23_ Let’s Monitor The Conditions at the Conference by Timothy Spann
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the Conference
Timothy Spann17 views
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf by Timothy Spann
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
Timothy Spann23 views
CoC23_Utilizing Real-Time Transit Data for Travel Optimization by Timothy Spann
CoC23_Utilizing Real-Time Transit Data for Travel OptimizationCoC23_Utilizing Real-Time Transit Data for Travel Optimization
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
Timothy Spann31 views
The Never Landing Stream with HTAP and Streaming by Timothy Spann
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
Timothy Spann254 views
Meetup - Brasil - Data In Motion - 2023 September 19 by Timothy Spann
Meetup - Brasil - Data In Motion - 2023 September 19Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19
Timothy Spann319 views
Implement a Universal Data Distribution Architecture to Manage All Streaming ... by Timothy Spann
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Timothy Spann28 views
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data by Timothy Spann
Building Real-time Pipelines with FLaNK_ A Case Study with Transit DataBuilding Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Timothy Spann193 views
big data fest building modern data streaming apps by Timothy Spann
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
Timothy Spann317 views
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp by Timothy Spann
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Timothy Spann163 views
OSSNA Building Modern Data Streaming Apps by Timothy Spann
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming Apps
Timothy Spann155 views
GSJUG: Mastering Data Streaming Pipelines 09May2023 by Timothy Spann
GSJUG: Mastering Data Streaming Pipelines 09May2023GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023
Timothy Spann255 views
BestInFlowCompetitionTutorials03May2023 by Timothy Spann
BestInFlowCompetitionTutorials03May2023BestInFlowCompetitionTutorials03May2023
BestInFlowCompetitionTutorials03May2023
Timothy Spann11 views
Cloudera Sandbox Event Guidelines For Workflow by Timothy Spann
Cloudera Sandbox Event Guidelines For WorkflowCloudera Sandbox Event Guidelines For Workflow
Cloudera Sandbox Event Guidelines For Workflow
Timothy Spann32 views
Meet the Committers Webinar_ Lab Preparation by Timothy Spann
Meet the Committers Webinar_ Lab PreparationMeet the Committers Webinar_ Lab Preparation
Meet the Committers Webinar_ Lab Preparation
Timothy Spann32 views

Recently uploaded

CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueShapeBlue
137 views13 slides
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
176 views29 slides
Initiating and Advancing Your Strategic GIS Governance Strategy by
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance StrategySafe Software
184 views68 slides
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... by
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...ShapeBlue
196 views62 slides
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...ShapeBlue
183 views18 slides
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...ShapeBlue
162 views25 slides

Recently uploaded(20)

CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue137 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc176 views
Initiating and Advancing Your Strategic GIS Governance Strategy by Safe Software
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software184 views
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... by ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue196 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue183 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue162 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays33 views
NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu437 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue178 views
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 by BookNet Canada
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
BookNet Canada44 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue303 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty65 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE84 views
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... by ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue129 views
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue224 views

Cracking the nut, solving edge ai with apache tools and frameworks

  • 1. Cracking the Nut, Solving Edge AI with Apache Tools and Frameworks Timothy Spann Principal DataFlow Field Engineer Cloudera @PaasDev
  • 2. © 2020 Cloudera, Inc. All rights reserved. 2 Tim Spann Who am I? Cloudera Principal DataFlow Field Engineer @PaasDev DZone Zone Leader and Big Data MVB Future of Data Meetup Leader ex-Pivotal Field Engineer https://github.com/tspannhw https://www.datainmotion.dev/
  • 3. © 2020 Cloudera, Inc. All rights reserved. 3 Welcome to Future of Data - Princeton - Virtual @PaasDev https://www.meetup.com/futureofdata-princeton/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ...
  • 4. 4 Where Can I Run Edge AI Easily? CDP services are optimized for the elastic compute & ‘always-on’ storage services provided by any cloud provider Web service hosted and managed by Cloudera Hosted in the your cloud environment, but managed by the CDP Management Console Shared Data Experience (SDX) technologies form a secure and governed data lake backed by object storage (S3, ADLS, GCS) Flow Management Streams Messaging Streaming Analytics
  • 6. © 2020 Cloudera, Inc. All rights reserved. 6 Streaming Data Pipelines with Apache NiFi + Kafka + Flink
  • 7. © 2020 Cloudera, Inc. All rights reserved. 7 Apache Tools and Frameworks Used
  • 8. Apache OpenNLP with Apache NiFi
  • 9. Apache MXNet Native Processor for Apache NiFi This is a beta, community release by me using the new beta Java API for Apache MXNet. https://github.com/tspannhw/nifi-mxnetinference-processor https://community.hortonworks.com/articles/229215/apache-nifi-processor-for-apache-mxnet-ssd-single.html https://www.youtube.com/watch?v=Q4dSGPvqXSA
  • 10. Apache MXNet Native Processor through DJL.AI for Apache NiFi This processor uses the DJL.AI Java Interface https://github.com/tspannhw/nifi-djl-processor https://dev.to/tspannhw/easy-deep-learning-in-apache-nifi-with-djl-2d79
  • 11. DJL NiFi Processors - Sentiment Analysis https://www.datainmotion.dev/2020/09/using-djlai-for-deep-learning-based.html https://github.com/tspannhw/nifi-djlsentimentanalysis-processor probnegative 0.99 No value set probnegativeperc 99.44 No value set probpositive 0.01 No value set probpositiveperc 0.56 No value set rawclassification [class: "Negative", probability: 0.99440, class: "Positive", probability: 0.00559]
  • 12. BERT QA through DJL.AI for Apache NiFi This processor uses the DJL.AI Java Interface https://github.com/tspannhw/nifi-djlqa-processor https://www.datainmotion.dev/2020/09/using-djlai-for-deep-learning-bert-q-in.html https://dev.to/tspannhw/easy-deep-learning-in-apache-nifi-with-djl-2d79 The pretrained model is DistilBERT model trained by HuggingFace using PyTorch. BERT QA
  • 13. 13 What is Apache NiFi and MiNiFi used for?
  • 14. © 2020 Cloudera, Inc. All rights reserved. 14 https://blog.cloudera.com/benchmarking-nifi-performance-and-scalability/ NiFi Processing Billions of Events
  • 15. © 2020 Cloudera, Inc. All rights reserved. 15 STREAMS MESSAGING / APACHE KAFKA Kafka Connect Support Simple Data Movement In/Out of Kafka Schema Registry Ranger Plugin Improved ACL and Audit for Kafka and Schema Registry Cruise Control Support Intelligent Kafka Cluster Rebalancing & Self Healing
  • 16. © 2020 Cloudera, Inc. All rights reserved. 16 Key Capabilities STREAMING ANALYTICS / APACHE FLINK Flink SQL Support Agile Streaming App Development using SQL Apache Flink Atlas Hook Capture operational Flink app metadata and lineage Single View of Flink Yarn Jobs Improve Developer Experience & operational visibility
  • 17. Demo
  • 18. 18 Edge AI to Cloud Streaming Pipeline Device Data SensorsEnergy Logs Weather Sensors Aggregates Energy SQL Analytics MiNiFi Agent Deep Learning Classification Edge Private Cloud Multi-Public Cloud
  • 19. © 2020 Cloudera, Inc. All rights reserved. 19 {"uuid": "rpi4_uuid_jfx_20200826203733", "amplitude100": 1.2, "amplitude500": 0.6, "amplitude1000": 0.3, "lownoise": 0.6, "midnoise": 0.2, "highnoise": 0.2, "amps": 0.3, "ipaddress": "192.168.1.76", "host": "rp4", "host_name": "rp4", "macaddress": "6e:37:12:08:63:e1", "systemtime": "08/26/2020 16:37:34", "endtime": "1598474254.75", "runtime": "28179.03", "starttime": "08/26/2020 08:47:54", "cpu": 48.3, "cpu_temp": "72.0", "diskusage": "40219.3 MB", "memory": 24.3, "id": "20200826203733_28ce9520-6832-4f80-b17d-f36c21fd8fc9", "temperature": "47.2", "adjtemp": "35.8", "adjtempf": "76.4", "temperaturef": "97.0", "pressure": 1010.0, "humidity": 8.3, "lux": 67.4, "proximity": 0, "oxidising": 77.9, "reducing": 184.6, "nh3": 144.7, "gasKO": "Oxidising: 77913.04 OhmsnReducing: 184625.00 OhmsnNH3: 144651.47 Ohms"} SHOW ME THE DATA
  • 20. © 2020 Cloudera, Inc. All rights reserved. 20 BME280 - temperature, pressure, humidity sensor LTR-559 - light and proximity sensor MICS6814 - analog gas sensor ADS1015 ADC MEMS - microphone 0.96-inch, 160 x 80 color LCD WHERE DID THAT DATA COME FROM?
  • 21. © 2020 Cloudera, Inc. All rights reserved. 21
  • 23. 23© 2020 Cloudera, Inc. All rights reserved. DEMO SOURCE CODE ● https://github.com/tspannhw/FlinkForwardGlobal2020 ● https://github.com/tspannhw/ApacheConAtHome2020 ● https://github.com/tspannhw/minifi-xaviernx ● https://github.com/tspannhw/minifi-jetson-nano ● https://github.com/tspannhw/minifi-enviroplus ● https://github.com/tspannhw/EverythingApacheNiFi ● https://github.com/tspannhw/FlinkSQLWithCatalogsDemo The code, build scripts, schemas, table DDL, Flink SQL, Kafka Connect configuration, NiFi flows, HBase tables, Kudu tables, Hive tables, HDFS directories, alerts, images, HTML, docs, links and all the goodies are here. Please fork and contribute.
  • 24. 24© 2020 Cloudera, Inc. All rights reserved. DEEPER CONTENT ● https://www.datainmotion.dev/2020/10/running-flink-sql-against-kafka-using.html ● https://www.datainmotion.dev/2020/10/top-25-use-cases-of-cloudera-flow.html ●
  • 25. © 2020 Cloudera, Inc. All rights reserved. 25 TH N Y U