Apache Kafka for Smart Grid, Utilities and Energy Production

Apache Kafka in the Energy Industry
Real-Time Analytics at Scale for IoT, Smart Grids, Energy Production and Distribution
Kai Waehner
Field CTO
contact@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de

IoT and Event Streaming with Apache Kafka in the Energy Industry – @KaiWaehner - www.kai-waehner.de
The
Energy
Sector
is
Changing…
Smart Grid Infrastructure

Requirements for a Smart Grid
3
• Reliability: Improve fault detection and allow
self-healing of the network
• Flexibility: Network topology with bidirectional
energy flows
• Efficiency: Demand-side management
• Sustainability: Green and clean technologies,
distributed and smaller scale
• Market-enabling: Systematic communication
between suppliers (their energy price) and
consumers (their willingness-to-pay)
• Cybersecurity: Secure infrastructure with
encrypted and authenticated communication
and real-time anomaly detection at scale
across the supply chain.

with Apache Kafka
Event
Streaming

5
STREAM
PROCESSING
Create and store
materialized views
Filter
Analyze in-flight
Time
C CC

TRADITIONAL
DATABASE
EVENT STREAM
PROCESSING
SELECT * FROM
DB_TABLE
CREATE TABLE T
AS SELECT * FROM
EVENT_STREAM
Active Query: Passive Data:
DB Table
Active Data: Passive Query:
Event Stream

TABLES STREAMS
USER
JAY
SUE
FRED
CREDIT_SCORE
695
430
710V1
V3
V2
PAYMENTS
42
18
65
...
USER
JAY
SUE
FRED
...

Global Scale
Real-time
Persistent Storage
Stream Processing
Data Integration
Apache Kafka
The De-facto Standard for Real-Time Event Streaming
Edge
Cloud
Data LakeDatabases
Datacenter
IoT
SaaS AppsMobile
Microservices Machine
Learning
Apache
Kafka

Apache Kafka at Scale at Tech Giants
> 7 trillion messages / day > 6 Petabytes / day
“You name it”
* Kafka is not just used for big data
** Kafka Is not just used by tech giants
9

Improve
Customer
Experience
(CX)
Increase
Revenue
(make money)
Business
Value
Decrease
Costs
(save money)
Core Business
Platform
Increase
Operational
Efficiency
Migrate
to Cloud
Mitigate
Risk (protect
money)
Key Drivers
Strategic
Objectives
(sample)
Fraud
Detection
IoT sensor
ingestion
Digital
replatforming/
Mainframe Offload
Connected Car: Navigation &
improved in-car experience: Audi
Customer 360
Simplifying Omni-channel Retail at
Scale: Target
Faster transactional
processing / analysis
incl. Machine Learning / AI
Mainframe Offload: RBC
Microservices
Architecture
Online Fraud Detection
Online Security
(syslog, log aggregation,
Splunk replacement)
Middleware
replacement
Regulatory
Digital
Transformation
Application Modernization: Multiple
Examples
Website / Core
Operations
(Central Nervous
System)
The [Silicon Valley] Digital Natives;
LinkedIn, Netflix, Uber, Yelp...
Predictive Maintenance: Audi
Streaming Platform in a regulated
environment (e.g. Electronic Medical
Records): Celmatix
Real-time app
updates
Real Time Streaming Platform for
Communications and Beyond: Capital
One
Developer Velocity - Building
Stateful Financial Applications with
Kafka Streams: Funding Circle
Detect Fraud & Prevent Fraud in
Real Time: PayPal
Kafka as a Service - A Tale of
Security and Multi-Tenancy: Apple
Example Use Cases
$↑
$↓
$↔
Example Case Studies
(of many)

10 Reasons for Event Streaming with Apache Kafka
Real Time
Scalable
Cost Reduction
24/7 – Zero downtime, zero data loss
Decoupling – Storage, Domain-driven Design
Data (re-)processing and stateful client applications
Integration – Connectivity to IoT, legacy, big data, everything
Hybrid Architecture – On Premises, multi cloud, edge computing
Fully managed cloud
No vendor locking
11

Device management
Unreliable networks
Connectivity beyond standards
Lightweight edge hardware
…
is not an IoT Platform!

Consumer IoT and Industrial IoT (IIoT)
Use Cases
for
Event Streaming

Ride-Sharing
More than just Messaging! Data correlation in real-time
for map-matching, ETA, cost calculation, and much more…
https://eng.lyft.com/a-new-real-time-map-matching-algorithm-at-lyft-da593ab7b006

DB Musterfirma | Vorname Name | Abteilung | Datum ("Einfügen > Kopf- und Fußzeile")
15Deutsche Bahn AG | Reisendeninformation
Consistent
real-time information
for travellers
across Germany
RI-Plattform

DB Musterfirma | Vorname Name | Abteilung | Datum ("Einfügen > Kopf- und Fußzeile")
16
Customer timetable
Operational
timetable
Assignments
Railway station
knowledge
Dispositions
Train positions
Matching
Aggregation
Consolidation
Apache
Kafka
Analysis
Railway station
Trains
Mobile Apps
Employees
Deutsche Bahn AG | Reisendeninformation
RI-Plattform

Connected Car Infrastructure
17
https://www.youtube.com/watch?v=yGLKi3TMJv8
• Real Time Data Analysis
• Swarm Intelligence
• Collaboration with Partners
• Predictive AI
• …

Track, manage, and locate
tools and other equipment
anytime and anywhere from
the warehouse to the jobsite https://www.confluent.io/customers/bosch/
https://events.confluent.io/online-talks/bosch-power-toolse-nables-real-time-analytics-on-iot-event-streams

Food Value Chain
IoT-Based and Data-Driven
Single source of truth
across the food value chain
(in the factories, and across regions)
Business critical
operations
(tracking, calculations, alerts, …)
https://www.confluent.io/blog/creating-iot-based-data-driven-food-value-chain-with-confluent-cloud/

Postmodern ERP (coined by Gartner)
Replace legacy, monolithic and highly customized ERP suites
by a mixture of loosely coupled, exchangeable cloud-based and on-premises applications.
TMS
Legacy Proprietary
SOAP Web Services
Supplier
Alert
ForecastInventory Customer
Order
Core ERP
CRM
SaaS
Kafka Interface
MES
Proprietary
HTTP Web Services
LMS
Legacy Homegrown
Database + CDC
SRM
Kafka-native

Real Time Supply Chain IoT Platform @ Mojix
https://www.confluent.io/customers/mojix/
Real-time operational intelligence with complex
event processing
Inventory accuracy increased from 65% to 99%
Omnichannel sales
Built using Confluent Cloud, Kafka, Kafka Connect
and Kafka Streams
Hybrid cloud across the edge – at retail stores and
distribution centers – and the cloud
Variety of sources, including RFID readers, camera
sensors, beacons, mobile devices and routers

Cross-Company Supply Chain Integration
Streaming Replication and API Management
MirrorMaker 2
Confluent Replicator
Cluster Linking
Tier 2
Supplier
OEM Streaming integration
between companies
API Management
(REST et al) is not
appropriate for
streaming data
Infosec and politics are
your biggest hurdle
Tier 1
Supplier

Cyber Intelligence Platform
leveraging Kafka Connect, Kafka Streams, Multi-Region Clusters (MRC), and more…
https://www.intel.com/content/www/us/en/it-management/intel-it-best-practices/modern-scalable-cyber-intelligence-platform-kafka.html

Real Time Streaming Machine Learning at the Edge @ Severstal
https://www.confluent.io/customers/severstal/

BMW Group
Industry-ready NLP Service Framework Based on Kafka
https://www.confluent.io/kafka-summit-lon19/industry-ready-nlp-service-framework-kafka/

Direct streaming ingestion
for model training
with TensorFlow I/O + Kafka Plugin
(no additional data storage
like S3 or HDFS required!)
Time
Model BModel A
Producer
Distributed
Commit Log
Streaming Ingestion and Model Training
with Kafka, Tiered Storage and TensorFlow IO
https://github.com/tensorflow/io
26
Model X
(at a later time)

Confluent Tiered Storage for Kafka
Object Store
Processing Storage
Transactions,
auth, quota
enforcement,
compaction, ...
Local
Remote
Kafka
Apps
Store Forever
Older data is offloaded to inexpensive object
storage, permitting it to be consumed at any time.
Save $$$
Storage limitations, like capacity and duration,
are effectively uncapped.
Instantaneously scale up and down
Your Kafka clusters will be able to automatically
self-balance load and hence elastically scale
(Only available in Confluent Platform)

BI
Tool
AI/ML
Machine Vision for Quality Assurance and Yield Management
Apache Kafka and Applied Machine Learning
Filter, transform
aggregate, orchestrate
APP
Real-time alerting
Sensor Data
SCADA
MES
PLCs
OT
Team
Plant
Manager
Images
from Products
of Assembly Lines
IT
Team
Live
Ops
Machine Vision for
Quality Inspection
Reporting
Backup
Data Science Team
Data Lake

Kafka Deployments
in the
Energy Sector

Edge Integration and Analytics @ WPX Energy
Edge processing and
replication to the cloud
in real-time at scale
in the oil&gas industry
https://www.prweb.com/releases/wpx_energy_aims_to_improve_drilling_and_completion_operations_with_hivecell_edge_as_a_service_for_confluent/prweb17599610.htm

Tesla
Trillions of messages per day for IoT use cases
https://www.confluent.io/kafka-summit-san-francisco-2019/0-60-teslas-streaming-data-platform/
https://www.confluent.io/blog/stream-processing-iot-data-best-practices-and-techniques/

Edge, Hybrid, Global
IoT
Architectures

Global Event Streaming
Aggregate Small Footprint
Edge Deployments with
Replication (Aggregation)
Simplify Disaster Recovery
Operations with
Multi-Region Clusters
with RPO=0 and RTO=0
Stream Data Globally with
Replication and Cluster Linking

CRM
3rd party
payment
provider
Real-Time
Asset Management
Customer data
Payment processing and
fraud detection as a service
Manager
Outage Management
API
Customer Customer
Customer
data
Truck
schedule
Payment
data
Route
details
Streams of real time events
Customer
data
Train
schedule
Payment
data
Loyalty
information
Customer
data
Train
schedule
Payment
data
Loyalty
information
Energy Production and Distribution
with a Hybrid Architecture
Wavelength
Public 5G
Campus #1 5G
Wavelength
Campus #2 5GSmart
Meters
Smart
Building

Smart
Grid
Upstream
Operations
Management
Manufacturing
Process
Customer
data
Truck
schedule
Payment
data
Route
details
Real-Time
Supply Chain
Management
Event Streaming for Energy Production
at the Edge with a 5G Campus Network
Wavelength
Public 5GPublic 5G
Campus 5G
Smart
Home

Energy Production at the Disconnected Edge
Time
P
C1
C2
C3
Predictive Analytics
Human
Machine
Interface
Predictive
Maintenance
Always on (even “offline”)
Replayability
Reduced traffic cost
Better latency
Sensors

OSIsoft PI, Siemens MindSphere, et al
Event Streaming
+
OT Middleware

Kafka Connect
Kafka Cluster
CRM
Integration
Domain-Driven Design for your Integration Layer
OT
Integration
Custom
Application
OSIsoft PI
Java / KSQL /
Kafka Streams
Schema
Registry
Event Streaming Platform
Customer
Domain
OT
Domain
Asset Management
Domain
è Independent and loosely coupled, but scalable, highly available and reliable!

Year 0: Direct Communication between OT and IT App
Application
1) Direct Legacy Mainframe Communication to App
Date Value
1/27/2017 4.56
1/22/2017 32.14
Utilities Infra ‘1970’
(Proprietary PLCs)

Year 1: Kafka for Decoupling between OT and IT App
Application
1) Direct OT Communication to IT App
2) Kafka for Decoupling between OT and IT App
Date Amount
1/27/2017 4.56
1/22/2017 32.14
OT Integration
- Change Data Capture (IIDR)
- Kafka Connect (JMS, MQ, JDBC)
- REST Proxy
- Kafka Client
- 3rd Party Tool like OSIsoft PI
(Proprietary PLCs)

Year 2 to 4: New Projects and Applications
Application
Kafka-native
Applications
Agile, Lightweight
(but Scalable, Robust)
Applications
Big Data Project
(Elastic, Spark,
AWS Services, …)
3) New IT Projects and Applications
External
Solution
Date Amount
1/27/2017 4.56
1/22/2017 32.14
OT Integration
- Change Data Capture (IIDR)
- Kafka Connect (JMS, MQ, JDBC)
- REST Proxy
- Kafka Client
- 3rd Party Tool like OSIsoft PI
(Proprietary PLCs)

Year 5: Proprietary PLC Replacement
Application
Agile, Lightweight
(but Scalable, Robust)
Applications
Big Data Project
(Elastic, Spark,
AWS Services, …)
3) New IT Projects and Applications
4) Proprietary PLC Replacement
External
Solution
(Modern Technology)
Date Amount
1/27/2017 4.56
1/22/2017 32.14
Kafka-native
Applications

for 100000 Connected Devices
Example

Smart Meters - High Frequency Noise Filter
48
Asset
Monitoring
Device
Gateway
ksqlDB
Filter
Real-Time
Reporting
~500GB/day 5GB/day
2 readings/hour * 24 hours * 10kB * 1M meters = ~480GB
100x reduction

Cloud Aggregator for Field Management
49
ksqlDB
Replicator
Asset
Management
ksqlDB
Status
Updates
Pull Query
Site
Information
Location
Filter

Real-Time Outage Management
for a Better Customer Experience
Energy
App
Backend
Infrastructure
Hurricane Power
Outage
Alert
Estimated
Outage
Time
Consume
Food

for Predictive Maintenance with a Digital Twin
51
MQTT
Proxy
Kafka
Cluster
Kafka
Connect
Devices
Kafka Ecosystem
TensorFlow
Other Components
Kafka
Streams
(Java)
All
Data
Critical
Data
Ingest
Data
Potential
Detect
KSQL
TensorFlow
Train
Analytic
Model
Consume
Data
Preprocess
Data
Analytic
Model
Deploy
Analytic
Model
Python
https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference
MongoDB
Storage
MongoDB
Dashboards
Search
Analytics
Mobile App
BI Tool

Architecture for 100000 Connected Devices
Kafka + KSQL + MQTT + TensorFlow + Kubernetes
52
https://www.kai-waehner.de/blog/2019/11/08/live-demo-iot-100-000-connected-cars-kubernetes-kafka-mqtt-tensorflow/

The Rise of Event Streaming
2010
Apache Kafka
created at LinkedIn by
Confluent founders
2014
2020
80%
Fortune 100
Companies
trust and use
Apache Kafka

56
I N V E S T M E N T & T I M E
VALUE
3
4
5
1
2
Event Streaming Maturity Model
56
Initial Awareness /
Pilot (1 Kafka Cluster)
Start to Build Pipeline /
Deliver 1 New Outcome
(1 Kafka Cluster)
Mission-Critical
Deployment
(Stretched, Hybrid,
Multi-Region)
Build Contextual Event-
Driven Apps
(Stretched, Hybrid,
Multi-Region)
Central Nervous System
(Global Kafka)
Product, Support, Training, Partners, Technical Account Management...

57Confluent Platform
Fully Managed Cloud ServiceSelf Managed Software FREEDOM OF CHOICE
COMMITTER-DRIVEN EXPERTISE PartnersTrainingProfessional
Services
Enterprise
Support
Apache Kafka
EFFICIENT
OPERATIONS AT SCALE
PRODUCTION-
STAGE PREREQUISITES
UNRESTRICTED
DEVELOPER PRODUCTIVITY
SQL-based Stream Processing
KSQL (ksqlDB)
Rich Pre-built Ecosystem
Connectors | Hub | Schema Registry
Multi-language Development
non-Java clients | REST Proxy
GUI-driven Mgmt & Monitoring
Control Center
Flexible DevOps Automation
Operator | Ansible
Dynamic Performance &
Elasticity
Auto Data Balancer | Tiered Storage
Enterprise-grade Security
RBAC | Secrets | Audit logs
Data Compatibility
Schema Registry | Schema
Validation
Global Resilience
Multi-Region Clusters | Replicator
Developer Operator Architect
Open Source | Community licensed
PARTNERSHIP
FOR BUSINESS SUCCESS
Complete Engagement Model
Revenue / Cost / Risk Impact
TCO / ROI
Executive Buyer

Kai Waehner
Field CTO
contact@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
www.confluent.io
LinkedIn
Questions? Feedback?
Let’s connect!

Apache Kafka for Smart Grid, Utilities and Energy Production

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Apache Kafka for Smart Grid, Utilities and Energy Production

Similar to Apache Kafka for Smart Grid, Utilities and Energy Production (20)

More from Kai Wähner

More from Kai Wähner (16)

Recently uploaded

Recently uploaded (20)

Apache Kafka for Smart Grid, Utilities and Energy Production