SlideShare a Scribd company logo
Stream Processing with Ballerina
S. Suhothayan
Director - Engineering, WSO2
The Problem
○ Integration is not always about request-response.
○ Highly scalable systems use Event Driven Architecture to asynchronously
communicate between multiple processing units.
○ Processing events from Webhooks, CDC, Realtime ETLs, and notification
systems fall into asynchronous event driven systems.
What is a Stream?
An unbounded continuous flow of records (having the same format)
E.g., sensor events, triggers from Webhooks, messages from MQ
Why Stream Processing ?
Doing continuous processing on the data forever !
Such as :
○ Monitor and detect anomalies
○ Real-time ETL
○ Streaming aggregations (e.g., average service response time in last 5
minutes)
○ Join/correlate multiple data streams
○ Detecting complex event patterns or trends
Stream Processing Constructs
○ Projection
○ Modifying the structure of the stream
○ Filter
○ Windows & Aggregations
○ Collection of streaming events over a time or length duration
(last 5 min or last 50 events)
○ Viewed in a sliding or tumbling manner
○ Aggregated over window (e.g., sum, count, min, max, avg, etc)
○ Joins
○ Joining multiple streams
○ Detecting Patterns
○ Trends, non-occurrence of events
How to write Stream Processing Logic?
Use language libraries :
○ Have different functions for each stream processor construct.
○ Pros: You can use the same language for implementation.
○ Cons: Quickly becomes very complex and messy.
User SQL dialog :
○ Use easy-to-use SQL to script the logic
○ Pros: Compact and easy to write the logic.
○ Cons: Need to write UDFs, which SQL does not support.
Solution for Programing Streaming Efficiently
Merging SQL and native programing
1. Consuming events to Ballerina using standard language constructs
○ Via HTTP, HTTP2, WebSocket, JMS and more.
2. Generate streams out of consumed data
○ Map JSON/XML/text messages into a record.
3. Define SQL to manipulate and process data in real time
○ If needed, use Ballerina functions within SQL
4. Generate output streams
5. Use standard language constructs to handle the output or send to an
endpoint
“
Having lots of sensors, among all valid sensors,
detect the sensors that have sent sensor readings
greater than 100 in total within the last minute.
A Use Case
Let’s see some
beautiful code!
Ballerina Stream Processing
Ballerina Stream Processing type Alert record {
string name; int total;
};
type SensorData record {
string name; int reading;
};
Define input and output
record types
Ballerina Stream Processing type Alert record {
string name; int total;
};
type SensorData record {
string name; int reading;
};
function alertQuery(
stream<SensorData> sensorDataStream,
stream<Alert> alertStream) {
}
Define input and output
record types
Function with
input/output Streams
Ballerina Stream Processing type Alert record {
string name; int total;
};
type SensorData record {
string name; int reading;
};
function alertQuery(
stream<SensorData> sensorDataStream,
stream<Alert> alertStream) {
forever {
}
}
Define input and output
record types
Function with
input/output Streams
Forever block
Ballerina Stream Processing type Alert record {
string name; int total;
};
type SensorData record {
string name; int reading;
};
function alertQuery(
stream<SensorData> sensorDataStream,
stream<Alert> alertStream) {
forever {
from sensorDataStream
where reading > 0
window time(60000)
select name, sum(reading) as total
group by name
having total > 100
}
}
Define input and output
record types
Function with
input/output Streams
Forever block
Among all valid sensors, select
ones having greater than 100 reading
in total within the last minute
Ballerina Stream Processing type Alert record {
string name; int total;
};
type SensorData record {
string name; int reading;
};
function alertQuery(
stream<SensorData> sensorDataStream,
stream<Alert> alertStream) {
forever {
from sensorDataStream
where reading > 0
window time(60000)
select name, sum(reading) as total
group by name
having total > 100
=> (Alert[] alerts) {
alertStream.publish(alerts);
}
}
}
Define input and output
record types
Function with
input/output Streams
Forever block
Among all valid sensors, select
ones having greater than 100 reading
in total within the last minute
Send Alert
Some more queries
Joining Two Streams Over Time
// Detect raw material input falls below 5% of the rate of production consumption
forever {
from productionInputStream window time(10000) as p
join rawMaterialStream window time(10000) as r
on r.name == p.name
select r.name, sum(r.amount) as totalRawMaterial, sum(p.amount) as totalConsumed
group by r.name
having ((totalRawMaterial - totalConsumed) * 100.0 / totalRawMaterial) < 5
=> (MaterialUsage[] materialUsages) {
materialUsageStream.publish(materialUsages);
}
}
Detecting Patterns Within Streams
// Detect small purchase transaction followed by a huge purchase transaction
// from the same card within a day
forever {
from every PurchaseStream where price < 20 as e1
followed by PurchaseStream where price > 200 && e1.id == id as e2
within 1 day
select e1.id as cardId, e1.price as initialPayment, e2.price as finalPayment
=> (Alert[] alerts) {
alertStream.publish(alerts);
}
}
Building Autonomous Services
○ Process incoming messages or
locally produced events
○ Process events at the receiving
node without sending to
centralised system
○ Services can monitor themselves
throw inbuilt matric streams
producing events locally
○ Do local optimizations and take
actions autonomously
Stream Processing at the Edge
○ Support microservices architecture
○ Summarize data at the edge.
○ When possible, take localized decisions.
○ Reduce the amount of data transferred
to the central node.
○ Ability to run independently
○ Highly scalable
The Roadmap
○ Support stream processing to incorporate Ballerina’s custom functions.
○ Building Ballerina Stream Processing using Ballerina.
○ Support streams joining with tables.
○ Improve query language.
○ Support State Recovery.
○ Support High Availability.
Q & A
THANK YOU

More Related Content

What's hot

Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности CassandraАндрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Olga Lavrentieva
 
Druid
DruidDruid
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy IndustriesWebinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
MongoDB
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
DoiT International
 
Cassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsCassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analytics
Anirvan Chakraborty
 
Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScale
MariaDB plc
 
Log Events @Twitter
Log Events @TwitterLog Events @Twitter
Log Events @Twitter
lohitvijayarenu
 
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandracodecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
DataStax Academy
 
Story of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streamingStory of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streaming
lohitvijayarenu
 
Open Source india 2014
Open Source india 2014Open Source india 2014
Open Source india 2014
lohitvijayarenu
 
Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013DataStax Academy
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
Omid Vahdaty
 
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processing
Yogi Devendra Vyavahare
 
Scaling event aggregation at twitter
Scaling event aggregation at twitterScaling event aggregation at twitter
Scaling event aggregation at twitter
lohitvijayarenu
 
Cassandra data access
Cassandra data accessCassandra data access
Cassandra data access
techblog
 
druid.io
druid.iodruid.io
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
HBaseCon
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Symantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in actionSymantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in action
DataStax Academy
 
Improve your SQL workload with observability
Improve your SQL workload with observabilityImprove your SQL workload with observability
Improve your SQL workload with observability
OVHcloud
 

What's hot (20)

Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности CassandraАндрей Козлов (Altoros): Оптимизация производительности Cassandra
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
 
Druid
DruidDruid
Druid
 
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy IndustriesWebinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Cassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analyticsCassandra as event sourced journal for big data analytics
Cassandra as event sourced journal for big data analytics
 
Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScale
 
Log Events @Twitter
Log Events @TwitterLog Events @Twitter
Log Events @Twitter
 
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandracodecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
 
Story of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streamingStory of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streaming
 
Open Source india 2014
Open Source india 2014Open Source india 2014
Open Source india 2014
 
Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
 
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processing
 
Scaling event aggregation at twitter
Scaling event aggregation at twitterScaling event aggregation at twitter
Scaling event aggregation at twitter
 
Cassandra data access
Cassandra data accessCassandra data access
Cassandra data access
 
druid.io
druid.iodruid.io
druid.io
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Symantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in actionSymantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in action
 
Improve your SQL workload with observability
Improve your SQL workload with observabilityImprove your SQL workload with observability
Improve your SQL workload with observability
 

Similar to Stream Processing with Ballerina

Stream Processing with Ballerina
Stream Processing with BallerinaStream Processing with Ballerina
Stream Processing with Ballerina
Ballerina
 
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2
 
Introducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event ProcessorIntroducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event ProcessorWSO2
 
Strtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP EngineStrtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP Engine
Myung Ho Yun
 
Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features WSO2
 
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
confluent
 
Parallel Complex Event Processing
Parallel Complex Event ProcessingParallel Complex Event Processing
Parallel Complex Event ProcessingKarol Grzegorczyk
 
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor WSO2
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
Maycon Viana Bordin
 
INTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZUREINTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZURE
DotNetCampus
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
Suneel Marthi
 
Reactive Extensions
Reactive ExtensionsReactive Extensions
Reactive Extensions
Dmitri Nesteruk
 
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
InfluxData
 
Reactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NETReactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NET
EPAM
 
Fabric - Realtime stream processing framework
Fabric - Realtime stream processing frameworkFabric - Realtime stream processing framework
Fabric - Realtime stream processing framework
Shashank Gautam
 
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Todd Whitehead
 
The program will read the file like this, java homework6Bank sma.pdf
The program will read the file like this, java homework6Bank sma.pdfThe program will read the file like this, java homework6Bank sma.pdf
The program will read the file like this, java homework6Bank sma.pdf
ivylinvaydak64229
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Accumulo Summit
 
Flink Forward SF 2017: Konstantinos Kloudas - Extending Flink’s Streaming APIs
Flink Forward SF 2017: Konstantinos Kloudas -  Extending Flink’s Streaming APIsFlink Forward SF 2017: Konstantinos Kloudas -  Extending Flink’s Streaming APIs
Flink Forward SF 2017: Konstantinos Kloudas - Extending Flink’s Streaming APIs
Flink Forward
 
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Flink Forward
 

Similar to Stream Processing with Ballerina (20)

Stream Processing with Ballerina
Stream Processing with BallerinaStream Processing with Ballerina
Stream Processing with Ballerina
 
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
 
Introducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event ProcessorIntroducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event Processor
 
Strtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP EngineStrtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP Engine
 
Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features Complex Event Processor 3.0.0 - An overview of upcoming features
Complex Event Processor 3.0.0 - An overview of upcoming features
 
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
 
Parallel Complex Event Processing
Parallel Complex Event ProcessingParallel Complex Event Processing
Parallel Complex Event Processing
 
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
INTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZUREINTERNET OF THINGS & AZURE
INTERNET OF THINGS & AZURE
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
 
Reactive Extensions
Reactive ExtensionsReactive Extensions
Reactive Extensions
 
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
 
Reactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NETReactive Extensions: classic Observer in .NET
Reactive Extensions: classic Observer in .NET
 
Fabric - Realtime stream processing framework
Fabric - Realtime stream processing frameworkFabric - Realtime stream processing framework
Fabric - Realtime stream processing framework
 
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
 
The program will read the file like this, java homework6Bank sma.pdf
The program will read the file like this, java homework6Bank sma.pdfThe program will read the file like this, java homework6Bank sma.pdf
The program will read the file like this, java homework6Bank sma.pdf
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
Flink Forward SF 2017: Konstantinos Kloudas - Extending Flink’s Streaming APIs
Flink Forward SF 2017: Konstantinos Kloudas -  Extending Flink’s Streaming APIsFlink Forward SF 2017: Konstantinos Kloudas -  Extending Flink’s Streaming APIs
Flink Forward SF 2017: Konstantinos Kloudas - Extending Flink’s Streaming APIs
 
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
 

Recently uploaded

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 

Recently uploaded (20)

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 

Stream Processing with Ballerina

  • 1. Stream Processing with Ballerina S. Suhothayan Director - Engineering, WSO2
  • 2. The Problem ○ Integration is not always about request-response. ○ Highly scalable systems use Event Driven Architecture to asynchronously communicate between multiple processing units. ○ Processing events from Webhooks, CDC, Realtime ETLs, and notification systems fall into asynchronous event driven systems.
  • 3. What is a Stream? An unbounded continuous flow of records (having the same format) E.g., sensor events, triggers from Webhooks, messages from MQ
  • 4. Why Stream Processing ? Doing continuous processing on the data forever ! Such as : ○ Monitor and detect anomalies ○ Real-time ETL ○ Streaming aggregations (e.g., average service response time in last 5 minutes) ○ Join/correlate multiple data streams ○ Detecting complex event patterns or trends
  • 5. Stream Processing Constructs ○ Projection ○ Modifying the structure of the stream ○ Filter ○ Windows & Aggregations ○ Collection of streaming events over a time or length duration (last 5 min or last 50 events) ○ Viewed in a sliding or tumbling manner ○ Aggregated over window (e.g., sum, count, min, max, avg, etc) ○ Joins ○ Joining multiple streams ○ Detecting Patterns ○ Trends, non-occurrence of events
  • 6. How to write Stream Processing Logic? Use language libraries : ○ Have different functions for each stream processor construct. ○ Pros: You can use the same language for implementation. ○ Cons: Quickly becomes very complex and messy. User SQL dialog : ○ Use easy-to-use SQL to script the logic ○ Pros: Compact and easy to write the logic. ○ Cons: Need to write UDFs, which SQL does not support.
  • 7. Solution for Programing Streaming Efficiently Merging SQL and native programing 1. Consuming events to Ballerina using standard language constructs ○ Via HTTP, HTTP2, WebSocket, JMS and more. 2. Generate streams out of consumed data ○ Map JSON/XML/text messages into a record. 3. Define SQL to manipulate and process data in real time ○ If needed, use Ballerina functions within SQL 4. Generate output streams 5. Use standard language constructs to handle the output or send to an endpoint
  • 8. “ Having lots of sensors, among all valid sensors, detect the sensors that have sent sensor readings greater than 100 in total within the last minute. A Use Case
  • 11. Ballerina Stream Processing type Alert record { string name; int total; }; type SensorData record { string name; int reading; }; Define input and output record types
  • 12. Ballerina Stream Processing type Alert record { string name; int total; }; type SensorData record { string name; int reading; }; function alertQuery( stream<SensorData> sensorDataStream, stream<Alert> alertStream) { } Define input and output record types Function with input/output Streams
  • 13. Ballerina Stream Processing type Alert record { string name; int total; }; type SensorData record { string name; int reading; }; function alertQuery( stream<SensorData> sensorDataStream, stream<Alert> alertStream) { forever { } } Define input and output record types Function with input/output Streams Forever block
  • 14. Ballerina Stream Processing type Alert record { string name; int total; }; type SensorData record { string name; int reading; }; function alertQuery( stream<SensorData> sensorDataStream, stream<Alert> alertStream) { forever { from sensorDataStream where reading > 0 window time(60000) select name, sum(reading) as total group by name having total > 100 } } Define input and output record types Function with input/output Streams Forever block Among all valid sensors, select ones having greater than 100 reading in total within the last minute
  • 15. Ballerina Stream Processing type Alert record { string name; int total; }; type SensorData record { string name; int reading; }; function alertQuery( stream<SensorData> sensorDataStream, stream<Alert> alertStream) { forever { from sensorDataStream where reading > 0 window time(60000) select name, sum(reading) as total group by name having total > 100 => (Alert[] alerts) { alertStream.publish(alerts); } } } Define input and output record types Function with input/output Streams Forever block Among all valid sensors, select ones having greater than 100 reading in total within the last minute Send Alert
  • 17. Joining Two Streams Over Time // Detect raw material input falls below 5% of the rate of production consumption forever { from productionInputStream window time(10000) as p join rawMaterialStream window time(10000) as r on r.name == p.name select r.name, sum(r.amount) as totalRawMaterial, sum(p.amount) as totalConsumed group by r.name having ((totalRawMaterial - totalConsumed) * 100.0 / totalRawMaterial) < 5 => (MaterialUsage[] materialUsages) { materialUsageStream.publish(materialUsages); } }
  • 18. Detecting Patterns Within Streams // Detect small purchase transaction followed by a huge purchase transaction // from the same card within a day forever { from every PurchaseStream where price < 20 as e1 followed by PurchaseStream where price > 200 && e1.id == id as e2 within 1 day select e1.id as cardId, e1.price as initialPayment, e2.price as finalPayment => (Alert[] alerts) { alertStream.publish(alerts); } }
  • 19. Building Autonomous Services ○ Process incoming messages or locally produced events ○ Process events at the receiving node without sending to centralised system ○ Services can monitor themselves throw inbuilt matric streams producing events locally ○ Do local optimizations and take actions autonomously
  • 20. Stream Processing at the Edge ○ Support microservices architecture ○ Summarize data at the edge. ○ When possible, take localized decisions. ○ Reduce the amount of data transferred to the central node. ○ Ability to run independently ○ Highly scalable
  • 21. The Roadmap ○ Support stream processing to incorporate Ballerina’s custom functions. ○ Building Ballerina Stream Processing using Ballerina. ○ Support streams joining with tables. ○ Improve query language. ○ Support State Recovery. ○ Support High Availability.
  • 22. Q & A