SlideShare a Scribd company logo
1 of 32
Samza SQL
Srinivasulu Punuru
Agenda
1 What is Samza SQL?
2 Why SQL on Samza?
3 How does it work?
4 Demo
5 Q&A
What is Samza SQL?
Samza SQL by Example
Count page views of each member in a five minute window.
Send the result to kafka topic PageViewCount.
Samza low level task API
Repartitioner Job
public class PageViewRepartitioner implements StreamTask {
SystemStream outputStream = new SystemStream("kafka", "pvMemberId");
@Override
public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) {
PageViewEvent pageViewEvent = (PageViewEvent) envelope.getMessage();
String key = pageViewEvent.getMemberId();
OutgoingMessageEnvelope outMessage = new OutgoingMessageEnvelope(outputStream, pageViewEvent, key, pageViewEvent);
collector.send(outMessage);
}
}
Samza low level task API (contd.)
Page view counter
job
public class PageViewCounter implements StreamTask {
SystemStream outputStream = new SystemStream("kafka", "pageviewCount");
private Instant lastTriggerTime = Instant.now();
private HashMap<String, Integer> counter = new HashMap<>();
@Override
public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) {
PageViewEvent pageViewEvent = (PageViewEvent) envelope.getMessage();
String memberId = pageViewEvent.getMemberId();
counter.put(memberId, counter.getOrDefault(memberId, 0) + 1);
if (Duration.between(lastTriggerTime, Instant.now()).toMinutes() > 5) {
counter.forEach((key, value) -> collector.send(new OutgoingMessageEnvelope(outputStream, key, value)));
counter.clear();
}
}
}
Samza high level API
public class PageViewCountApplication implements StreamApplication {
@Override public void init(StreamGraph graph, Config config) {
MessageStream<PageViewEvent> pageViewEvents = graph.getInputStream("pageView" );
MessageStream pageViewCount = graph.getOutputStream("pageViewCount" );
pageView
.partitionBy(m -> m.memberId)
.window(Windows.keyedTumblingWindow(m -> m.memberId, Duration.ofMinutes(5),
initialValue, (m, c) -> c + 1))
.map(MyStreamOutput::new)
.sendTo(pageViewPerMember);
}
}
Samza SQL
INSERT INTO kafka.pageViewCount
SELECT memberId, count(*) FROM kafka.pageViewStream
GROUP BY memberId, TUMBLE(current_timestamp, INTERVAL '5' MINUTES)
Samza API stack
User can choose the API to
write a Samza job.
Why SQL on Samza
• Expand the target audience of stream processing.
• Obtain quick real time insights.
• Create stream processing applications quickly.
How does it work?
How do we execute below SQL on Samza?
INSERT INTO kafka.NewEmployees
SELECT firstName, lastName FROM kafka.profileUpdateStream
WHERE profile.newCompany = ‘LinkedIn’
High level architecture
Samza SQL to Calcite relational algebra
INSERT INTO kafka.NewLinkedInEmployees
SELECT firstName, lastName FROM kafka.profileChange
WHERE profile.newCompany = ‘LinkedIn’
LogicalTableModify
LogicalProject
LogicalFilter
LogicalTableScan
Samza operator graph conversion
LogicalTableModify
LogicalProject
LogicalFilter
LogicalTableScan
profileChange
.filter(p -> p.getNewCompany().equals("LinkedIn"))
.map(this::getFirstAndLastName)
.sendTo(newLinkedInEmployees);
Samza SQL message flow
Samza SQL message flow
Samza SQL rel message format
public class SamzaSqlRelMessage {
private final List<Object> relFieldValues = new ArrayList<>();
private final List<String> relFieldNames = new ArrayList<>();
public List<String> getRelFieldNames() {
return relFieldNames;
}
public List<Object> getRelFieldValues() {
return this.relFieldValues;
}
}
• Simple relational format that represents a row in a table
• Ordered list of named values
Pluggable input/output resolvers
INSERT INTO kafka.NewEmployees
SELECT firstName, lastName FROM kafka.profileUpdateStream
WHERE profile.newCompany = ‘LinkedIn’
Samza SQL architecture
Demo
Demo setup
How do you use it?
• Samza SQL is available in Samza 0.14 release.
• Tutorial – http://bit.ly/samzasql
Samza– 0.14
• Samza SQL
• Projection, Filtering, UDFs, Flatten, Union, Avro
• Apache Beam runner for Samza
• Azure EventHub support
• Amazon kinesis support
• Multi stage batch support
• High level API improvements
• Durable state
• Programmable SerDe
Samza SQL- Future
• Joins (Stream-Stream & Stream-Table)
• Aggregates & aggregate UDF
• Full Subquery support
• Samza SQL as a service
Samza SQL- Future
• Joins (Stream-Stream & Stream-Table)
• Aggregates & aggregate UDF
• Full Subquery support
• Samza SQL as a service
Questions?
Thank you
Samza operator graph conversion
LogicalTableModify
LogicalProject
LogicalFilter
LogicalTableScan
Pluggable schema and message converters

More Related Content

What's hot

KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafkaconfluent
 
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward
 
High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...
High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...
High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...HostedbyConfluent
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registryconfluent
 
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNApache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNblueboxtraveler
 
Going Reactive with Spring 5
Going Reactive with Spring 5Going Reactive with Spring 5
Going Reactive with Spring 5Drazen Nikolic
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward
 
Diving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka ConnectDiving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka Connectconfluent
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Kafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingKafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingYaroslav Tkachenko
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19confluent
 
How to manage large amounts of data with akka streams
How to manage large amounts of data with akka streamsHow to manage large amounts of data with akka streams
How to manage large amounts of data with akka streamsIgor Mielientiev
 
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward
 
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...confluent
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...confluent
 
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...StreamNative
 
Actors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesActors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesYaroslav Tkachenko
 

What's hot (20)

KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
 
High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...
High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...
High Available Task Scheduling Design using Kafka and Kafka Streams | Naveen ...
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNApache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
 
Going Reactive with Spring 5
Going Reactive with Spring 5Going Reactive with Spring 5
Going Reactive with Spring 5
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
 
Diving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka ConnectDiving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka Connect
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Kafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingKafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processing
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19
 
How to manage large amounts of data with akka streams
How to manage large amounts of data with akka streamsHow to manage large amounts of data with akka streams
How to manage large amounts of data with akka streams
 
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
 
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...
From Zero to Streaming Healthcare in Production (Alexander Kouznetsov, Invita...
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
 
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
Pulsar in the Lakehouse: Overview of Apache Pulsar and Delta Lake Connector -...
 
Actors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesActors or Not: Async Event Architectures
Actors or Not: Async Event Architectures
 

Similar to Stream Processing using Samza SQL

SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationYi Pan
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextPrateek Maheshwari
 
Nextcon samza preso july - final
Nextcon samza preso   july - finalNextcon samza preso   july - final
Nextcon samza preso july - finalYi Pan
 
Data Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメントData Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメントTakuya Saeki
 
SynapseIndia dotnet development ajax client library
SynapseIndia dotnet development ajax client librarySynapseIndia dotnet development ajax client library
SynapseIndia dotnet development ajax client librarySynapseindiappsdevelopment
 
Scaling asp.net websites to millions of users
Scaling asp.net websites to millions of usersScaling asp.net websites to millions of users
Scaling asp.net websites to millions of usersoazabir
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!Guido Schmutz
 
Event streaming webinar feb 2020
Event streaming webinar feb 2020Event streaming webinar feb 2020
Event streaming webinar feb 2020Maheedhar Gunturu
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012hwilming
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLScyllaDB
 
10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websites10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websitesoazabir
 
Multi Client Development with Spring
Multi Client Development with SpringMulti Client Development with Spring
Multi Client Development with SpringJoshua Long
 
SMC304 Serverless Orchestration with AWS Step Functions
SMC304 Serverless Orchestration with AWS Step FunctionsSMC304 Serverless Orchestration with AWS Step Functions
SMC304 Serverless Orchestration with AWS Step FunctionsAmazon Web Services
 
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareHostedbyConfluent
 
Spring MVC introduction HVA
Spring MVC introduction HVASpring MVC introduction HVA
Spring MVC introduction HVAPeter Maas
 
[JEEConf-2017] RxJava as a key component in mature Big Data product
[JEEConf-2017] RxJava as a key component in mature Big Data product[JEEConf-2017] RxJava as a key component in mature Big Data product
[JEEConf-2017] RxJava as a key component in mature Big Data productIgor Lozynskyi
 

Similar to Stream Processing using Samza SQL (20)

SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
Nextcon samza preso july - final
Nextcon samza preso   july - finalNextcon samza preso   july - final
Nextcon samza preso july - final
 
Data Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメントData Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメント
 
SynapseIndia dotnet development ajax client library
SynapseIndia dotnet development ajax client librarySynapseIndia dotnet development ajax client library
SynapseIndia dotnet development ajax client library
 
Scaling asp.net websites to millions of users
Scaling asp.net websites to millions of usersScaling asp.net websites to millions of users
Scaling asp.net websites to millions of users
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
 
Full Stack Scala
Full Stack ScalaFull Stack Scala
Full Stack Scala
 
Event streaming webinar feb 2020
Event streaming webinar feb 2020Event streaming webinar feb 2020
Event streaming webinar feb 2020
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
 
Spring Web MVC
Spring Web MVCSpring Web MVC
Spring Web MVC
 
10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websites10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websites
 
Multi Client Development with Spring
Multi Client Development with SpringMulti Client Development with Spring
Multi Client Development with Spring
 
Serverless Apps with AWS Step Functions
Serverless Apps with AWS Step FunctionsServerless Apps with AWS Step Functions
Serverless Apps with AWS Step Functions
 
SMC304 Serverless Orchestration with AWS Step Functions
SMC304 Serverless Orchestration with AWS Step FunctionsSMC304 Serverless Orchestration with AWS Step Functions
SMC304 Serverless Orchestration with AWS Step Functions
 
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
 
Spring MVC introduction HVA
Spring MVC introduction HVASpring MVC introduction HVA
Spring MVC introduction HVA
 
[JEEConf-2017] RxJava as a key component in mature Big Data product
[JEEConf-2017] RxJava as a key component in mature Big Data product[JEEConf-2017] RxJava as a key component in mature Big Data product
[JEEConf-2017] RxJava as a key component in mature Big Data product
 
Jsf intro
Jsf introJsf intro
Jsf intro
 

Recently uploaded

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 

Stream Processing using Samza SQL

  • 2. Agenda 1 What is Samza SQL? 2 Why SQL on Samza? 3 How does it work? 4 Demo 5 Q&A
  • 3.
  • 5. Samza SQL by Example Count page views of each member in a five minute window. Send the result to kafka topic PageViewCount.
  • 6. Samza low level task API Repartitioner Job public class PageViewRepartitioner implements StreamTask { SystemStream outputStream = new SystemStream("kafka", "pvMemberId"); @Override public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { PageViewEvent pageViewEvent = (PageViewEvent) envelope.getMessage(); String key = pageViewEvent.getMemberId(); OutgoingMessageEnvelope outMessage = new OutgoingMessageEnvelope(outputStream, pageViewEvent, key, pageViewEvent); collector.send(outMessage); } }
  • 7. Samza low level task API (contd.) Page view counter job public class PageViewCounter implements StreamTask { SystemStream outputStream = new SystemStream("kafka", "pageviewCount"); private Instant lastTriggerTime = Instant.now(); private HashMap<String, Integer> counter = new HashMap<>(); @Override public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { PageViewEvent pageViewEvent = (PageViewEvent) envelope.getMessage(); String memberId = pageViewEvent.getMemberId(); counter.put(memberId, counter.getOrDefault(memberId, 0) + 1); if (Duration.between(lastTriggerTime, Instant.now()).toMinutes() > 5) { counter.forEach((key, value) -> collector.send(new OutgoingMessageEnvelope(outputStream, key, value))); counter.clear(); } } }
  • 8. Samza high level API public class PageViewCountApplication implements StreamApplication { @Override public void init(StreamGraph graph, Config config) { MessageStream<PageViewEvent> pageViewEvents = graph.getInputStream("pageView" ); MessageStream pageViewCount = graph.getOutputStream("pageViewCount" ); pageView .partitionBy(m -> m.memberId) .window(Windows.keyedTumblingWindow(m -> m.memberId, Duration.ofMinutes(5), initialValue, (m, c) -> c + 1)) .map(MyStreamOutput::new) .sendTo(pageViewPerMember); } }
  • 9. Samza SQL INSERT INTO kafka.pageViewCount SELECT memberId, count(*) FROM kafka.pageViewStream GROUP BY memberId, TUMBLE(current_timestamp, INTERVAL '5' MINUTES)
  • 10. Samza API stack User can choose the API to write a Samza job.
  • 11. Why SQL on Samza • Expand the target audience of stream processing. • Obtain quick real time insights. • Create stream processing applications quickly.
  • 12. How does it work?
  • 13. How do we execute below SQL on Samza? INSERT INTO kafka.NewEmployees SELECT firstName, lastName FROM kafka.profileUpdateStream WHERE profile.newCompany = ‘LinkedIn’
  • 15. Samza SQL to Calcite relational algebra INSERT INTO kafka.NewLinkedInEmployees SELECT firstName, lastName FROM kafka.profileChange WHERE profile.newCompany = ‘LinkedIn’ LogicalTableModify LogicalProject LogicalFilter LogicalTableScan
  • 16. Samza operator graph conversion LogicalTableModify LogicalProject LogicalFilter LogicalTableScan profileChange .filter(p -> p.getNewCompany().equals("LinkedIn")) .map(this::getFirstAndLastName) .sendTo(newLinkedInEmployees);
  • 19. Samza SQL rel message format public class SamzaSqlRelMessage { private final List<Object> relFieldValues = new ArrayList<>(); private final List<String> relFieldNames = new ArrayList<>(); public List<String> getRelFieldNames() { return relFieldNames; } public List<Object> getRelFieldValues() { return this.relFieldValues; } } • Simple relational format that represents a row in a table • Ordered list of named values
  • 20. Pluggable input/output resolvers INSERT INTO kafka.NewEmployees SELECT firstName, lastName FROM kafka.profileUpdateStream WHERE profile.newCompany = ‘LinkedIn’
  • 22. Demo
  • 24. How do you use it? • Samza SQL is available in Samza 0.14 release. • Tutorial – http://bit.ly/samzasql
  • 25. Samza– 0.14 • Samza SQL • Projection, Filtering, UDFs, Flatten, Union, Avro • Apache Beam runner for Samza • Azure EventHub support • Amazon kinesis support • Multi stage batch support • High level API improvements • Durable state • Programmable SerDe
  • 26. Samza SQL- Future • Joins (Stream-Stream & Stream-Table) • Aggregates & aggregate UDF • Full Subquery support • Samza SQL as a service
  • 27. Samza SQL- Future • Joins (Stream-Stream & Stream-Table) • Aggregates & aggregate UDF • Full Subquery support • Samza SQL as a service
  • 29.
  • 31. Samza operator graph conversion LogicalTableModify LogicalProject LogicalFilter LogicalTableScan
  • 32. Pluggable schema and message converters