SlideShare a Scribd company logo
Building a Real-Time Analytics
Application with
Apache Pulsar and Apache Pinot
Mark Needham
@MarkHNeedham
15th November 2022
Mary Grygleski
@mgrygles
Mary Grygleski
The Passionate Developer Advocate
Mary is a Streaming Developer Advocate at DataStax, a
leading Data Management Company that specializes in
Database-as-a-Service, NoSQL, Big Data, Streaming, and
the Cloud-Native platform. Previously she was with the
Java and WebSphere/Open Source Advocacy team at
IBM.
Based out of Chicago, Mary is a Java Champion and
President and Executive Board Member of the Chicago
Java Users Group (CJUG). She is also co-organizers for
the Data, Cloud and AI In Chicago, Chicago Cloud, and
IBM Cloud Chicago meetup groups.
She has extensive experience in product and application
design, development, integration, and deployment
experience, and specializes in Event-driven, Reactive
Java, Open Source, and Cloud-enabled Distributed
systems.
https://www.linkedin.com/in/mary-grygleski/
@mgrygles
https://www.twitch.tv/mgrygles
https://discord.gg/RMU4Juw
Who is Mary?
Mark Needham
Developer Relations Engineer
Mark Needham is an Apache Pinot advocate and
developer relations engineer at StarTree.
As a developer relations engineer, Mark helps users
learn how to use Apache Pinot to build their real-time
user-facing analytics applications. He also does
developer experience, simplifying the getting started
experience by making product tweaks and
improvements to the documentation.
Mark writes about his experiences working with Pinot at
markhneedham.com.
https://www.linkedin.com/in/markhneedham/
@markhneedham
Who is Mark?
https://www.markhneedham.com/blog/
learndatawithmark.com
What is Real-Time Analytics?
Real-time analytics is the discipline that applies logic and mathematics
to data to provide insights for making better decisions quickly.
Events
Events
Events -> Insight
Events Insight
Events -> Insight -> Action
Events Insight Action
The value of data over time
Time
Value
The value of data over time
Time
Value
Real-Time
The value of data over time
Time
Value
Real-Time
Who’s interested in this data?
● Analysts
● Management
● Users
Real-Time Analytics Quadrant
Human Facing
Machine Facing
Internal External
Observability
Real-Time
Dashboard
Recommendation Engine
Fraud Detection
Order Tracking Service
Total users 700 Million
QPS 10000+
Latency SLA < 100 ms p99th
Freshness Seconds
Examples of Real-Time Analytics
Examples of Real-Time Analytics
Missed
orders
Inaccurate
orders
Downtime
Top selling
items
Menu item
Feedback
Total users 500,000+
QPS 100s
Latency SLA < 100 ms p99th
Freshness Seconds - Minutes
Examples of Real-Time Analytics
Source:
Peter Bakkum, Engineering Manager @Stripe Financial
Properties of Real-Time Analytics Systems
Building a User-facing Real-Time Analytics System
Velocity of
ingestion
Real-Time
Ingestion
1000s of QPS
Milliseconds
Latency
Seconds
Freshness
Highly
Available Scalable
Cost
Effective
High
Dimensionality
What is Apache Pulsar?
18
Open source
Created by Yahoo
Contributed to the Apache Software Foundation (ASF) in 2016
Top-level project (2018)
Cloud-native design
Cluster based
Multi-tenant
Simple client APIs (Java, C#, Python, Go, …)
➔ Separate compute and storage!
Guaranteed message delivery
If a message successfully reaches a Pulsar broker, it will be delivered to its
intended target.
Light-weight serverless functions framework
Create complex processing logic within a Pulsar cluster (aka: data
pipeline)
Tiered storage offloads
Offload data from hot/warm storage to cold/long-term storage when the
data is aging out
Meet
Pulsar
19
Streaming
Ingest data Sink data Select data
Process data
Not Streaming
Ingest
data
Persist
data
Select
data
Process
data
Streaming versus not streaming
Persist
data
Select
data
What is Apache Pinot?
S1 S3
Pinot
Controller
S2
3
1 2
2 3
4
Pinot Servers
Zookeeper
Pinot
Broker
S4
4
1
Seg1 -> S1
Seg2 -> S2
Seg3 -> S3
Seg4 -> S4
Seg1 -> S1, S4
Seg2 -> S2, S3
Seg3 -> S3, S1
Seg4 -> S4, S2
select count(*) from X
where country = us
Apache Pinot Architecture
Demo Time! 🥳
github.com/mneedham/pinot-wiki/tree/pulsar
Real-Time Analytics Quadrant
Human Facing
Machine Facing
Internal External
Observability
Real-Time
Dashboard
Recommendation Engine
Fraud Detection
Order Tracking Service
Demo Architecture
Our data set: Wikimedia Recent Changes Feed
● A continuous stream of structured event data
describing changes made to Wikimedia properties.
● Published over HTTP using the Server-Side Events
(SSE) Protocol.
Wikimedia Recent Changes Feed events
event: message
id:
[{"topic":"eqiad.mediawiki.recentchange","partition":0,"timestamp":1647344554001},{"topic":"codfw.me
diawiki.recentchange","partition":0,"offset":-1}]
data:
{"$schema":"/mediawiki/recentchange/1.0.0","meta":{"uri":"https://en.wikipedia.org/wiki/Bosmansdam_H
igh_School","request_id":"f72015bb-376c-48b9-9863-afc0c75a72c8","id":"99c272ae-d31c-4535-9dac-69b098
3171d6","dt":"2022-03-15T11:42:34Z","domain":"en.wikipedia.org","stream":"mediawiki.recentchange","t
opic":"eqiad.mediawiki.recentchange","partition":0,"offset":3714501013},"id":1485381286,"type":"edit
","namespace":0,"title":"Bosmansdam High School","comment":"v2.04b - Fix errors for [[WP:WCW|CW
project]] (Template value ends with break)","timestamp":1647344554,"user":"ZI
Jony","bot":false,"minor":true,"length":{"old":16089,"new":16085},"revision":{"old":1075262250,"new"
:1077261343},"server_url":"https://en.wikipedia.org","server_name":"en.wikipedia.org","server_script
_path":"/w","wiki":"enwiki","parsedcomment":"v2.04b - Fix errors for <a href="/wiki/Wikipedia:WCW"
class="mw-redirect" title="Wikipedia:WCW">CW project</a> (Template value ends with break)"}
Wikimedia Recent Changes Feed events
event: message
id:
[{"topic":"eqiad.mediawiki.recentchange","partition":0,"timestamp":1647344554001},{"topic":"codfw.me
diawiki.recentchange","partition":0,"offset":-1}]
data:
{"$schema":"/mediawiki/recentchange/1.0.0","meta":{"uri":"https://en.wikipedia.org/wiki/Bosmansdam_H
igh_School","request_id":"f72015bb-376c-48b9-9863-afc0c75a72c8","id":"99c272ae-d31c-4535-9dac-69b098
3171d6","dt":"2022-03-15T11:42:34Z","domain":"en.wikipedia.org","stream":"mediawiki.recentchange","t
opic":"eqiad.mediawiki.recentchange","partition":0,"offset":3714501013},"id":1485381286,"type":"edit
","namespace":0,"title":"Bosmansdam High School","comment":"v2.04b - Fix errors for [[WP:WCW|CW
project]] (Template value ends with break)","timestamp":1647344554,"user":"ZI
Jony","bot":false,"minor":true,"length":{"old":16089,"new":16085},"revision":{"old":1075262250,"new"
:1077261343},"server_url":"https://en.wikipedia.org","server_name":"en.wikipedia.org","server_script
_path":"/w","wiki":"enwiki","parsedcomment":"v2.04b - Fix errors for <a href="/wiki/Wikipedia:WCW"
class="mw-redirect" title="Wikipedia:WCW">CW project</a> (Template value ends with break)"}
Demo Done! 😌
Powered by Apache Pinot
3.9k
Github Stars
Slack Users
Companies
2400+
100+
Community
Events/sec
1M+ Peak QPS
200k+ Query Latency
ms
Performance
pinot.apache.org
Who else is using Pulsar?
31
Takeaways
● Real-time analytics lets us create applications that give users
actionable insights
● Properties of these systems: Fresh data, fast querying, at scale
● Pulsar + Pinot is the perfect combination to achieve this
Thank you! (from Mark) 🙇
dev.startree.ai
@MarkHNeedham
stree.ai/slack
@learndatawithmark
Thank you! (from Mary) 󰢚
@mgrygles
Apache Pulsar Slack sign-up
https://apache-pulsar.herokuapp.com/
https://pulsar-neighborhood.github.io/
Resources
Astra DB: https://astra.datastax.com
Astra Streaming:
https://www.datastax.com/products/astra-streaming
Luna Streaming:
https://www.datastax.com/products/luna-streaming
CDC for Astra DB:
https://docs.datastax.com/en/astra/docs/astream-cdc.html
https://pulsar.apache.org/
https://bookkeeper.apache.org/
https://zookeeper.apache.org
Check out 5 Minutes About Pulsar on
https://bit.ly/3bgkRxJ
How to start coding ?
Check out Awesome-Astra
https://awesome-astra.github.io/docs/
Follow Mary’s Twitch Stream
(Different topics: Java, Open Source, Distributed Messaging, Event-Streaming, Cloud, DevOps, etc)
Wednesday at 2pm-US/CST
https://twitch.tv/mgrygles
Publishing Messages to Kafka
Creating Pinot Table
docker exec -it pinot-controller-wiki bin/pinot-admin.sh 
AddTable 
-tableConfigFile /config/table.json 
-schemaFile /config/schema.json 
-exec
Publishing Messages to Kafka
Pinot
Pinot
Streamlit Dashboard
Streamlit Dashboard: Bots?
Streamlit Dashboard: Top Users
Streamlit Dashboard: Top Bots/Non Bots
Streamlit Dashboard: What got changed?
Streamlit Dashboard: By who?

More Related Content

What's hot

Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
confluent
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Databricks
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
Flink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
Databricks
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
DataWorks Summit
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
Databricks
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
Databricks
 
Building Reliable Data Lakes at Scale with Delta Lake
Building Reliable Data Lakes at Scale with Delta LakeBuilding Reliable Data Lakes at Scale with Delta Lake
Building Reliable Data Lakes at Scale with Delta Lake
Databricks
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
HostedbyConfluent
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Altinity Ltd
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with Debezium
ChengKuan Gan
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
Timothy Spann
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
DataWorks Summit
 

What's hot (20)

Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Building Reliable Data Lakes at Scale with Delta Lake
Building Reliable Data Lakes at Scale with Delta LakeBuilding Reliable Data Lakes at Scale with Delta Lake
Building Reliable Data Lakes at Scale with Delta Lake
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with Debezium
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 

Similar to Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot

GIB2020 - Building Event-Driven Integration Architectures
GIB2020 - Building Event-Driven Integration ArchitecturesGIB2020 - Building Event-Driven Integration Architectures
GIB2020 - Building Event-Driven Integration Architectures
Daniel Toomey
 
Set Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO RoundtableSet Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO Roundtable
confluent
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Kai Wähner
 
8base Hyperledger Miami Meetup Presentation
8base Hyperledger Miami Meetup Presentation8base Hyperledger Miami Meetup Presentation
8base Hyperledger Miami Meetup Presentation
8base
 
8base Hyperledger Miami Meetup 20180719
8base Hyperledger Miami Meetup 201807198base Hyperledger Miami Meetup 20180719
8base Hyperledger Miami Meetup 20180719
Oscar Perez
 
Sviluppare Applicazioni Real Time con AppSync Deck.pptx
Sviluppare Applicazioni Real Time con AppSync Deck.pptxSviluppare Applicazioni Real Time con AppSync Deck.pptx
Sviluppare Applicazioni Real Time con AppSync Deck.pptxAmazon Web Services
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
Sriskandarajah Suhothayan
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
Guido Schmutz
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Matt Stubbs
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache Kafka
Guido Schmutz
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
Ofir Sharony
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and MoreWSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2
 
GIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event Grid
GIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event GridGIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event Grid
GIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event Grid
Harris Kristanto
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
IanFurlong4
 
Data & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architectureData & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architecture
Niels Naglé
 
WSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needsWSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needs
Sriskandarajah Suhothayan
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Shirshanka Das
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystem
Yael Garten
 
Introduction to Azure monitor
Introduction to Azure monitorIntroduction to Azure monitor
Introduction to Azure monitor
Praveen Nair
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Databricks
 

Similar to Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot (20)

GIB2020 - Building Event-Driven Integration Architectures
GIB2020 - Building Event-Driven Integration ArchitecturesGIB2020 - Building Event-Driven Integration Architectures
GIB2020 - Building Event-Driven Integration Architectures
 
Set Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO RoundtableSet Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO Roundtable
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
8base Hyperledger Miami Meetup Presentation
8base Hyperledger Miami Meetup Presentation8base Hyperledger Miami Meetup Presentation
8base Hyperledger Miami Meetup Presentation
 
8base Hyperledger Miami Meetup 20180719
8base Hyperledger Miami Meetup 201807198base Hyperledger Miami Meetup 20180719
8base Hyperledger Miami Meetup 20180719
 
Sviluppare Applicazioni Real Time con AppSync Deck.pptx
Sviluppare Applicazioni Real Time con AppSync Deck.pptxSviluppare Applicazioni Real Time con AppSync Deck.pptx
Sviluppare Applicazioni Real Time con AppSync Deck.pptx
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache Kafka
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and MoreWSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
 
GIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event Grid
GIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event GridGIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event Grid
GIBC2018 - Building Event Driven Cloud Solutions with Microsoft Azure Event Grid
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
Data & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architectureData & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architecture
 
WSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needsWSO2 Analytics Platform - The one stop shop for all your data needs
WSO2 Analytics Platform - The one stop shop for all your data needs
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystem
 
Introduction to Azure monitor
Introduction to Azure monitorIntroduction to Azure monitor
Introduction to Azure monitor
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 

More from Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Altinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Altinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
Altinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Altinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Altinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
Altinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
Altinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
Altinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
Altinity Ltd
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
Altinity Ltd
 

More from Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
 

Recently uploaded

1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 

Recently uploaded (20)

1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 

Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot

  • 1. Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot Mark Needham @MarkHNeedham 15th November 2022 Mary Grygleski @mgrygles
  • 2. Mary Grygleski The Passionate Developer Advocate Mary is a Streaming Developer Advocate at DataStax, a leading Data Management Company that specializes in Database-as-a-Service, NoSQL, Big Data, Streaming, and the Cloud-Native platform. Previously she was with the Java and WebSphere/Open Source Advocacy team at IBM. Based out of Chicago, Mary is a Java Champion and President and Executive Board Member of the Chicago Java Users Group (CJUG). She is also co-organizers for the Data, Cloud and AI In Chicago, Chicago Cloud, and IBM Cloud Chicago meetup groups. She has extensive experience in product and application design, development, integration, and deployment experience, and specializes in Event-driven, Reactive Java, Open Source, and Cloud-enabled Distributed systems. https://www.linkedin.com/in/mary-grygleski/ @mgrygles https://www.twitch.tv/mgrygles https://discord.gg/RMU4Juw Who is Mary?
  • 3. Mark Needham Developer Relations Engineer Mark Needham is an Apache Pinot advocate and developer relations engineer at StarTree. As a developer relations engineer, Mark helps users learn how to use Apache Pinot to build their real-time user-facing analytics applications. He also does developer experience, simplifying the getting started experience by making product tweaks and improvements to the documentation. Mark writes about his experiences working with Pinot at markhneedham.com. https://www.linkedin.com/in/markhneedham/ @markhneedham Who is Mark? https://www.markhneedham.com/blog/ learndatawithmark.com
  • 4. What is Real-Time Analytics? Real-time analytics is the discipline that applies logic and mathematics to data to provide insights for making better decisions quickly.
  • 7. Events -> Insight -> Action Events Insight Action
  • 8. The value of data over time Time Value
  • 9. The value of data over time Time Value Real-Time
  • 10. The value of data over time Time Value Real-Time Who’s interested in this data? ● Analysts ● Management ● Users
  • 11. Real-Time Analytics Quadrant Human Facing Machine Facing Internal External Observability Real-Time Dashboard Recommendation Engine Fraud Detection Order Tracking Service
  • 12. Total users 700 Million QPS 10000+ Latency SLA < 100 ms p99th Freshness Seconds Examples of Real-Time Analytics
  • 13. Examples of Real-Time Analytics Missed orders Inaccurate orders Downtime Top selling items Menu item Feedback Total users 500,000+ QPS 100s Latency SLA < 100 ms p99th Freshness Seconds - Minutes
  • 14. Examples of Real-Time Analytics Source: Peter Bakkum, Engineering Manager @Stripe Financial
  • 15. Properties of Real-Time Analytics Systems
  • 16. Building a User-facing Real-Time Analytics System Velocity of ingestion Real-Time Ingestion 1000s of QPS Milliseconds Latency Seconds Freshness Highly Available Scalable Cost Effective High Dimensionality
  • 17. What is Apache Pulsar?
  • 18. 18 Open source Created by Yahoo Contributed to the Apache Software Foundation (ASF) in 2016 Top-level project (2018) Cloud-native design Cluster based Multi-tenant Simple client APIs (Java, C#, Python, Go, …) ➔ Separate compute and storage! Guaranteed message delivery If a message successfully reaches a Pulsar broker, it will be delivered to its intended target. Light-weight serverless functions framework Create complex processing logic within a Pulsar cluster (aka: data pipeline) Tiered storage offloads Offload data from hot/warm storage to cold/long-term storage when the data is aging out Meet Pulsar
  • 19. 19 Streaming Ingest data Sink data Select data Process data Not Streaming Ingest data Persist data Select data Process data Streaming versus not streaming Persist data Select data
  • 20. What is Apache Pinot?
  • 21. S1 S3 Pinot Controller S2 3 1 2 2 3 4 Pinot Servers Zookeeper Pinot Broker S4 4 1 Seg1 -> S1 Seg2 -> S2 Seg3 -> S3 Seg4 -> S4 Seg1 -> S1, S4 Seg2 -> S2, S3 Seg3 -> S3, S1 Seg4 -> S4, S2 select count(*) from X where country = us Apache Pinot Architecture
  • 24. Real-Time Analytics Quadrant Human Facing Machine Facing Internal External Observability Real-Time Dashboard Recommendation Engine Fraud Detection Order Tracking Service
  • 26. Our data set: Wikimedia Recent Changes Feed ● A continuous stream of structured event data describing changes made to Wikimedia properties. ● Published over HTTP using the Server-Side Events (SSE) Protocol.
  • 27. Wikimedia Recent Changes Feed events event: message id: [{"topic":"eqiad.mediawiki.recentchange","partition":0,"timestamp":1647344554001},{"topic":"codfw.me diawiki.recentchange","partition":0,"offset":-1}] data: {"$schema":"/mediawiki/recentchange/1.0.0","meta":{"uri":"https://en.wikipedia.org/wiki/Bosmansdam_H igh_School","request_id":"f72015bb-376c-48b9-9863-afc0c75a72c8","id":"99c272ae-d31c-4535-9dac-69b098 3171d6","dt":"2022-03-15T11:42:34Z","domain":"en.wikipedia.org","stream":"mediawiki.recentchange","t opic":"eqiad.mediawiki.recentchange","partition":0,"offset":3714501013},"id":1485381286,"type":"edit ","namespace":0,"title":"Bosmansdam High School","comment":"v2.04b - Fix errors for [[WP:WCW|CW project]] (Template value ends with break)","timestamp":1647344554,"user":"ZI Jony","bot":false,"minor":true,"length":{"old":16089,"new":16085},"revision":{"old":1075262250,"new" :1077261343},"server_url":"https://en.wikipedia.org","server_name":"en.wikipedia.org","server_script _path":"/w","wiki":"enwiki","parsedcomment":"v2.04b - Fix errors for <a href="/wiki/Wikipedia:WCW" class="mw-redirect" title="Wikipedia:WCW">CW project</a> (Template value ends with break)"}
  • 28. Wikimedia Recent Changes Feed events event: message id: [{"topic":"eqiad.mediawiki.recentchange","partition":0,"timestamp":1647344554001},{"topic":"codfw.me diawiki.recentchange","partition":0,"offset":-1}] data: {"$schema":"/mediawiki/recentchange/1.0.0","meta":{"uri":"https://en.wikipedia.org/wiki/Bosmansdam_H igh_School","request_id":"f72015bb-376c-48b9-9863-afc0c75a72c8","id":"99c272ae-d31c-4535-9dac-69b098 3171d6","dt":"2022-03-15T11:42:34Z","domain":"en.wikipedia.org","stream":"mediawiki.recentchange","t opic":"eqiad.mediawiki.recentchange","partition":0,"offset":3714501013},"id":1485381286,"type":"edit ","namespace":0,"title":"Bosmansdam High School","comment":"v2.04b - Fix errors for [[WP:WCW|CW project]] (Template value ends with break)","timestamp":1647344554,"user":"ZI Jony","bot":false,"minor":true,"length":{"old":16089,"new":16085},"revision":{"old":1075262250,"new" :1077261343},"server_url":"https://en.wikipedia.org","server_name":"en.wikipedia.org","server_script _path":"/w","wiki":"enwiki","parsedcomment":"v2.04b - Fix errors for <a href="/wiki/Wikipedia:WCW" class="mw-redirect" title="Wikipedia:WCW">CW project</a> (Template value ends with break)"}
  • 30. Powered by Apache Pinot 3.9k Github Stars Slack Users Companies 2400+ 100+ Community Events/sec 1M+ Peak QPS 200k+ Query Latency ms Performance pinot.apache.org
  • 31. Who else is using Pulsar? 31
  • 32. Takeaways ● Real-time analytics lets us create applications that give users actionable insights ● Properties of these systems: Fresh data, fast querying, at scale ● Pulsar + Pinot is the perfect combination to achieve this
  • 33. Thank you! (from Mark) 🙇 dev.startree.ai @MarkHNeedham stree.ai/slack @learndatawithmark
  • 34. Thank you! (from Mary) 󰢚 @mgrygles Apache Pulsar Slack sign-up https://apache-pulsar.herokuapp.com/ https://pulsar-neighborhood.github.io/
  • 35. Resources Astra DB: https://astra.datastax.com Astra Streaming: https://www.datastax.com/products/astra-streaming Luna Streaming: https://www.datastax.com/products/luna-streaming CDC for Astra DB: https://docs.datastax.com/en/astra/docs/astream-cdc.html https://pulsar.apache.org/ https://bookkeeper.apache.org/ https://zookeeper.apache.org
  • 36. Check out 5 Minutes About Pulsar on https://bit.ly/3bgkRxJ
  • 37. How to start coding ? Check out Awesome-Astra https://awesome-astra.github.io/docs/
  • 38. Follow Mary’s Twitch Stream (Different topics: Java, Open Source, Distributed Messaging, Event-Streaming, Cloud, DevOps, etc) Wednesday at 2pm-US/CST https://twitch.tv/mgrygles
  • 40. Creating Pinot Table docker exec -it pinot-controller-wiki bin/pinot-admin.sh AddTable -tableConfigFile /config/table.json -schemaFile /config/schema.json -exec
  • 42. Pinot
  • 43. Pinot
  • 47. Streamlit Dashboard: Top Bots/Non Bots