Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale

Seunghyun Lee
Seunghyun LeeSoftware Engineer at LinkedIn
Enabling Real-time Analytics Applications @ LinkedIn’s Scale
Mayank Shrivastava Jackie Jiang
Senior Software Engineer
Seunghyun Lee
Senior Software EngineerStaff Software Engineer
Apache Pinot
1
2
3
4
Agenda
Introduction
Pinot @ LinkedIn
How to use Pinot
Pinot Performance
How is data generated and used at LinkedIn
Actor Verb
Member
Job
Post
Company
Object Life Cycle
Create
Generate
Analyze
Product
DataInsights
600+ million
members
Tens of
million posts
likes/shared
per day
3+ million
jobs posted
per month
30 million
companies
Trillions of events per day
Real-time Analytics Applications at LinkedIn
How to build an online analytics application?
• Real-time data ingestion
• Millions of active users, 1000s of queries per sec
• Super low latency (10s ms)
• Highly available, always on
Approach 1. Join on the fly
Event Stream
Profile View
Profile View Table
Member Table
Application
Server
Who viewed my profile
• Real-time
(depending on storage)
• High latency due to join
Approach 2. Pre Join + Pre Aggregate
• Near real-time ingestion
• Latency varies with query
selectivity
Event Stream
Profile View
Profile View
Table
Member Table
Application
Server
Who viewed my profile
Stream
Processing
Engine
Pre Join +
Pre Aggr
Approach 3. Pre Join + Pre Aggregate + Pre Cube
• Very fast
• Batch ingestion (hourly / daily)
• Storage explosion
• Re-bootstrap on schema change
Event Stream
Profile View Profile View
Table
Member Table
Application
Server
Who viewed my profile
Batch
Processing
Engine
Pre Join +
Pre Aggr +
Pre Cube
Latency vs. Flexibility
Profile View Table
Member Table Pre-Join Pre-Aggregation Pre-Cube
Spark SQL
Presto
Hive
Big Query
Druid
Elastic Search
Pinot
Kylin
KV Store
Latency
Flexibility
lowhigh
lowhigh
Pinot
Who Viewed My Profile @ LinkedIn
Data Lake
Stream
Processing
WVMP
Dashboard
Ad-hoc Queries
Espresso
Raw Tracking
Data
Pre-joined
Data
Pre Join +
Pre Aggr
What is Apache Pinot?
• OLAP Datastore
• Columnar, indexed storage
• Low latency analytics
• Distributed – highly available, reliable, scalable
• Lambda architecture
○ Offline data pushes + Real-time stream ingestion
• Open Source
1
2
3
4
Agenda
Introduction
Pinot @ LinkedIn
How to use Pinot
Pinot Performance
Pinot @ LinkedIn
70+ 2000+ 100K+ 1M+
Member Facing
Use Cases
Dashboards
for Internal
Business Metrics
Queries
Per Second
Records Ingested
Per Second
Pinot @ LinkedIn: Member Facing Analytics Report
• Providing analytics reports
for Linkedin member-facing
applications
• Very high QPS (Thousands)
• Requires strict latency SLA
(10s ms - sub-sec)
Pinot @ LinkedIn: Interactive Dashboard
• Visualization tool for
multi-dimensional metrics
• Complex, explorative queries
• 2000+ metrics,
used by 1000+ employees
Pinot @ LinkedIn: Anomaly Detection
• Efficiently detect and
investigate anomalies in
metrics
• Third Eye: Part of Apache
Pinot open source
Pinot Usage @ Other Companies
1
2
3
4
Agenda
Introduction
Pinot @ LinkedIn
How to use Pinot
Pinot Performance
How to use Pinot
Batch Data Ingestion
Real-time Data Ingestion
SQL-like Query Interface (PQL)
Let’s build something cool
Event RSVP Data
How to use Pinot: Workflow
Define
Schema
Define Table
Configuration
Create
Table
One Time Setup
Raw Data
Generate
Pinot
Segments
Push Data
Streaming
Data
Setup
Stream Data
Source
Batch
(Scheduled Job)
Real-time
(One Time Setup)
Data Ingestion
HDFS, S3,
ADSL, NFS...
Kafka,
Event Hub...
How to use Pinot: Define Schema
● Schema name: meetupRsvp
● Dimension field specs
○ event_name (string)
○ event_time (long)
○ country (string)
○ city (string)
○ …
● Metrics field specs
○ rsvp_count (int)
● Time field spec
○ timestamp (long)
■ timetype: epoch / datetime
■ granularity: millisecond /
second/hour/day
• Dimension: an attribute of your data (filter,
group by)
• Metric: a number that is used to measure
characteristics of a dimension (aggregation)
• Time: a timestamp of an event (partitioning,
retention management)
SELECT event_name, sum(rsvp_count)
FROM meetupRsvp
WHERE country = “us”
GROUP BY event_name
TOP 10
Example Query - Top 10 events in US
How to use Pinot: Configure and Create Table
Pinot Schema
Table Config
● Table name: meetupRsvp
● Table type: batch / realtime
/ hybrid
● Replication factor: 2
● Index Columns: ...
● Bloom filters: ...
● Retention: 30 days
● ...
Pinot
Admin Client
How to use Pinot: Batch Ingestion
Raw DataRaw Data
Raw Data
Segment
Generation
Job
(library)
Json, CSV, Avro,
Parquet, ORC...
Pinot
Schema
Table
Config
Pinot
Segment
Pinot
Segment
Pinot
Segment
HDFS, S3, ADLS, NFS...
HDFS, S3, ADLS, NFS...
How to use Pinot: Batch Ingestion
Raw Data
Segment
Generation
Job
(library)
Json, Avro,
Parquet, ORC...
Pinot
Schema
Table
Config
Pinot
Segment
Pinot
Segment
Pinot
Segment
Segment
Push Job
(library)
HDFS, S3, ADLS, NFS... HDFS, S3, ADLS, NFS...
How to use Pinot: Segment Assignment
Segment
Push Job
Controller
Helix
Zookeeper
Server-0 Server-1 Server-2
Pinot
• Assignment strategies
○ Uniform
○ Replica Group
○ Partition Aware
Segment Store
S0 S2S1
HDFS, S3, ADLS, NFS...
● S0: Sever-0, Server-1
● S1: Server-1, Server-2
● S2: Server-0, Server-2
S0 S2 S1 S0 S2 S1
1. Table name
2. Segment name
3. Segment URI path
How to use Pinot: Query Routing
Segment
Push Job
Controller
Helix
• Routing Strategies
○ Uniform
○ Replica Group
○ Partition Aware
Broker
Queries
Segment Store
S0 S2S1
HDFS, S3, ADLS, NFS...
Server-0 Server-1 Server-2
Pinot
S0 S2 S1 S0 S2 S1
How to use Pinot: Batch + Realtime
Segment
Push Job
Controller
Helix
Real-time
Servers
Offline
Servers
Broker
Queries
Pinot
Streaming
Data
Kafka,
Event Hub,
Kinesis...
Table Config
● Table name: meetupRsvp
● Table type: real-time
● Replication factor: 2
● Kafka broker: ...
● Kafka topic name: ...
● Retention: 5 days
● ...
• A single schema for both
offline + real-time tables
How to use Pinot: Batch + Realtime
Segment
Push Job
Controller
Helix
Real-time
Servers
Offline
Servers
Broker
Queries
Pinot
Streaming
Data
Kafka,
Event Hub,
Kinesis...
• Real-time servers keep
consumed data in
memory, periodically
flush data to segment
store.
• Broker handles offline
and real-time federation.
Quick Demo
Event RSVP Data
1
2
3
4
Agenda
Introduction
Pinot @ LinkedIn
How to use Pinot
Pinot Performance
Interactive Dashboard select sum(pageView) from T
where country = us
and browser = chrome
...
group by time
• Human-driven queries
• Slice and dice over arbitrary dimensions
5000 Queries Pinot Druid
Total Time 11 minutes 24 minutes
P50 84ms 136ms
P90 206ms 667ms
Site Facing Analytics
select sum(articleViewCount) from T
where articleId = x
...
and time >= y time < z
group by viewer[title|geo|industry]
• Pre-defined queries with different
filtering values
• Usually have a filter on the primary key
(e.g. articleId)
• High QPS (thousands), low latency
(< 100ms for 99%) requirements
Anomaly Detection
for d1 in [us, ca, ...]
for d2 in [chrome, firefox, ...]
...
select sum(pageViews) from T
where country = d1 and browser = d2…
group by time
Filter Aggregation
select …
where country = us …
Slow, scan 60-70% data
select …
where country = ireland …
Scan less than 1%
• Identifying issues requires monitoring
all possible combinations
• Data distribution can be skewed
Secret behind Pinot
Aggregation
Filter
Storage
Scan Star-Tree Pre-aggregation
Scan Inverted Index
Columnar Store Encoding/Compression
Sorted Index Star-Tree Index
❏ Common Techniques
❏ Pinot & Druid
❏ Pinot Only
select sum(pageView) from T
where country = us
and browser = chrome
Columnar Store
• Read relevant columns only
country browser ...
us chrome ...
ca firefox ...
jp ie ...
us firefox ...
ca ie ...
… … ...
Raw Data
Row Based
Column Based
Aggregation
Filter
Storage
select sum(pageView) from T
where country = us
and browser = chrome
Columnar us chrome ...
ca firefox ...
jp ie ...
country
us
ca
jp
us
ca
…
browser
chrome
firefox
ie
firefox
ie
…
...
...
...
...
...
...
...
Encoding & Compression Dictionary
Forward Index
country
ca
jp
us
…
browser
chrome
firefox
ie
…
country
2
0
1
2
0
...
browser
0
1
2
1
2
...
• Storage compression
○ Dictionary encoding
○ Bit compression
Aggregation
Filter
Storage Encoding/Compression
select sum(pageView) from T
where country = us
and browser = chrome
Column Based
country
us
ca
jp
us
ca
…
browser
chrome
firefox
ie
firefox
ie
…
docId
0
1
2
3
4
…
docId
0
1
2
3
4
...
dictId
0
1
2
…
Inverted Index
docId country browser
0 us chrome
1 ca firefox
2 jp ie
3 us firefox
4 ca ie
… … …
Raw Data country docIds
ca 1, 4...
jp 2...
us 0, 3...
... ...
Inverted Index
browser docIds
chrome 0 ...
firefox 1, 3...
ie 2, 4...
... ...• Storing bitmap for each value
• Fast filtering:
○ Constant time value lookup
○ Bit operations for AND/OR clause
Aggregation
Filter
Storage
Inverted
Index
select sum(pageView) from T
where country = us
and browser = chrome
Sorted Index
• Better data compression:
○ Run length encoding
○ Can be accessed as
forward/inverted index
• Spatial locality
country start docId end docId
ca 0 80
jp 81 100
us 101 300
… … …
docId country
0 ca
... …
100 jp
101 us
… …
300 us
… …
sorted index
inverted index
Aggregation
Filter
Storage
Sorted Index
select sum(pageView) from T
where country = us
and browser = chrome
Latency vs. Space Trade-off
latency
space requirement
scan
pre-cubeStar-Tree
select sum(pageView) from T
where country = us
and browser = chrome
Aggregation
Filter
Storage
Star-Tree Pre-aggregation
Star-Tree Index
Star-Tree Index
latency
space requirement
T=infinity
T=1,000,000
T=10,000
T=100
T=1
• Configurable trade-off between latency and space by partial
pre-aggregation technique
• Be able to achieve a hard upper bound for query latencies
Star-Tree Index
Flexible Query Execution Plan
Query Optimization
select max(col) from T Use metadata instead of scanning
select sum(metric) from T
where country = us and accountId = x
Reorder filter based on the available indexes
(apply accountId before country predicate)
Segment level physical query planner can intelligently choose the best way
to solve the query based on the segment metadata and available indexes.
Global Optimizations
Problem Solution
Querying all segments
Segment pruning to minimize the number of
segments to query
Querying all servers
Smart segment assignment to reduce the fan-out
to servers
Conclusion
User Activity
Data
Member
Facing
Applications
Interactive
Dashboard
Anomaly
Detection
Contributing to Pinot
• We are looking for contributions!
• Apache Pinot (incubating) 0.1.0 is available at
https://pinot.apache.org
• Pinot Twitter Account
https://twitter.com/ApachePinot
• Pinot Meetup Page
https://www.meetup.com/apache-pinot
• Pinot Slack Channel
https://tinyurl.com/pinotSlackChannel
Folks behind Pinot
Mayank Shrivastava
Subbu Subramaniam
Jean-Francois Im
Jackie Jiang
Seunghyun Lee
Jennifer Dai
Neha Pawar
Jialiang Li
Sunitha Beeram
Shraddha Sahay
Kishore Gopalakrishna
Xiang Fu
James Shao
Prasanna Ravi
John Gutmann
Dino Occhialini
Walter Huf
Xiaohui Sun
Long Huynh
Akshay Rai
Alexander Pucher
Jihao Zhang
Felix Cheung
Olivier Lamy
Jim Jagielski
Marcel Siegrist
Roman Shaposhnik
Anurag Shendge
Thank you
1 of 48

Recommended

Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018 by
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018Seunghyun Lee
3.1K views37 slides
Pinot: Near Realtime Analytics @ Uber by
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
21.8K views21 slides
How Uber scaled its Real Time Infrastructure to Trillion events per day by
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayDataWorks Summit
27.6K views40 slides
Real-time Analytics with Trino and Apache Pinot by
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
1.2K views25 slides
Pinot: Realtime Distributed OLAP datastore by
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastoreKishore Gopalakrishna
907.9K views36 slides
Apache Pinot Meetup Sept02, 2020 by
Apache Pinot Meetup Sept02, 2020Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Mayank Shrivastava
923 views74 slides

More Related Content

What's hot

Using ClickHouse for Experimentation by
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
12.8K views33 slides
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas by
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasFlink Forward
1.1K views49 slides
Building real time analytics applications using pinot : A LinkedIn case study by
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyKishore Gopalakrishna
15K views39 slides
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P... by
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
1K views25 slides
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안 by
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK
15.2K views54 slides
New Features in Apache Pinot by
New Features in Apache PinotNew Features in Apache Pinot
New Features in Apache PinotSiddharth Teotia
165 views33 slides

What's hot(20)

Using ClickHouse for Experimentation by Gleb Kanterov
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov12.8K views
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas by Flink Forward
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Flink Forward1.1K views
Building real time analytics applications using pinot : A LinkedIn case study by Kishore Gopalakrishna
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case study
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P... by Databricks
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Databricks1K views
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안 by SANG WON PARK
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK15.2K views
Iceberg: a fast table format for S3 by DataWorks Summit
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
DataWorks Summit7.5K views
Autoscaling Flink with Reactive Mode by Flink Forward
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward921 views
Cost-Based Optimizer in Apache Spark 2.2 by Databricks
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
Databricks5.5K views
Grafana introduction by Rico Chen
Grafana introductionGrafana introduction
Grafana introduction
Rico Chen8.7K views
Diving into Delta Lake: Unpacking the Transaction Log by Databricks
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
Databricks807 views
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi by DataWorks Summit
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit4K views
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021 by StreamNative
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative536 views
Apache Flink in the Cloud-Native Era by Flink Forward
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
Flink Forward171 views
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData by InfluxData
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxDataInfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
InfluxData1.5K views
Practical learnings from running thousands of Flink jobs by Flink Forward
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
Flink Forward265 views
Airflow presentation by Ilias Okacha
Airflow presentationAirflow presentation
Airflow presentation
Ilias Okacha3.6K views

Similar to Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale

Perfect Norikra 2nd Season by
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd SeasonSATOSHI TAGOMORI
6.2K views37 slides
Realtime Analytics on AWS by
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWSSungmin Kim
502 views79 slides
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013 by
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Amazon Web Services
3.7K views55 slides
[WSO2Con EU 2018] The Rise of Streaming SQL by
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQLWSO2
291 views42 slides
Data Science in the Cloud @StitchFix by
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
952 views107 slides
Writing a Fullstack Application with Javascript - Remote media player by
Writing a Fullstack Application with Javascript - Remote media playerWriting a Fullstack Application with Javascript - Remote media player
Writing a Fullstack Application with Javascript - Remote media playerTikal Knowledge
3.1K views38 slides

Similar to Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale(20)

Realtime Analytics on AWS by Sungmin Kim
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
Sungmin Kim502 views
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013 by Amazon Web Services
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Amazon Web Services3.7K views
[WSO2Con EU 2018] The Rise of Streaming SQL by WSO2
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
WSO2291 views
Data Science in the Cloud @StitchFix by C4Media
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
C4Media952 views
Writing a Fullstack Application with Javascript - Remote media player by Tikal Knowledge
Writing a Fullstack Application with Javascript - Remote media playerWriting a Fullstack Application with Javascript - Remote media player
Writing a Fullstack Application with Javascript - Remote media player
Tikal Knowledge3.1K views
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn by Grokking VN
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking VN639 views
Cloud Security Monitoring and Spark Analytics by amesar0
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
amesar01K views
Análisis de las novedades del Elastic Stack by Elasticsearch
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
Elasticsearch1.6K views
#TwitterRealTime - Real time processing @twitter by Twitter Developers
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis by Amazon Web Services
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
Amazon Web Services1.4K views
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs by Altoros
Cloud Foundry Monitoring How-To: Collecting Metrics and LogsCloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs
Altoros6.3K views
Elastic Stack roadmap deep dive by Elasticsearch
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
Elasticsearch1.1K views
Azure Stream Analytics : Analyse Data in Motion by Ruhani Arora
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
Ruhani Arora694 views
The Patterns of Distributed Logging and Containers by SATOSHI TAGOMORI
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
SATOSHI TAGOMORI24.9K views
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401... by Amazon Web Services
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
Amazon Web Services7.8K views
Apache CarbonData+Spark to realize data convergence and Unified high performa... by Tech Triveni
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Tech Triveni418 views
Growing into a proactive Data Platform by LivePerson
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data Platform
LivePerson1.1K views
Norikra: SQL Stream Processing In Ruby by SATOSHI TAGOMORI
Norikra: SQL Stream Processing In RubyNorikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In Ruby
SATOSHI TAGOMORI6.7K views

Recently uploaded

CRIJ4385_Death Penalty_F23.pptx by
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptxyvettemm100
6 views24 slides
Organic Shopping in Google Analytics 4.pdf by
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdfGA4 Tutorials
11 views13 slides
Introduction to Microsoft Fabric.pdf by
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdfishaniuudeshika
29 views16 slides
PROGRAMME.pdf by
PROGRAMME.pdfPROGRAMME.pdf
PROGRAMME.pdfHiNedHaJar
18 views13 slides
Supercharging your Data with Azure AI Search and Azure OpenAI by
Supercharging your Data with Azure AI Search and Azure OpenAISupercharging your Data with Azure AI Search and Azure OpenAI
Supercharging your Data with Azure AI Search and Azure OpenAIPeter Gallagher
37 views32 slides
Building Real-Time Travel Alerts by
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
111 views48 slides

Recently uploaded(20)

CRIJ4385_Death Penalty_F23.pptx by yvettemm100
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptx
yvettemm1006 views
Organic Shopping in Google Analytics 4.pdf by GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials11 views
Introduction to Microsoft Fabric.pdf by ishaniuudeshika
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdf
ishaniuudeshika29 views
Supercharging your Data with Azure AI Search and Azure OpenAI by Peter Gallagher
Supercharging your Data with Azure AI Search and Azure OpenAISupercharging your Data with Azure AI Search and Azure OpenAI
Supercharging your Data with Azure AI Search and Azure OpenAI
Peter Gallagher37 views
Building Real-Time Travel Alerts by Timothy Spann
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann111 views
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20045 views
Survey on Factuality in LLM's.pptx by NeethaSherra1
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptx
NeethaSherra15 views
Understanding Hallucinations in LLMs - 2023 09 29.pptx by Greg Makowski
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptx
Greg Makowski17 views
Data structure and algorithm. by Abdul salam
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
Abdul salam 19 views
Advanced_Recommendation_Systems_Presentation.pptx by neeharikasingh29
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx by DataScienceConferenc1
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
Cross-network in Google Analytics 4.pdf by GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 views
UNEP FI CRS Climate Risk Results.pptx by pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf by vikas12611618
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdfVikas 500 BIG DATA TECHNOLOGIES LAB.pdf
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf
vikas126116188 views
3196 The Case of The East River by ErickANDRADE90
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East River
ErickANDRADE9011 views
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation by DataScienceConferenc1
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation

Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale

  • 1. Enabling Real-time Analytics Applications @ LinkedIn’s Scale Mayank Shrivastava Jackie Jiang Senior Software Engineer Seunghyun Lee Senior Software EngineerStaff Software Engineer Apache Pinot
  • 2. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  • 3. How is data generated and used at LinkedIn Actor Verb Member Job Post Company Object Life Cycle Create Generate Analyze Product DataInsights 600+ million members Tens of million posts likes/shared per day 3+ million jobs posted per month 30 million companies Trillions of events per day
  • 5. How to build an online analytics application? • Real-time data ingestion • Millions of active users, 1000s of queries per sec • Super low latency (10s ms) • Highly available, always on
  • 6. Approach 1. Join on the fly Event Stream Profile View Profile View Table Member Table Application Server Who viewed my profile • Real-time (depending on storage) • High latency due to join
  • 7. Approach 2. Pre Join + Pre Aggregate • Near real-time ingestion • Latency varies with query selectivity Event Stream Profile View Profile View Table Member Table Application Server Who viewed my profile Stream Processing Engine Pre Join + Pre Aggr
  • 8. Approach 3. Pre Join + Pre Aggregate + Pre Cube • Very fast • Batch ingestion (hourly / daily) • Storage explosion • Re-bootstrap on schema change Event Stream Profile View Profile View Table Member Table Application Server Who viewed my profile Batch Processing Engine Pre Join + Pre Aggr + Pre Cube
  • 9. Latency vs. Flexibility Profile View Table Member Table Pre-Join Pre-Aggregation Pre-Cube Spark SQL Presto Hive Big Query Druid Elastic Search Pinot Kylin KV Store Latency Flexibility lowhigh lowhigh Pinot
  • 10. Who Viewed My Profile @ LinkedIn Data Lake Stream Processing WVMP Dashboard Ad-hoc Queries Espresso Raw Tracking Data Pre-joined Data Pre Join + Pre Aggr
  • 11. What is Apache Pinot? • OLAP Datastore • Columnar, indexed storage • Low latency analytics • Distributed – highly available, reliable, scalable • Lambda architecture ○ Offline data pushes + Real-time stream ingestion • Open Source
  • 12. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  • 13. Pinot @ LinkedIn 70+ 2000+ 100K+ 1M+ Member Facing Use Cases Dashboards for Internal Business Metrics Queries Per Second Records Ingested Per Second
  • 14. Pinot @ LinkedIn: Member Facing Analytics Report • Providing analytics reports for Linkedin member-facing applications • Very high QPS (Thousands) • Requires strict latency SLA (10s ms - sub-sec)
  • 15. Pinot @ LinkedIn: Interactive Dashboard • Visualization tool for multi-dimensional metrics • Complex, explorative queries • 2000+ metrics, used by 1000+ employees
  • 16. Pinot @ LinkedIn: Anomaly Detection • Efficiently detect and investigate anomalies in metrics • Third Eye: Part of Apache Pinot open source
  • 17. Pinot Usage @ Other Companies
  • 18. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  • 19. How to use Pinot Batch Data Ingestion Real-time Data Ingestion SQL-like Query Interface (PQL)
  • 20. Let’s build something cool Event RSVP Data
  • 21. How to use Pinot: Workflow Define Schema Define Table Configuration Create Table One Time Setup Raw Data Generate Pinot Segments Push Data Streaming Data Setup Stream Data Source Batch (Scheduled Job) Real-time (One Time Setup) Data Ingestion HDFS, S3, ADSL, NFS... Kafka, Event Hub...
  • 22. How to use Pinot: Define Schema ● Schema name: meetupRsvp ● Dimension field specs ○ event_name (string) ○ event_time (long) ○ country (string) ○ city (string) ○ … ● Metrics field specs ○ rsvp_count (int) ● Time field spec ○ timestamp (long) ■ timetype: epoch / datetime ■ granularity: millisecond / second/hour/day • Dimension: an attribute of your data (filter, group by) • Metric: a number that is used to measure characteristics of a dimension (aggregation) • Time: a timestamp of an event (partitioning, retention management) SELECT event_name, sum(rsvp_count) FROM meetupRsvp WHERE country = “us” GROUP BY event_name TOP 10 Example Query - Top 10 events in US
  • 23. How to use Pinot: Configure and Create Table Pinot Schema Table Config ● Table name: meetupRsvp ● Table type: batch / realtime / hybrid ● Replication factor: 2 ● Index Columns: ... ● Bloom filters: ... ● Retention: 30 days ● ... Pinot Admin Client
  • 24. How to use Pinot: Batch Ingestion Raw DataRaw Data Raw Data Segment Generation Job (library) Json, CSV, Avro, Parquet, ORC... Pinot Schema Table Config Pinot Segment Pinot Segment Pinot Segment HDFS, S3, ADLS, NFS... HDFS, S3, ADLS, NFS...
  • 25. How to use Pinot: Batch Ingestion Raw Data Segment Generation Job (library) Json, Avro, Parquet, ORC... Pinot Schema Table Config Pinot Segment Pinot Segment Pinot Segment Segment Push Job (library) HDFS, S3, ADLS, NFS... HDFS, S3, ADLS, NFS...
  • 26. How to use Pinot: Segment Assignment Segment Push Job Controller Helix Zookeeper Server-0 Server-1 Server-2 Pinot • Assignment strategies ○ Uniform ○ Replica Group ○ Partition Aware Segment Store S0 S2S1 HDFS, S3, ADLS, NFS... ● S0: Sever-0, Server-1 ● S1: Server-1, Server-2 ● S2: Server-0, Server-2 S0 S2 S1 S0 S2 S1 1. Table name 2. Segment name 3. Segment URI path
  • 27. How to use Pinot: Query Routing Segment Push Job Controller Helix • Routing Strategies ○ Uniform ○ Replica Group ○ Partition Aware Broker Queries Segment Store S0 S2S1 HDFS, S3, ADLS, NFS... Server-0 Server-1 Server-2 Pinot S0 S2 S1 S0 S2 S1
  • 28. How to use Pinot: Batch + Realtime Segment Push Job Controller Helix Real-time Servers Offline Servers Broker Queries Pinot Streaming Data Kafka, Event Hub, Kinesis... Table Config ● Table name: meetupRsvp ● Table type: real-time ● Replication factor: 2 ● Kafka broker: ... ● Kafka topic name: ... ● Retention: 5 days ● ... • A single schema for both offline + real-time tables
  • 29. How to use Pinot: Batch + Realtime Segment Push Job Controller Helix Real-time Servers Offline Servers Broker Queries Pinot Streaming Data Kafka, Event Hub, Kinesis... • Real-time servers keep consumed data in memory, periodically flush data to segment store. • Broker handles offline and real-time federation.
  • 31. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  • 32. Interactive Dashboard select sum(pageView) from T where country = us and browser = chrome ... group by time • Human-driven queries • Slice and dice over arbitrary dimensions 5000 Queries Pinot Druid Total Time 11 minutes 24 minutes P50 84ms 136ms P90 206ms 667ms
  • 33. Site Facing Analytics select sum(articleViewCount) from T where articleId = x ... and time >= y time < z group by viewer[title|geo|industry] • Pre-defined queries with different filtering values • Usually have a filter on the primary key (e.g. articleId) • High QPS (thousands), low latency (< 100ms for 99%) requirements
  • 34. Anomaly Detection for d1 in [us, ca, ...] for d2 in [chrome, firefox, ...] ... select sum(pageViews) from T where country = d1 and browser = d2… group by time Filter Aggregation select … where country = us … Slow, scan 60-70% data select … where country = ireland … Scan less than 1% • Identifying issues requires monitoring all possible combinations • Data distribution can be skewed
  • 35. Secret behind Pinot Aggregation Filter Storage Scan Star-Tree Pre-aggregation Scan Inverted Index Columnar Store Encoding/Compression Sorted Index Star-Tree Index ❏ Common Techniques ❏ Pinot & Druid ❏ Pinot Only select sum(pageView) from T where country = us and browser = chrome
  • 36. Columnar Store • Read relevant columns only country browser ... us chrome ... ca firefox ... jp ie ... us firefox ... ca ie ... … … ... Raw Data Row Based Column Based Aggregation Filter Storage select sum(pageView) from T where country = us and browser = chrome Columnar us chrome ... ca firefox ... jp ie ... country us ca jp us ca … browser chrome firefox ie firefox ie … ... ... ... ... ... ... ...
  • 37. Encoding & Compression Dictionary Forward Index country ca jp us … browser chrome firefox ie … country 2 0 1 2 0 ... browser 0 1 2 1 2 ... • Storage compression ○ Dictionary encoding ○ Bit compression Aggregation Filter Storage Encoding/Compression select sum(pageView) from T where country = us and browser = chrome Column Based country us ca jp us ca … browser chrome firefox ie firefox ie … docId 0 1 2 3 4 … docId 0 1 2 3 4 ... dictId 0 1 2 …
  • 38. Inverted Index docId country browser 0 us chrome 1 ca firefox 2 jp ie 3 us firefox 4 ca ie … … … Raw Data country docIds ca 1, 4... jp 2... us 0, 3... ... ... Inverted Index browser docIds chrome 0 ... firefox 1, 3... ie 2, 4... ... ...• Storing bitmap for each value • Fast filtering: ○ Constant time value lookup ○ Bit operations for AND/OR clause Aggregation Filter Storage Inverted Index select sum(pageView) from T where country = us and browser = chrome
  • 39. Sorted Index • Better data compression: ○ Run length encoding ○ Can be accessed as forward/inverted index • Spatial locality country start docId end docId ca 0 80 jp 81 100 us 101 300 … … … docId country 0 ca ... … 100 jp 101 us … … 300 us … … sorted index inverted index Aggregation Filter Storage Sorted Index select sum(pageView) from T where country = us and browser = chrome
  • 40. Latency vs. Space Trade-off latency space requirement scan pre-cubeStar-Tree select sum(pageView) from T where country = us and browser = chrome Aggregation Filter Storage Star-Tree Pre-aggregation Star-Tree Index
  • 41. Star-Tree Index latency space requirement T=infinity T=1,000,000 T=10,000 T=100 T=1 • Configurable trade-off between latency and space by partial pre-aggregation technique • Be able to achieve a hard upper bound for query latencies
  • 43. Flexible Query Execution Plan Query Optimization select max(col) from T Use metadata instead of scanning select sum(metric) from T where country = us and accountId = x Reorder filter based on the available indexes (apply accountId before country predicate) Segment level physical query planner can intelligently choose the best way to solve the query based on the segment metadata and available indexes.
  • 44. Global Optimizations Problem Solution Querying all segments Segment pruning to minimize the number of segments to query Querying all servers Smart segment assignment to reduce the fan-out to servers
  • 46. Contributing to Pinot • We are looking for contributions! • Apache Pinot (incubating) 0.1.0 is available at https://pinot.apache.org • Pinot Twitter Account https://twitter.com/ApachePinot • Pinot Meetup Page https://www.meetup.com/apache-pinot • Pinot Slack Channel https://tinyurl.com/pinotSlackChannel
  • 47. Folks behind Pinot Mayank Shrivastava Subbu Subramaniam Jean-Francois Im Jackie Jiang Seunghyun Lee Jennifer Dai Neha Pawar Jialiang Li Sunitha Beeram Shraddha Sahay Kishore Gopalakrishna Xiang Fu James Shao Prasanna Ravi John Gutmann Dino Occhialini Walter Huf Xiaohui Sun Long Huynh Akshay Rai Alexander Pucher Jihao Zhang Felix Cheung Olivier Lamy Jim Jagielski Marcel Siegrist Roman Shaposhnik Anurag Shendge