SlideShare a Scribd company logo
1 of 33
Download to read offline
Apache Pinot Meetup @ Uber
05/05/2020
Presenters
Sidd Teotia Xiang Fu
Exact match search - without indexing
select count(*) from foo where firstName = "John"
FirstName
Adam
...
…
John
…
…
…
Raw Data
0
1
2
3
4
Dictionary
0
2
2
3
3
2
2
4
Forward Index
John
1. Find “John” in dictionary using
binary search
2. Scan forward index and and
count the number of records that
match the dictionary Id
Exact match search - with indexing
select count(*) from foo where firstName = "John"
FirstName
Adam
...
…
John
…
…
…
Raw Data
0
1
2
3
4
Dictionary
0
2
2
3
3
2
2
4
Forward Index
John
0
1
2
3
4
Bitmap of DocIds
Bitmap of DocIds
Bitmap of DocIds
Bitmap of DocIds
Bitmap of DocIds
Inverted Index
John
1. Find “John” in dictionary using
binary search
2. Lookup Inverted index to get
matching docIds for
dictionaryId=2
Text search - without indexing
select count(*) from foo where regexp_like( firstName, "John*")
FirstName
Adam
...
Johnny
John
…
Johnson
…
Raw Data
0
1
2
3
4
Dictionary
0
2
2
3
3
2
2
4
Forward Index
1. Scan the dictionary and get the
raw value for each dictionary Id
2. Pattern match the raw value
with regex and compute
matching dictionary Id set
3. Scan forward Index and count
the number of records with
dictionary Id part of matching set
Text match - with index?
● Pinot supports super fast query processing through its indexes on non-BLOB
like columns.
● For arbitrary text, we need more than exact match -- phrase, regex, fuzzy.
○ regexp_like is not efficient since it is a scan based operation.
○ Fuzzy search not supported
● In 0.3.0, we added support for text indexes to efficiently do arbitrary search on
STRING columns where each column value is a large BLOB of text.
○ Use new in-built function text_match
○ select count(*) from foo where text_match (colName, searchExpr)
Text match - with index
Number of rows - 500 million
Text corpus size - 200GB
Query Selectivity Latency with
text_match
Latency with
regexp_like
Regex search for
‘united states’
150 million 2secs 9mins
Regex search for
‘product’
20 million 500ms 7mins
Text Analytics on Pinot - Introduction
Let’s take an example of a query log and resume file
● Store the query log and resume text in two STRING columns in a Pinot table.
● Create text indexes on both columns.
Count the number of group by queries that have between filter on timecol:
SELECT count(*) FROM MyTable WHERE text_match(logCol, '"timecol between" AND "group by"')
Count the number of candidates that have “machine learning” and “gpu processing”:
SELECT count(*) FROM MyTable WHERE text_match(resume, '"machine learning" AND "gpu processing"')
Count the number of candidates that have “distributed systems” and either ‘java’ or ‘c++’”
SELECT count(*) FROM MyTable WHERE text_match(resume, '"distributed systems" AND (Java C++)')
Text Analytics on Pinot - Index Format
● Pinot uses Apache Lucene to create text indexes.
● Lucene’s inverted index maps words (aka terms) to the corresponding docIds.
○ Document/Row comprises of fields and is the unit of indexing.
● Text indexes are created at a per column level during segment generation.
● Take the raw column value and create a Lucene document with two fields:
○ Text field - contains the actual column value representing the body of text that should be
indexed. The raw value is not stored by Lucene.
○ Stored field - contains a monotonically increasing docId counter to reverse map each
document indexed in Lucene back to its docId (rowId) in Pinot.
● Each text index is in its own directory (under Pinot segment directory)
○ dataDir/myTable_OFFLINE/myTable_0/v3/textCol.lucene.index
Text Analytics on Pinot - Index for Offline Segments
Create text indexes on two
columns textCol1 and textCol2
select …. FROM foo where text_match(textCol1, expr1)
AND text_match(textcol2, expr2)
Text Analytics on Pinot - DocId Translation Optimization
● Lucene index is composed of multiple sub-indexes. Each sub-index is an
independent self-contained index with Lucene docIds being relative to each
sub-index.
● So we have to store Pinot docId in every document added to Lucene index.
● This results in two-pass query execution:
○ The search query will return a bitmap of matching Lucene docIds.
○ Iterate over each Lucene docId, get the corresponding document and retrieve the Pinot docId
from the document. Store pinot docId in a bitmap.
● Retrieving the entire document from Lucene became a major bottleneck for
throughput testing.
● Solution:
○ Merge Lucene sub-indexes on segment completion.
○ Use a pre-built mapping of <lucene docId, pinot docId>. Built once during segment load.
● 40x-50x improvement in query performance at higher QPS.
Text Analytics on Pinot - DocId Translation Optimization
Improved scalability with increase in QPS
40x-50x improvement in query performance at higher QPS
Pinot Query
Execution Stack
SQL Support on Pinot
● Users currently use PQL query syntax for writing queries on Pinot.
● PQL is SQL-like but is syntactically different from SQL.
● Changes
○ Moved to Apache Calcite for query parsing and logical planning.
○ This allows Pinot to handle standard SQL syntax.
○ Tabular query response format - easy to integrate with UI tools - Superset, Grafana ..
select avg(salary), min(salary) from employee group by age
PQL response SQL response
[ {
"columnNames": ["age", "salary:min”],
"results": [[ "25", "60000"], ["26", "80000"]]
},
{
"columnNames": ["age", "salary:avg"],
"results": [["25", "180000"], ["26", "210000"]]
} ]
"dataSchema": {
"columnDataTypes": ["INT", "DOUBLE", "LONG"],
"columnNames": ["age", "avg(salary)", "min(salary)"]}
"rows": [
[25, 180000.0000, 60000],
[26, 210000.0000, 80000]
]
New Plugin Architecture
● Simplified dependencies on file formats and external systems to support easy
pluggability and integration with other systems.
Where is the input data
What’s the input data format?
How to ingest data?
Deploy Pinot In Kubernetes
- Pinot in Kubernetes
- Data Ingestion
- Access Pinot
- Operations
- Presto Pinot Integration
Kubernetes
- Container Orchestration
- Desired State Management
- Scalable
- Distributing Traffic
- Runs Anyway
Pinot In Kubernetes
- Manage Stateful Services
- Scale Each Layer Independently
- No single point of failure
- Auto recovery
Pinot Deployment Overview
Deploy Pinot Using HelmChart
helm repo add pinot
https://raw.githubusercontent.com/apache/incubator-pinot/master/kubernetes/helm
kubectl create ns pinot
helm install pinot pinot/pinot 
-n pinot 
--set cluster.name=pinot 
--set server.replicaCount=2
- Source code
- Pinot Helm
- Deploy Pinot in one three commands
Pinot Controller
- StatefulSet to keep Constant hostname
- Network
- Headless Service: Manage Unique Identities
- Load Balancer Service: Expose to external traffics
- Scaling Factor
- Cluster size, usually 3
- Deep Store Management
- Native Pinot FS Plugin
- Mount a PV supports AccessMode ReadWriteMany.
(e.g. NFS/AzureFile/CephFS)
- Leverage Linux FUSE lib, e.g. gcs-fuse
- Instance Type
- Balanced CPU/Memory
- E.g. EC2 m5.2xlarge(8 cores/32GB RAM)
Pinot Broker
- StatefulSet to keep Constant hostname
- Multi-Tenant mode
- Network
- Headless Service: Manage Unique Identities
- Load Balancer: Expose to external traffics
- Scaling Factor:
- Query Load
- Start with 2-3
- Instance Type:
- Balanced CPU and Memory
- E.g EC2 m5.xlarge(4 cores/16GB ram)
Pinot Server
- StatefulSet to keep Constant hostname
- Network
- Headless Service
- Scaling Factor:
- Data Volume
- Query Load
- Start with 2 for data replica
- Persistent Volume
- SSD
- Remote vs Local
- Instance Type
- High RAM
- E.g. EC2 r5.4xlarge(16 Cores/128G RAM) + 4TB EBS
- EC2 i3.4xlarge(16 Cores/122G RAM/2x1900G NVMe SSD)
Table Creation
- Schema
- Table Config
- K8s Job Spec
Scripts can be found here.
Loading Data into Pinot
- Table
- Data Ingestion
- Batch
- Streaming
Batch Workflow
- Fetch Raw Data
- Preprocessing
- Pinot Segment Creation
- Pinot Segment Push
Scripts can be found here.
Access Pinot
- Query
- Query Console
- Broker Endpoint
- Admin APIs
- Schema
- Table Config
- Segments Assignment
Scale up Pinot
- Increase Pinot Servers from 2 -> 3
- Rebalance Segments in the Cluster
kubectl scale statefulsets pinot-server --replicas=3 -n pinot
./pinot-admin.sh RebalanceTable 
-clusterName pinot
-zkAddress pinot-zookeeper:2181 
-tableName covid19_recovered_global_OFFLINE
Presto Integration
- Built-in Presto Pinot Connector
- Packing Most Recent Presto Release
- Pinot Catalog Config
kubectl apply -f presto-coordinator.yaml
./presto-cli.sh --server localhost:8080 --catalog pinot --schema default
Scripts can be found here.
Q&A
Scripts can be found here.
Text Analytics on Pinot - Index for Realtime Segments
Text match - with index
Number of rows - 500 million
Text corpus size - 200GB
Query 1 selectivity - 150 million
Query 2 selectivity - 20 million

More Related Content

What's hot

Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...HostedbyConfluent
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...HostedbyConfluent
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceDatabricks
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Cloudera, Inc.
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Etl is Dead; Long Live Streams
Etl is Dead; Long Live StreamsEtl is Dead; Long Live Streams
Etl is Dead; Long Live Streamsconfluent
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Thomas Riley
 
Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Databricks
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Incremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergIncremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergWalaa Eldin Moustafa
 

What's hot (20)

Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Etl is Dead; Long Live Streams
Etl is Dead; Long Live StreamsEtl is Dead; Long Live Streams
Etl is Dead; Long Live Streams
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)
 
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
 
Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Incremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergIncremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and Iceberg
 

Similar to New Features in Apache Pinot

Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleMariaDB plc
 
Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...
Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...
Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...confluent
 
Real-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingReal-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingDatabricks
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Anyscale
 
Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Christopher Biow
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsJulien Le Dem
 
Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]Alexander Hendorf
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Ontico
 
Postgresql Database Administration Basic - Day2
Postgresql  Database Administration Basic  - Day2Postgresql  Database Administration Basic  - Day2
Postgresql Database Administration Basic - Day2PoguttuezhiniVP
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Revision booklet 6957 2016
Revision booklet 6957 2016Revision booklet 6957 2016
Revision booklet 6957 2016jom1987
 
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Ontico
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQLSatoshi Nagayasu
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?Miklos Christine
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...Antonios Giannopoulos
 

Similar to New Features in Apache Pinot (20)

Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScale
 
04 pig data operations
04 pig data operations04 pig data operations
04 pig data operations
 
Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...
Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...
Kafka Summit NYC 2017 - Easy, Scalable, Fault-tolerant Stream Processing with...
 
Real-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingReal-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to Streaming
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
SphinxSE with MySQL
SphinxSE with MySQLSphinxSE with MySQL
SphinxSE with MySQL
 
Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]Introduction to Data Analtics with Pandas [PyCon Cz]
Introduction to Data Analtics with Pandas [PyCon Cz]
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
 
Postgresql Database Administration Basic - Day2
Postgresql  Database Administration Basic  - Day2Postgresql  Database Administration Basic  - Day2
Postgresql Database Administration Basic - Day2
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Revision booklet 6957 2016
Revision booklet 6957 2016Revision booklet 6957 2016
Revision booklet 6957 2016
 
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
 
COinS (eng version)
COinS (eng version)COinS (eng version)
COinS (eng version)
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
 

Recently uploaded

Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...shreenathji26
 
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliStructural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliNimot Muili
 
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...IJAEMSJORNAL
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...gerogepatton
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptxmohitesoham12
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...arifengg7
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...
Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...
Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...Amil baba
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSneha Padhiar
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
ADM100 Running Book for sap basis domain study
ADM100 Running Book for sap basis domain studyADM100 Running Book for sap basis domain study
ADM100 Running Book for sap basis domain studydhruvamdhruvil123
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
Detection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and trackingDetection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and trackinghadarpinhas1
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSsandhya757531
 
AntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxAntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxLina Kadam
 

Recently uploaded (20)

Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
 
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliStructural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
 
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptx
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...
Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...
Uk-NO1 kala jadu karne wale ka contact number kala jadu karne wale baba kala ...
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
ADM100 Running Book for sap basis domain study
ADM100 Running Book for sap basis domain studyADM100 Running Book for sap basis domain study
ADM100 Running Book for sap basis domain study
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
Detection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and trackingDetection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and tracking
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
 
AntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxAntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptx
 

New Features in Apache Pinot

  • 1. Apache Pinot Meetup @ Uber 05/05/2020
  • 3. Exact match search - without indexing select count(*) from foo where firstName = "John" FirstName Adam ... … John … … … Raw Data 0 1 2 3 4 Dictionary 0 2 2 3 3 2 2 4 Forward Index John 1. Find “John” in dictionary using binary search 2. Scan forward index and and count the number of records that match the dictionary Id
  • 4. Exact match search - with indexing select count(*) from foo where firstName = "John" FirstName Adam ... … John … … … Raw Data 0 1 2 3 4 Dictionary 0 2 2 3 3 2 2 4 Forward Index John 0 1 2 3 4 Bitmap of DocIds Bitmap of DocIds Bitmap of DocIds Bitmap of DocIds Bitmap of DocIds Inverted Index John 1. Find “John” in dictionary using binary search 2. Lookup Inverted index to get matching docIds for dictionaryId=2
  • 5. Text search - without indexing select count(*) from foo where regexp_like( firstName, "John*") FirstName Adam ... Johnny John … Johnson … Raw Data 0 1 2 3 4 Dictionary 0 2 2 3 3 2 2 4 Forward Index 1. Scan the dictionary and get the raw value for each dictionary Id 2. Pattern match the raw value with regex and compute matching dictionary Id set 3. Scan forward Index and count the number of records with dictionary Id part of matching set
  • 6. Text match - with index? ● Pinot supports super fast query processing through its indexes on non-BLOB like columns. ● For arbitrary text, we need more than exact match -- phrase, regex, fuzzy. ○ regexp_like is not efficient since it is a scan based operation. ○ Fuzzy search not supported ● In 0.3.0, we added support for text indexes to efficiently do arbitrary search on STRING columns where each column value is a large BLOB of text. ○ Use new in-built function text_match ○ select count(*) from foo where text_match (colName, searchExpr)
  • 7. Text match - with index Number of rows - 500 million Text corpus size - 200GB Query Selectivity Latency with text_match Latency with regexp_like Regex search for ‘united states’ 150 million 2secs 9mins Regex search for ‘product’ 20 million 500ms 7mins
  • 8. Text Analytics on Pinot - Introduction Let’s take an example of a query log and resume file ● Store the query log and resume text in two STRING columns in a Pinot table. ● Create text indexes on both columns. Count the number of group by queries that have between filter on timecol: SELECT count(*) FROM MyTable WHERE text_match(logCol, '"timecol between" AND "group by"') Count the number of candidates that have “machine learning” and “gpu processing”: SELECT count(*) FROM MyTable WHERE text_match(resume, '"machine learning" AND "gpu processing"') Count the number of candidates that have “distributed systems” and either ‘java’ or ‘c++’” SELECT count(*) FROM MyTable WHERE text_match(resume, '"distributed systems" AND (Java C++)')
  • 9. Text Analytics on Pinot - Index Format ● Pinot uses Apache Lucene to create text indexes. ● Lucene’s inverted index maps words (aka terms) to the corresponding docIds. ○ Document/Row comprises of fields and is the unit of indexing. ● Text indexes are created at a per column level during segment generation. ● Take the raw column value and create a Lucene document with two fields: ○ Text field - contains the actual column value representing the body of text that should be indexed. The raw value is not stored by Lucene. ○ Stored field - contains a monotonically increasing docId counter to reverse map each document indexed in Lucene back to its docId (rowId) in Pinot. ● Each text index is in its own directory (under Pinot segment directory) ○ dataDir/myTable_OFFLINE/myTable_0/v3/textCol.lucene.index
  • 10. Text Analytics on Pinot - Index for Offline Segments Create text indexes on two columns textCol1 and textCol2 select …. FROM foo where text_match(textCol1, expr1) AND text_match(textcol2, expr2)
  • 11. Text Analytics on Pinot - DocId Translation Optimization ● Lucene index is composed of multiple sub-indexes. Each sub-index is an independent self-contained index with Lucene docIds being relative to each sub-index. ● So we have to store Pinot docId in every document added to Lucene index. ● This results in two-pass query execution: ○ The search query will return a bitmap of matching Lucene docIds. ○ Iterate over each Lucene docId, get the corresponding document and retrieve the Pinot docId from the document. Store pinot docId in a bitmap. ● Retrieving the entire document from Lucene became a major bottleneck for throughput testing. ● Solution: ○ Merge Lucene sub-indexes on segment completion. ○ Use a pre-built mapping of <lucene docId, pinot docId>. Built once during segment load. ● 40x-50x improvement in query performance at higher QPS.
  • 12. Text Analytics on Pinot - DocId Translation Optimization Improved scalability with increase in QPS 40x-50x improvement in query performance at higher QPS
  • 14. SQL Support on Pinot ● Users currently use PQL query syntax for writing queries on Pinot. ● PQL is SQL-like but is syntactically different from SQL. ● Changes ○ Moved to Apache Calcite for query parsing and logical planning. ○ This allows Pinot to handle standard SQL syntax. ○ Tabular query response format - easy to integrate with UI tools - Superset, Grafana .. select avg(salary), min(salary) from employee group by age PQL response SQL response [ { "columnNames": ["age", "salary:min”], "results": [[ "25", "60000"], ["26", "80000"]] }, { "columnNames": ["age", "salary:avg"], "results": [["25", "180000"], ["26", "210000"]] } ] "dataSchema": { "columnDataTypes": ["INT", "DOUBLE", "LONG"], "columnNames": ["age", "avg(salary)", "min(salary)"]} "rows": [ [25, 180000.0000, 60000], [26, 210000.0000, 80000] ]
  • 15. New Plugin Architecture ● Simplified dependencies on file formats and external systems to support easy pluggability and integration with other systems. Where is the input data What’s the input data format? How to ingest data?
  • 16. Deploy Pinot In Kubernetes - Pinot in Kubernetes - Data Ingestion - Access Pinot - Operations - Presto Pinot Integration
  • 17. Kubernetes - Container Orchestration - Desired State Management - Scalable - Distributing Traffic - Runs Anyway
  • 18. Pinot In Kubernetes - Manage Stateful Services - Scale Each Layer Independently - No single point of failure - Auto recovery
  • 20. Deploy Pinot Using HelmChart helm repo add pinot https://raw.githubusercontent.com/apache/incubator-pinot/master/kubernetes/helm kubectl create ns pinot helm install pinot pinot/pinot -n pinot --set cluster.name=pinot --set server.replicaCount=2 - Source code - Pinot Helm - Deploy Pinot in one three commands
  • 21.
  • 22. Pinot Controller - StatefulSet to keep Constant hostname - Network - Headless Service: Manage Unique Identities - Load Balancer Service: Expose to external traffics - Scaling Factor - Cluster size, usually 3 - Deep Store Management - Native Pinot FS Plugin - Mount a PV supports AccessMode ReadWriteMany. (e.g. NFS/AzureFile/CephFS) - Leverage Linux FUSE lib, e.g. gcs-fuse - Instance Type - Balanced CPU/Memory - E.g. EC2 m5.2xlarge(8 cores/32GB RAM)
  • 23. Pinot Broker - StatefulSet to keep Constant hostname - Multi-Tenant mode - Network - Headless Service: Manage Unique Identities - Load Balancer: Expose to external traffics - Scaling Factor: - Query Load - Start with 2-3 - Instance Type: - Balanced CPU and Memory - E.g EC2 m5.xlarge(4 cores/16GB ram)
  • 24. Pinot Server - StatefulSet to keep Constant hostname - Network - Headless Service - Scaling Factor: - Data Volume - Query Load - Start with 2 for data replica - Persistent Volume - SSD - Remote vs Local - Instance Type - High RAM - E.g. EC2 r5.4xlarge(16 Cores/128G RAM) + 4TB EBS - EC2 i3.4xlarge(16 Cores/122G RAM/2x1900G NVMe SSD)
  • 25. Table Creation - Schema - Table Config - K8s Job Spec Scripts can be found here.
  • 26. Loading Data into Pinot - Table - Data Ingestion - Batch - Streaming
  • 27. Batch Workflow - Fetch Raw Data - Preprocessing - Pinot Segment Creation - Pinot Segment Push Scripts can be found here.
  • 28. Access Pinot - Query - Query Console - Broker Endpoint - Admin APIs - Schema - Table Config - Segments Assignment
  • 29. Scale up Pinot - Increase Pinot Servers from 2 -> 3 - Rebalance Segments in the Cluster kubectl scale statefulsets pinot-server --replicas=3 -n pinot ./pinot-admin.sh RebalanceTable -clusterName pinot -zkAddress pinot-zookeeper:2181 -tableName covid19_recovered_global_OFFLINE
  • 30. Presto Integration - Built-in Presto Pinot Connector - Packing Most Recent Presto Release - Pinot Catalog Config kubectl apply -f presto-coordinator.yaml ./presto-cli.sh --server localhost:8080 --catalog pinot --schema default Scripts can be found here.
  • 31. Q&A Scripts can be found here.
  • 32. Text Analytics on Pinot - Index for Realtime Segments
  • 33. Text match - with index Number of rows - 500 million Text corpus size - 200GB Query 1 selectivity - 150 million Query 2 selectivity - 20 million