Successfully reported this slideshow.
Your SlideShare is downloading. ×

Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale

Ad

Enabling Real-time Analytics Applications @ LinkedIn’s Scale
Mayank Shrivastava Jackie Jiang
Senior Software Engineer
Seun...

Ad

1
2
3
4
Agenda
Introduction
Pinot @ LinkedIn
How to use Pinot
Pinot Performance

Ad

How is data generated and used at LinkedIn
Actor Verb
Member
Job
Post
Company
Object Life Cycle
Create
Generate
Analyze
Pr...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 48 Ad
1 of 48 Ad
Advertisement

More Related Content

Similar to Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale

Advertisement

Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale

  1. 1. Enabling Real-time Analytics Applications @ LinkedIn’s Scale Mayank Shrivastava Jackie Jiang Senior Software Engineer Seunghyun Lee Senior Software EngineerStaff Software Engineer Apache Pinot
  2. 2. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  3. 3. How is data generated and used at LinkedIn Actor Verb Member Job Post Company Object Life Cycle Create Generate Analyze Product DataInsights 600+ million members Tens of million posts likes/shared per day 3+ million jobs posted per month 30 million companies Trillions of events per day
  4. 4. Real-time Analytics Applications at LinkedIn
  5. 5. How to build an online analytics application? • Real-time data ingestion • Millions of active users, 1000s of queries per sec • Super low latency (10s ms) • Highly available, always on
  6. 6. Approach 1. Join on the fly Event Stream Profile View Profile View Table Member Table Application Server Who viewed my profile • Real-time (depending on storage) • High latency due to join
  7. 7. Approach 2. Pre Join + Pre Aggregate • Near real-time ingestion • Latency varies with query selectivity Event Stream Profile View Profile View Table Member Table Application Server Who viewed my profile Stream Processing Engine Pre Join + Pre Aggr
  8. 8. Approach 3. Pre Join + Pre Aggregate + Pre Cube • Very fast • Batch ingestion (hourly / daily) • Storage explosion • Re-bootstrap on schema change Event Stream Profile View Profile View Table Member Table Application Server Who viewed my profile Batch Processing Engine Pre Join + Pre Aggr + Pre Cube
  9. 9. Latency vs. Flexibility Profile View Table Member Table Pre-Join Pre-Aggregation Pre-Cube Spark SQL Presto Hive Big Query Druid Elastic Search Pinot Kylin KV Store Latency Flexibility lowhigh lowhigh Pinot
  10. 10. Who Viewed My Profile @ LinkedIn Data Lake Stream Processing WVMP Dashboard Ad-hoc Queries Espresso Raw Tracking Data Pre-joined Data Pre Join + Pre Aggr
  11. 11. What is Apache Pinot? • OLAP Datastore • Columnar, indexed storage • Low latency analytics • Distributed – highly available, reliable, scalable • Lambda architecture ○ Offline data pushes + Real-time stream ingestion • Open Source
  12. 12. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  13. 13. Pinot @ LinkedIn 70+ 2000+ 100K+ 1M+ Member Facing Use Cases Dashboards for Internal Business Metrics Queries Per Second Records Ingested Per Second
  14. 14. Pinot @ LinkedIn: Member Facing Analytics Report • Providing analytics reports for Linkedin member-facing applications • Very high QPS (Thousands) • Requires strict latency SLA (10s ms - sub-sec)
  15. 15. Pinot @ LinkedIn: Interactive Dashboard • Visualization tool for multi-dimensional metrics • Complex, explorative queries • 2000+ metrics, used by 1000+ employees
  16. 16. Pinot @ LinkedIn: Anomaly Detection • Efficiently detect and investigate anomalies in metrics • Third Eye: Part of Apache Pinot open source
  17. 17. Pinot Usage @ Other Companies
  18. 18. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  19. 19. How to use Pinot Batch Data Ingestion Real-time Data Ingestion SQL-like Query Interface (PQL)
  20. 20. Let’s build something cool Event RSVP Data
  21. 21. How to use Pinot: Workflow Define Schema Define Table Configuration Create Table One Time Setup Raw Data Generate Pinot Segments Push Data Streaming Data Setup Stream Data Source Batch (Scheduled Job) Real-time (One Time Setup) Data Ingestion HDFS, S3, ADSL, NFS... Kafka, Event Hub...
  22. 22. How to use Pinot: Define Schema ● Schema name: meetupRsvp ● Dimension field specs ○ event_name (string) ○ event_time (long) ○ country (string) ○ city (string) ○ … ● Metrics field specs ○ rsvp_count (int) ● Time field spec ○ timestamp (long) ■ timetype: epoch / datetime ■ granularity: millisecond / second/hour/day • Dimension: an attribute of your data (filter, group by) • Metric: a number that is used to measure characteristics of a dimension (aggregation) • Time: a timestamp of an event (partitioning, retention management) SELECT event_name, sum(rsvp_count) FROM meetupRsvp WHERE country = “us” GROUP BY event_name TOP 10 Example Query - Top 10 events in US
  23. 23. How to use Pinot: Configure and Create Table Pinot Schema Table Config ● Table name: meetupRsvp ● Table type: batch / realtime / hybrid ● Replication factor: 2 ● Index Columns: ... ● Bloom filters: ... ● Retention: 30 days ● ... Pinot Admin Client
  24. 24. How to use Pinot: Batch Ingestion Raw DataRaw Data Raw Data Segment Generation Job (library) Json, CSV, Avro, Parquet, ORC... Pinot Schema Table Config Pinot Segment Pinot Segment Pinot Segment HDFS, S3, ADLS, NFS... HDFS, S3, ADLS, NFS...
  25. 25. How to use Pinot: Batch Ingestion Raw Data Segment Generation Job (library) Json, Avro, Parquet, ORC... Pinot Schema Table Config Pinot Segment Pinot Segment Pinot Segment Segment Push Job (library) HDFS, S3, ADLS, NFS... HDFS, S3, ADLS, NFS...
  26. 26. How to use Pinot: Segment Assignment Segment Push Job Controller Helix Zookeeper Server-0 Server-1 Server-2 Pinot • Assignment strategies ○ Uniform ○ Replica Group ○ Partition Aware Segment Store S0 S2S1 HDFS, S3, ADLS, NFS... ● S0: Sever-0, Server-1 ● S1: Server-1, Server-2 ● S2: Server-0, Server-2 S0 S2 S1 S0 S2 S1 1. Table name 2. Segment name 3. Segment URI path
  27. 27. How to use Pinot: Query Routing Segment Push Job Controller Helix • Routing Strategies ○ Uniform ○ Replica Group ○ Partition Aware Broker Queries Segment Store S0 S2S1 HDFS, S3, ADLS, NFS... Server-0 Server-1 Server-2 Pinot S0 S2 S1 S0 S2 S1
  28. 28. How to use Pinot: Batch + Realtime Segment Push Job Controller Helix Real-time Servers Offline Servers Broker Queries Pinot Streaming Data Kafka, Event Hub, Kinesis... Table Config ● Table name: meetupRsvp ● Table type: real-time ● Replication factor: 2 ● Kafka broker: ... ● Kafka topic name: ... ● Retention: 5 days ● ... • A single schema for both offline + real-time tables
  29. 29. How to use Pinot: Batch + Realtime Segment Push Job Controller Helix Real-time Servers Offline Servers Broker Queries Pinot Streaming Data Kafka, Event Hub, Kinesis... • Real-time servers keep consumed data in memory, periodically flush data to segment store. • Broker handles offline and real-time federation.
  30. 30. Quick Demo Event RSVP Data
  31. 31. 1 2 3 4 Agenda Introduction Pinot @ LinkedIn How to use Pinot Pinot Performance
  32. 32. Interactive Dashboard select sum(pageView) from T where country = us and browser = chrome ... group by time • Human-driven queries • Slice and dice over arbitrary dimensions 5000 Queries Pinot Druid Total Time 11 minutes 24 minutes P50 84ms 136ms P90 206ms 667ms
  33. 33. Site Facing Analytics select sum(articleViewCount) from T where articleId = x ... and time >= y time < z group by viewer[title|geo|industry] • Pre-defined queries with different filtering values • Usually have a filter on the primary key (e.g. articleId) • High QPS (thousands), low latency (< 100ms for 99%) requirements
  34. 34. Anomaly Detection for d1 in [us, ca, ...] for d2 in [chrome, firefox, ...] ... select sum(pageViews) from T where country = d1 and browser = d2… group by time Filter Aggregation select … where country = us … Slow, scan 60-70% data select … where country = ireland … Scan less than 1% • Identifying issues requires monitoring all possible combinations • Data distribution can be skewed
  35. 35. Secret behind Pinot Aggregation Filter Storage Scan Star-Tree Pre-aggregation Scan Inverted Index Columnar Store Encoding/Compression Sorted Index Star-Tree Index ❏ Common Techniques ❏ Pinot & Druid ❏ Pinot Only select sum(pageView) from T where country = us and browser = chrome
  36. 36. Columnar Store • Read relevant columns only country browser ... us chrome ... ca firefox ... jp ie ... us firefox ... ca ie ... … … ... Raw Data Row Based Column Based Aggregation Filter Storage select sum(pageView) from T where country = us and browser = chrome Columnar us chrome ... ca firefox ... jp ie ... country us ca jp us ca … browser chrome firefox ie firefox ie … ... ... ... ... ... ... ...
  37. 37. Encoding & Compression Dictionary Forward Index country ca jp us … browser chrome firefox ie … country 2 0 1 2 0 ... browser 0 1 2 1 2 ... • Storage compression ○ Dictionary encoding ○ Bit compression Aggregation Filter Storage Encoding/Compression select sum(pageView) from T where country = us and browser = chrome Column Based country us ca jp us ca … browser chrome firefox ie firefox ie … docId 0 1 2 3 4 … docId 0 1 2 3 4 ... dictId 0 1 2 …
  38. 38. Inverted Index docId country browser 0 us chrome 1 ca firefox 2 jp ie 3 us firefox 4 ca ie … … … Raw Data country docIds ca 1, 4... jp 2... us 0, 3... ... ... Inverted Index browser docIds chrome 0 ... firefox 1, 3... ie 2, 4... ... ...• Storing bitmap for each value • Fast filtering: ○ Constant time value lookup ○ Bit operations for AND/OR clause Aggregation Filter Storage Inverted Index select sum(pageView) from T where country = us and browser = chrome
  39. 39. Sorted Index • Better data compression: ○ Run length encoding ○ Can be accessed as forward/inverted index • Spatial locality country start docId end docId ca 0 80 jp 81 100 us 101 300 … … … docId country 0 ca ... … 100 jp 101 us … … 300 us … … sorted index inverted index Aggregation Filter Storage Sorted Index select sum(pageView) from T where country = us and browser = chrome
  40. 40. Latency vs. Space Trade-off latency space requirement scan pre-cubeStar-Tree select sum(pageView) from T where country = us and browser = chrome Aggregation Filter Storage Star-Tree Pre-aggregation Star-Tree Index
  41. 41. Star-Tree Index latency space requirement T=infinity T=1,000,000 T=10,000 T=100 T=1 • Configurable trade-off between latency and space by partial pre-aggregation technique • Be able to achieve a hard upper bound for query latencies
  42. 42. Star-Tree Index
  43. 43. Flexible Query Execution Plan Query Optimization select max(col) from T Use metadata instead of scanning select sum(metric) from T where country = us and accountId = x Reorder filter based on the available indexes (apply accountId before country predicate) Segment level physical query planner can intelligently choose the best way to solve the query based on the segment metadata and available indexes.
  44. 44. Global Optimizations Problem Solution Querying all segments Segment pruning to minimize the number of segments to query Querying all servers Smart segment assignment to reduce the fan-out to servers
  45. 45. Conclusion User Activity Data Member Facing Applications Interactive Dashboard Anomaly Detection
  46. 46. Contributing to Pinot • We are looking for contributions! • Apache Pinot (incubating) 0.1.0 is available at https://pinot.apache.org • Pinot Twitter Account https://twitter.com/ApachePinot • Pinot Meetup Page https://www.meetup.com/apache-pinot • Pinot Slack Channel https://tinyurl.com/pinotSlackChannel
  47. 47. Folks behind Pinot Mayank Shrivastava Subbu Subramaniam Jean-Francois Im Jackie Jiang Seunghyun Lee Jennifer Dai Neha Pawar Jialiang Li Sunitha Beeram Shraddha Sahay Kishore Gopalakrishna Xiang Fu James Shao Prasanna Ravi John Gutmann Dino Occhialini Walter Huf Xiaohui Sun Long Huynh Akshay Rai Alexander Pucher Jihao Zhang Felix Cheung Olivier Lamy Jim Jagielski Marcel Siegrist Roman Shaposhnik Anurag Shendge
  48. 48. Thank you

×