Qian Qiao Neng
eBay Senior Product Manager
Sep 2017
When OLAP Meets Real-time, What
Happens in eBay?
PLATFORM ENGINEERING REVIEW
Agenda
What’s OLAP & Apache Kylin
Challenge of Real-time OLAP
eBay Real-time OLAP
Architecture
3 Stages Data Store
Segment Window
Others
2
PLATFORM ENGINEERING REVIEW
What’s OLAP
• OLAP(Online Analytical Processing), is an approach to
answering multi-dimensional analytical queries quickly
in computing.
• OLAP Cube, a data set consists of numeric facts
called measures that are categorized by dimensions.
• OLAP Cube Build is the data process pre-calculating
measures by multi-dimensions, and save them in storage as a
cube.
• OLAP Cube trades off data processing latency for online
query latency
3
PLATFORM ENGINEERING REVIEW
What is Apache Kylin
Extreme OLAP Engine for Big Data
Apache Kylin is an open source Distributed Analytics Engine
designed to provide SQL interface and multi-dimensional analysis
(OLAP) on Hadoop supporting extremely large datasets, initiated and
continuously contributed by eBay Inc.
kylin 麒麟
--n. (in Chinese art) a mythical animal of composite form
Kylin fills an important gap – OLAP on Hadoop
4
PLATFORM ENGINEERING REVIEW
Kylin Architecture
5
PLATFORM ENGINEERING REVIEW
The Challenge
Remember Kylin is about pre-aggregations, it successfully reduced the query latencies by spending
more time and resources on data processing.
But for real-time OLAP, cubing time and data visibility latency is critical.
Can we pre-build the cubes in real-time in a more cost effective manner? This is hard but still doable.
6
PLATFORM ENGINEERING REVIEW
Real-time OLAP
Milliseconds Query Latency + Milliseconds Data Processing Latency
7
PLATFORM ENGINEERING REVIEW
3 Stages Data Store
We save the unbounded incoming streaming data into 3 stages.
8
InMem Stage
On Disk Stage
Full Cubing Stage
Continuously InMem
Aggregations
Flush to disk, columnar
based storage and indexes
Full cubing with MR or
Spark, save to HBase.
Unbounded
streaming events
PLATFORM ENGINEERING REVIEW
3 Stages Data Store
99
InMem Stage On Disk Stage Full Cubing Stage
InMemoryScanner FragementScanner HBaseScanner
GTScanner Interface
Small Real-time Data Volume
Few Aggregations
Large Historical Data Volume
All Aggregations
Low Data
Process
Latency
Users
Transparent
Low Query
Latency
PLATFORM ENGINEERING REVIEW
eBay’s Real-time OLAP built on Apache Kylin
10
PLATFORM ENGINEERING REVIEW
Streaming
Receivers Cluster
11
ReplicaSet1Streaming Sources
Cube
Storage
(HBase)
new streaming cube request
1
2
3
4
5
Build EngineSteaming Coordinator
6
7
How Real-time Cube Build Engine Works
8
ReplicaSet2
ReplicaSet3 ReplicaSet4
ReplicaSet5
9
10
PLATFORM ENGINEERING REVIEW
Streaming Receivers
Cluster
12
SQL Query
Steaming CoordinatorQuery Engine
1
2 3
How Real-time Cube Query Engine Works
12
Cube
Storage
(HBase)
PLATFORM ENGINEERING REVIEW
Data Storage
13
Raw
Record_1
…
RR_x
RR_x+1
…
RR_y
Seg_LL
Seg_4
Seg_3
1 … I
1 … J
1 … K
In Memory Active Fragments
Seg_LL
Seg_2
Seg_1
1 … L
1 … M
1 … N
Immutable Fragments
InMemoryScanner FragementScanner HBaseScanner
Open to Write Close to Process
GTScanner Interface
How to support milliseconds query latency
Processed
Unbounded
streaming events
PLATFORM ENGINEERING REVIEW
Hierarchy Data on Disk Stage
14
Fragment
Segment
Cube Cube
Segment 1
Fragment 1 Fragment 2 ...
...
...
Meta data Meta data Meta data Meta data
PLATFORM ENGINEERING REVIEW
Segments are divided by event time.
New created segments are first
In Active state.
After a preconfigured data retention
period, no further events coming,
The segment will be changed to
Immutable state and then write to
remote HDFS.
15
Segment Window And States
PLATFORM ENGINEERING REVIEW 16
Seg_4
1 … J
Active Segments
Seg_3
Seg_2
Seg_1
1 … L
1 … M
1 … N
Immutable Segments
Open to Write Close to Process
Segment States
Unbounded
streaming events
In Memory
Store
Fragments
PLATFORM ENGINEERING REVIEW 17
Segment Store On Disk
PLATFORM ENGINEERING REVIEW 18
Columnar Based Fragment File Design
Dimensions : D1,D2…Dn.
Metrics : M1,M2...Mm.
RawRecords : R1,R2...Rr.
D_1
…
D_n
Metric_1
…
Metric_m
Dimension_1 Dictionary
Dimension_1 Encoded
Value by Row
Index Data
Dimensions
Metrics
PLATFORM ENGINEERING REVIEW 19
Dimension_1 Dictionary
Dimension_1 Encoded
Value by Row
Inverted Index
Dim_1Value_1 Dic_1 Value_1
… …
Dim_1Value_x Dic_1 Value_x
Row_1’s Dic_1 Value
…
Row_r’s Dic_1 Value
Dic_1Value_1 Offset_1
… …
Dic_1Value_x Offset_x
Dic_1Value_1‘s occurrence of Rows
…
Dic_1Value_x‘s occurrence of Rows
Offset_1
Offset_...
Offset_x R_1
Occu
R_2
Occu
…
R_r
Occu
PLATFORM ENGINEERING REVIEW 20
Dimensions : Site, Category, Date
Metrics : SPS, PV.
RawRecords :
Site Category Date SPS PV
US Collectibles 8/27/16 11757 204143
US Collectibles 8/28/16 10949 192350
GE Fashion 8/28/16 5338 186659
US Collectibles 8/29/16 9609 186283
US Fashion 8/27/16 15174 177123
US Fashion 8/28/16 14612 172552
GE Home & Garden 8/27/16 7336 171364
US Fashion 8/29/16 13927 171235
GE Home & Garden 8/28/16 8615 169846
US Home & Garden 8/29/16 4560 166506
PLATFORM ENGINEERING REVIEW 21
D1
…
Dn
Metrics1
…
Metrics m
Dimension_1 Dictionary
Dimension_1 Encoded
Value by Row
Inverted Index
Dimensions
Metrics
US 1
GE 2
1121112121
1 offset1
2 offset2
1101110101
0010001010
11757
10949
5338
9609
15174
14612
7336
13927
8615
4560
Record_1 M1
Record_2 M1
……
Record_r M1
PLATFORM ENGINEERING REVIEW 22
Replica Set
The concept of Replica Set is similar to many other
distributed systems like Kafka, Mongo, Kubernetes, etc.
Replica Set ensures the replications for HA purpose.
Streaming receiver instances in the same replica set
have the exact same local state.
Streaming receiver instances are pre-allocated to join the
Replica Set.
Streaming
Receivers Cluster
ReplicaSet1 ReplicaSet2
ReplicaSet3 ReplicaSet4
ReplicaSet5
PLATFORM ENGINEERING REVIEW 23
Replica Set
Replica Set
Receiver1
Receiver2
Assignment:
“cube1”:[1,2]
“cube2”:[2,3]
Zookeeper
• All receivers in the set
share the same
assignment.
• The lead of the
ReplicaSet responsible
to upload segments to
HDFS
• Use Zookeeper to do
leader election
PLATFORM ENGINEERING REVIEW 24
Date Time Topic Name Partition # Offset SeqID of Active Segments
2016/10/01
12:00:00
Topic_1 1 x Seg_5, I Seg_4, J Seg_3, K
Check Point
Kafka
Seg_5
Seg_4
Seg_3
Topic_1 Part_1
1 … I
1 … J
1 … K
Active Fragments
... x x+1 … y y+1 …
Kafka
PLATFORM ENGINEERING REVIEW 25
Question?

When OLAP Meets Real-Time, What Happens in eBay?

  • 1.
    Qian Qiao Neng eBaySenior Product Manager Sep 2017 When OLAP Meets Real-time, What Happens in eBay?
  • 2.
    PLATFORM ENGINEERING REVIEW Agenda What’sOLAP & Apache Kylin Challenge of Real-time OLAP eBay Real-time OLAP Architecture 3 Stages Data Store Segment Window Others 2
  • 3.
    PLATFORM ENGINEERING REVIEW What’sOLAP • OLAP(Online Analytical Processing), is an approach to answering multi-dimensional analytical queries quickly in computing. • OLAP Cube, a data set consists of numeric facts called measures that are categorized by dimensions. • OLAP Cube Build is the data process pre-calculating measures by multi-dimensions, and save them in storage as a cube. • OLAP Cube trades off data processing latency for online query latency 3
  • 4.
    PLATFORM ENGINEERING REVIEW Whatis Apache Kylin Extreme OLAP Engine for Big Data Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, initiated and continuously contributed by eBay Inc. kylin 麒麟 --n. (in Chinese art) a mythical animal of composite form Kylin fills an important gap – OLAP on Hadoop 4
  • 5.
  • 6.
    PLATFORM ENGINEERING REVIEW TheChallenge Remember Kylin is about pre-aggregations, it successfully reduced the query latencies by spending more time and resources on data processing. But for real-time OLAP, cubing time and data visibility latency is critical. Can we pre-build the cubes in real-time in a more cost effective manner? This is hard but still doable. 6
  • 7.
    PLATFORM ENGINEERING REVIEW Real-timeOLAP Milliseconds Query Latency + Milliseconds Data Processing Latency 7
  • 8.
    PLATFORM ENGINEERING REVIEW 3Stages Data Store We save the unbounded incoming streaming data into 3 stages. 8 InMem Stage On Disk Stage Full Cubing Stage Continuously InMem Aggregations Flush to disk, columnar based storage and indexes Full cubing with MR or Spark, save to HBase. Unbounded streaming events
  • 9.
    PLATFORM ENGINEERING REVIEW 3Stages Data Store 99 InMem Stage On Disk Stage Full Cubing Stage InMemoryScanner FragementScanner HBaseScanner GTScanner Interface Small Real-time Data Volume Few Aggregations Large Historical Data Volume All Aggregations Low Data Process Latency Users Transparent Low Query Latency
  • 10.
    PLATFORM ENGINEERING REVIEW eBay’sReal-time OLAP built on Apache Kylin 10
  • 11.
    PLATFORM ENGINEERING REVIEW Streaming ReceiversCluster 11 ReplicaSet1Streaming Sources Cube Storage (HBase) new streaming cube request 1 2 3 4 5 Build EngineSteaming Coordinator 6 7 How Real-time Cube Build Engine Works 8 ReplicaSet2 ReplicaSet3 ReplicaSet4 ReplicaSet5 9 10
  • 12.
    PLATFORM ENGINEERING REVIEW StreamingReceivers Cluster 12 SQL Query Steaming CoordinatorQuery Engine 1 2 3 How Real-time Cube Query Engine Works 12 Cube Storage (HBase)
  • 13.
    PLATFORM ENGINEERING REVIEW DataStorage 13 Raw Record_1 … RR_x RR_x+1 … RR_y Seg_LL Seg_4 Seg_3 1 … I 1 … J 1 … K In Memory Active Fragments Seg_LL Seg_2 Seg_1 1 … L 1 … M 1 … N Immutable Fragments InMemoryScanner FragementScanner HBaseScanner Open to Write Close to Process GTScanner Interface How to support milliseconds query latency Processed Unbounded streaming events
  • 14.
    PLATFORM ENGINEERING REVIEW HierarchyData on Disk Stage 14 Fragment Segment Cube Cube Segment 1 Fragment 1 Fragment 2 ... ... ... Meta data Meta data Meta data Meta data
  • 15.
    PLATFORM ENGINEERING REVIEW Segmentsare divided by event time. New created segments are first In Active state. After a preconfigured data retention period, no further events coming, The segment will be changed to Immutable state and then write to remote HDFS. 15 Segment Window And States
  • 16.
    PLATFORM ENGINEERING REVIEW16 Seg_4 1 … J Active Segments Seg_3 Seg_2 Seg_1 1 … L 1 … M 1 … N Immutable Segments Open to Write Close to Process Segment States Unbounded streaming events In Memory Store Fragments
  • 17.
    PLATFORM ENGINEERING REVIEW17 Segment Store On Disk
  • 18.
    PLATFORM ENGINEERING REVIEW18 Columnar Based Fragment File Design Dimensions : D1,D2…Dn. Metrics : M1,M2...Mm. RawRecords : R1,R2...Rr. D_1 … D_n Metric_1 … Metric_m Dimension_1 Dictionary Dimension_1 Encoded Value by Row Index Data Dimensions Metrics
  • 19.
    PLATFORM ENGINEERING REVIEW19 Dimension_1 Dictionary Dimension_1 Encoded Value by Row Inverted Index Dim_1Value_1 Dic_1 Value_1 … … Dim_1Value_x Dic_1 Value_x Row_1’s Dic_1 Value … Row_r’s Dic_1 Value Dic_1Value_1 Offset_1 … … Dic_1Value_x Offset_x Dic_1Value_1‘s occurrence of Rows … Dic_1Value_x‘s occurrence of Rows Offset_1 Offset_... Offset_x R_1 Occu R_2 Occu … R_r Occu
  • 20.
    PLATFORM ENGINEERING REVIEW20 Dimensions : Site, Category, Date Metrics : SPS, PV. RawRecords : Site Category Date SPS PV US Collectibles 8/27/16 11757 204143 US Collectibles 8/28/16 10949 192350 GE Fashion 8/28/16 5338 186659 US Collectibles 8/29/16 9609 186283 US Fashion 8/27/16 15174 177123 US Fashion 8/28/16 14612 172552 GE Home & Garden 8/27/16 7336 171364 US Fashion 8/29/16 13927 171235 GE Home & Garden 8/28/16 8615 169846 US Home & Garden 8/29/16 4560 166506
  • 21.
    PLATFORM ENGINEERING REVIEW21 D1 … Dn Metrics1 … Metrics m Dimension_1 Dictionary Dimension_1 Encoded Value by Row Inverted Index Dimensions Metrics US 1 GE 2 1121112121 1 offset1 2 offset2 1101110101 0010001010 11757 10949 5338 9609 15174 14612 7336 13927 8615 4560 Record_1 M1 Record_2 M1 …… Record_r M1
  • 22.
    PLATFORM ENGINEERING REVIEW22 Replica Set The concept of Replica Set is similar to many other distributed systems like Kafka, Mongo, Kubernetes, etc. Replica Set ensures the replications for HA purpose. Streaming receiver instances in the same replica set have the exact same local state. Streaming receiver instances are pre-allocated to join the Replica Set. Streaming Receivers Cluster ReplicaSet1 ReplicaSet2 ReplicaSet3 ReplicaSet4 ReplicaSet5
  • 23.
    PLATFORM ENGINEERING REVIEW23 Replica Set Replica Set Receiver1 Receiver2 Assignment: “cube1”:[1,2] “cube2”:[2,3] Zookeeper • All receivers in the set share the same assignment. • The lead of the ReplicaSet responsible to upload segments to HDFS • Use Zookeeper to do leader election
  • 24.
    PLATFORM ENGINEERING REVIEW24 Date Time Topic Name Partition # Offset SeqID of Active Segments 2016/10/01 12:00:00 Topic_1 1 x Seg_5, I Seg_4, J Seg_3, K Check Point Kafka Seg_5 Seg_4 Seg_3 Topic_1 Part_1 1 … I 1 … J 1 … K Active Fragments ... x x+1 … y y+1 … Kafka
  • 25.