Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix

Scaling Cloud-Scale Translytics Workloads
with Omid and Phoenix
Ohad Shacham
Yahoo Research
Edward Bortnikov
Yahoo Research
RESEARCH
Yonatan Gottesman
Yahoo Research

Agenda
2
Translytics = Transactions + Analytics
Cloud-Scale Use Cases
Doing it in the HBase-Omid-Phoenix World
Omid and Phoenix Deep Dive

Real-Time Data Processing on the Rise
3

The Applications Perspective
4
Event-to-action/insight latency becomes king
Stream processing, asynchronous execution
Data consistency becomes nontrivial
Complex processing patterns (online reporting to AI)
Data integration across multiple feeds and schemas
OLTP World
Analytics World

Translytics Platforms Vision
5
The best of all worlds: OLTP and Analytics all-in-one
Enable complex, consistent, real-time data processing
Simple API’s with strong guarantees
Built to scale on top of NoSQL data platforms

OLTP Coming to NoSQL
6
Traditional NoSQL guarantees row-level atomicity
Translytics applications often bundle reads and writes
Asynchronous design patterns drive concurrency
Without ACID guarantees, chaos rules!

ACID transactions
Multiple data accesses in a single logical operation
Atomic
“All or nothing” – no partial effect observable
Consistent
The DB transitions from one valid state to another
Isolated
Appear to execute in isolation
Durable
Committed data cannot disappear

Use Case: Audience Targeting for Ads
8
Advertisers optimize campaigns to reach the right user audiences
Ad-tech platforms build and sell audience segments (identity sets)
Segmentation is based on user features (demographics, behavior, …)
Algorithms vary from rule-based heuristics to AI classification
Timeliness directly affects revenue

Real-Time Targeting Platform
9
Storm for Compute
Audience segmentation algorithms embedded in bolts
HBase for Storage
User Profiles (U), Segments (S), and U ↔ S relationships
Kafka for Messaging
Scale: trillions of touchpoints/month

Challenge: Keeping the Data Consistent
10
Shared data is accessed in parallel by multiple bolts
Access patterns are complex
User profile update: read+compute+write
User↔Segment mapping update: two writes
Segment query (scan): read multiple rows
HBase read/write API does not provide atomic guarantees

Omid Comes to Help
11
Transaction Processing layer for Apache HBase
Apache Incubation (started 2015, graduation planned 2019)
Easy-to-use API (good old NoSQL)
Popular consistency model (snapshot isolation)
Battle tested (in prod @Yahoo since 2015, new customers onboarding)

Omid Programming
12
TransactionManager tm = HBaseTransactionManager.newInstance();
TTable txTable = new TTable("MY_TX_TABLE”);
Transaction tx = tm.begin(); // Control path
Put row1 = new Put(Bytes.toBytes("EXAMPLE_ROW1"));
row1.add(family, qualifier, Bytes.toBytes("val1"));
txTable.put(tx, row1); // Data path
Put row2 = new Put(Bytes.toBytes("EXAMPLE_ROW2"));
row2.add(family, qualifier, Bytes.toBytes("val2"));
txTable.put(tx, row2); // Data path
tm.commit(tx); // Control path

SQL Coming to NoSQL
13
NoSQL API is simple but crude and non-standardized
Hard to manage complex schemas (low-level data abstraction)
Hard to implement analytics queries (low-level access primitives)
Hard to optimize for speed (server-side programming required)
Hard to integrate with relational data sources

Use Case: Real-Time Ad Inventory Ingestion
14
Advertisers deploy campaign content & metadata in the marketplace
SQL-speaking external client
Complex schema (many campaign types and optimization goals)
High scalability (growing market)
Campaign operations run multidimensional inventory analytics
Aggregate queries by advertiser, product, time, etc.
ML pipeline learns recommendation models for new campaigns
NoSQL-style access to data

Phoenix comes to Help
15
OLTP and Real-Time Analytics for HBase
Query optimizer transforms SQL to native HBase API calls
Standard SQL interface with JDBC API’s
High level data abstractions (e.g., secondary indexes)
High performance (leverages server-side coprocessors)

Phoenix/Omid Integration
16
Phoenix is designed for public-cloud scale (>10K query servers)
Omid is extremely scalable (>600k tps), low-latency (<5ms), and HA
New Omid release (1.0.1) - SQL features, improved performance
Supports secondary indexes, extended Snapshot Isolation, downstream
filters
Phoenix releases 4.15 and 5.1 include Omid as Phoenix Tps
Phoenix refactored to support multiple TP backends (Omid is default)

Phoenix/Omid Integration performance
17
1M initial inserts, 1Kb each row
Omid in Sync post commit mode

Why do we care?
18
SQL transactions
SELECT * FROM my_table; -- This will start a transaction
UPSERT INTO my_table VALUES (1,'A’);
SELECT count(*) FROM my_table WHERE k=1;
DELETE FROM my_other_table WHERE k=2;
!commit -- Other transactions will now see your updates and you will see theirs

Why do we care?
1919
Non-transactional secondary index update might breaks consistency
(k1, [v1,v2,v3])
Table Index
(v1, k1)
Write (k1, [v1,v2,v3])

Why do we care?
20
Updating the secondary index fails
Out of handlers
Many jiras discuss this issue
20
(k1, [v1,v2,v3])
Table Index
Write (k1, [v1,v2,v3])

Transactions and snapshot isolation
Aborts only on write-write conflicts
Read
point
Write
point
begin commitread(x) write(y) write(x) read(y)

Omid architecture
Client
Begin/Commit
Data Data Data
Commit
Table
Persist
Commit
Verify commitRead/Write
Conflict
Detection
22
Transaction
Manager
Results/Timestamp

Omid low latency (LL) architecture
Client
Begin/Commit
Data Data Data
Commit
Table
Persist
Commit
23
Transaction
Manager
Results/Timestamp

Client
Begin
Data Data Data
Commit
Table
t1
Write (k1, v1, t1) Write (k2, v2, t1)
Read (k’, last committed t’ < t1)
(k1, v1, t1) (k2, v2, t1)
Execution example
tr = t1
Transaction
Manager
24

Client
Commit: t1, {k1, k2}
Data Data Data
Commit
Table
t2
(k1, v1, t1) (k2, v2, t1)
Write (t1, t2)
(t1, t2)
Execution example
tr = t1
tc = t2
25
Transaction
Manager

Client
Data Data Data
Commit
Table
Read (k1, t3)
(k1, v1, t1) (k2, v2, t1)
Read (t1)
Execution example
tr = t3
26
Bottleneck!
TSO
(t1, t2)

Client
Data Data Data
Commit
Table
t2
(k1,v1,t1,t2) (k2,v2,t1,t2)
Delete(t1)
Post-Commit
tr = t1
tc = t2
Update
commit
cells
27
TSO
(t1, t2)

Data Data Data
Commit
Table
Read (k1, t3)
Using Commit Cells
Client
tr = t3
28
TSO
(k1,v1,t1,t2) (k2,v2,t1,t2)

Durability
Client
Begin/Commit
Data Data Data
Commit
Table
Persist
Commit
29
Transaction
Manager
Results/Timestamp
HBase
table

What about high availability?
Client
Begin/Commit
Data Data Data
Commit
Table
Persist
Commit
Single
point of
failure
30
Transaction
Manager
Results/Timestamp

High availability
Client
Begin/Commit
Data Data Data
Commit
Table
31
Results/Timestamp
Transaction
Manager
(TSO)
Transaction
Manager
(TSO)
Recovery
state
Force abortPersist
Commit

Benchmark: single-write transaction workload
Easily scales beyond 500K tps
Latency problem solved
TSO latency
bottleneck!TSO latency
bottleneck!

New scenarios for Omid
33
Secondary Indexes
Atomic Updates
How can we update metadata?
On-the-Fly Index Creation
What should we do with inflight transaction?
Extended Snapshot Isolation
Read-Your-Own-Writes Queries
Does not match to snapshot isolation

Secondary index: creation and maintenance
34
T1
T2
T3
CREATE
INDEX
started
T4
CREATE
INDEX
complete
T5
T6

Secondary index: creation and maintenance
35
T1
T2
T3
CREATE
INDEX
started
T4
CREATE
INDEX
complete
T5
T6
Bulk-Insert
into index
Abort
(enforced
upon
commit)
Added by
a
coproces
sor
Added by
a
coproces
sor
Index
update
(stored
procedure)

Extended snapshot isolation
36
BEGIN;
INSERT INTO T
SELECT ID+10 FROM T;
INSERT INTO T
SELECT ID+100 FROM T;
COMMIT;
CREATE TABLE T (ID INT);
...

Moving snapshot implementation
37
Checkpoint for
Statement 1
Checkpoint for
Statement 2
Writes by
Statement 1
Timestamps allocated by TM in blocks.
Client promotes the checkpoint.

Summary
38
Apache Phoenix is a relational database layer for HBase
Apache Phoenix need a scalable and HA Tps
Omid is Battle-Tested, Highly Scalable, Low-Latency Tps
Phoenix-Omid integration provides an efficient OLTP for Hadoop
Cloud-scale use cases in Yahoo

Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix

Similar to Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix