Percolator

Large-scale Incremental Processing Using Distributed Transactions and Notifications [email_address]

Agenda Introduction Design Bigtable overview Transaction Timestamps Notifications Discussion Reference

Introduction why Percolator? data processing tasks that transform a large repository of data via small, independent mutations. RDBMS? MapReduce? a system for incrementally processing updates to a large data set create the Google web search index reduce the average age of documents in Google search results by 50%

Introduction(cont.) Percolator features random access to a multi-PB repository ACID-compliant transactions snapshot isolation semantics observers: like triggers in DBMS, applications are structured as a series of observers user scenarios computation should be very large in some dimension can be broken down into small updates have some strong consistency requirements

Design Two main abstractions ACID transactions over a random-access repository observers Components a percolator worker/ a bigtable tablet server/ a GFS chunk server timestamp oracle light weight lock service

Design: Bigtable overview Bigtable row transaction (hbase?) percolator’s API closely resembles Bigtable’s API percolator library largely consists of Bigtable operations wrapped in Percolator-specific computations Challenges multirow transactions the observer framework

Design: Transactions Cross-row, cross-table transactions with ACID snapshot-isolation semantics no serializability No central transactions management, but built as a client library accessing Bigtable lock server need to replicated, distributed and balanced, and write to a persistent data store. store locks in special in-memory columns in the same Bigtable that stores data

Design: Transactions(cont.) The transaction’s constructor asks the timestamp oracle for a start timestamp. determines the consistent snapshot seen by Get() calls to Set() are buffered until commit time. 2-phase commit try to lock all the cells being written obtains the commit timestamp, then release its lock and make its write visible by replacing the lock with a write record

Design: Transactions(cont.) Error recovery client failure while transaction being commited. lazy approach to cleanup failure judgment: primary lock roll back client failure during the second phase of commit. past the commit point roll forward Lock cleanup only cleanup lock belongs to a dead or stuck worker (use chubby)

Design: Timestamps Hands out timestamps in strictly increasing order. batches timestamp requests 2 million timestamps per second from a single machine Guarantee that Get() returns all commited writes before the transaction’s start timestamp. T W < T R

Design: Notifications Observer registers a function and a set of columns with Percolator Percolator scan two special columns and call responding observers Ack Notify in practice, very few observers(10), one observer run on a particular column

Design: Discussion Many RPCs per work unit 50 to process a single document solutions Add conditional mutations in Bigtable API Batch operations Prefetch All API calls blocking Rely on running thousands of thread to provide enough parallelism

Reference Large-scale Incremental Processing Using Distributed Transactions and Notifications”, OSDI’10 http://www.infoq.com/cn/news/2010/10/google-percolator

HBase Coprocessor provides a framework both for distributed computation directly within the HBase server processes and flexible and generic extension. Observer RegionObserver MasterObserver WALObserver Endpoint

Percolator

More Related Content

What's hot

Similar to Percolator

Recently uploaded

Percolator