• Like
Percolator
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,350
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
23
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Large-scale Incremental Processing Using Distributed Transactions and Notifications [email_address]
  • 2. Agenda
    • Introduction
    • Design
      • Bigtable overview
      • Transaction
      • Timestamps
      • Notifications
      • Discussion
    • Reference
  • 3. Introduction
    • why Percolator?
      • data processing tasks that transform a large repository of data via small, independent mutations.
        • RDBMS?
        • MapReduce?
      • a system for incrementally processing updates to a large data set
        • create the Google web search index
        • reduce the average age of documents in Google search results by 50%
  • 4. Introduction(cont.)
    • Percolator
      • features
        • random access to a multi-PB repository
        • ACID-compliant transactions
        • snapshot isolation semantics
        • observers: like triggers in DBMS, applications are structured as a series of observers
      • user scenarios
        • computation should be very large in some dimension
        • can be broken down into small updates
        • have some strong consistency requirements
  • 5. Design
    • Two main abstractions
      • ACID transactions over a random-access repository
      • observers
    • Components
      • a percolator worker/ a bigtable tablet server/ a GFS chunk server
      • timestamp oracle
      • light weight lock service
  • 6. Design: Bigtable overview
    • Bigtable
      • row transaction (hbase?)
      • percolator’s API closely resembles Bigtable’s API
      • percolator library largely consists of Bigtable operations wrapped in Percolator-specific computations
    • Challenges
      • multirow transactions
      • the observer framework
  • 7. Design: Transactions
    • Cross-row, cross-table transactions with ACID snapshot-isolation semantics
      • no serializability
    • No central transactions management, but built as a client library accessing Bigtable
      • lock server need to replicated, distributed and balanced, and write to a persistent data store.
      • store locks in special in-memory columns in the same Bigtable that stores data
  • 8. Design: Transactions(cont.)
    • The transaction’s constructor asks the timestamp oracle for a start timestamp.
      • determines the consistent snapshot seen by Get()
      • calls to Set() are buffered until commit time.
    • 2-phase commit
      • try to lock all the cells being written
      • obtains the commit timestamp, then release its lock and make its write visible by replacing the lock with a write record
  • 9. Design: Transactions(cont.)
    • Error recovery
      • client failure while transaction being commited.
        • lazy approach to cleanup
        • failure judgment: primary lock
        • roll back
      • client failure during the second phase of commit.
        • past the commit point
        • roll forward
    • Lock cleanup
      • only cleanup lock belongs to a dead or stuck worker (use chubby)
  • 10. Design: Timestamps
    • Hands out timestamps in strictly increasing order.
      • batches timestamp requests
      • 2 million timestamps per second from a single machine
    • Guarantee that Get() returns all commited writes before the transaction’s start timestamp.
      • T W < T R
  • 11. Design: Notifications
    • Observer
      • registers a function and a set of columns with Percolator
      • Percolator scan two special columns and call responding observers
        • Ack
        • Notify
      • in practice, very few observers(10), one observer run on a particular column
  • 12. Design: Discussion
    • Many RPCs per work unit
      • 50 to process a single document
      • solutions
        • Add conditional mutations in Bigtable API
        • Batch operations
        • Prefetch
    • All API calls blocking
      • Rely on running thousands of thread to provide enough parallelism
  • 13. Reference
    • Large-scale Incremental Processing Using Distributed Transactions and Notifications”, OSDI’10
    • http://www.infoq.com/cn/news/2010/10/google-percolator
  • 14. HBase Coprocessor
    • provides a framework both for distributed computation directly within the HBase server processes and flexible and generic extension.
      • Observer
        • RegionObserver
        • MasterObserver
        • WALObserver
      • Endpoint
  • 15.
    • Thank you!