• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
 

Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

on

  • 2,981 views

This is the summary materials of "Benchmarking Cloud Serving Systems with YCSB" paper for nosql summer reading in Tokyo on September 15, 2010 at Gemini Mobile Technologies in Shibuya, Tokyo.

This is the summary materials of "Benchmarking Cloud Serving Systems with YCSB" paper for nosql summer reading in Tokyo on September 15, 2010 at Gemini Mobile Technologies in Shibuya, Tokyo.

Statistics

Views

Total Views
2,981
Views on SlideShare
2,976
Embed Views
5

Actions

Likes
4
Downloads
39
Comments
0

3 Embeds 5

http://themodule.com 3
http://blogs.geminimobile.com 1
http://www.geminimobile.jp 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010 Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010 Presentation Transcript

    • Benchmarking Cloud Serving Systems with YCSBby Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.
      Gemini Mobile Technologies, Inc.
      NOSQL Tokyo Reading Group
      (http://nosqlsummer.org/city/tokyo)
      September 15, 2010
      Tags: #ycsb #nosql
      10.9.11
      Gemini Mobile Technologies, Inc.
      1
    • Benchmarking Cloud Serving Systems with YCSB
      Authors: Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R, Sears, R..
      Abstract: … We present the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report results for four widely used systems: Cassandra, HBase, Yahoo!'s PNUTS, and a simple shardedMySQL implementation. We also hope to foster the development of additional cloud benchmark suites that represent other classes of applications by making our benchmark tool available via open source. In this regard, a key feature of the YCSB framework/tool is that it is extensible---it supports easy definition of new workloads, in addition to making it easy to benchmark new systems.
      Appeared in: ACM Symposium on Cloud Computing, ACM, Indianapolis, IN, USA (2010)
      http://research.yahoo.com/files/ycsb.pdf
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      2
    • 1. Introduction
      Hard to compare non-relational DBs
      • Data model varies. Key-Value vs. Column-oriented vs. Document-oriented.
      • DB’s performance profile (writes/reads/updates) has different emphasis.
      • Consistency model, replication, fault handling, etc. are all different.
      Goal: A standard benchmarking framework to evaluate “serving” systems that do online read/write data ops.
      YCSB (Yahoo! Cloud Serving Benchmark)
      • Workload generating client.
      • Package of standard workloads (e.g., read-heavy, scan, etc.)
      • Package of DB interface layers for Cassandra, HBase, MongoDB.
      • Extensible. Add new workloads. Add new DBs.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      3
    • 2.1. Cloud Serving System Characteristics
      Scale-out
      To add capacity, add servers.
      Goal is constant performance/node.
      Elasticity
      Load is distributed by adding a server to a running system.
      Temporary performance decrease as data is re-distributed.
      High Availability
      System remains available in face of failures.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      4
    • 2.2 Classifications of Systems and Tradeoffs
      Read vs. Write Performance
      Write-optimized. Log-structured systems. Append updates to commit log. Reads may need to merge update information.
      Latency vs. Durability
      Disk sync writes.
      Synchronous vs. Asynchronous Replication
      Data Partitioning
      Row-based storage: A row’s data is stored contiguously on disk.
      Column storage: Different columns can be stored separately.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      5
    • 3.1 Benchmark Tiers
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      6
      Tier 1: Performance (Latency)
      Measure latency as throughput is increased until system is saturated.
      Tier 2: Scaling
      • Scaleup. Increase number of servers, amount of data, and offered throughput scale proportionally. Latency should be constant.
      • Elastic Speedup. In running system, add more servers. Performance should improve.
    • 4. Benchmark Workloads
      Operation Types
      Insert
      Update
      Read
      Scan
      Data size
      Number of fields (e.g., 10)
      Field length (e.g., 100 bytes)
      Request distribution
      Uniform: All items equally likely.
      Zipfian: Some records are very popular, most records are unpopular.
      Latest: Like Zipfian with most recently inserted records as the most popular
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      7
    • 4.2 Core Workloads
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      8
    • 5.1 YCSB Client Architecture
      Workload Executor. Traffic generation for both “load” and “transaction” phases.
      DB Interface Layer. Custom for each DB.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      9
    • 5.2 Extensibility
      YCSB package is open-source Java code.
      Workload Executor
      Modify configuration (e.g., operation mix, distribution, data size, etc.)
      Custom Java class to define workload.
      DB Interface Layer
      Implement interface (read,update, insert, delete, scan) for DB.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      10
    • 6. Results: Setup
      Tested 4 DBs
      Cassandra 0.5.0
      HBase 0.20.3
      PNUTS MySQL 5.1.24
      MySQL(sharded) 5.1.32.
      6 servers. Dual 65-bit quad-core 2.5 GHz Intel Xeon CPUs, 8GB RAM, 6-disk RAID-10 array, GB ethernet.
      YCSB Client on a separate 8-core server.
      Up to 500 threads.
      Client was not the bottleneck.
      No replication
      Data is 120M 1KB records (total size: 120GB). Each server then stored 20GB data.
      Cassandra, PNUTS, MySQL configured to sync to disk. HBase not sync to disk.
      Periodic compaction operations.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      11
    • 6. Results: Read vs. Write Performance
      Cassandra and HBase had better performance on write-heavy workload.
      PNUTS and MySQL had better performance on read-heavy workload.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      12
    • 6. Results: Scalability
      Vary number of servers from 2 to 12. Data size and request rate varied proportionally.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      13
      HBase is erratic.
      Cassandra and
      PNUTS scale well.
    • 6. Results: Elasticity
      Start with 2 servers with 120GB data. Then add more servers up to 6.
      Cassandra, HBase, PNUTS were able to grow elastically.
      HBase does not repartition data until next compaction.
      PNUTS was best, most stable latency while elastically repartitioning data.
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      14
      Go from 5 to 6 servers at 10 minute mark.
    • 7. Future Work
      Tier 3: Availability
      Tier 3: Replication
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      15
    • Further Study
      Main Site: http://research.yahoo.com/Web_Information_Management/YCSB
      Source Code:  http://github.com/brianfrankcooper/YCSB 
      Mailing list: http://tech.groups.yahoo.com/group/ycsb-users/
      10.9.11
      Gemini Mobile Technologies, Inc. All rights reserved.
      16