Денис Нелюбин, "Тамтэк"

2,908 views
2,789 views

Published on

HighLoad++ 2013

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,908
On SlideShare
0
From Embeds
0
Number of Embeds
1,373
Actions
Shares
0
Downloads
20
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Денис Нелюбин, "Тамтэк"

  1. 1. Тестирование производительности NoSQL БД Денис Нелюбин
  2. 2. Thumbtack Technology Inc.
  3. 3. A.K.A.: Citrusleaf Creator: Aerospike, August 2012 License: Proprietary, Community edition Category: Key-value, Complex data types + Secondary indexes (from v.3.0)
  4. 4. A.K.A.: CouchDB + Membase Creator: Couchbase, Inc. (CouchOne + Membase), January 2012 License: Apache 2.0, Proprietary (Enterprise edition) Category: Key-value, Document + Secondary indexes (from v.2.0)
  5. 5. A.K.A.: Apache Cassandra Creator: Facebook, July 2008 License: Apache 2.0 Category: Key-value, BigTable, Columnoriented
  6. 6. A.K.A.: Mongo Creator: 10gen (MongoDB, Inc.), March 2010 License: AGPL, Commercial license Category: Document-oriented
  7. 7. YCSB A.K.A.: Yahoo! Cloud Serving Benchmark Creator: Yahoo! Research, June 2010 License: Apache 2.0 Category: NoSQL benchmark
  8. 8. YCSB Data set ● key: "user" + 64-bit Fowler-Noll-Vo hash ● value: 10 fields of random data Load ● insert N records Run ● update and read on N records by the key
  9. 9. YCSB
  10. 10. YCSB Does NOT do, does NOT check: ● join ● secondary index ● where clause ● partial update
  11. 11. Why YCSB? ● applicable to any database ● popular ● de-facto standard
  12. 12. Hardware Servers: 4 * (8 * Xeon + 32GB RAM + 4 * 120GB SSD) Clients: 8 * (4 * i5 + 4GB RAM) Single client is not enough
  13. 13. Hardware: CPU 8 cores Xeon ≈ 4 cores i5 * *(unproved)
  14. 14. Hardware: Network 1 Gbps is not enough 1 Gbit/sec / 1 KB of data ≈ 100 000 ops/sec Single IO queue on single CPU is not enough # cat /proc/interrupts | grep eth 90: 0 0 0 0 IR-PCI-MSI-edge eth0 91: 275107859 0 0 0 IR-PCI-MSI-edge eth0-TxRx-0 92: 227858040 0 0 0 IR-PCI-MSI-edge eth0-TxRx-1 93: 242082684 0 0 0 IR-PCI-MSI-edge eth0-TxRx-2 94: 230651008 0 0 0 IR-PCI-MSI-edge eth0-TxRx-3 95: 217273950 0 0 0 IR-PCI-MSI-edge eth0-TxRx-4 96: 240149262 0 0 0 IR-PCI-MSI-edge eth0-TxRx-5 97: 194736879 0 0 0 IR-PCI-MSI-edge eth0-TxRx-6 98: 270089080 0 0 0 IR-PCI-MSI-edge eth0-TxRx-7
  15. 15. Hardware: SSD Overprovisioning ● hdparm ● fdisk http://en.wikipedia.org/wiki/Write_amplification
  16. 16. OS (GNU/Linux) ulimit ● nofile > 4k RAID (RAID 0?) ● mdadm ● lvm Read-ahead ● minimal http://upload.wikimedia.org/wikipedia/commons/a/a4/Gnu-linux-on-white.png
  17. 17. Test Data sets: ● RAM: 50M * 100 byte ≈ 5GB ● SSD: 500M * 100 byte ≈ 50GB ● replication factor = 2 Workloads: ● Heavy Write: 50% update / 50% read ● Mostly Read: 5% update / 95% read http://kushsrivastava.files.wordpress.com/2012/11/test.gif Consistency: ● Sync replication ● Async replication
  18. 18. Insert, RAM
  19. 19. Insert, SSD
  20. 20. Heavy Update, RAM
  21. 21. Heavy Update, SSD
  22. 22. Heavy Update, Latency
  23. 23. Mostly Read, RAM
  24. 24. Mostly Read, SSD
  25. 25. Speed Insert Couchbase* Aerospike Cassandra MongoDB Update Couchbase* Aerospike Cassandra MongoDB * in memory or on smaller data set Read Aerospike Couchbase* MongoDB Cassandra
  26. 26. Failover test ● 50%, 75%, 100% of max throughput ● Heavy Update ● ● ● ● ● 10min warmup kill -9 10min without one node service start 20min after restore
  27. 27. Aerospike, Sync, SSD, 50%
  28. 28. Cassandra, Async, SSD, 50%
  29. 29. Couchbase, Async, RAM, 100%
  30. 30. MongoDB, Async, SSD, 50%
  31. 31. Replication MongoDB Cassandra Couchbase/Aerospike
  32. 32. Data storing reliability Cassandra MongoDB Aerospike Couchbase archive live data fast cache, eviction cache (async only)
  33. 33. Capacity Cassandra MongoDB Aerospike Couchbase* packed archive unpacked live data indexes in RAM + SSD metadata and cache in RAM * was able to take only 200M records
  34. 34. Deployment Couchbase Aerospike Cassandra MongoDB four clicks powerful config config+config+calculator shards of replica-sets
  35. 35. Managing Couchbase MongoDB Cassandra Aerospike superduperwebconsole commands and docs * exists ** raw *** * use MMS (MongoDB Management Service) ** use DataStax products *** try AMC (Aerospike Monitoring Console)
  36. 36. Unique features Aerospike ● SSD support, speed Couchbase ● good web console, easy deployment Cassandra ● writes faster than reads ;) MongoDB ● documents
  37. 37. Troublesomes Aerospike ● eviction ● secret config options ● long start Couchbase ● big data ● strange client behaviour ● long start ● long shutdown http://www.spreadshirt.com/herecomes-trouble-women-s-t-shirtsC3376A9069098
  38. 38. Troublesomes Cassandra ● need to think about the config ;) MongoDB ● mongos have to be restarted ● replica-set is too surviving ;) http://www.spreadshirt.com/herecomes-trouble-women-s-t-shirtsC3376A9069098
  39. 39. When to use: Aerospike Big Fast Cache http://x-celestia-x.deviantart.com/art/I-am-the-best-Rainbow-Dash-358472521
  40. 40. When to use: Couchbase In-memory Cache with Persistence http://zutheskunk.deviantart.com/art/MLP-Resource-Shadowbolt-Female-02-238973870
  41. 41. When to use: Cassandra Big-Data Archive http://www.deviantart.com/art/Zecora-324988216
  42. 42. When to use: MongoDB Universal DB for Web http://www.deviantart.com/art/Trixie-221583239
  43. 43. Not only YCSB ● From scratch, inspired by YCSB ● More tests ○ Secondary indexes (cardinality, overhead) ○ Aggregation (average value) ○ Collection data types (stack, array, wide row)
  44. 44. Thanks Denis Nelubin <dnelubin@gmail.com> Alexey Remnev <aremnev@thumbtack.net>

×