Cassandra troubleshooting:
    out of the shadows

      Benjamin Black, b@b3k.us
Introducing: This Guy
The Allegory of the Cave
Most people start
troubleshooting problems
interpreting shadows on the
wall.
Common shadows.
Paths out of the cave.
Combination of
basic system tools
&
nodetool/JMX
I’m using RP.
My ring is very unbalanced.
I’m using RP.
My ring is very unbalanced.




                   WTF?
nodetool ring
Address   Status   Load       Range                  Ring
                      148873535527910577765226390751398592511
10...
Address   Status   Load       Range                  Ring
                      148873535527910577765226390751398592511
10...
Autobootstrap
+
Automatic token assignment
Automatic token algorithm:

Assign a token that will give me
half the range of
the most loaded node.
32
16 16
8 8 16
8888
44888
444488
Address   Status   Load       Range                  Ring
                      148873535527910577765226390751398592511
10...
nodetool move
+
Manual token assignment
0-(2**127 - 1)
def tokens(nodes)
 0.upto(nodes - 1) do |n|
  p (n * (2**127 - 1) / nodes)
 end
end
=> tokens(6)
0
283568639100782052886145506193140
17621
567137278201564105772291012386280
35242
850705917302346158658436518...
YES:

This means you need to change tokens on
most of the nodes in your cluster whenever
you add a node.
Writes are fast.
Reads keep getting slower.
Writes are fast.
Reads keep getting slower.




                   WTF?
iostat -x
look at %util
nodetool tpstats
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
...
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
...
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
...
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
...
YOU ARE OUT OF
DISK BANDWIDTH
You can:
Throttle reads at clients
Adjust memtable settings
    (size/ops/time)
Less frequent memtable flush

 Less frequent compaction

Less disk bandwidth demand
Add more nodes
Add more spindles per node
Switch to SSDs
I inserted a bunch of data.
Now my nodes are flapping.
I inserted a bunch of data.
Now my nodes are flapping.




                  WTF?
iostat -x
look at %util
vmstat
look at swap
INFO 13:27:35,309
DiskAccessMode 'auto' determined to be mmap,
indexAccessMode is mmap
mmap() in Cassandra
consumes up to 2GB.
mmap() in Cassandra
consumes up to 2GB.
Per segment.
NOT tracked as JVM heap.




           *See: https://issues.apache.org/jira/browse/CASSANDRA-1214
NOT tracked as JVM heap.
JVM heap not locked in
memory.

           *See: https://issues.apache.org/jira/browse/CASSANDRA-...
When your data set exceeds
memory,
this is likely.
Swapping can delay gossip
long enough to cause a
node to be marked down.
<DiskAccessMode>mmap_index_only</
DiskAccessMode>
or
disk_access_mode: mmap_index_only
On Linux: swappiness=0
INFO 13:27:35,309
DiskAccessMode isstandard,
indexAccessMode is mmap
Most people start
troubleshooting problems
interpreting shadows on the
wall.
You can now see the path
and the sunlight outside.
YOU CAN HELP!
What things have confused
you?
What problems have you
solved?
What tools have you used to
solve them?
GET INVOLVED!
http://wiki.apache.org/cassandra
#cassandra on freenode
Upcoming SlideShare
Loading in...5
×

Cassandra Summit 2010 - Operations & Troubleshooting Intro

7,290

Published on

Slides from my talk at the Cassandra Summit 2010 introducing some common things to troubleshoot and how to do so in Cassandra.

Published in: Technology
0 Comments
21 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,290
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
308
Comments
0
Likes
21
Embeds 0
No embeds

No notes for slide




















































  • Cassandra Summit 2010 - Operations & Troubleshooting Intro

    1. 1. Cassandra troubleshooting: out of the shadows Benjamin Black, b@b3k.us
    2. 2. Introducing: This Guy
    3. 3. The Allegory of the Cave
    4. 4. Most people start troubleshooting problems interpreting shadows on the wall.
    5. 5. Common shadows.
    6. 6. Paths out of the cave.
    7. 7. Combination of basic system tools & nodetool/JMX
    8. 8. I’m using RP. My ring is very unbalanced.
    9. 9. I’m using RP. My ring is very unbalanced. WTF?
    10. 10. nodetool ring
    11. 11. Address Status Load Range Ring 148873535527910577765226390751398592511 10.248.54.192 Up 5.59 GB 0 |<--| 10.248.254.15 Up 10.58 GB 42535295865117307932921825928971026431 | ^ 10.248.135.239Up 11.01 GB 85070591730234615865843651857942052863 v | 10.248.223.191Up 5.42 GB 106338239662793269832304564822427566079 | ^ 10.248.122.240Up 5.51 GB 127605887595351923798765477786913079295 v | 10.248.34.80 Up 5.45 GB 148873535527910577765226390751398592511 |-->|
    12. 12. Address Status Load Range Ring 148873535527910577765226390751398592511 10.248.54.192 Up 5.59 GB 0 |<--| 10.248.254.15 Up 10.58 GB 42535295865117307932921825928971026431 | ^ 10.248.135.239Up 11.01 GB 85070591730234615865843651857942052863 v | 10.248.223.191Up 5.42 GB 106338239662793269832304564822427566079 | ^ 10.248.122.240Up 5.51 GB 127605887595351923798765477786913079295 v | 10.248.34.80 Up 5.45 GB 148873535527910577765226390751398592511 |-->|
    13. 13. Autobootstrap + Automatic token assignment
    14. 14. Automatic token algorithm: Assign a token that will give me half the range of the most loaded node.
    15. 15. 32 16 16 8 8 16 8888 44888 444488
    16. 16. Address Status Load Range Ring 148873535527910577765226390751398592511 10.248.54.192 Up 5.59 GB 0 |<--| 10.248.254.15 Up 10.58 GB 42535295865117307932921825928971026431 | ^ 10.248.135.239Up 11.01 GB 85070591730234615865843651857942052863 v | 10.248.223.191Up 5.42 GB 106338239662793269832304564822427566079 | ^ 10.248.122.240Up 5.51 GB 127605887595351923798765477786913079295 v | 10.248.34.80 Up 5.45 GB 148873535527910577765226390751398592511 |-->|
    17. 17. nodetool move + Manual token assignment
    18. 18. 0-(2**127 - 1) def tokens(nodes) 0.upto(nodes - 1) do |n| p (n * (2**127 - 1) / nodes) end end
    19. 19. => tokens(6) 0 283568639100782052886145506193140 17621 567137278201564105772291012386280 35242 850705917302346158658436518579420 52863 113427455640312821154458202477256 070484 141784319550391026443072753096570 088105
    20. 20. YES: This means you need to change tokens on most of the nodes in your cluster whenever you add a node.
    21. 21. Writes are fast. Reads keep getting slower.
    22. 22. Writes are fast. Reads keep getting slower. WTF?
    23. 23. iostat -x look at %util
    24. 24. nodetool tpstats
    25. 25. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
    26. 26. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
    27. 27. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
    28. 28. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
    29. 29. YOU ARE OUT OF DISK BANDWIDTH
    30. 30. You can:
    31. 31. Throttle reads at clients
    32. 32. Adjust memtable settings (size/ops/time)
    33. 33. Less frequent memtable flush Less frequent compaction Less disk bandwidth demand
    34. 34. Add more nodes
    35. 35. Add more spindles per node
    36. 36. Switch to SSDs
    37. 37. I inserted a bunch of data. Now my nodes are flapping.
    38. 38. I inserted a bunch of data. Now my nodes are flapping. WTF?
    39. 39. iostat -x look at %util
    40. 40. vmstat look at swap
    41. 41. INFO 13:27:35,309 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
    42. 42. mmap() in Cassandra consumes up to 2GB.
    43. 43. mmap() in Cassandra consumes up to 2GB. Per segment.
    44. 44. NOT tracked as JVM heap. *See: https://issues.apache.org/jira/browse/CASSANDRA-1214
    45. 45. NOT tracked as JVM heap. JVM heap not locked in memory. *See: https://issues.apache.org/jira/browse/CASSANDRA-1214
    46. 46. When your data set exceeds memory, this is likely.
    47. 47. Swapping can delay gossip long enough to cause a node to be marked down.
    48. 48. <DiskAccessMode>mmap_index_only</ DiskAccessMode> or disk_access_mode: mmap_index_only
    49. 49. On Linux: swappiness=0
    50. 50. INFO 13:27:35,309 DiskAccessMode isstandard, indexAccessMode is mmap
    51. 51. Most people start troubleshooting problems interpreting shadows on the wall.
    52. 52. You can now see the path and the sunlight outside.
    53. 53. YOU CAN HELP!
    54. 54. What things have confused you?
    55. 55. What problems have you solved?
    56. 56. What tools have you used to solve them?
    57. 57. GET INVOLVED!
    58. 58. http://wiki.apache.org/cassandra
    59. 59. #cassandra on freenode
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×