Your SlideShare is downloading. ×
TechTalk v2.0 - Performance tuning Cassandra + AWS
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

TechTalk v2.0 - Performance tuning Cassandra + AWS

590
views

Published on

Published in: Engineering, Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
590
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
12
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Eddie  Garcia,   VP  of  InfoSec  and  Services   Gazzang,  Inc.   I/O  Performance  tuning  for  Cassandra   running  on  AWS  with  Gazzang  
  • 2. Today’s  Agenda   •  Tips  and  Tricks  to  achieve  high  performance  when  running   Cassandra  on  AWS   •  ConfiguraBon  tuning  for  Cassandra   •  Tools  to  benchmark  raw  file  system  I/O   •  AWS  available  AMIs  to  boost  performance   •  Stress  tesBng  on  AWS  i2  HVM  instances   •  Configuring  AWS  EC2  instances  with  SSDs  and  EBS  storage   with  PIOPS   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 2
  • 3. Performance  tuning   • Tuning  at  every  layer   – Tune  the  AWS  layer   – Tune  the  Cassandra  layer   – Tune  the  file  system  /  security  layer   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 3
  • 4.   Tune  the  AWS  layer       4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 4
  • 5. Tune  the  AWS  layer   •  i2  HVM  instances  will  provide  beNer  I/O  over  other  instance   types   •  i2  instances  will  support  SSD  TRIM  for  beNer  SDD  health  and   performance  over  Bme   •  Use  Amazon  Linux  distribuBon  AMI  or  kernel  version  3.8  and   greater  for  higher  I/O  performance   •  Use  Amazon  Linux  distribuBon  AMI  for  built-­‐in  SR-­‐IOV  (single   root  I/O  virtualizaBon)  drivers  to  enable  higher  performance   AWS  Enhanced  Networking  when  running  in  a  VPC   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 5
  • 6. Amazon  Linux  AMI  Instance  Types  and  Sizes   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 6 http://aws.amazon.com/amazon-linux-ami/
  • 7. Amazon  Linux  AMI  Instance  Types  and  Cost   on-­‐demand  in  US  East   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 7 http://aws.amazon.com/ec2/pricing/
  • 8.   Tune  the  Cassandra  layer       4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 8
  • 9. Tune  the  Cassandra  layer   •  Follow  DataStax  published  Cassandra  best  pracBces   hNp://www.datastax.com/documentaBon/cassandra/2.0/cassandra/install/installRecommendSe]ngs.html   •  Data  directory  should  go  on  the  mounted  ephemeral  instance   storage,  avoid  EBS  storage  for  maximum  I/O  performance   •  IMPORTANT:  You  must  have  a  backup  strategy  when  using   ephemeral,  for  example  using  S3  for  backups   •  RAID-­‐0  (stripe)  of  SSDs  is  supported  but  Cassandra  also  does  a   great  job  of  using  all  mounted  drives  without  RAID   •  Scale  by  adding  smaller  instances  vs.  increasing  instance  size   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 9
  • 10. Tune  the  Cassandra  layer   •  Cassandra  writes  immutable  sstable  files  to  disk.    It  then   compacts  mulBple  sstables  into  1  larger  sstable  with  some   cleanup  occurring  along  the  way  which  also  helps  TRIM     •  More  OS  memory  the  beNer,  on  read  the  sstables  are  cached   as  normal  memory  mapped  file  loaded  into  OS  memory   •  Increasing  the  JVM  heap  size  can  cause  performance  issues  for   Cassandra  during  garbage  collecBon  “Death  by  Garbage   CollecBon”   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 10
  • 11.   Tune  the  file  system  /  security  layer       4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 11
  • 12. Tune  the  file  system  layer   •  Format  the  file  system  with  ext4  vs  ext3  or  xfs  if  supported  by   your  chosen  Linux  distribuBon   •  Use  the  most  current  Linux  version  for  your  distribuBon,  many   performance  fixes  are  supported  only  in  newer  kernels   •  Use  IOZone  or  other  file  system  tests  before  and  ager   configuraBons  to  benchmark  raw  file  I/O  before  loading  your   Cassandra  data   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 12
  • 13. Tune  the  file  security  layer   •  Use  Block  Level  encrypBon  dedicaBng  enBre  SSD  volume   •  Encrypt  the  cluster  before  loading  data  whenever  possible   •  Use  systems  that  support  hardware  encrypBon  acceleraBon   like  Intel  AES-­‐NI  hNp://aws.amazon.com/ec2/instance-­‐types   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 13
  • 14.       Test  and  measure       4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 14
  • 15. Performance  TesJng   •  When  tesBng  performance  reduce  the  number  of  variables   that  can  affect  the  test   –  Stopping  and  stopping  a  server  can  switch  your  instance  to  a  different   host  with  different  performance   –  Time  of  day  when  you  run  tests  can  affect  the  performance   –  Eliminate  cached  in  memory  data  from  prior  tests  which  may   contaminate  your  results   –  Avoid  tesBng  on  systems  with  unknown  state  and  size  of  data   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 15
  • 16. Cassandra  Test  Environment   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 16 Cassandra   Stress  Client   Cassandra   Node  1   Cassandra   Node  2   Cassandra   Node  3   Cassandra   Node  4   Cassandra   Node  5   Cassandra   Node  6   EBS  Clear  text   EBS  4K  PIOPS   SSD  Clear  text   SSD  Encrypted   IOZone Tests Cassandra Stress Tests S3   Backups  
  • 17. Test  Environment  SpecificaJons   Instance:  i2.2xlarge       AZ:  us-­‐east-­‐1a   AMI  InformaBon:  amzn-­‐ami-­‐hvm-­‐2013.09.2.x86_64-­‐ebs  (ami-­‐e9a18d80)   Linux  DistribuBon:  Amazon  Linux  AMI  release  2013.09   Kernel  Version:  3.4.73-­‐64.112.amzn1.x86_64   Drive  Layout:          Filesystem                        Size    Used  Avail  Use%  Mounted  on          /dev/xvda1                        7.9G    1.8G    6.1G    23%  /    (EBS  backed  for  tests,  ephemeral  is  beNer)          tmpfs                                    30G          0      30G      0%  /dev/shm          /dev/xvdb                          734G    197M    697G      1%  /mount/ssd1    (Cleartext  test  SSD)          /dev/mapper/encrypted  734G      36G    662G      6%  /encrypted    (Encrypted  test  SSD)     Cassandra  Stress  Client  –  m1.medium     Cassandra  Cluster:  6  Nodes   DataStax  enterprise:  dse-­‐libcassandra-­‐3.2.2-­‐1.noarch   Cassandra:  version  1.2.12.2     Java  HotSpot(TM)  64-­‐Bit  Server  VM/1.6.0_45     4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 17
  • 18. IOZone  SSD  vs.  Non-­‐SSD   IOZone  test  configuraBon   Bme  iozone  -­‐ORa  -­‐s  163840  -­‐r  16384            Iozone:  Performance  Test  of  File  I/O                            Version  $Revision:  3.420  $                      Compiled  for  64  bit  mode.                      Build:  linux-­‐AMD64              OPS  Mode.  Output  is  in  operaBons  per  second.            Excel  chart  generaBon  enabled            Auto  Mode            File  size  set  to  163840  KB            Record  Size  16384  KB            Command  line  used:  iozone  -­‐ORa  -­‐s  163840  -­‐r  16384            Time  ResoluBon  =  0.000001  seconds.            Processor  cache  size  set  to  1024  Kbytes.            Processor  cache  line  size  set  to  32  bytes.            File  stride  size  set  to  17  *  record  size.   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 18 http://www.iozone.org/
  • 19. Cassandra  Test  Environment   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 19 Cassandra   Node     EBS  Clear  text   EBS  4K  PIOPS   encrypted   SSD   SSD  Encrypted   IOZone Tests real 1m6.360s user 0m0.084s sys 0m0.911s real 0m15.223s user 0m0.115s sys 0m1.391s real 0m9.951s user 0m0.291s sys 0m3.595s
  • 20. Cassandra  stress   The  cassandra-­‐stress  tool   •  A  Java-­‐based  stress  tesBng  uBlity  for  benchmarking  and  load  tesBng   a  Cassandra  cluster.   •  The  binary  installaBon  of  the  tool  also  includes  a  daemon,  which  in   larger-­‐scale  tesBng  can  prevent  potenBal  skews  in  the  test  results  by   keeping  the  JVM  warm.   •  Modes  of  operaBon:   –  InserBng:  Loads  test  data.   –  Reading:  Reads  test  data.   –  Indexed  range  slicing:  Works  with  RandomParBBoner  on  indexed  tables.   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 20 http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/ toolsCStress_t.html
  • 21. Current  Cassandra  stress  test  configuraJon   •  Cassandra  stress  test  command   –  <cassandra  home>/tools/bin/cassandra-­‐stress  -­‐l  3  -­‐o  insert  -­‐n   100000000  -­‐i  1  -­‐e  ONE  -­‐c  10  -­‐d  <Cassandra  Node  IPs>  -­‐t  150  -­‐f   T1.csv  &   •  In  the  stress  test,  client  stress  test  nodes  1  –  3  will  target  two   separate  Cassandra  nodes.  On  client  node  #4,  target  all  Cassandra   nodes.   –  Client#1  —>  CAS  1,  2   –  Client#2  —>  CAS  3,  4   –  Client#3  —>  CAS  5,  6   –  Client#4  —>  CAS  1,  2,  3,  4,  5,  6   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 21
  • 22. Cassandra  Test  Environment   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 22 Stress     Client  1   Cassandra   Node  1   Cassandra   Node  2   Cassandra   Node  3   Cassandra   Node  4   Cassandra   Node  5   Cassandra   Node  6   SSD  Clear  text   SSD  Encrypted   Cassandra Stress Tests Stress     Client  2   Stress     Client  3   Stress     Client  4  
  • 23. Benchmark  clear  text  vs  encrypted  inserts  (write)   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 23
  • 24. Summary   •  Test  in  your  environment  with  your  data,  results  will  vary   greatly  on  OS,  HW  and  applicaBon  configuraBons   –  Baseline  before  you  tune   –  Tune   –  Test  ager  tuning   –  Measure   –  Rinse  and  repeat  twice     •  Security  and  Performance  are  not  mutually  exclusive,   encrypBon  can  coexist  with  High  I/O  performance     •  Do  your  homework,  configure  and  run  tests  that  map  to  your   use  case   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 24
  • 25. • Headquartered  in  AusBn,  Texas   • Focus  on  securing  sensiBve  data  in  cloud   and  big  data  environments   • Enable  customers  to  meet  compliance     requirements  like  HIPAA,  PCI,  FIPS  and   FERPA   • SaBsfy  internal  security  mandates   • Protect  valuable  client  informaBon   About  Gazzang  
  • 26. Gazzang  is  focused  on  data  at-­‐rest  encrypBon     Security  in  the  cloud  is  a  layered  approach   264/24/14 Gazzang - All rights reserved 2013 Data  in  process  (in  applicaJon)   Data  at  rest  (storage)   Data  in  transit  (SSL)  
  • 27. and  key  management     274/24/14 Gazzang - All rights reserved 2013 Security  in  the  cloud  is  a  layered  approach   Data  in  process  (in  applicaJon)   Data  at  rest  (storage)   Data  in  transit  (SSL)  
  • 28. Thank  you!   Gazzang,  Inc   www.gazzang.com       Eddie  Garcia   VP  of  InfoSec  and  Services   eddie.garcia@gazzang.com   4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 28