MongoDB and server performance

  • 7,977 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Never forget that load on linux is wrong: it shows processes in uninterruptible sleep, not processes waiting for the CPU, and thats include process waiting for IO but not using any cpu time, it's a very weak indicator.
    Are you sure you want to
    Your message goes here
  • @sirgilot The graph screenshots are from the MongoDB MMS
    Are you sure you want to
    Your message goes here
  • Its awesome! Good job. I have a ask, which is the name of software/plataform that you use for monitoring in your slides?

    Thanks
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
7,977
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
97
Comments
3
Likes
19

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. MongoDB And Server Performance Alon  Horev   Israel  MongoDB  user  group     August  2013  
  • 2. Meta Alon  Horev   Twitter:  @alonhorev     Mail:  alon@horev.net   Blog:  http://alon.horev.net      
  • 3. Performance Considerations ! Speed  -­‐  make  things  fast:   ! Optimize  queries   ! Optimize  schema   ! Growth  -­‐  things  should  stay  fast:   ! When  will  queries  slow  down?   ! When  will  we  run  out  of  resources?   ! Troubleshoot  -­‐  what  went  wrong?  
  • 4. Tools Of The Trade ! Profiling  –  focus  on  specific  queries   ! Query  explain   ! system.profile  collection   ! Monitoring  –  focus  on  overall  performance   ! logs   ! MMS   ! Mongo  commands   ! CLI  tools   ! munin,  nagios,  etc’  
  • 5. Server Resources   ! Storage:  DISK,  RAM   ! Network   ! CPU  
  • 6. Storage MongoDB  storage  stack:     MongoDB   Memory  Mapped  Files   Page  Cache     Disk  and  RAM  
  • 7. Storage – Disk And RAM   ! Variety  of  storage  types:  RAM,  SSD,  HDD   ! Different  attributes:  price,  volatility,  speed,   access  patterns,  wear,  capacity,  latency   ! Most  systems  use  a  mix    
  • 8. Compare Attribute   RAM   SSD   HDD   Volatile   Yes   No   No   Wear   Very  Slow     Fast   Slow   Latency   Low   Low   High   Read  MB/s   4000+   520   164   Write  MB/s   4000+   480   160   Price  per  GB   5.5  USD   0.5  USD   0.05  USD  
  • 9. Page Cache ! Uses  RAM  to  cache  the  disk   ! Recently  accessed  buffers  are  saved  in  RAM   ! Writes  are  flushed  to  disk  in  the  background   ! Exists  in  all  modern  operating  systems   ! Already  extensively  used  by  file  systems  
  • 10. User  Space   Kernel  Space   Process  calls   read/write   disk  Page  Cache       File+Offset  -­‐>  Physical  Memory   Write  on  flush   Read  on  fault    
  • 11. Memory Mapped Files ! Maps  a  chunk  of  a  file  to  a  chunk  of  memory   ! memory  access  translates  to  file  read/write   ! Page  cache  caches  reads  and  writes   ! Reduces  system  calls  and  memory  duplication   ! Reading  from  a  page  (4k)  that’s  not  in  memory   triggers  the  infamous  page  fault  
  • 12. User  Space   Kernel  Space   Process  read   or  writes  to   memory   Page  Cache       File+Offset  -­‐>  Physical  Memory   Virtual  Memory  Manager   maps  a  process  memory   segment  to  file  +  offset   read   write   No  system  call!   disk   Write  on  flush   Read  on  fault    
  • 13. Pros & Cons ! Pros:   ! Simple!  abstracts  away  RAM  and  disk   ! Cons:   ! Performance  not  always  predictable   ! Other  applications  can  hurt  mongo   ! Warm  up  –  first  query  is  slow   ! Can’t  lock  documents  in  memory    
  • 14. Disk Limits - Capacity ! Indications:  crash  or  read-­‐only  mode   ! Capacity  can  be  predictable   ! db.collection.stats()     ! size   ! avgObjSize   ! indexSizes   ! You  can  limit:  addShard  accepts  maxSize  
  • 15. Disk Limits - Throughput ! Indications:  slow  queries,  high  lock  %   ! Is  the  disk  saturated?   ! iostat  –x  shows  disk  utilization   ! Is  it  Mongo?   ! iotop  shows  which  process  does  most  I/O   ! Disk  is  usually  faster  than  mongo   ! Load  often  indicates  lack  of    RAM  
  • 16. Many  page  faults  =   Reading  a  lot  from  disk  =   Slower  queries   Disk  is  loaded  for  reads  =   Disk  is  loaded  for  writes  
  • 17. Queues  contain  blocked   queries  waiting  in  line   for  execution   Write  lock  blocks  all   reads.  This  time  slow   writes  impact  reads!  
  • 18. Offending Queries >  db.currentOp()   {    "secs_running"  :  9,          "numYields"  :  132,  #  number  of  page  faults          "lockStats"  :  {                    "timeLockedMicros"  :  {                                "r"  :  NumberLong(4774),  #  held  the  read  lock                                "w"  :  NumberLong(0)  }                    "timeAcquiringMicros”:  {                                "r"  :  NumberLong(2),  #  waited  for  read  lock                                "w"  :  NumberLong(0)  }}}    
  • 19. CLI utilities ! mongotop  –  active  queries    and  duration   ! mongostat  –  faults,  queues,  locks,  queries   ! mongostat  and  the  MMS  rely  on  db.serverStatus()  
  • 20. Memory Limits ! Indications:  page  faults,  slow  queries   ! Hard  to  predict:  limit  is  application  specific   ! Not  all  data  has  to  be  in  memory   ! Not  all  indexes  have  to  be  in  memory   ! Page  faults  can  be  acceptable   ! Example:  a  user  searching  an  archive  can   experience  degraded  performance  
  • 21. Working Set ! Working  set:  what  has  to  stay  in  RAM   ! Boundary  that  once  passed  –  hurts  application   ! Examples:   ! Documents  that  are  often  accesses   ! Parts  of  index  that  are  often  traversed   ! Entire  collection  when  query  isn’t  indexed  
  • 22. Estimate Working Set >  db.serverStatus({workingSet:  1})  #  new  in  2.4   "workingSet"  :  {                  "pagesInMemory"  :  <num>,  #  4k  pages                  "overSeconds"  :  <num>  #  oldest  page  age   }   #  if  oldest  page  is  rather  new  –  not  enough  RAM    
  • 23. Mapped  -­‐  size  of  the  database   Resident  -­‐  data  in  RAM       Non-­‐mapped  -­‐  internal  data  structures   like  threads  and  connections  
  • 24. Network ! Usually  not  the  bottleneck   !  Could  be  for  multi  data  center  clusters   ! Latency  hurts  cross  shard  queries   ! Bandwidth  is  limited   ! CLI  monitoring  tools   ! nethogs  –  bandwidth  per  process   ! iftop  –  bandwidth  per  remote  host  
  • 25. CPU ! Can  be  a  bottleneck  for  query-­‐heavy  apps   ! Load  often  indicates  inefficient  queries   ! JavaScript  (both  spidermonky  and  v8)   ! One  interpreter  per  mongod   ! Means  one  core  per  mongod!   ! Aggregation  framework  is  better!   ! Fast  –  implemented  in  C++    
  • 26. CPU limits ! top  –  understand  ‘load  average’   ! Needed  processors  for  a  period  of  time   ! Three  periods  (minutes):  1  ,  5,  15   ! Example:     ! load  average:  4.12,  3.79,  3.61   ! For  a  two  processor  system,  it’s  overloaded   ! For  a  16  processor  system,  it’s  underutilized   ! Find  number  of  processors  in  /proc/cpuinfo  
  • 27. Hardware Specification ! No  one  size  fits  all   ! Simulate  your  application   ! Disk:  fill  the  database   ! RAM  +  CPU:  run  common  queries  
  • 28. To Scale Or Not To Scale? ! Will  you  really  need  a  cluster?   ! It  has  several  moving  parts   ! You  can  do  it  later  with  no  code  changes   ! You  will  need  to  migrate  the  data   ! Sharding   ! Scale  writes   ! Unlimited  capacity   ! Replica  sets   ! Scale  reads   ! Failover  
  • 29. Read/Write  ratio  can  help  predict  scaling  requirements