Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
MongoDB at MMFFrom a DevOps Perspective      Jan 24, 2013
Introductionl     MapMyFitness	  was	  founded	  in	  2007l     Offices	  in	  Denver,	  C O	  &	  AusRn,	  T X     (w/	  ...
MMF Platform Overview•	  Python	  (django)	  &	  P HP	  (legacy	  A PI)•	  Although	  MySQL	  is	  the	  core	  backing	  ...
Route & Elevation data example   (Lost on the way to MongoSeattle)
Implementation Patterns•	  Standard	  Datastore	  -­‐	  3	  member	  replica	  set 	  	  	  	  (small	  to	  med	  impleme...
Implementation Patterns•	  In	  the	  cloud,	  tune	  the	  instance	  type	  to	  the	  mongo	  implementaRon•	  On	  iro...
Operational Automation( example of automated mongodb install via puppet )
Replica Set Expansion•   MongoDB	  is	  “replicaRon	  made	  elegant”•   Ridiculously	  simple	  to	  add	  addiRonal	  me...
Monitoring and Introspection•	  M MS,	  10gens	  cloud-­‐based	  monitoring	  service	  (best	  available)•	  Supported	  ...
10gens MMS(the one-stop shop for mongdb metrics)
Mongo in Zabbix( Mikoomi Plugins: http://code.google.com/p/mikoomi )
mongostat( Very useful for real-time troubleshooting )
Operational Automation( example of automated mongodb restart action )
Security Considerations•	  MongoDB	  provides	  authenRcaRon	  support	  and	  basic	  permissions•	  Auth	  is	  turned	 ...
Network Security Automation## Puppet Pattern for Mongodb network securityclass iptables::public {      iptables::add_rule ...
Security Considerations•	  Use	  the	  rule	  of	  least-­‐privilege	  to	  allow	  access	  to	  environments	  •	  Data	...
Maintenance•	  Far	  less	  maintenance	  required	  than	  tradiRonal	  R DMBS	  systems•	  Regularly	  perform	  query	 ...
Indexing Patterns or “Know Your App”•   Proper	  indexing	  criRcal	  to	  performance	  at	  scale    (monitor	  slow	  q...
Capped Collections• Use	  standard	  capped	  collecRons	  for	  retaining	  a	  fixed	  amount	    of	  data.	  	  Uses	  ...
Lessons Learned•	  Mongo	  2.2	  upgrade	  containing	  a	  capped	  collecRon	  created	  in	  1.8.4.	  	  This	  severel...
Thank	  you!chris@mapmyfitness.com
Upcoming SlideShare
Loading in …5
×

MongoDB at MapMyFitness from a DevOps Perspective

1,019 views

Published on

  • Be the first to comment

MongoDB at MapMyFitness from a DevOps Perspective

  1. 1. MongoDB at MMFFrom a DevOps Perspective Jan 24, 2013
  2. 2. Introductionl MapMyFitness  was  founded  in  2007l Offices  in  Denver,  C O  &  AusRn,  T X (w/  associates  in  S F,  Boston,  New  York,  L A,  and  Chicago)l Over  13  million  registered  usersl ~80  million  geo-­‐data  routes   (runs,  rides,  walks,  hikes,  etc)l Core  sites,  mobile  apps,  A PI,  white-­‐label (MapMyRun,  MapMyRide,  MapMyFitness)
  3. 3. MMF Platform Overview•  Python  (django)  &  P HP  (legacy  A PI)•  Although  MySQL  is  the  core  backing  db  for  Django,  the  majority  of    M MF  data  lives  in  various  MongoDB  datastores.    •  Routes  datastore  has  ~120  million  objects,  currently  7TB+  of  data    (3  member  replica  set  backed  by  a  EMC  SAN,  48GB  RAM  each)•  Django  sessions  converted  to  using  MongoDB      (funcRonal  scaling  example,  600M  sessions  stored)•  Live  Tracking  system  uRlizes  elasRc  replica  set  membership  to    handle  load  scaling  for  events•  Granular  A PI  access/error  logging  via  json  to  MongoDB
  4. 4. Route & Elevation data example (Lost on the way to MongoSeattle)
  5. 5. Implementation Patterns•  Standard  Datastore  -­‐  3  member  replica  set        (small  to  med  implementaRons)•  Big  Data  implementaRon  –  sharded  cluster  (TB+)•  Buffering  Layer  -­‐  high  memory          (load  all  data  and  index  files  into  R AM)•  Write  Heavy  -­‐  uRlize  sharding  to  opRmize  for  writes•  Read  Heavy  -­‐  3+n  replica  set  configuraRon  for  rapid  read  scaling        (up  to  12  nodes)
  6. 6. Implementation Patterns•  In  the  cloud,  tune  the  instance  type  to  the  mongo  implementaRon•  On  iron,  plan  carefully  and  dedicate  servers  completely  to  mongo  to   avoid  memory  map  contenRon•  For  D R,  spin  up  a  delayed,  hidden  replica  node  (preferably  in  a   different  datacenter)•  AggregaRon  framework  can  be  used  in  myriad  ways,  including   bridging  the  gap  to  S QL  data  warehousing  via  E TL.•  Automate  install  paSerns  for  rapid  development,  prototyping,  and   infrastructure  scaling.
  7. 7. Operational Automation( example of automated mongodb install via puppet )
  8. 8. Replica Set Expansion• MongoDB  is  “replicaRon  made  elegant”• Ridiculously  simple  to  add  addiRonal  members• Be  sure  to  run  IniRalSync  from  a  secondary! rs.add(  “host”  :  “livetrack_db09”,  “ini8alSync”  :  {  “state”  :  2  }  )• Both  rs.add()  and  rs.remove()  can  be  scripted  and  connected  to   Monitoring  systems  for  autoscaling
  9. 9. Monitoring and Introspection•  M MS,  10gens  cloud-­‐based  monitoring  service  (best  available)•  Supported  by  Zabbix,  Nagios,  Munin,  Server  Density,  etc•  mongostat,  mongotop,  R EST  interface,  database  profiler•  Monitoring  system  triggers  can  iniRate  node  addiRons,    removals,  service  restarts,  etc•  In  addiRon  to  service-­‐level  monitoring,  use  more  advanced    tests  to  check  for  and  alert  on  query  latency  spikes
  10. 10. 10gens MMS(the one-stop shop for mongdb metrics)
  11. 11. Mongo in Zabbix( Mikoomi Plugins: http://code.google.com/p/mikoomi )
  12. 12. mongostat( Very useful for real-time troubleshooting )
  13. 13. Operational Automation( example of automated mongodb restart action )
  14. 14. Security Considerations•  MongoDB  provides  authenRcaRon  support  and  basic  permissions•  Auth  is  turned  off  by  default  to  allow  for  opRmal  performance  •  Always  run  databases  in  a  trusted  network  environment•  Lock  down  host  based  firewalls  to  limit  access  to  required  clients  •  Automate  iptables  with  puppet  or  chef,  in  EC2  use  security  groups
  15. 15. Network Security Automation## Puppet Pattern for Mongodb network securityclass iptables::public { iptables::add_rule { 001 MongoDB established: rule => -A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT } iptables::add_rule { 002 MongoDB: rule => -A RH-Firewall-1-INPUT -i eth1 -p tcp -m tcp --dport 27017 -j ACCEPT } iptables::add_rule { 003 MongoDB MMF Phase II Network: rule => -A RH-Firewall-1-INPUT -i eth0 -s 172.16.16.0/20 -p tcp -m tcp --dport 27017 -j ACCEPT } iptables::add_rule { 004 MongoDB MMF Cloud Network: rule => -A RH-Firewall-1-INPUT -i eth0 -s 10.178.52.0/24 -p tcp -m tcp --dport 27017 -j ACCEPT } }
  16. 16. Security Considerations•  Use  the  rule  of  least-­‐privilege  to  allow  access  to  environments  •  Data  sensiRvity  should  determine  the  extent  of  security  measures•  For  non-­‐sensiRve  data,  good  network  security  can  be  sufficient  •  In  open  environments,  be  sure  experience  matches  access  level•  Lack  of  granular  perms  allows  for  full  admin  access,  use  discreRon
  17. 17. Maintenance•  Far  less  maintenance  required  than  tradiRonal  R DMBS  systems•  Regularly  perform  query  profile  analysis  and  index  audiRng•  Rebuild  databases  to  reclaim  space  lost  due  to  fragmentaRon•  Automate  checks  of  log  files  for  known  red-­‐flags•  Regularly  review  data  throughput  rate,  storage  growth  rate,  and    overall  business  growth  graphs  to  inform  capacity  planning.•  For  H A  tesRng,  periodically  step-­‐down  the  primary  to  force  failover
  18. 18. Indexing Patterns or “Know Your App”• Proper  indexing  criRcal  to  performance  at  scale (monitor  slow  queries  to  catch  non-­‐performant  requests)• MongoDB  is  ulRmately  flexible,  being  schemaless (mongo  gives  you  enough  rope  to  hang  yourself,  choose  wisely)• Avoid  un-­‐indexed  queries  at  all  costs   (its  quickest  way  to  crater  your  app...  consider  -­‐-­‐notablescan)• Onus  on  DevOps  to  match  applicaRon  to  indexes (know  your  query  profile,  never  assume)• Shoot  for  covered  queries  wherever  possible (answer  can  be  obtained  from  indexes  only)
  19. 19. Capped Collections• Use  standard  capped  collecRons  for  retaining  a  fixed  amount   of  data.    Uses  a  F IFO  strategy  for  pruning. (based  on  data  size,  not  number  of  rows)• TTL  CollecRons  (2.2)  age  out  data  based  on  a  retenRon  Rme   configuraRon.     (great  for  data  retenRon  requirements  of  all  types) Gotcha! Explicitly  create  the  capped  collecRon  before  any  data  is  put   into  the  system  to  avoid  auto-­‐creaRon  of  collecRon
  20. 20. Lessons Learned•  Mongo  2.2  upgrade  containing  a  capped  collecRon  created  in  1.8.4.    This  severely  impacted   replicaRon  (RC:  no  "_id"  index,    F IX:  add  "_id"  index)  •  Never  start  mongo  when  a  mount  point  is  missing  or  incorrectly  configured.  Mongo  may   decide  to  take  maSers  into  its  own  hands  and  resync  itself  with  the  replica  set.     Make   sure  your  devops  and  your  hos2ng  provider  admins  are  aware  of  this•  Some  drivers  that  use  connecRon  pooling  can  freak  the  freaky  freak  when  the  primary   member  changes  (older  pymongo).    Kicking  the  applicaRon  can  fix,  also:  upgrade  drivers•  High  locked  %  is  a  big  red-­‐flag,  and  can  be  caused  by  a  large  number  of  simultaneous  dml   acRons  (high  insert  rate,  high  update  rate).  Consider  this  in  the  design  phase.•  Be  wary  of  automaRon  that  can  change  the  state  of  a  node  during  maintenance  mode.     Disable  automaRon  agents  for  reduced  risk  during  criRcal  administraRve  operaRons   (filesystem  maint,  etc)
  21. 21. Thank  you!chris@mapmyfitness.com

×