Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mongodb beijingconf yottaa_3.3

4,311 views

Published on

Yottaa mongodb production in Mongodb Beijing 2011.3.3 conference.

Published in: Education
  • Be the first to comment

Mongodb beijingconf yottaa_3.3

  1. 1. MongoDB In Production:<br />YottaaPractice<br />XiangJun Wu<br />System Engineer<br />xwu@yottaa.com<br />Yottaa Inc. <br />2 Canal Park 5th Floor<br />Cambridge MA 02141<br />http://www.yottaa.com<br />
  2. 2. Overview<br /><ul><li>About Yottaa
  3. 3. Engineering challenges
  4. 4. System Architecture
  5. 5. Collection Design
  6. 6. Production environment
  7. 7. Lesson Learned
  8. 8. QA</li></ul>2<br />
  9. 9. What is Yottaa?<br />3<br />
  10. 10. We Monitor More Sites Than Anyone Else<br />4<br />
  11. 11. Demo<br />We are recruiting!<br />http://www.yottaa.com/about#jobs<br />5<br />
  12. 12. Engineering Challenges<br /><ul><li>We collect lots of data
  13. 13. 27,000+ URLs monitored
  14. 14. ~300 samples per URL per day
  15. 15. Some samples are >1mb (firebug)
  16. 16. Missing a sample isn’t a big deal
  17. 17. Collect over 10 kinds of metrics: DNS lookup, time to display, time to interactive, firebug, Yslowand so forth
  18. 18. We try to make everything real time
  19. 19. No batch jobs, everything displayed as it happens
  20. 20. “Check Now” button runs tests on demand</li></ul>6<br />
  21. 21. Engineering Challenges<br /><ul><li>Small engineering team
  22. 22. Started with team of 2
  23. 23. Must be Agile
  24. 24. We didn’t know exactly what features we’d need
  25. 25. Requirements change daily
  26. 26. Limited operations budget
  27. 27. No full time operations staff
  28. 28. 100% in the cloud: EC2, voxel, linode, rackspaceand so forth cloud provider</li></ul>7<br />
  29. 29. Sharding!<br />High Concurrency<br />Scale-Out<br />App Server<br />Passenger<br />Collection<br />Nginx<br />Mongos<br />Reporting<br />Easy as Rails!<br />MongoD<br />Data Source<br />MongoD<br />Load<br />Balancer<br />User<br />MongoD<br />8<br />
  30. 30. Database Architecture<br />Primary data store is broken into 5 part<br /><ul><li>Users - user related data.
  31. 31. Web metrics - store DNS lookup, http connection, firebug etc. Web metrics data with different scales: daily, monthly. The purpose is to speed up data report from frontend. Raw data for query in the detailed.
  32. 32. Alerts - monitor if some website/URL has performance downgrade.
  33. 33. Summary - store the most frequently read URL information. also, used a message queue for worker to fetch URL access task.
  34. 34. URL optimization logic - store optimization switch: enable CDN, enable compression, CSS minify and so forth. </li></ul>9<br />
  35. 35. Database Architecture<br />MongoDBhas other usage cases:<br /><ul><li>System metrics - cpu/memory/network
  36. 36. Application metrics - cache hit, process speed, health
  37. 37. All log information - use logstash</li></ul> (http://code.google.com/p/logstash/)<br /> to feed and store log for different components inMongoDB. Search log via Rails. Plan to applySinatra interface for both log feed and query. <br />10<br />
  38. 38. Database Architecture<br />11<br />
  39. 39. { url: ‘www.google.com’,<br /> location: “Beijing”<br /> connect: 23, <br /> first_byte: 123, <br /> last_byte: 245, <br /> timestamp: 1234 } <br />URL<br />Location<br />Connect<br />First Byte<br />Last Byte<br />Timestamp<br />{ url: ‘www.google.com’,<br /> location: “Shanghai” <br /> connect: 23, <br /> first_byte: 123, <br /> last_byte: 245, <br /> timestamp: 2345 } <br />Thinking in rows<br />12<br />
  40. 40. URL<br />Location<br />Connect<br />First Byte<br />Last Byte<br />Timestamp<br />Thinking in rows<br />What was the average connect time for google on friday?<br />From Beijing?<br />From Shanghai?<br />Between 1AM-2AM?<br />13<br />
  41. 41. Up to 100’s of samples per URL per day!!<br />URL<br />Location<br />Connect<br />First Byte<br />Last Byte<br />Timestamp<br />An “average” chart had to hit 3000 rows <br />30 days average query range<br />Thinking in rows<br />Day 1<br />AVG<br />Result<br />Day 2<br />AVG<br />Day 3<br />AVG<br />14<br />
  42. 42. URL<br />www.google.com<br />Last Byte<br />Sum<br />2312<br />SFO<br />Sum<br />1200<br />Sum<br />1112<br />Thinking in Documents<br />This document contains all data for www.google.com<br />collected during 9/20/2010<br />This tells us the average value for this metric for this url / time period<br />Average value from Beijing<br />Average value from Shanghai<br />15<br />
  43. 43. URL<br />Day<br /><data><br />More efficient charts<br />1 Document per URL per Day<br />Day 1<br />AVG<br />Result<br />Day 2<br />Average chart hits 30 documents. <br />AVG<br />100x fewer<br />Day 3<br />AVG<br />30 days == 30 documents<br />16<br />
  44. 44. Which document we’re updating<br />Atomically update the document<br />Update the aggregate value<br />Update the location specific value<br />Create the document if it doesn’t already exist<br />Storing a sample<br />db.metrics.dailies.update( <br /> { url: ‘www.google.com’, <br /> day: new Date(2010,9,2)}, <br /> { ‘$inc’: { <br /> ‘connect.sum’:1234,<br /> ‘connect.count’:1,<br /> ‘connect.bj.sum’:1234,<br /> ‘connect.bj.count’:1 } },<br /> true // upsert<br />);<br />17<br />
  45. 45. Putting it together<br />Atomically update the daily data<br />1<br />{ url: ‘www.google.com’,<br /> location: “Beijing”<br /> connect: 23, <br /> first_byte: 123, <br /> last_byte: 245, <br /> timestamp: 1234 } <br />Atomically update the weekly data<br />2<br />Atomically update the monthly data<br />3<br />18<br />
  46. 46. Mongodb In Production<br /><ul><li>EC2 based large server, 2CPU, 8GB memory
  47. 47. 4 MongoDBserver in 3 DB cluster
  48. 48. Master/Slave setup in same datacenter
  49. 49. One master and one slave for core database
  50. 50. Backup the entire database everyday
  51. 51. Restore the entire data to newMongoDBserver for data integrity.
  52. 52. SaveMongoDBlog for slow query/ops analysis
  53. 53. After 120 days, we have > 500GB of data
  54. 54. Adding about 5gb / day today
  55. 55. 101 read/s, 70.96 write/s
  56. 56. Global lock rate 34.9%</li></ul>19<br />
  57. 57. Production: Sharding<br />Write load evenly<br />distributed<br />Shard 1<br />Collection Server<br />Shard 2<br />Shard by URL<br />Reporting Server<br />Shard 3<br />Most reads hit a single shard<br />Shard 4<br /><ul><li> Scale out architecture </li></ul>Mongo auto - shardingallows us to “just add servers” at rails & db tiers. Right now, no sharding is used.<br />20<br />
  58. 58. Production:Monitor<br /><ul><li>Apply restful API to sendmongostatmetrics to own monitor system, we can watchMongoDBperformance in real time.</li></ul>21<br />
  59. 59. Lesson Learned<br /><ul><li>Consider collection sharding from first day
  60. 60. Preallocateoplog before startingMongDBif you are using ext3/ext2 file system;ext4/xfs has better performance and don’t need to take care on oplog.
  61. 61. Review all slow query and add proper index in staging env.</li></ul>‘<br /><ul><li>Be careful to add index in production; Try to add indexes in background or ‘off’ time.
  62. 62. Avoid slow write operation or hold lock too long
  63. 63. Watch MongoDBlogs after new deployment</li></ul>22<br />
  64. 64. We Are Hiring!<br />Mongodb,Ruby ,Web and Java talents <br />http://www.yottaa.com/about#jobs<br />Thank you for viewing<br />23<br />

×