Mongodb beijingconf yottaa_3.3

1. MongoDB In Production: YottaaPractice XiangJun Wu System Engineer xwu@yottaa.com Yottaa Inc. 2 Canal Park 5th Floor Cambridge MA 02141 http://www.yottaa.com

3. Engineering challenges

4. System Architecture

5. Collection Design

6. Production environment

7. Lesson Learned

8. QA2

9. What is Yottaa? 3

10. We Monitor More Sites Than Anyone Else 4

11. Demo We are recruiting! http://www.yottaa.com/about#jobs 5

13. 27,000+ URLs monitored

14. ~300 samples per URL per day

15. Some samples are >1mb (firebug)

16. Missing a sample isn’t a big deal

17. Collect over 10 kinds of metrics: DNS lookup, time to display, time to interactive, firebug, Yslowand so forth

18. We try to make everything real time

19. No batch jobs, everything displayed as it happens

20. “Check Now” button runs tests on demand6

22. Started with team of 2

23. Must be Agile

24. We didn’t know exactly what features we’d need

25. Requirements change daily

26. Limited operations budget

27. No full time operations staff

28. 100% in the cloud: EC2, voxel, linode, rackspaceand so forth cloud provider7

29. Sharding! High Concurrency Scale-Out App Server Passenger Collection Nginx Mongos Reporting Easy as Rails! MongoD Data Source MongoD Load Balancer User MongoD 8

31. Web metrics - store DNS lookup, http connection, firebug etc. Web metrics data with different scales: daily, monthly. The purpose is to speed up data report from frontend. Raw data for query in the detailed.

32. Alerts - monitor if some website/URL has performance downgrade.

33. Summary - store the most frequently read URL information. also, used a message queue for worker to fetch URL access task.

34. URL optimization logic - store optimization switch: enable CDN, enable compression, CSS minify and so forth. 9

36. Application metrics - cache hit, process speed, health

37. All log information - use logstash (http://code.google.com/p/logstash/) to feed and store log for different components inMongoDB. Search log via Rails. Plan to applySinatra interface for both log feed and query. 10

38. Database Architecture 11

39. { url: ‘www.google.com’, location: “Beijing” connect: 23, first_byte: 123, last_byte: 245, timestamp: 1234 } URL Location Connect First Byte Last Byte Timestamp { url: ‘www.google.com’, location: “Shanghai” connect: 23, first_byte: 123, last_byte: 245, timestamp: 2345 } Thinking in rows 12

40. URL Location Connect First Byte Last Byte Timestamp Thinking in rows What was the average connect time for google on friday? From Beijing? From Shanghai? Between 1AM-2AM? 13

41. Up to 100’s of samples per URL per day!! URL Location Connect First Byte Last Byte Timestamp An “average” chart had to hit 3000 rows 30 days average query range Thinking in rows Day 1 AVG Result Day 2 AVG Day 3 AVG 14

42. URL www.google.com Last Byte Sum 2312 SFO Sum 1200 Sum 1112 Thinking in Documents This document contains all data for www.google.com collected during 9/20/2010 This tells us the average value for this metric for this url / time period Average value from Beijing Average value from Shanghai 15

43. URL Day <data> More efficient charts 1 Document per URL per Day Day 1 AVG Result Day 2 Average chart hits 30 documents. AVG 100x fewer Day 3 AVG 30 days == 30 documents 16

44. Which document we’re updating Atomically update the document Update the aggregate value Update the location specific value Create the document if it doesn’t already exist Storing a sample db.metrics.dailies.update( { url: ‘www.google.com’, day: new Date(2010,9,2)}, { ‘$inc’: { ‘connect.sum’:1234, ‘connect.count’:1, ‘connect.bj.sum’:1234, ‘connect.bj.count’:1 } }, true // upsert ); 17

45. Putting it together Atomically update the daily data 1 { url: ‘www.google.com’, location: “Beijing” connect: 23, first_byte: 123, last_byte: 245, timestamp: 1234 } Atomically update the weekly data 2 Atomically update the monthly data 3 18

47. 4 MongoDBserver in 3 DB cluster

48. Master/Slave setup in same datacenter

49. One master and one slave for core database

50. Backup the entire database everyday

51. Restore the entire data to newMongoDBserver for data integrity.

52. SaveMongoDBlog for slow query/ops analysis

53. After 120 days, we have > 500GB of data

54. Adding about 5gb / day today

55. 101 read/s, 70.96 write/s

56. Global lock rate 34.9%19

60. Preallocateoplog before startingMongDBif you are using ext3/ext2 file system;ext4/xfs has better performance and don’t need to take care on oplog.

62. Avoid slow write operation or hold lock too long

63. Watch MongoDBlogs after new deployment22

64. We Are Hiring! Mongodb,Ruby ,Web and Java talents http://www.yottaa.com/about#jobs Thank you for viewing 23

Mongodb beijingconf yottaa_3.3

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Mongodb beijingconf yottaa_3.3

Similar to Mongodb beijingconf yottaa_3.3 (20)

More from Yottaa

More from Yottaa (19)

Recently uploaded

Recently uploaded (20)

Mongodb beijingconf yottaa_3.3