Discover MongoDB - Israel

610 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
610
On SlideShare
0
From Embeds
0
Number of Embeds
150
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • “Who here runs an application using databases in production?\n“Who has ever had an application or service outage due to database problems?”\n\n
  • \n
  • “I know I have.”\nBeen up late at night, wondering why replication doesn’t work, manually failing over to a slave\n
  • Since today is about “Discover MongoDB”, some of the parts I talk about may not be relevant to you, and others may.\nFor the next 30 minutes or so, I will show you some examples of how using Mongo is easier to implement than many other data storage products.\n
  • Worked with companies from startup to enterprise, banking to web tech, and everything in between.\nFrom SQL server, mysql to hadoop & vertica\n
  • Agility:\nFor Developers:\nReally easy to get started. Download, run, use driver, show examples, and it “just works”\n
  • How many people have seen something like this before?\nThis is a data model diagram of ....\nThe amount of tables and joins are carefully calculated and modeled, and changing that once in production is not an easy task.\n
  • JSON-style objects, BSON data format. Almost every modern language speaks JSON, as well as many web APIs.\n\n"all bugs reported by michael"\n"all bugs with project website and status open"\n
  • Write concern:\nFire and forget\nWait for error \nWait for journal sync \nWait for fsync\nWait for replication\n\nEventual consistency\nIn reality, many applications are very useful with eventual consistency\n
  • Components:\n application/driver\n mongod\n database\n
  • mongod - primary mongo application, storage system\n\nconfig - a mongod process used to synchronously replicate the state information of a sharded environment. In a sharded environment, config servers store the metadata of the cluster. Although config server can run standalone, any production deployment should have exactly three config servers that have copies of the same metadata (for data safety and high availability).\n\nmongos - routing & coordination process that makes the mongod nodes in the cluster look like a single system. \nMongos processes have no persistent state; rather, they keep a cached copy of config server data in memory. \nAny changes that occur on the config servers are propagated to each mongos process. Mongos processes may be run on the shard servers themselves, but they are lightweight enough to exist on each application server. Many mongos processes can be run simultaneously since these processes do not coordinate between one another.\n
  • \n
  • For Operations:\nreplica set auto failover with reelection\npackages for deb, rpm maintained by 10gen, distros like freebsd maintain their own ports\nthe mongo shell has many admin-style helpers for setting up replica set and sharding configuration\nin progress: working on chef cookbooks, puppet manifests, ec2 cloudformation, ubuntu juju charms, etc\n
  • \n
  • Components:\n application/driver\n mongod\n database\n arbiter\n
  • mongod - primary mongo application, storage system\n\nconfig - a mongod process used to synchronously replicate the state information of a sharded environment. In a sharded environment, config servers store the metadata of the cluster. Although config server can run standalone, any production deployment should have exactly three config servers that have copies of the same metadata (for data safety and high availability).\n\nmongos - routing & coordination process that makes the mongod nodes in the cluster look like a single system. \nMongos processes have no persistent state; rather, they keep a cached copy of config server data in memory. \nAny changes that occur on the config servers are propagated to each mongos process. Mongos processes may be run on the shard servers themselves, but they are lightweight enough to exist on each application server. Many mongos processes can be run simultaneously since these processes do not coordinate between one another.\n
  • mongos \n
  • Pre-production:\n6 month period\nunlimited servers\nengineer support & health check\n
  • Scalability:\nAny good data storage mechanism today has scalability, one way or another.\nNote: ReplSets max at 7 members\n
  • \n
  • mongos \n
  • In virtualization:\ndon’t scale up as far as you can you before you hit max\n
  • application/driver\nmongod\n database*\n arbiter \n config\nmongos router\n
  • mongod - primary mongo application, storage system\n\nconfig - a mongod process used to synchronously replicate the state information of a sharded environment. In a sharded environment, config servers store the metadata of the cluster. Although config server can run standalone, any production deployment should have exactly three config servers that have copies of the same metadata (for data safety and high availability).\n\nmongos - routing & coordination process that makes the mongod nodes in the cluster look like a single system. \nMongos processes have no persistent state; rather, they keep a cached copy of config server data in memory. \nAny changes that occur on the config servers are propagated to each mongos process. Mongos processes may be run on the shard servers themselves, but they are lightweight enough to exist on each application server. Many mongos processes can be run simultaneously since these processes do not coordinate between one another.\n
  • mongos \n
  • L1 cache: sandwich in front of you\nL2 cache: Walk to kitchen and make a sandwich\nRAM: Go to the store and get ingredients, go home and make a sandwich\nDisk: Start a farm, and grow the wheat and vegetables to make a sandwich\n\nmongod database, config:\nRAM, Disk, CPU\nmongod arbiter:\nnetwork\nmongos router:\nnetwork, CPU\n\n
  • Logs:\nTalk about the different types of logs\n
  • RAID!!!!\nMore personal experiences\n
  • \n
  • \n
  • \n
  • \n
  • Discover MongoDB - Israel

    1. 1. Technical Considerations Michael A. Fiedler
    2. 2. Who uses MongoDB? 2
    3. 3. 3
    4. 4. Agenda• Why MongoDB is easier to... • Develop • Operate • Scale 4
    5. 5. $ whoami{ name: "Michael A. Fiedler", desc: ["Systems Engineer", "SysAdmin", "DevOps", "that guy"], hist: "Building tech platforms 16+ years", twitter: "@mikefiedler", github : "@miketheman", website: "http://www.miketheman.net"} 5
    6. 6. Why MongoDB?• Easier for developers • Running within minutes • MANY drivers! 6
    7. 7. Why MongoDB?• Easier for developers • Running within minutes • MANY drivers! 6
    8. 8. RDBMS Data Model 7
    9. 9. Easier to... develop updated_bug = { _id: "3004", desc: "web page wrong color", status: "resolved", reported_by: "michael", assigned_to: "ben", verified_by: "amelia", product: "website", tags: ["web", "ux", "css"] } db.bugs.save(updated_bug) 8
    10. 10. Development concepts• JSON-like• Flexible data model• Data control at app layer • Write Concern 9
    11. 11. Typical development model app Single Node DB 10
    12. 12. Easier to... get help• Open Source• Rich community, examples• 10gen offers free community support! • http://groups.google.com/group/mongodb-user • IRC: #mongodb@freenode.net 11
    13. 13. Why MongoDB?• Easier for operations • minimal skill set required • scaling within minutes • automatic replica failover 12
    14. 14. Easier to... operate• Packages, init scripts• Shell helpers for admin functions• Docs: Admin Zone• Future: cookbooks and manifests 13
    15. 15. Adding fault tolerance app Replica Set db|| db DB arb 14
    16. 16. Adding more front-end servers app app app Replica Set db DB db 15
    17. 17. Easier to... get more help• 10gen offers: • consulting • training • commercial support • pre-production subscription (new!!) 16
    18. 18. Why MongoDB?• Easier for growth • grow as you need to • plan your capacity • open source 17
    19. 19. Easier to... scale• Distribute reads, eventual consistency• Shard databases across multiple servers • Replicate sharded databases for HA• Automatic (re)balancing of data 18
    20. 20. Slave reads, eventual consistency app app app Replica Set, slaveOk db DB db 19
    21. 21. When should I scale?• Metrics: • Business Performance • Page faults?• More reads or writes? Access patterns? • Find the bottleneck!• Shard when needed• Do not wait until 75-80% capacity to scale 20
    22. 22. Shard the data! app app app Sharded Replicas mongos mongos mongos DB DB DB db db db db db db 21
    23. 23. Shards can be in distinct locations app app app Sharded Replicas mongos mongos mongos dc1 DB DB DB dc2 db db db dc3 db db db 22
    24. 24. Easier to... plan for resources • Database size vs. working set vs. indexes only 23
    25. 25. Easier to... see what’s going on• Operational (or “run”) log• free, iostat, vmstat, sar, dstat• mongostat• plugins for munin, ganglia• MMS - FREE! https://mms.10gen.com 24
    26. 26. Things to consider• More RAM & IOPS• Write buffering w/o battery backup - bad!• Config servers are not a replica set• Sharded cluster: use balancer• Choose shard key carefully• Test your backups 25
    27. 27. Online resources• http://mongodb.org• http://www.10gen.com• Bug Tracking: https://jira.mongodb.org• Book: MongoDB in Action, Kyle Banker• Slides: Deployment Tips, Jared Rosoff• Video: Schema Design, Eliot Horowitz• Many, many more on 10gen site 26
    28. 28. Why MongoDB?• It’s easier to: • develop • operate • scale
    29. 29. Why MongoDB?• It’s easier to: • develop • operate • scale• No, really.
    30. 30. Why MongoDB?• It’s easier to: • develop • operate • scale• No, really.• Try it, I dare you. http://try.mongodb.org/

    ×