Techorama - Evolvable Application Development with MongoDB


Published on

Introduction to MongoDB for .NET developers.

Published in: Software
  • Be the first to like this

Techorama - Evolvable Application Development with MongoDB

  1. 1. 1 Evolvable Application Development with MongoDB Gerd Teniers Bart Wullems for .NET developers
  2. 2. WARNING – This session is rated as a ‘Grandma session’ (=Level 200)
  3. 3. 3 goals of this presentations When you leave this presentation you should have learned How easy it is to get started using MongoDB How using MongoDB changes the way you design and build your applications How MongoDB’s flexibility supports evolutionary design That giving speakers beer before a session is never a good idea
  4. 4. What is not cool? White socks & sandals
  5. 5. What is not cool? Dancing like Miley Cyrus
  6. 6. What is not cool? Relational databases
  7. 7. What is cool? Short pants and very large socks
  8. 8. What is cool? Dancing like Psy
  9. 9. What is cool? NO-SQL (=Not Only SQL)
  10. 10. ThoughtWork Technology Radar
  11. 11. Entity Framework 7 will support No-SQL
  12. 12. Gartner
  13. 13. What is MongoDB?
  14. 14. MongoDB HuMongous General purpose database Document oriented database using JSON document syntax Features: - Flexibility - Power - Scaling - Ease of Use - Built-in Javascript Users: Craigslist, eBay, Foursquare, SourceForge, and The New York Times.
  15. 15. Written in C++ Extensive use of memory-mapped files i.e. read-through write-through memory caching. Runs nearly everywhere Data serialized as BSON (fast parsing) Full support for primary & secondary indexes Document model = less work High Performance
  16. 16. MongoDB Database Architecture: Document { _id: ObjectId("5099803df3f4948bd2f98391"), name: { first: "Alan", last: "Turing" }, birth: new Date('Jun 23, 1912'), death: new Date('Jun 07, 1954'), contribs: [ "Turing machine", "Turing test", "Turingery" ], views : NumberLong(1250000) }
  17. 17. MongoDB Database Architecture: Collection Logical group of documents May or may not share same keys Schema is dynamic/application maintained
  18. 18. Why should I use it?(or how do I convince my boss?) Developer productivity Avoid ORM pain, no mapping needed Performance(again) Scaling out is easy(or at least easier) Optimized for reads Flexibility Dynamic schema
  19. 19. How to run it? Exe Windows service Azure 3rd party commercial hosting
  20. 20. How to talk to it? Mongo shell Official and non official drivers >12 languages supported
  21. 21. DEMO 1 - PROTOTYPING
  22. 22. Schema design
  23. 23. 23 First step in any application is determine your domain/entities
  24. 24. In a relational based app We would start by doing schema design
  25. 25. In a MongoDB based app We start building our app and let the schema evolve
  26. 26. Comparison Album - id - artistid - title Track - no - name - unitPrice - popularity Artist - id - name Album - _id - title - artist - tracks[] - _id - name Relational Document db
  27. 27. Modeling
  28. 28. Modeling Start from application-specific queries “What questions do I have?” vs “What answers” “Data like the application wants it” Base parent documents on The most common usage What do I want returned?
  29. 29. Modeling Embedding vs Linking vs Hybrid Album - _id - artist - cover - _id - name Artist - _id - name - photo
  30. 30. Product Single collection inheritance Product - _id - price Book - author - title Album - artist - title Jeans - size - color - _id - price - author - title Relational Document db - _id - price - size - color
  31. 31. Product Single collection inheritance Product - _id - price Book - author - title Album - artist - title Jeans - size - color _type: Book - _id - price - author - title Relational Document db _type: Jeans - _id - price - size - color
  32. 32. One-to-many Embedded array / array keys Some queries get harder You can index arrays! Normalized approach More flexibility A lot less performance BlogPost - _id - content - tags: {“foo”, “bar”} - comments: {“id1”, “id2”}
  33. 33. Demo 2 – MODELING
  34. 34. CRUD
  35. 35. CRUD operations Create: insert, save Read: find, findOne Update: update, save Delete: remove, drop
  36. 36. ACID Transactions No support for multi-document transactions commit/rollback Atomic operations on document level Multiple actions inside the same document Incl. embedded documents By keeping transaction support extremely simple, MongoDB can provide greater performance especially for partitioned or replicated systems
  37. 37. Demo 3 – CRUD
  38. 38. GridFS
  39. 39. Storing binary documents Although MongoDB is a document database, it’s not good for documents :-S Document != .PNG & .PDF files Document size is limited Max document size is 16MB Recommended document size <250KB Solution is GridFS Mechanism for storing large binary files in MongoDB Stores metadata in a single document inside the fs.files collection Splits files into chunks and stores them inside the fs.chunks collection GridFS implementation is handled completely by the client driver
  40. 40. Demo 4 – Evolving your domain model ------------& GRIDFS
  41. 41. Evolving your domain model Great for small changes! Hot swapping Minimal impact on your application and database Avoid Migrations Handle changes in your application instead of your database
  42. 42. Performance
  43. 43. Avoid table collections scans by using indexes > db.albums.ensureIndex({title: 1}) Compound indexes Index on multiple fields > db.albums.ensureIndex({title: 1, year: 1}) Indexes have their price Every write takes longer Max 64 indexes on a collection Try to limit them Indexes are useful as the number of records you want to return are limited If you return >30% of a collection, check if a table scan is faster Creating indexes
  44. 44. Aggregations with the Aggregation Framework $project Select() $unwind SelectMany() $match Where() $group GroupBy() $sort OrderBy() $skip Skip() $limit Take() Largely replaces the original Map/Reduce Much faster! Implemented in a multi-threaded C ++ No support in LINQ-provider yet (but in development)
  45. 45. Demo 5 – Optimizations
  46. 46. Conclusion
  47. 47. Benefits Scalable: good for a lot of data & traffic Horizontal scaling: to more nodes Good for web-apps Performance No joins and constraints Dev/user friendly Data is modeled to how the app is going to use it No conversion between object oriented > relational No static schema = agile Evolvable
  48. 48. Drawbacks Forget what you have learned New way of building and designing your application Can collect garbage No data integrity checks Add a clean-up job Database model is determined by usage Requires insight in the usage
  49. 49.
  50. 50. Things we didn’t talk about
  51. 51. Things we didn’t talk about…  Security - HTTPS/SSL  Compile the code yourself  Eventual Consistency  Geospatial features  Realtime Aggregation
  52. 52. Things we didn’t talk about…  Many to Many - Multiple approaches  References on 1 site  References on both sites
  53. 53. Things we didn’t talk about…  Write Concerns - Acknowledged vs Unacknowledged writes - Stick with acknowledged writes(=default)
  54. 54. Things we didn’t talk about…  GridFS disadvantages - Slower performance: accessing files from MongoDB will not be as fast as going directly through the filesystem. - You can only modify documents by deleting them and resaving the whole thing. - Drivers are required
  55. 55. Things we didn’t talk about…  Schema Migrations - Avoid it - Make your app backwards compatible - Add version field to your documents
  56. 56. Things we didn’t talk about…  Why you should not use regexes - Slow!  Advanced Indexing - Indexing objects and Arrays - Unique vs Sparse Indexes - Geospatial Indexes - Full Text Indexes  MapReduce - Avoid it - Very slow in MongoDB - Use Aggregation FW instead
  57. 57. Things we didn’t talk about…  Sharding  Based on a shard key (= field)  Commands are sent to the shard that includes the relevant range of the data  Data is evenly distributed across the shards  Automatic reallocation of data when adding or removing servers