Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Eat your own dog food using mongo db at mongodb

151 views

Published on

MongoDB University 구현사례를 통해 MongoDB내부에선 MongoDB를 어떻게 사용했는지를 알아봅니다.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Eat your own dog food using mongo db at mongodb

  1. 1. Eat Your Own Dog Food Migrating MongoDB University From SQL to MongoDB John Yu
  2. 2. What Is MongoDB University?
  3. 3. MOOC: Massive Open Online Courses (~2010) Free MongoDB courses on the web MongoDB 인증 (개발자, DBA) Developed by MongoDB Inc
  4. 4. Why was it using SQL?
  5. 5. History of MongoDB University Started in 2012 with a fork of edX MySQL DB, Python Django, XML Django is designed for SQL databases Future option to use MongoDB for course materials
  6. 6. Why should we move to MongoDB?
  7. 7. Maybe we shouldn’t Site works fine SQL is fine A lot of work to move to MongoDB MongoDB is not a great fit for django We don’t use many of MongoDB’s standout features (sharding)
  8. 8. Eat your own dog food If we think MongoDB is good, then we should use it Help test MongoDB products
  9. 9. MongoDB is good for University too MongoDB is closer to application data Arrays (배열) Subclasses (flexible schema) Ease of development (pymongo) Integration with other MongoDB tools (Atlas, Compass, Charts)
  10. 10. “While you are attending PyCon, please visit the MongoDB booth to learn about PyMongo!” #PyCon #Cleveland #MongoDB #Python MongoDB: { “text”: “While you are attending PyCon, please visit the MongoDB booth to learn about PyMongo!”, “tags”: [“PyCon”, “Cleveland”, “MongoDB”, “Python”] } SQL: text id “While you are attending PyCon...” 1 blog_id tag 1 “PyCon” 1 “Cleveland”
  11. 11. PyMongo vs Python SQL connector > database.collection.find_one({‘user_id’:1}) { "email": "john.yu@mongodb.com", "address":{ "street": "1633 Broadway", "city": "New York", "state": "NY", "country": "United States" }, } > user[‘address’][‘country’] “United States” > connection.execute(‘SELECT * FROM people where id=1’) (‘john.yu@mongodb.com’, ‘1633 Broadway’, ‘New York’, ‘NY’, ‘United States’) > user[4] “United States"
  12. 12. Flexible Schema Within a Collection Analogous to subclasses in programming languages Example: Multiple-choice problem (객관식) vs Text problem (주관식) { type: "multiple-choice", question: "Who was the first president of the US?", choices: [ { "text": "Barack Obama", "is_correct": false }, { "text": "George Washington", "is_correct": true } ] } { type: "text" question: "Who was the president during the civil war?", answer: "Abraham Lincoln" }
  13. 13. MongoDB can be normalized like SQL, but you can also have arrays and embedded documents. MongoDB gives you more options than a tabular DB. Summary
  14. 14. How did we do it?
  15. 15. Flexible schema is great, but more decisions to be made. Top-down design How will the data be CRUDed? - What operations are needed to render a web page? Optimize for queries, since querying happens more often than creating/updating/deleting
  16. 16. What is a course? Course (수업) A course has one or more chapters A chapter has one or more lessons A lesson has one or more units - Lecture (강의) - Problems (multiple-choice, text) Student progress (진행, 성적) Did student view a chapter/lesson/problem? Student submissions for problems - Problem ID - Answer submitted - Submitted date Student grade Many students per course
  17. 17. Old Way: Course <course id=“M101P”, title=“MongoDB for Python Developers“ start=“Aug 12 17:00 UTC 2013” end=“Sep 21 17:00 UTC 2013”> <chapter title=“Week 1: Introduction” start=“Aug 12 17:00 UTC 2013” end=“Aug 19 17:00 UTC 2013”> <lesson title=“Welcome to M101P”> <problem id=“5d4340a7eba” title=“Quiz: MongoDB vs SQL”> Which DB should you use? <choice correct=”false”>MySQL</choice> <choice correct=”false”>Postgres</choice> <choice correct=”true”>MongoDB</choice> </problem> </lesson> <lesson> … </lesson> </chapter> <chapter title=“Week 2: CRUD operations”>…</chapter> </course>
  18. 18. New Way: Course { start: 2018-08-01 17:00 UTC, end: 2018-09-01 17:00 UTC, title: "US History", chapters: [ { title: "Chapter 1: Introduction", lessons: [ { title: "First President", video: youtube.com/123456, problem: { type: "multiple-choice", question: "Who was the first president of the US?", choices: [ { "text": "Barack Obama", "is_correct": false }, { "text": "George Washington", "is_correct": true } ] } }, { title: "Second president", video: youtube.com/2334566, problem: { type: "text" question: "Who was the president during the civil war?", answer: "Abraham Lincoln" } } ] }, { title: "Chapter 2: Wars", lessons: [ ] } ] } Good Bad Courseware needs bits of the entire offering Can project just the fields we need Offering can be a big document Note: previously, offerings were in memory, not DB
  19. 19. Old way: Student Progress student_id course_id problem_id state (상태) 71495 “M101P/2019_July” “5d4340a7eba” ‘{ "answer": [0,1,2,3], "score": 1, "submit_date": 2019-07-21 09:15 UTC }’ 13789 “M101/2015_May” “21b172e26113” ‘{ “answer”: “Barack Obama”, “score”: 0, “submit_date”: 2015-05-12 10:15 UTC }’ ?
  20. 20. Approach 1: Mechanically move SQL tables to MongoDB collections { student_id: 71495, course_id: “M101P/2019_July”, problem_id: “5d4340a7eba”, state: { "answer": [0,1,2,3], "score": 1, "submit_date": 2019-07-21 09:15 UTC } }, { student_id: 13789, course_id: “M101P/2015_May”, lecture_id: “21b172e26113”, state: { ”last_viewed": 2015-05-12 10:15 UTC } } Good Bad Easy to migrate Already better than the previous table Many queries required per page
  21. 21. Approach 2A: All progress for a course in 1 document { course_id: "M101P/2019_May", students: [ { user_id: 11111, units: [ { id: "Problem 1", attempts: [ { date: 2016-06-02 15:02 UTC index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ], } ] }, Good Bad ? Doesn’t fit common use case Grows without bound { user_id: 22222, units: [ { id: "Problem 1", attempts: [ { date: 2016-06-02 15:02 UTC index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ], } ] } ] }
  22. 22. Approach 2B: All courses for a student in 1 document { user_id: 11111, courses: [ { course_id: "M101P", units: { "lecture_1": { last_viewed: 2016-06-01 10:10 UTC }, "problem_1": { attempts: [ { date: 2016-06-02 15:02 UTC, index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ] }, }, { course_id: "M101P", units: { "lecture_1": { last_viewed: 2016-06-01 10:10 UTC }, "problem_1": { attempts: [ { date: 2016-06-02 15:02 UTC index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ] }, }, ] } Good Bad ? Will probably grow larger than document size limit
  23. 23. Better Approach: Fit to use case (progress) { user_id: 71495, course_id: "M101P/2019_July", units: { "lecture_1": { last_viewed: 2016-06-01 10:10 UTC }, "problem_1": { attempts: [{ date: 2016-06-02 15:02 UTC, index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ] }, "problem_2": { attempts: [{ date: 2017-06-05 11:08 UTC, text: "Barack Obama" } ] } } } Good Bad - Courseware often needs multiple units at a time - Grade student’s progress in one document - Can still update just parts of the document Document can grow without bound
  24. 24. ODM We use PyModm (https://github.com/mongodb/pymodm) We can use Python classes instead of dictionaries - Application side schema validation (검증) - Now there is MongoDB schema validation - Type checking - Convenience Downsides: - New querying language (but mimics Django ORM) - Unclear when queries are actually being executed
  25. 25. How about performance (성능)? • Performance gains from data model • Basic indexes on queries
  26. 26. Timeline Certification Exams Just pymongo Certification Exams v2 ODM (mongoengine) Courseware ODM (mongoengine) Courseware v2 ODM (PyModm)
  27. 27. Summary SQL is fine But MongoDB is also good, and sometimes better We moved to MongoDB because it is great for developers Beware of pitfalls with document DBs
  28. 28. Future Plans • Move the rest of the SQL tables to Mongo • Try newer MongoDB features • Schema validation • Transactions
  29. 29. Thank You

×