Mongo db
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Mongo db

  • 2,592 views
Uploaded on

MongoDB is a popular NoSQL database. This presentation was delivered during a workshop. ...

MongoDB is a popular NoSQL database. This presentation was delivered during a workshop.

First it talks about NoSQL databases, shift in their design paradigm, focuses a little more on document based NoSQL databases and tries drawing some parallel from SQL databases.

Second part, is for hands-on session of MongoDB using mongo shell. But the slides help very less.

At last it touches advance topics like data replication for disaster recovery and handling big data using map-reduce as well as Sharding.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,592
On Slideshare
1,977
From Embeds
615
Number of Embeds
35

Actions

Shares
Downloads
21
Comments
0
Likes
2

Embeds 615

http://facebook-programming.blogspot.in 369
http://facebook-programming.blogspot.com 107
http://facebook-programming.blogspot.de 18
http://facebook-programming.blogspot.co.uk 12
http://facebook-programming.blogspot.com.ar 10
http://facebook-programming.blogspot.com.br 10
http://facebook-programming.blogspot.fr 9
http://facebook-programming.blogspot.it 8
http://facebook-programming.blogspot.co.il 7
http://facebook-programming.blogspot.ca 7
http://facebook-programming.blogspot.ru 6
http://facebook-programming.blogspot.com.es 5
http://facebook-programming.blogspot.nl 5
http://facebook-programming.blogspot.jp 5
http://facebook-programming.blogspot.kr 3
http://facebook-programming.blogspot.tw 3
http://www.facebook-programming.blogspot.in 3
http://facebook-programming.blogspot.co.at 3
http://facebook-programming.blogspot.sk 3
http://facebook-programming.blogspot.sg 3
http://facebook-programming.blogspot.com.au 2
http://facebook-programming.blogspot.be 2
http://facebook-programming.blogspot.dk 2
http://facebook-programming.blogspot.ro 2
http://facebook-programming.blogspot.cz 1
http://facebook-programming.blogspot.ie 1
http://facebook-programming.blogspot.pt 1
http://translate.googleusercontent.com 1
http://facebook-programming.blogspot.hu 1
http://facebook-programming.blogspot.ae 1
http://facebook-programming.blogspot.ch 1
http://facebook-programming.blogspot.mx 1
http://facebook-programming.blogspot.hk 1
http://facebook-programming.blogspot.no 1
http://facebook-programming.blogspot.fi 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. NoSQL Database Akshay Mathur Sarang Shravagi @akshaymathu, @_sarangs {name: ‘mongo’, type: ‘db’}
  • 2. Who uses MongoDB @akshaymathu, @_sarangs 2
  • 3. Let’s Know Each Other • Do you code? • OS? • Programing Language? • Why are you attending? @akshaymathu, @_sarangs 3
  • 4. Akshay Mathur • Managed development, testing and release teams in last 14+ years – Currently Principal Architect at ShopSocially • Founding Team Member of – ShopSocially (Enabling “social” for retailers) – AirTight Neworks (Global leader of WIPS) @akshaymathu, @_sarangs 4
  • 5. Sarang Shravagi • 10gen Certified Developer and DBA • CS graduate from PICT Pune • 3+ years in Software Product industry • Currently Senior Full-stack Developer at ShopSocially @akshaymathu, @_sarangs 5
  • 6. How we use MongoDB @akshaymathu, @_sarangs 6 Python MongoDB MongoEngine
  • 7. Where MongoDB Fits @akshaymathu, @_sarangs 7
  • 8. Program Outline: Understanding NoSQL • Data Landscape • Different Storage Needs • Design Paradigm Shift from SQL to NoSQL • Different Datastores • Closer look to Document Storage • Drawing parallel from RDBMS @akshaymathu, @_sarangs 8
  • 9. Program Outline: Hands on Lab • Installation and basic configuration • Mongo Shell • Creating and Changing Schema • Create, Read, Update and Delete of Data • Analyzing Performance • Improving performance by creating Indices • Assignment • Problem solving for the assignment @akshaymathu, @_sarangs 9
  • 10. Program Outline: Advance Topics • Handling Big Data – Introduction to Map/Reduce – Introduction to Data Partitioning (Sharding) • Disaster Recovery – Introduction to Replica set and High Availability @akshaymathu, @_sarangs 10
  • 11. Ground Rules • Disturb Everyone – Not by phone rings – Not by local talks – By more information and questions @akshaymathu, @_sarangs 11
  • 12. Data Patterns & Storage Needs @akshaymathu, @_sarangs 12
  • 13. Data at an Online Store • Product Information • User Information • Purchase Information • Product Reviews • Site Interactions • Social Graph • Search Index @akshaymathu, @_sarangs 13
  • 14. SQL to NoSQL Design Paradigm Shift @akshaymathu, @_sarangs 14
  • 15. SQL Storage • Was designed when – Storage and data transfer was costly – Processing was slow – Applications were oriented more towards data collection • Initial adopters were financial institutions @akshaymathu, @_sarangs 15
  • 16. SQL Storage • Structured – schema • Relational – foreign keys, constraints • Transactional – Atomicity, Consistency, Isolation, Durability • High Availability through robustness – Minimize failures • Optimized for Writes • Typically Scale Up @akshaymathu, @_sarangs 16
  • 17. NoSQL Storage • Is designed when – Storage is cheap – Data transfer is fast – Much more processing power is available • Clustering of machines is also possible – Applications are oriented towards consumption of User Generated Content – Better on-screen user experience is in demand @akshaymathu, @_sarangs 17
  • 18. NoSQL Storage • Semi-structured – Schemaless • Consistency, Availability, Partition Tolerance • High Availability through clustering – expect failures • Optimized for Reads • Typically Scale Out @akshaymathu, @_sarangs 18
  • 19. Different Datastores Half Level Deep @akshaymathu, @_sarangs 19
  • 20. SQL: RDBMS • MySql, Postgresql, Oracle etc. • Stores data in tables having columns – Basic (number, text) data types • Strong query language • Transparent values – Query language can read and filter on them – Relationship between tables based on values • Suited for user info and transactions @akshaymathu, @_sarangs 20
  • 21. NoSQL: Key/Value • Redis, DynamoDB etc. • Stores a values against a key – Strings • Values are opaque – Can not be part of query • Suited for site interactions @akshaymathu, @_sarangs 21
  • 22. NoSQL: Key/Value
  • 23. NoSQL: Document • MongoDB, CouchDB etc. • Object Oriented data models – Stores data in document objects having fields – Basic and compound (list, dict) data types • SQL like queries • Transparent values – Can be part of query • Suited for product info and its reviews @akshaymathu, @_sarangs 23
  • 24. NoSQL: Document
  • 25. NoSQL: Column Family • Cassandra, Big Table etc. • Stores data in columns • Transparent values – Can be part of query • SQL like queries • Suited for search @akshaymathu, @_sarangs 25
  • 26. NoSQL: Column Family
  • 27. NoSQL: Graph • Neo4j • Stores data in form of nodes and relationships • Query is in form of traversal • In-memory • Suited for social graph @akshaymathu, @_sarangs 27
  • 28. NoSQL: Graph
  • 29. Document Storage: Closer Look @akshaymathu, @_sarangs 30
  • 30. MongoDB • Document database • Powerful query language • Docs, sub-docs, indexes • Map/reduce • Replicas, shards, replicated shards • SDKs/drivers for so many languages – C, C++, C#, Python, Erlang, PHP, Java, Javascript, NodeJS, Perl, Ruby, Scala @akshaymathu, @_sarangs 31
  • 31. RDBMS: DB Design @akshaymathu, @_sarangs 32
  • 32. RDBMS: Query @akshaymathu, @_sarangs 33
  • 33. RDBMS  MongoDB RDBMS MongoDB Database Database Table Collection Row Document Column Field Select c1, c2 from Table where c1 = ‘v1’ order by c2 limit n Collection.objects(F1 = ‘v1’).order_by(‘c2’).limit(n) @akshaymathu, @_sarangs 34
  • 34. MongoDB: Design @akshaymathu, @_sarangs 35
  • 35. MongoDB: Query • Movies.objects() @akshaymathu, @_sarangs 36
  • 36. @akshaymathu, @_sarangs 37
  • 37. Have you Installed? http://www.mongodb.org/downloads @akshaymathu, @_sarangs
  • 38. Hands-on Dive-in with Sarang @akshaymathu, @_sarangs 39
  • 39. MongoDB: Core Binaries • mongod – Database server • mongo – Database client shell • mongos – Router for Sharding @akshaymathu, @_sarangs 40
  • 40. Getting Help • For mongo shell – mongo –help • Shows options available for running the shell • Inside mongo shell – Object.help() • Shows commands available on the object @akshaymathu, @_sarangs 41
  • 41. Import Export Tools • For objects – mongodump – mongorestore – bsondump – mongooplog • For data items – mongoimport – mongoexport @akshaymathu, @_sarangs 42
  • 42. Database Operations • Database creation • Creating/changing collection • Data insertion • Data read • Data update • Creating indices • Data deletion • Dropping collection @akshaymathu, @_sarangs 43
  • 43. Diagnostic Tools • mongostat • mongoperf • mongosnif • mongotop @akshaymathu, @_sarangs 44
  • 44. @akshaymathu, @_sarangs 45
  • 45. Assignment • Go to http://www.velocitainc.com/mongo/ – Tasks • assignments.txt – Data • students.json @akshaymathu, @_sarangs 46
  • 46. Disaster Recovery Introduction to Replica Sets and High Availability @akshaymathu, @_sarangs 47
  • 47. Disasters • Physical Failure – Hardware – Network • Solution – Replica Sets • Provide redundant storage for High Availability – Real time data synchronization • Automatic failover for zero down time @akshaymathu, @_sarangs 48
  • 48. Replication @akshaymathu, @_sarangs 49
  • 49. Multi Replication • Data can be replicated to multiple places simultaneously • Odd number of machines are always needed in a replica set @akshaymathu, @_sarangs 50
  • 50. Single Replication • If you want to have only one or odd number of secondary, you need to setup an arbiter @akshaymathu, @_sarangs 51
  • 51. Failover • When primary fails, remaining machines vote for electing new primary @akshaymathu, @_sarangs 52
  • 52. Handling Big Data Introduction to Map/Reduce and Sharding @akshaymathu, @_sarangs 53
  • 53. Large Data Sets • Problem 1 – Performance • Queries go slow • Solution – Map/Reduce @akshaymathu, @_sarangs 54
  • 54. Map Reduce • A way to divide large query computation into smaller chunks • May run in multiple processes across multiple machines • Think of it as GROUP BY of SQL @akshaymathu, @_sarangs 55
  • 55. Map/Reduce Example • Map function digs the data and returns required values @akshaymathu, @_sarangs 56
  • 56. Map/Reduce Example • Reduce function uses the output of Map function and generates aggregated value @akshaymathu, @_sarangs 57
  • 57. Large Data Sets • Problem 2 – Vertical Scaling of Hardware • Can’t increase machine size beyond a limit • Solution – Sharding @akshaymathu, @_sarangs 58
  • 58. Sharding • A method for storing data across multiple machines • Data is partitioned using Shard Keys @akshaymathu, @_sarangs 59
  • 59. Data Partitioning: Range Based • A range of Shard Keys stay in a chunk @akshaymathu, @_sarangs 60
  • 60. Data Partitioning: Hash Bsed • A hash function on Shard Keys decides the chunk @akshaymathu, @_sarangs 61
  • 61. Sharded Cluster @akshaymathu, @_sarangs 62
  • 62. Optimizing Shards: Splitting • In a shard, when size of a chunk increases, the chunk is divided into two @akshaymathu, @_sarangs 63
  • 63. Optimizing Shards: Balancing • When number of chunks in a shard increase, a few chunks are migrated to other shard @akshaymathu, @_sarangs 64
  • 64. Summary • MongoDB is good – Stores objects as we use in programming language – Flexible semi-structured design – Scales out to store big data – Embedded documents eliminates need for join • MongoDB is bad – No multi-document query – De-normalized storage – No support for transactions @akshaymathu, @_sarangs 65
  • 65. Thanks @akshaymathu, @_sarangs 66 @akshaymathu @_sarangs