%in Harare+277-882-255-28 abortion pills for sale in Harare
Introduction to MongoDB
1. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
1
Introduction to MongoDB
Pradyumn Sharma
Pragati Software Pvt. Ltd., India
pradyumn.sharma@pragatisoftware.com
www.pragatisoftware.com
2. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
2
What is NoSQL?
• Generic term …
• … for various non-relational database alternatives
3. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
3
Modern Applications Require…
1. Storage and handling of huge volumes of data
Twitter (March 2013): 200 million active users, 400 million tweets
per day
Facebook (Aug 2012): 100 PB of data as on that date; >500 TB of
data per day, 2.5 billion pieces of content, 2.7 billion Like actions,
300 million photos
Wal-Mart: 1 million customer transactions per hour
And the grand-daddy of all: Google
RDBMS: Big challenge
RDBMS: Scalability beyond to a point is not practical or economical
4. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
4
Modern Applications Require…
1. Storage and handling of huge volumes of data
2. Very high level of performance
RDBMS: Lack of linear performance
Normalized tables lead to many joins => drop in performance
5. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
5
Modern Applications Require…
1. Storage and handling of huge volumes of data
2. Very high level of performance
3. 100% uptime with no single point of failure
RDBMS: single point of failure
Typically master-slave architecture
Not designed for multi-DC, geo-clusters, cloud
6. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
6
Modern Applications Require…
1. Storage and handling of huge volumes of data
2. Very high level of performance
3. 100% uptime with no single point of failure
4. Ease of managing the database
RDBMS: complex to manage
Complex, old architecture, often requiring lot of administration and
tuning work
7. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
7
Modern Applications Require…
1. Storage and handling of huge volumes of data
2. Very high level of performance
3. 100% uptime with no single point of failure
4. Ease of managing the database
5. Flexibility in schema design
RDBMS: Not easy to change schema online
Limited support for new data type needs
8. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
8
Modern Applications Require…
1. Storage and handling of huge volumes of data
2. Very high level of performance
3. 100% uptime with no single point of failure
4. Ease of managing the database
5. Flexibility in schema design
9. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
9
What is NoSQL?
• Architected from ground-up, with considerations for
High performance
Huge data volumes => high, linear scalability
High availability
High flexibility of database structure
• Don’t use the relational model at all, or use very little of it
• Compromise on various features of RDBMS (including joins,
normalization, ACID transactions in most cases)
• Schema-less, or flexible schemas
• Mostly open source
• Mostly distributed systems, run on clusters
10. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
10
Types of NoSQL Databases
• Column family (Apache Cassandra, HBase)
• Document (MongoDB, CouchDB)
• Graph (Neo4J, Titan)
• Key value (Riak, DynamoDB)
11. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
11
MongoDB
12. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
12
Introducing MongoDB
• A Document database
• Stores JSON documents
• Developed by MongoDB Inc
• Latest version is 2.6.1, released on May 5, 2014
13. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
13
MongoDB Terminology
RDBMS Mongo DB
ID Name Gender Dept
1 Ahmad M Fin
2 Bajrang M Sales
3 Catherine F HR
4 Dostoyevski M Prod
ID Name Unit
1 Pens NO
2 Biscuits KG
Database
Row / Record
Database
Collection
Document
Employees
Products
Table
19. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
19
Persons Collection
Person
- FirstName: int
- LastName: int
- Gender: int
- YearOfBirth: int
- LivesIn: int
- Married: int
- CountriesVisited: List <Country>
- LanguagesKnown: List <LanguageKnown>
Country
- CountryName: int
LanguageKnown
- LanguageName: int
- Profiency: int
*
*
21. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
21
Querying Data
db.persons.find( {gender: 'F'} )
db.persons.find( {gender: 'F'} ).pretty()
22. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
22
Querying Data
db.persons.find( {gender: 'F'} )
23. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
23
Querying Data
db.persons.find( {gender: 'F'} )
db.persons.find( {gender: 'F'} ).pretty()
24. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
24
The _id Field
• Reserved for use as a primary key
• If a value is not specified by you, the insert () method adds it
to the document with a unique ObjectId for its value
• ObjectId is a 12-byte unique identifier.
• Value must be unique in a collection
• Is immutable
• May be of any type other an array
27. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
27
Operators
-- All people born before 1980
db.persons.find (
{yearOfBirth: {$lt: 1980} }
)
-- All persons not living in Jaipur
db.persons.find (
{livesIn: {$ne: 'Jaipur'} }
)
28. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
28
Operators
-- Find all persons who either live in Mumbai or have visited
India
db.persons.find (
{$or: [
{livesIn: 'Mumbai'},
{countriesVisited: 'India'}
] } )
-- Find all persons who have visited India or know Hindi
db.persons.find (
{$or: [
{countriesVisited: 'India'},
{'languages.name': 'Hindi'}
] } )
29. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
29
Operators
-- Find all persons who have visited India and know Hindi
db.persons.find (
{$and: [
{countriesVisited: 'India'},
{'languages.name': 'Hindi'}
] })
31. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
31
A Complex Query
-- All males born before 1970 and all females born before 1980.
db.persons.find (
{$or: [
{$and: [
{gender: 'M'},
{yearOfBirth: {$lt: 1970} }
] },
{$and: [
{gender: 'F'},
{yearOfBirth: {$lt: 1980} }
] }
] }
)
32. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
32
Sorting the Documents
-- Show all persons, sorted in ascending order of their year of
birth
db.persons.find().sort ( {yearOfBirth: 1} )
-- and further, in the descending order of their last name.
db.persons.find(
{},
{yearOfBirth: 1, name: 1, _id: 0}
).sort ( {yearOfBirth: 1, name.last: -1} )
33. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
33
Aggregation Framework
• All cities with population more than 10000.
db.zips.find( {pop: {$gte: 10000}}).pretty()
• State-wise population
db.zips.aggregate ( {$group:
{_id: "$state", totalpop: {$sum: "$pop"} }
})
• All states with population more than 10 million
db.zips.aggregate ( {$group:
{_id: "$state", totalpop: {$sum: "$pop"} }
},
{$match: {totalpop: {$gte: 10*1000*1000}}}
)
39. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
39
Time To Live (TTL)
db.persons.ensureIndex (
{tempPassword : 1},
{expireAfterSeconds : 7200}
)
• A background task, that runs once a minute, removes the
data.
• TTL background thread runs only on primary members of a
replica set. Secondaries replicate deletion from the primary.
40. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
40
Text Indexing and Search
db.persons.ensureIndex (
{"qualifications.institute" : "text"} )
db.collection.runCommand(
"text" , { search: <string> } )
41. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
41
Geospatial Indexing and Search
• Indexes and query mechanism to handle geospatial
information.
• Spherical (earth-like) surfaces, as well as flat surfaces
(Euclidean planes) are supported.
• You can query for things like:
Locations contained entirely within a specified polygon
Locations that intersect with a given geometry
Points nearest to another point
42. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
42
Capped Collections
• Example: you need to only consider recently added
documents; older ones can be safely discarded.
• Capped Collections: fixed-size collections
• Guarantee preservation of the insertion order.
• Order of documents on disk is identical to the insertion order.
• Oldest documents get automatically removed.
• Update causing the document size to grow will fail.
• Deletion of selected documents not possible.
• Cannot shard a capped collection.
• You can create a tailable cursor, which tails the end of a
capped collection. You can continue retrieving documents
using this.
43. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
43
Replication
• Replica Set: cluster of
mongod instances that
replicate amongst one
another and ensure
automated failover.
• Up to 12 members in a
replica set, only up to 7
have votes.
• Master-slave replication,
with one primary and the
rest as secondary
members.
44. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
44
Replication
• Clients direct write to the primary; the secondary members
replicate from the primary asynchronously.
• Oplog is a special capped collection…
• …that keeps a rolling record of all changes to the database.
• MongoDB applies changes to the primary…
• …and then records the operations on the primary’s oplog.
• Secondary members then replicate the oplog…
• …and apply the operations to themselves in an asynchronous
manner.
45. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
45
Failover
• Members are interconnected with each other to exchange
heartbeat messages.
• A crashed server with missing heartbeat is detected by other
members and is removed from the replica set membership.
• When a server recovers in future, it can rejoin the cluster by
connecting to the primary to replicate the changes since it
crashed.
• If a primary member fails, the remaining members
automatically try to elect a new primary, without human
intervention.
46. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
46
Failover and Rollback
• A possible scenario:
The primary has completed a write request
None of the secondaries have replicated it
The primary crashes
Remaining secondaries elect a new primary and continue operating,
unaware that they have lost a write request already acknowledged by
the former primary.
It has to roll back the write operation to maintain database consistency
across the replica set
• MongoDB write the rollback data to a BSON file. You have
to manually intervene to apply the rollback data for the
former primary to rejoin the cluster as a secondary.
47. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
47
Failover and Rollback
• Alternatively, you can specify the number of secondaries to
receive the modification before the primary acknowledges to
the client.
• Tradeoff: between latency and reliability.
48. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
48
Read Operations
• By default all read operations against a replica set are
returned from the primary. Users may configure on a per-
connection basis to prefer read operations from a secondary.
• All read operations issued to the primary of a replica set are
consistent with the last write operation.
• Strict consistency for read operations from secondary
members cannot be guaranteed.
49. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
49
Replication
• Members can be designated as:
Secondary-only: for dedicated backup, for off-site data centers
Hidden: invisible to client application, an isolated member for
reporting and monitoring, cannot become primary, but they vote in
elections
Delayed: replication after a specified delay, a form of rolling backup,
protection against human errors and change control; they must also be
hidden and secondary-only
Arbiters: no data but only participate in elections; cannot become
primary
Non-voting
Default: can become primary, hold data, replicate immediately, have
vote
50. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
50
Chained Replication
• A secondary replicating from another secondary
• Reduces the load on primary…
• …but can increase replication lag.
51. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
51
Journaling
• Write-ahead logging to an on-disk journal…
• …to guarantee write operation durability and to provide crash
resiliency.
• When a write operation occurs:
MongoDB data about the write to the private view in RAM and then
copies the same to the journal on disk, in batches called group
commits, by default every 100 milliseconds
It then applies the changes to the shared view, which now becomes
inconsistent with the data files.
At default intervals of 60 seconds, MongoDB flushes the shared view
to disk, and removes the write operations from the journal.
52. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
52
Sharding
• Partitions a
collection and
stores different
portions on
different machines.
• To run sharding
you set up a
sharded cluster.
• Within a cluster,
sharding is enabled
on a per collection
basis.
53. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
53
Sharding
• Typically each shard is a replica set, though not mandatory.
• Sharding options:
Using a shard key, a field that exists in every document of a collection
Hash-based sharding
• Documents are distributed according to the range of values in
the shard key.
• A shard key can be a single field or a composite.
54. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
54
Sharding
55. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
55
Sharding
56. Pragati Software Pvt. Ltd., 207, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com
56
Security
• Role based access control
• Auditing
• Encryption in flight: SSL
• Encryption at rest: Gazzang
• Supports Kerberos authentication