MongoDB Replication
Fundamentals
Desert Code Camp – Oct 2014
By
Avinash Ramineni
Agenda
• Introduction to MongoDB
• MongoDB Replication
• Understanding Oplog
• Stream data from Oplog
• Demo
• Gotchas
• Questions
Why use a NoSQL Database?
• NoSQL describes a horizontally scalable, non-relational database
with built-in replication support
• One Size does not Fit All
– RDBMS
• Horizontal or Vertical Scalability ?
– Key-Value stores
– Column
– Document and Graph
• High Availability and Scalability
• CAP Theorem
– Choose any two from (Consistency, Availability , Partition Tolerance)
• Availability and Partition Tolerance
Why use a NoSQL Database? -2
• NoSQL’s primary goal is to achieve horizontal scalability. It attains
this by reducing transactional semantics and referential integrity.
MongoDB -1
• Document Oriented Database
– Bridges the gap between RDBMS and Key-Value Stores
– Atomicity
– Indexing
– Sharding - horizontal Scalability
• BSON format
– Binary encoded JSON representation
• No Joins
• Complex Queries /Indices
• Row Level Locking
MongoDB -2
• MongoDB Cluster
– Master - Slave
• Slave can become Master incase of fail-over
• Only Master is allowed to commit changes to Store
– Master – Master in limited capacity
• Inserts/Queries/Deletions are done by Id
• Does not work if the usecase expects same object can
be updated concurrently
– ReplicaSets
MongoDB -2
Replication
• Why Replication ?
– Failover Scenarios
• Hot Backups
– Disaster Recovery
• Provides Redundancy and Increases Data
Availability
• Increases Read Capacity
• Different uses of data
• Normal processing
• DR / Backup
• Reporting
MongoDB Terminology
• Database
– Collection (RDBMS – table)
– Document (RDBMS – row)
• Cluster Node Types
– Primary
– Secondary
– Arbiter
– Hidden
MongoDB Replication
Replicasets
• Primary
– Primary accepts all write operations
– Only one Primary
– Strict Consistency for reads
– Logs all the changes in data to “oplog “
• Secondary
– Replicate by reading Primary’s “oplog”
– Reads might return stale data
– Can become primary
Cluster
Primary Election
Read Preference
• Routes Read operations to Replica set Members
• Increase Read throughputs
• Reduce Latency
• Secondary reads might be stale
• Modes
– Primary
– Primary Preferred (secondary if primary unavailable)
– Secondary
– Secondary Preferred
– Nearest (read from member with least network
latency)
Write Preference
• Write only on Primary (Default)
• Write to N number of replica set members
db.products.insert(
{ item: "envelopes", qty : 100, type: "Clasp" },
{ writeConcern: { w: 2, wtimeout: 5000 } }
)
WriteConcern: Unacknowledged
WriteConcern: Acknowledged
WriteConcern: Journaled
WriteConcern w:2
Stream data from MongoDB
Oplog (Operation Log)
• Similar to Oracle Redo log
– Rolling record of all operations that modify the
data
– All writes (insert/update/delete) get an entry in
the Oplog
• Replicaset members have oplog collection
– local.oplog.rs
– Oplog is yet another collection in the database
Oplog in Action - Demo
Dissecting Oplog
Dissecting Oplog ..
• Oplog Contents
– ts: the time this operation occurred.
– h: a unique ID for this operation. Each operation will
have a different value in this field.
– op: the write operation that should be applied to the
slave
– ns: the database and collection affected by this
operation.
– o: the actual document representing the operation
– v: Version of the oplog.
Oplog - op
• Op – Operation
– i inserts
– u updates
– d deletes
– n no-op
• Updates has an extra field
– o2
• o1 has update information
• o2 has the id that was updated
Triggers?
• Does mongoDB have triggers?
– Tailable cursors
• tail –f oplog
• Notice any issues with oplog
– Aren't we doubling the size of the database ?
Oplog ..
• Capped Collection (fixed Size collection)
– Circular Queue
– Default Oplog size depends on the OS
– Oldest entries get overwritten
• What if the slave node is way off that the oplog
got overwritten
– Full Resync
• copyDatabase starts streaming from oplog
– What if oplog rolls over while the slaves are
completing the copy
Non-Replicated Collections
• local database
– Collections in local don’t get replicated
– Changes to the collections in local database don’t
show up in the oplog
Questions

MongoDB Replication fundamentals - Desert Code Camp - October 2014

  • 1.
    MongoDB Replication Fundamentals Desert CodeCamp – Oct 2014 By Avinash Ramineni
  • 2.
    Agenda • Introduction toMongoDB • MongoDB Replication • Understanding Oplog • Stream data from Oplog • Demo • Gotchas • Questions
  • 3.
    Why use aNoSQL Database? • NoSQL describes a horizontally scalable, non-relational database with built-in replication support • One Size does not Fit All – RDBMS • Horizontal or Vertical Scalability ? – Key-Value stores – Column – Document and Graph • High Availability and Scalability • CAP Theorem – Choose any two from (Consistency, Availability , Partition Tolerance) • Availability and Partition Tolerance
  • 4.
    Why use aNoSQL Database? -2 • NoSQL’s primary goal is to achieve horizontal scalability. It attains this by reducing transactional semantics and referential integrity.
  • 5.
    MongoDB -1 • DocumentOriented Database – Bridges the gap between RDBMS and Key-Value Stores – Atomicity – Indexing – Sharding - horizontal Scalability • BSON format – Binary encoded JSON representation • No Joins • Complex Queries /Indices • Row Level Locking
  • 6.
    MongoDB -2 • MongoDBCluster – Master - Slave • Slave can become Master incase of fail-over • Only Master is allowed to commit changes to Store – Master – Master in limited capacity • Inserts/Queries/Deletions are done by Id • Does not work if the usecase expects same object can be updated concurrently – ReplicaSets
  • 7.
  • 8.
    Replication • Why Replication? – Failover Scenarios • Hot Backups – Disaster Recovery • Provides Redundancy and Increases Data Availability • Increases Read Capacity • Different uses of data • Normal processing • DR / Backup • Reporting
  • 9.
    MongoDB Terminology • Database –Collection (RDBMS – table) – Document (RDBMS – row) • Cluster Node Types – Primary – Secondary – Arbiter – Hidden
  • 10.
  • 11.
    Replicasets • Primary – Primaryaccepts all write operations – Only one Primary – Strict Consistency for reads – Logs all the changes in data to “oplog “ • Secondary – Replicate by reading Primary’s “oplog” – Reads might return stale data – Can become primary
  • 12.
  • 13.
  • 14.
    Read Preference • RoutesRead operations to Replica set Members • Increase Read throughputs • Reduce Latency • Secondary reads might be stale • Modes – Primary – Primary Preferred (secondary if primary unavailable) – Secondary – Secondary Preferred – Nearest (read from member with least network latency)
  • 15.
    Write Preference • Writeonly on Primary (Default) • Write to N number of replica set members db.products.insert( { item: "envelopes", qty : 100, type: "Clasp" }, { writeConcern: { w: 2, wtimeout: 5000 } } )
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
    Oplog (Operation Log) •Similar to Oracle Redo log – Rolling record of all operations that modify the data – All writes (insert/update/delete) get an entry in the Oplog • Replicaset members have oplog collection – local.oplog.rs – Oplog is yet another collection in the database
  • 22.
  • 23.
  • 24.
    Dissecting Oplog .. •Oplog Contents – ts: the time this operation occurred. – h: a unique ID for this operation. Each operation will have a different value in this field. – op: the write operation that should be applied to the slave – ns: the database and collection affected by this operation. – o: the actual document representing the operation – v: Version of the oplog.
  • 25.
    Oplog - op •Op – Operation – i inserts – u updates – d deletes – n no-op • Updates has an extra field – o2 • o1 has update information • o2 has the id that was updated
  • 26.
    Triggers? • Does mongoDBhave triggers? – Tailable cursors • tail –f oplog • Notice any issues with oplog – Aren't we doubling the size of the database ?
  • 27.
    Oplog .. • CappedCollection (fixed Size collection) – Circular Queue – Default Oplog size depends on the OS – Oldest entries get overwritten • What if the slave node is way off that the oplog got overwritten – Full Resync • copyDatabase starts streaming from oplog – What if oplog rolls over while the slaves are completing the copy
  • 28.
    Non-Replicated Collections • localdatabase – Collections in local don’t get replicated – Changes to the collections in local database don’t show up in the oplog
  • 29.

Editor's Notes

  • #4 One size doesnot fit all -- abiltiy of the system to store ,analyze , manipulate the with out loosing Availability , Performance and Throughput as the data increases -- enforce data integrity and enforce schema rules.. Enable high-performance queries on complex, connected data ●  Easily represent the complex, connected data stored in today’s applications The type of NOSQL database you choose depends on what type of data you need to store and how you want to access it. A graph database, for instance, models real world connections better than other NOSQL databases Column family It’s a powerful way to capture semi-structured data, but often sacrifices consistency for availability A document database contains a collection of key-value pairs stored in documents. While it is good at storing documents, it was not designed with enterprise-strength transactions and durability in mind. Document databases are the most flexible of the key-value style stores, perfect for storing a large collection of unrelated, discrete documents Relationals DBs scale with adding more processor / memory / diskspace ----- >loading data from disk ?? Try adding a new column to a very large relational database ORACLE RAC – multiple computers with access to the same database - shared storage facilities…that do not scale out Availability , Consistency ----- single database with all your data - might not work all the data needs to be in single instance of the database Partition tolerance and Consistency  2 phase commits across database Availability and Partition tolerance ---- NoSQL way NoSQL storage is highly replicated (a commit doesn’t occur until the data is successfully written to at least two separate storage devices) and the file systems are optimized for write-only commits Consistency (all nodes see the same data at the same tim e) • A vailability (node failures don’t prevent survivors from continuing to operate) • Partition tolerance (no failur es less than total network failures cause the system to fail) Don’ t be stubborn; neither NoSQL nor traditional databases apply to all cases • Apply the CAP Theor em to your use cases to determine feasibility
  • #5  Relationals DBs scale with adding more processor / memory / diskspace ----- >loading data from disk ?? Try adding a new column to a very large relational database ORACLE RAC – multiple computers with access to the same database - shared storage facilities…that do not scale out Availability , Consistency ----- single database with all your data - might not work all the data needs to be in single instance of the database Partition tolerance and Consistency  2 phase commits across database Availability and Partition tolerance ---- NoSQL way To scale horizontally, you need strong network partition tolerance which requires giving up either consistency or availability NoSQL storage is highly replicated (a commit doesn’t occur until the data is successfully written to at least two separate storage devices) and the file systems are optimized for write-only commits Consistency means that each client always has the same view of the data. Availability means that all clients can always read and write. Partition tolerance means that the system works well across physical network partitions. Don’ t be stubborn; neither NoSQL nor traditional databases apply to all cases • Apply the CAP Theor em to your use cases to determine feasibility
  • #6  mongoDB is a document-based NoSQL database that bridges the gap between scalable key-value stores like Datastore and Memcache DB, and RDBMS’s querying and robustness capabilities. Document-oriented storage - data is manipulated as JSON-like documents • Querying - uses JavaScript and has APIs for submitting queries in every major programming language • In-place updates - atomicity • Indexing - any attribute in a document may be used for indexing and query optimization • Auto-sharding - enables horizontal scalability • Map/reduce - the mongoDB cluster may run smaller MapReduce jobs than a Hadoop cluster with significant cost and efficiency improvements
  • #7  mongoDB is a document-based NoSQL database that bridges the gap between scalable key-value stores like Datastore and Memcache DB, and RDBMS’s querying and robustness capabilities. Document-oriented storage - data is manipulated as JSON-like documents • Querying - uses JavaScript and has APIs for submitting queries in every major programming language • In-place updates - atomicity • Indexing - any attribute in a document may be used for indexing and query optimization • Auto-sharding - enables horizontal scalability • Map/reduce - the mongoDB cluster may run smaller MapReduce jobs than a Hadoop cluster with significant cost and efficiency improvements
  • #8  mongoDB is a document-based NoSQL database that bridges the gap between scalable key-value stores like Datastore and Memcache DB, and RDBMS’s querying and robustness capabilities. Document-oriented storage - data is manipulated as JSON-like documents • Querying - uses JavaScript and has APIs for submitting queries in every major programming language • In-place updates - atomicity • Indexing - any attribute in a document may be used for indexing and query optimization • Auto-sharding - enables horizontal scalability • Map/reduce - the mongoDB cluster may run smaller MapReduce jobs than a Hadoop cluster with significant cost and efficiency improvements
  • #11 A replica set is a group of mongod instances that host the same data set. One mongod, the primary, receives all write operations. All other instances, secondaries, apply operations from the primary so that they have the same data set.
  • #13  When a primary does not communicate with the other members of the set for more than 10 seconds, the replica set will attempt to select another member to become the new primary. The first secondary that receives a majority of the votes becomes primary.
  • #15 The application performs a read with a different read preference, The thread terminates, or The client receives a socket exception, as is the case when there’s a network error or when the mongod closes connections during a failover. This triggers a retry, which may be transparent to the application. When using request association, if the client detects that the set has elected a new primary, the driver will discard all associations between threads and members.
  • #16 override this default write concern, such as to confirm write operations on a specified number of the replica set members. MongoDB does not provide any multi-document transactions or isolation.
  • #17 the following method includes a write concern that specifies that the method return only after the write propagates to the primary and at least one secondary or the method times out after 5 seconds.
  • #18 Acknowledged With a receipt acknowledged write concern, the mongod confirms that it received the write operation and applied the change to the in-memory view of data. Acknowledged write concern allows clients to catch network, duplicate key, and other errors. MongoDB uses the acknowledged write concern by default starting in the driver releases outlined in Acknowledged write concern does not confirm that the write operation has persisted to the disk system.
  • #19  With a journaled write concern, the MongoDB acknowledges the write operation only after committing the data to the journal. This write concern ensures that MongoDB can recover the data following a shutdown or power interruption. You must have journaling enabled to use this write concern. With a journaled write concern, write operations must wait for the next journal commit. To reduce latency for these operations, MongoDB also increases the frequency that it commits operations to the journal
  • #20 the following method includes a write concern that specifies that the method return only after the write propagates to the primary and at least one secondary or the method times out after 5 seconds.
  • #22 A fixed-sized collection that automatically overwrites its oldest entries when it reaches its maximum size. The MongoDB oplog that is used in replication is a capped collection
  • #26 GO OVER other possible updates
  • #28 Oplog is the reason why capped collections were invented