12. #MDBlocal
CHANGE STREAMS PRESENT A DEFINED
API, UTILIZE COLLECTION ACCESS
CONTROLS, SHOW ONLY DURABLE
CHANGES, AND ENABLE SCALING
ACROSS ALL YOUR NODES.
21. #MDBlocal
1 . S H AR D K E Y I S U S E D ( B Y T H E M O N G O S ) T O R O U T E
O P E R AT I O N S
2 . _ I D I S U S E D T O U N I Q U E L Y I D E N T I F Y A D O C U M E N T
mongos
3 1
2Shard
1
Shard
2
Shard
3
DEFINING THE documentKey
who am I – what do I do, talk about new feature in 3.6 but first some background.
Simple definition.
Change streams allow you to watch all the changes against a collection. So, when you open change streams against a collection you will see all of the changes applied to documents that belong to that collection. On each change a listening application will be notified with a document describing the change.
Imagine this scenario, I have a smart thermometer within my apartment. Every second it inserts a document into MongoDB with temperature data. Change streams can then notify any listening application to every change in my temperature collection. So as the thermometer inserts new data, change streams will inform my listening action handler. In this case, the action handler is an application I built to turn on a fan every time the temperature goes above 70 degrees.
Don’t let the simplicity of this example fool you, this concept can be applied to back-office management applications to end user applications. Change streams are extremely powerful
So, this slide is to show you how incredibly simple it is to implement change streams in your application. This example happens to be in node.
Here, I am defining a change stream. Then starting it where every change gets printed out.
It’s extremely simple and incredibly powerful. With change stream, you get a real-time feed of the changes to your data, enabling your application to react to these changes immediately.
what type of operations can you "watch"? Basically write operations on documents. you cannot see queries or commands, nor meta-data events, but you can see any write that changes a document – I'll go into specifics of how each of these looks a bit later.
change streams provide certain guarantees to make it easier for you to implement powerful functionality
STABLE API
API
access control
access control
guarantee durable data
scaling
ordered (even sharded)
are resumable
aggregation syntax
scaling
well ordered (even sharded)
resumable
aggregation syntax for more advanced features
sharding – mongos routes data to shards, so what's the order of operations of shards relative to each other?
open change stream against mongos
guaranteed ordering Lamport clock or logical cluster time that all components of the cluster know about – if they fall behind they get "latest cluster time" and increment relative to that.
durable – no changes shown that can be rolled back
what is rollback?
only when committed to majority of the nodes in replica set (no change streams against standalone)
resumable – what does that mean? if you lose your place re-try from where you were last.
driver takes care of most details
similar to retryable writes only certain errors can be "resumed" from.
If your application crashes or is stopped for some reason, you are responsible for storing the resume token to pass it to watch method when you restart (if not then you will start with "now") and may possibly miss changes during the time application was down.
familiar syntax for additional processing of change stream data – you get to provide an aggregation pipeline, and documents returned stream through it. hence only some stages are allowed.
SUMMARY
SUMMARY
look closer at different operation types and what details they provide
Now we are going to move into some of the characteristics of change streams.
NOW you should close the change stream and re-open it with a different list of watched rooms, and you should probably add an $or watching for fullDocument.username *you* in case there's a change there.
The oplog is an internal mechanism that tracks all changes across the entire cluster, whether they are data or system changes.
primary accepts all writes
records them into oplog
secondaries tail the oplog – meaning they watch it, sort of like watching a change stream(!) in fact that was a way you could do it yourself in the past