Making Mongo realtime - oplog tailing in Meteor

  • 1,074 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,074
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
13
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Making Mongo Realtime Oplog tailing in Meteor David Glasser Meteor DevShop 10, 2013-Dec-05
  • 2. Meteor makes realtime the default When data changes in your database, everybody's web UI updates automatically without you having to write any custom code.
  • 3. How the Meteor realtime stack works The server runs its publish function, which typically returns a cursor Mto.uls(tpsoe" fnto (aed { eerpbih"o-crs, ucin gmI) cekgmI,Srn) hc(aed tig; rtr Soe.id{ae gmI} eun crsfn(gm: aed, {ot {cr:-} lmt 5 fed:{cr:1 ue:1}; sr: soe 1, ii: , ils soe , sr }) }; ) The server watches the query in the database and observes its changes When changes happens, the server sends DDP data messages to the client The client updates its local cache Changes to the local cache cause Meteor UI to re-render templates
  • 4. o s r e h n e  makes Mongo bevCags a realtime database o s r e h n e is a brand new API in Meteor's Mongo client interface. bevCags hnl =Msae.id{om roI}.bevCags{ ade esgsfn(ro: omd)osrehne( add fnto (d fed){.} de: ucin i, ils .., cagd fnto (d fed){.} hne: ucin i, ils .., rmvd fnto (d {.} eoe: ucin i) .. } ) o s r e h n e executes and calls the a d dcallback for each bevCags de matching document It continues to watches the database and notices when the query's results change When the results change, it calls the a d d c a g d and r m v d de, hne, eoe callbacks asynchronously This continues until you call h n l . t p ) adeso(
  • 5. o s r e h n e  supports all bevCags Mongo queries Meteor turns the full query API of a real database into a live query API No more custom per-query code to monitor the database and see when it changes It's our job to make o s r e h n e as efficient as possible for as bevCags many queries as possible
  • 6. poll-and-diff Run a query over and over, and compare the results each time vrrsls={; a eut } vrplAdif=fnto ( { a olnDf ucin ) cro.eid) usrrwn(; vrodeut =rsls a lRsls eut; rsls={; eut } cro.oEc(ucin(o){ usrfrahfnto dc rslsdc_d =dc eut[o.i] o; i (.a(lRsls dc_d) f _hsodeut, o.i) clbcscagddc_d cagdilsewe(lRslsdc_d,dc) alak.hne(o.i, hneFedBtenodeut[o.i] o); es le clbcsadddc_d dc; alak.de(o.i, o) }; ) _ec(lRsls fnto (o,i){ .ahodeut, ucin dc d i (_hsrsls i) f !.a(eut, d) clbcsrmvdi) alak.eoe(d; }; ) } ; stnevlplAdif 1 *10) eItra(olnDf, 0 00; weWWieohCleto(olnDf) hnertTTeolcinplAdif;
  • 7. When do we re-run the query? Every time we think the query may have changed: specifically, any time that the current Meteor server process writes to the collection Additionally, every 10 seconds, to catch writes from other processes
  • 8. Benefits of poll-and-diff Code is short and correct Writes from the current process are reflected on clients immediately Writes from other processes are reflected on clients eventually
  • 9. Drawbacks of poll-and-diff 1. Cost proportional to poll frequency: number of polls grows with the frequency of data change 2. Cost proportional to query result size: Mongo bandwidth, CPU to parse BSON and perform recursive diffs, etc 3. Latency depends on whether a write originated from the same server (very low) or another process (10 seconds)
  • 10. Optimizations to poll-and-diff 1. Infer that a write does not affect an observe, then skip the poll. (eg, when both the write and the query specify a specific _ d i .) 2. Query de-duplication: if multiple connections want to subscribe to the same query, use the same poll-and-diff fiber for all of them.
  • 11. Oplog tailing
  • 12. The Mongo oplog MongoDB replication uses an operation log describing exactly what has changed in the database You can tail the oplog: follow along and find out about every change immediately
  • 13. Using oplog tailing for osrehne bevCags We're going to let the database tell us what changed
  • 14. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Run the query and cache: {i:"x" ue:"vtl,soe 10 _d xx, sr aia" cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "net,i:"w" {ae "kebl" ue:"lse" soe 10} o: isr" d ww, gm: se-al, sr gasr, cr: 00} Ignore it: does not match selector.
  • 15. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"x" ue:"vtl,soe 10 _d xx, sr aia" cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "net,i:"a" {ae "arm,ue:"lse" soe 1} o: isr" d aa, gm: cro" sr gasr, cr: 0} Ignore it: selector matches, but the score is not high enough.
  • 16. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"x" ue:"vtl,soe 10 _d xx, sr aia" cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "eoe,i:"p" o: rmv" d pp} Ignore it: removing something we aren't publishing can't affect us (unless skip option is set!)
  • 17. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"x" ue:"vtl,soe 10 _d xx, sr aia" cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "pae,i:"c" {st {oo:"le}} o: udt" d cc, $e: clr bu"} This is a document not currently in the cursor. This change does not affect the selector or the sort criteria, so it can't affect the results. Ignore it!
  • 18. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"x" ue:"vtl,soe 10 _d xx, sr aia" cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "pae,i:"x" {st {oo:"e"} o: udt" d xx, $e: clr rd}} This is a document in the cursor, but it does not affect the selector, sort criteria, or any published fields. Ignore it!
  • 19. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"x" ue:"vtl,soe 10 _d xx, sr aia" cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "pae,i:"d" {st {ae "oiin}} o: udt" d dd, $e: gm: dmno"} This is a document not currently in the cursor. This change is to a field from the selector, but it can't make it true. Ignore it!
  • 20. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"x" ue:"vtl,soe 10 _d xx, sr aia" cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "pae,i:"x" {st {sr "v"} o: udt" d xx, $e: ue: ai}} Invoke c a g d " x " { s r " v " ) hne(xx, ue: ai}. Cache is now: {i:"x" ue:"v" soe 10 _d xx, sr ai, cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3}
  • 21. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"x" ue:"v" soe 10 _d xx, sr ai, cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} {i:"z" ue:"lv" soe 10 _d zz, sr saa, cr: 3} Oplog says: {p "net,i:"b" {sr "lse" gm:"arm,soe 20} o: isr" d bb, ue: gasr, ae cro" cr: 0} Matches and sorts at the top! Invoke a d d " b " { s r " l s e " s o e 2 0 )and de(bb, ue: gasr, cr: 0} rmvd"z". eoe(zz) Cache is now: {i:"b" ue:"lse" soe 20 _d bb, sr gasr, cr: 0} {i:"x" ue:"v" soe 10 _d xx, sr ai, cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4}
  • 22. Oplog tailing, conceptually Soe.id{ae "arm} {ot {cr:-} lmt 3 fed:{cr:1 ue:1} crsfn(gm: cro", sr: soe 1, ii: , ils soe , sr }) Cache is: {i:"b" ue:"lse" soe 20 _d bb, sr gasr, cr: 0} {i:"x" ue:"v" soe 10 _d xx, sr ai, cr: 5} {i:"y" ue:"am" soe 10 _d yy, sr noi, cr: 4} Oplog says: {p "pae,i:"e" {st {cr:50} o: udt" d ee, $e: soe 0}} This matches if "eee" has g m : " a r m . We have to fetch doc "eee" ae cro" from Mongo and check. If it does, invoke a d d " e " { s r " m l " s o e 5 0 ) de(ee, ue: eiy, cr: 0} and r m v d " y " . Otherwise, do nothing. eoe(yy)
  • 23. Minimongo on the server In order to process the oplog, we need to be able to interpret Mongo selectors, field specifiers, sort specifiers, etc This was not the case for poll-and-diff Fortunately, Meteor already can do this: minimongo, our client-side local database cache! When moving minimongo to the server, we need to be very careful that we perfectly match Mongo's implementation, even in complex cases (nested arrays, nulls, etc) Synchronizing between the "initial query" and the oplog tailing is very subtle
  • 24. Benchmarks Running benchmarks with various high write loads Benchmark with lots of inserts and few updates: 10x more connected clients Benchmark with lots of updates: 2.5x more connected clients Goal: Scale Meteor so that the DB is the limiting factor Bottleneck: Mongo server CPU/bandwidth Can fix by reading from Mongo replicas More unimplemented heuristics
  • 25. On d v l e e  today! Oplog tailing for an initial class of Mongo queries in the next release Other classes of queries will be supported by 1.0 Current implementation runs automatically for dev-mode m t o eer r nand can be enabled in production with $ O G _ P O _ R u MNOOLGUL Works with Galaxy!
  • 26. Thanks! Any questions?