MongoDB for Time Series Data Part 3: Sharding

13,032 views

Published on

Published in: Technology, Business
0 Comments
18 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
13,032
On SlideShare
0
From Embeds
0
Number of Embeds
9,173
Actions
Shares
0
Downloads
287
Comments
0
Likes
18
Embeds 0
No embeds

No notes for slide
  • Priority
    Floating point number between 0..1000
    Highest member that is up to date wins
    Up to date == within 10 seconds of primary
    If a higher priority member catches up, it will force election and win

    Slave Delay
    Lags behind master by configurable time delay
    Automatically hidden from clients
    Protects against operator errors
    Fat fingering
    Application corrupts data
  • Initialize -> Election
    Primary + data replication from primary to secondary
  • Priority
    Floating point number between 0..1000
    Highest member that is up to date wins
    Up to date == within 10 seconds of primary
    If a higher priority member catches up, it will force election and win

    Slave Delay
    Lags behind master by configurable time delay
    Automatically hidden from clients
    Protects against operator errors
    Fat fingering
    Application corrupts data
  • MongoDB for Time Series Data Part 3: Sharding

    1. 1. Sr. Solutions Architect, MongoDB Jake Angerman #MongoDBWorld Sharding Time Series Data
    2. 2. Let's Pretend We Are DevOps What my friends think I do What society thinks I do What my Mom thinks I do What my boss thinks I do What I think I do What I really do DevOps
    3. 3. Sharding Overview Primary Secondary Secondary Shard 1 Primary Secondary Secondary Shard 2 Primary Secondary Secondary Shard 3 Primary Secondary Secondary Shard N … Query Router Query Router Query Router …… Driver Application
    4. 4. Why do we need to shard? • Reaching a limit on some resource – RAM (working set) – Disk space – Disk IO – Client network latency on writes (tag aware sharding) – CPU
    5. 5. Do we need to shard right now? • Two schools of thought: 1. Shard at the outset to avoid technical debt later 2. Shard later to avoid complexity and overhead today • Either way, shard before you need to! – 256GB data size threshold published in documentation – Chunk migrations can cause memory contention and disk IO Working Set Free RAM Things seemed fine… Working Set Chunk Migration … then I waited too long to shard
    6. 6. > db.mdbw.stats() { "ns" : "test.mdbw", "count" : 16000, // one hour's worth of documents "size" : 65280000, // size of user data, padding included "avgObjSize" : 4080, "storageSize" : 93356032, // size of data extents, unused space included "numExtents" : 11, "nindexes" : 1, "lastExtentSize" : 31354880, "paddingFactor" : 1, "systemFlags" : 1, "userFlags" : 1, "totalIndexSize" : 801248, "indexSizes" : { "_id_" : 801248 }, "ok" : 1 } collection stats
    7. 7. Storage model spreadsheet sensors 16,000 years to keep data 6 docs per day 384,000 docs per year 140,160,000 docs total across all years 840,960,000 indexes per day 801248 bytes storage per hour 63 MB storage per day 1.5 GB storage per year 539 GB storage across all years 3,235 GB
    8. 8. Why we need to shard now 539 GB in year one alone 0 500 1,000 1,500 2,000 2,500 3,000 3,500 1 2 3 4 5 6 Year Total storage (GB) 16,000 sensors today… … 47,000 tomorrow?
    9. 9. What will our sharded cluster look like? • We need to model the application to answer this question • Model should include: – application write patterns (sensors) – application read patterns (clients) – analytic read patterns – data storage requirements • Two main collections – summary data (fast query times) – historical data (analysis of environmental conditions)
    10. 10. Option 1: Everything in one sharded cluster Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Shard 2 Shard 3 Shard N … Primary Secondary Secondary Shard 1 Primary Shard Primary Secondary Secondary Shard 4 • Issue: prevent analytics jobs from affecting application performance • Summary data is small (16,000 * N bytes) and accessed frequently
    11. 11. Option 2: Distinct replica set for summaries Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Shard 1 Shard 2 Shard N … Primary Secondary Secondary Replica set Primary Secondary Secondary Shard 3 • Pros: Operational separation between business functions • Cons: application must write to two different databases
    12. 12. Application read patterns • Web browsers, mobile phones, and in-car navigation devices • Working set should be kept in RAM • 5M subscribers * 1% active * 50 sensors/query * 1 device query/min = 41,667 reads/sec • 41,667 reads/sec * 4080 bytes = 162 MB/sec – and that's without any protocol overhead • Gigabit Ethernet is ≈ 118 MB/sec Primary Secondary Secondary Replica set 1 Gbps
    13. 13. Application read patterns (continued) • Options – provision more bandwidth ($$$) – tune application read pattern – add a caching layer – secondary reads from the replica set Primary Secondary Secondary Replica set 1 Gbps 1 Gbps 1 Gbps
    14. 14. Secondary Reads from the Replica Set • Stale data OK in this use case • caution: read preference of secondary could be disastrous in a 3-replica set if a secondary fails! • app servers with mixed read preferences of primary and secondary are operationally cumbersome • Use nearest read preference to access all nodes Primary Secondary Secondary Replica set 1 Gbps 1 Gbps 1 Gbps db.collection.find().readPref ( { mode: 'nearest'} )
    15. 15. Replica Set Tags • app servers in different data centers use replica set tags plus read preferencenearest • db.collection.find().readPref( { mode: 'nearest', tags: [ {'datacenter': 'east'} ] } ) east Secondary Secondary Primary >rs.conf() {"_id":"rs0", "version":2, "members":[ {"_id":0, "host":"node0.example.net:27017", "tags":{"datacenter":"east"} }, {"_id":1, "host":"node1.example.net:27017", "tags":{"datacenter":"east"} }, {"_id":2, "host":"node2.example.net:27017", "tags":{"datacenter":"east"} }, }
    16. 16. eastcentralwest Replica Set Tags • Enables geographic distribution SecondarySecondary Primary
    17. 17. eastcentralwest Replica Set Tags • Enables geographic distribution • Allows scaling within each data center Secondary Secondary Secondary Secondary Secondary Secondary Primary Secondary Secondary
    18. 18. Analytic read patterns • How does an analyst look at the data on the sharded cluster? • 1 Year of data = 539 GB 3, 256 3, 192 5, 128 9, 64 17, 32 0 50 100 150 200 250 300 0 2 4 6 8 10 12 14 16 18 Server RAM Number of machines
    19. 19. Application write patterns • 16,000 sensors every minute = 267 writes/sec • Could we handle 16,000 writes in one second? – 16,000 writes * 4080 bytes = 62 MB • Load test the app!
    20. 20. Modeling the Application - summary • We modeled: – application write patterns (sensors) – application read patterns (clients) – analytic read patterns – data storage requirements – the network, a little bit
    21. 21. Shard Key
    22. 22. Shard Key characteristics • Agood shard key has: – sufficient cardinality – distributed writes – targeted reads ("query isolation") • Shard key should be in every query if possible – scatter gather otherwise • Choosing a good shard key is important! – affects performance and scalability – changing it later is expensive
    23. 23. Hashed shard key • Pros: – Evenly distributed writes • Cons: – Random data (and index) updates can be IO intensive – Range-based queries turn into scatter gather Shard 1 mongos Shard 2 Shard 3 Shard N
    24. 24. Low cardinality shard key • Induces "jumbo chunks" • Examples: sensor ID Shard 1 mongos Shard 2 Shard 3 Shard N [ a, b )
    25. 25. Ascending shard key • Monotonically increasing shard key values cause "hot spots" on inserts • Examples: timestamps, _id Shard 1 mongos Shard 2 Shard 3 Shard N [ ISODate(…), $maxKey )
    26. 26. Choosing a shard key for time series data • Consider compound shard key: {arbitrary value, incrementing value} • Best of both worlds – local hot spotting, targeted reads Shard 1 mongos Shard 2 Shard 3 Shard N [ {V1, ISODate(A)}, {V1, ISODate(B)} ), [ {V1, ISODate(B)}, {V1, ISODate(C)} ), [ {V1, ISODate(C)}, {V1, ISODate(D)} ), … [ {V4, ISODate(A)}, {V4, ISODate(B)} [ {V4, ISODate(B)}, {V4, ISODate(C)} [ {V4, ISODate(C)}, {V4, ISODate(D)} … [ {V2, ISODate(A)}, {V2, ISODate(B)} ), [ {V2, ISODate(B)}, {V2, ISODate(C)} ), [ {V2, ISODate(C)}, {V2, ISODate(D)} ), … [ {V3, ISODate(A)}, {V3, ISODate(B)} ), [ {V3, ISODate(B)}, {V3, ISODate(C)} ), [ {V3, ISODate(C)}, {V3, ISODate(D)} ), …
    27. 27. What is our shard key? • Let's choose: linkID, date – example: { linkID: 9000006, date: 140312 } – example: { _id: "900006:140312" } – this application's _id is in this form already, yay!
    28. 28. Summary • Model the read/write patterns and storage • Choose an appropriate shard key • DevOps influenced the application – write recent summary data to separate database – replica set tags for summary database – avoid synchronous sensor checkins – consider changing client polling frequency – consider throttling RESTAPI access to app servers
    29. 29. Which DevOps person are you?
    30. 30. Sr. Solutions Architect, MongoDB Jake Angerman #MongoDBWorld Thank You
    31. 31. $ mongo --nodb > cluster = new ShardingTest({"shards": 1, "chunksize": 1}) $ mongo --nodb > // now connect to mongos on 30999 > db = (new Mongo("localhost:30999")).getDB("test") Sharding Experimentation
    32. 32. I decided to shard from the outset • Sensor summary documents can all fit in RAM – 16,000 sensors * N bytes • Velocity of sensor events is only 267 writes/sec • Volume of sensor events is what dictates sharding { _id:<linkID>, update:ISODate(“2013-10-10T23:06:37.000Z”), last10:{ avgSpeed:<int>, avgTime:<int> }, lastHour:{ avgSpeed:<int>, avgTime:<int> }, speeds:[52,49,45,51,...], times:[237,224,246,233,...], pavement:"WetSpots", status:"WetConditions", weather:"LightRain" }
    33. 33. > this_is_for_replica_sets_not_sharding = { _id : "mySet", members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C"}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuring Sharding
    34. 34. I'm off to my private island in New Zealand
    35. 35. Replica Set Diagram
    36. 36. > conf = { _id : "mySet", members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C"}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options
    37. 37. My Wonderful Subsection
    38. 38. > conf = { _id : "mySet”, members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C"}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options Primary DC
    39. 39. Tag Aware Sharding • Control where data is written to, and read from • Each member can have one or more tags – tags: {dc: "ny"} – tags: {dc: "ny", subnet: "192.168", rack: "row3rk7"} • Replica set defines rules for write concerns • Rules can change without changing app code

    ×