Your SlideShare is downloading. ×
0
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

NoSQL - how it works (@pavlobaron)

2,577

Published on

Slides of my OOP'12 talk

Slides of my OOP'12 talk

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,577
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. NoSQL.How it works. Pavlo Baron
  • 2. Geek‘s GuideTo The Working Life Pavlo Baron pavlo.baron@codecentric.de @pavlobaron
  • 3. NoSQL is not about …<140’000 things NoSQL is not about>… NoSQL is about choice(Jan Lehnardt on NoSQL)
  • 4. (John Muellerleile on NoSQL)
  • 5. NoSQL addresses the issue of poorly structured data
  • 6. NoSQL addresses the issueof data management simplicity
  • 7. NoSQL addresses the issue of data flood
  • 8. NoSQL addresses the issue of extremely frequent reads/writes
  • 9. NoSQL addresses the issue of big data streams
  • 10. NoSQL addresses the issue of real-time data processing and analysis
  • 11. NoSQL addresses the issue of huge data storage
  • 12. NoSQL addresses the issue of fast data filtering
  • 13. NoSQL addresses the issue of complex, deep relations
  • 14. NoSQL addresses the issue of pure web existences
  • 15. NoSQL addresses the issue of picking the right tool for the job
  • 16. How?
  • 17. Chop in smaller pieces
  • 18. Chop in bite-size,manageable pieces
  • 19. Separate readingfrom writing
  • 20. CachingVariations: eager write, append only lazy write, eventual consistency
  • 21. Write throughwrite read read cache write read through missproducts users data store
  • 22. Write back / write snapshotting readcache read write miss backproducts users data store
  • 23. Design for theoreticallyunlimited amount of data
  • 24. Append, update, mark, recycle, don’t delete and restructure
  • 25. Minimize hard relations
  • 26. Parallelize and distribute
  • 27. Avoid single bottle necks
  • 28. Decentralize with “equal” nodes
  • 29. Build upon consensus,agreement, voting, quorum
  • 30. write RM2 Gossip – RM RM1 Clock table Value Update log stable clockReplica clock updates Value Executed operation table
  • 31. Gossip – node down/upNode 1Node 2 update, read, update update 4 down 4 upNode 3 Node 4 update read
  • 32. Don’t trust timeand timestamps
  • 33. ClocksV(i), V(j): competingConflict resolution: 1: siblings, client 2: merge, system 3: voting, system
  • 34. TimestampsNode 1 10:00 10:10 10:20Node 2 10:01 10:11 10:20Node 3 9:59 10:09 10:18 10:19
  • 35. Logical clocks ?Node 1 1 4 5 6 7Node 2 2 3 6 7 ?Node 3 2 4 5 6 7
  • 36. Vector clocksNode 1 1,0,0 2,2,0 3,2,0 4,3,3Node 2 1,1,0 1,2,0 1,3,3 4,4,3Node 3 1,0,1 1,2,2 1,2,3 4,3,4
  • 37. Vector clocksNode 1 Node 2 Node 3 Node 4 1,0,0,0 1,1,0,0 1,2,0,0 1,3,0,3 1,0,1,0 1,0,2,0 1,0,0,1 1,2,0,2 1,2,0,3
  • 38. Strive forO(1) for data lookups #
  • 39. Merkle TreesN, M: nodesHT(N), HT(M): hash treesM needs update: obtain HT(N) calc delta(HT(M), HT(N)) pull keys(delta)
  • 40. Node a.1 Merkle Trees a ab ac abc abd acb acc abe abd ada adb ab ad aNode a.2
  • 41. Node a.1 Merkle Trees a ab abc abd abd ada adb ab ad aNode a.2
  • 42. Node 1 Vertical sharding users addresses contracts orders „read contract“ user=foo invoices products itemsNode 2
  • 43. Node 1 Range based sharding users id(1-N) addresses zip(1234- read 2345) write products write addresses users zip(2346- read id(1-M) 9999)Node 2
  • 44. Hash based shardingstart with 3 nodes: node hash N = # mod 3add 2 nodes N = # mod 5kill 2 nodes N = # mod 3
  • 45. Insert key N Key = “foo” #=N
  • 46. Add 2 nodesrehash leaverehashleave
  • 47. LookupKey = “foo” key #=N NValue = “bar”
  • 48. Remove noderehash leaverehashleave
  • 49. The ringX bit integer space 0 <= N <= 2 ^ Xor: 2 x Pi 0 <= A <= 2 x Pi x(N) = cos(A) y(N) = sin(A)
  • 50. Clustering12 partitions (constant) 3 nodes, 4 vnodes eachadd node 4 nodes, 3 vnodes eachAlternatives: 3 nodes, 2 x 5 + 1 x 2 vnodes container based
  • 51. QuorumV: vnodes holding a keyW: write quorumR: read quorumDW: durable write quorum • W > 0.5 * V R + W > V
  • 52. Insert keyKey = “foo” (sloppy quorum)# = N, W = 2 replicate N ok
  • 53. Add node co py leaveleave co py py leaveco
  • 54. Lookup key (sloppy quorum)N Value = “bar” Key = “foo” # = N, R = 2
  • 55. Remove nodecopy leave
  • 56. Minimize the distance between the data and its processors
  • 57. Utilize commodity hardware
  • 58. MapReducemodel: functional map/foldout-database MR irrelevantin-database MR: data locality no splitting needed distributed querying distributed processing
  • 59. In-database MapReduce query =Node X "Alice" map reduce hit list map map N= N= N= „Alice" "Alice" "Alice" Node A Node B Node C
  • 60. Design with eventualactuality/consistency in mind
  • 61. BASEBasically Available,Soft-state,Eventually consistentOpposite to ACID
  • 62. Read your write consistencyFE1 FE2 write read write read v2 v2 v1 v1 v1 v2 v3 Data store
  • 63. Session consistency FESession 1 Session 2 write read write read v2 v2 v1 v1 v1 v2 v3 Data store
  • 64. Monotonic read consistencyFE1 FE2 read read read read read v2 v2 v3 v3 v4 v1 v2 v3 v4 Data store
  • 65. Monotonic write consistencyFE1 FE2 write write read read v1 v2 v3 v3 v1 v2 v3 v4 Data store
  • 66. Eventual consistencyFE1 FE2read read read read write v1 v2 v2 v3 v3 v1 v2 v3 Data store
  • 67. Implement redundancy and replication
  • 68. Source node Replication – addresses state transfer products take usersTarget node
  • 69. Source node Replication – deletes operational transfer inserts take updates runTarget node
  • 70. Eager replication - 3PCCoordinatorCohort 1 can yes pre ACK commit ok commit? commitCohort 2
  • 71. Eager replication – 3PC (failure)CoordinatorCohort 1 can yes pre ACK abort ok commit? commitCohort 2
  • 72. Eager replication- Paxos Commit2F + 1 acceptors overall , F + 1correct ones to achieveconsensusStability, Consistency,Non-Triviality,Non-Blocking
  • 73. Paxos CommitEager replication – commit 2b prepared prepare 2a prepared begin commit initial other Acceptors leader RMs RM1
  • 74. Eager replication – Paxos Commit (failure)Acceptors 2a prepared 2a prepared timeout, timeout, no no decision decisionleader initial prepare prepare abort begin commitotherRMs RM 1
  • 75. Master node Lazy replication – master/slave addresses products write users read readSlave node(s)
  • 76. Master node(s) Lazy replication – master/master users itemsid(1-N) id(1-K) write read users items readid(1-M) id(1-L) writeMaster node(s)
  • 77. Hinted handoffN: node, G: group including Nnode(N) is unavailable replicate to G or store data(N) locally hint handoff for later node(N) is alive handoff data to node(N)
  • 78. Key = “foo”, # = N -> Directhandoff hint = true replica failsKey = “foo” N replicate
  • 79. Replicahandoff recovers
  • 80. AllKey = “foo”,# = N -> replicashandoff hint = failtrue N
  • 81. All replicashandoff recoverreplicate
  • 82. Consider latency an adjustment screw
  • 83. Consider availability an adjustment screw
  • 84. CAP – the variationsCA – irrelevantCP – eventually unavailableoffering maximum consistencyAP – eventually inconsistentoffering maximum availability
  • 85. CAP – the tradeoffA C
  • 86. Replica 1 CP v1 read v2 write v2 v2 v1 readReplica 2
  • 87. Replica 1 CP (partition) v1 read v2 write v2 v1 readReplica 2
  • 88. Replica 1 AP v1 write v2 v2 read replicate v2 v1 readReplica 2
  • 89. Replica 1 AP (partition) v1 write v2 v2 read hint handoff v2 v1 readReplica 2
  • 90. Build upon appropriate storage strategy,not upon a general one
  • 91. Design for frequent structure changes
  • 92. Most queries are known up frontAd-hoc queries areseldom necessaryPrepared queries canextremely speed up data retrievalIndex can help ad-hoc querying,and can be externalizedIndex should be incremental
  • 93. Store asDocument (semi-structured)Key/Value (unstructured)Graph (special case)...Externalize relations andproperties
  • 94. The graph caseSaving graph in a table leads to:Limited depthFixed relation typesExpensive nested subselectsFull table scan tendencyGraph data stores store graphdata optimally
  • 95. Thank you
  • 96. Many graphics I’ve created myselfSome images originate from istockphoto.com except few ones taken from Wikipedia and product pages

×