Your SlideShare is downloading. ×
0
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Big Data & NoSQL - EFS'11 (Pavlo Baron)

6,231

Published on

That's the slides of my half day workshop at the EFS'11 in Stuttgart where I covered some theoretical aspects of NoSQL data stores relevant for dealing with large data amounts

That's the slides of my half day workshop at the EFS'11 in Stuttgart where I covered some theoretical aspects of NoSQL data stores relevant for dealing with large data amounts

Published in: Technology
1 Comment
32 Likes
Statistics
Notes
  • Concepts so nicely explained - thanks for sharing!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
6,231
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
1
Likes
32
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1.  
  • 2. Pavlo Baron http://www.pbit.org [email_address] @pavlobaron
  • 3. Agenda Blah-blah More blah-blah Color pics Standing ovations
  • 4. So, come on, sell this to me
  • 5. Agenda Blah-blah More blah-blah Color pics Standing ovations
  • 6. Somewhere a mosquito coughs…
  • 7. … and somewhere else a data center gets flooded with data (PB)
  • 8. Big Data describes datasets that grow so large that they become awkward to work with using on-hand database management tools (Wikipedia)
  • 9. NoSQL is not about … <140’000 things NoSQL is not about>… NoSQL is about choice (Jan Lehnardt, CouchDB)
  • 10. Look here brother, who you jivin‘ with that Cosmik Debris ?
  • 11. (John Muellerleile)
  • 12. Agenda Blah-blah More blah-blah Color pics Standing ovations
  • 13. So, you think you can tell heaven from hell ...
  • 14. Where does your data actually come from ?
  • 15. Do you have a million well structured records?
  • 16. Or a couple of Gigabytes of storage?
  • 17. Does your data get modified every now and then ?
  • 18. Do you look at your data Once a month to create a management report?
  • 19. Or is your data an unstructured chaos?
  • 20. Do you get flooded by tera-/petabytes of data?
  • 21. Or do you simply get bombed with data?
  • 22. Does your data flow on streams at a very high rate from different locations?
  • 23. Or do you have to read The Matrix ?
  • 24. Do you need to distribute your data over the whole world
  • 25. Or does your existence depend on (the quality of) your data?
  • 26. Look back and turn back. Look at yourself
  • 27. Is it the storage that you need to focus on?
  • 28. Or are you more preparing data?
  • 29. Or do you have your customers spread all over the world ?
  • 30. Or do you have complex statistical analysis to do?
  • 31. Or do you have to filter data as it comes?
  • 32. Or is it necessary to visualize the data?
  • 33. ...every blade is sharp, the arrows fly...
  • 34. Chop in smaller pieces
  • 35. Chop in bite-size , manageable pieces
  • 36. Separate reading from writing
  • 37. Update and mark, don’t delete physically
  • 38. Minimize hard relations
  • 39. Separate archive from accessible data
  • 40. Trash everything that has only to be analyzed in real-time
  • 41. Parallelize and distribute
  • 42. Avoid single bottle necks
  • 43. Decentralize with “ equal” nodes
  • 44. Design with Byzantine faults in mind
  • 45. Build upon consensus , agreement , voting , quorum
  • 46. Don’t trust time and timestamps
  • 47. Strive for O(1) for data lookups #
  • 48. Minimize the distance between the data and its processors
  • 49. Utilize commodity hardware
  • 50. Consider hardware fallibility
  • 51. Relax new hardware startup procedure
  • 52. Bring data to its users
  • 53. Build upon asynchronous message passing
  • 54. Consider network unreliability
  • 55. Consider asynchronous message passing unreliability
  • 56. Design with eventual actuality/consistency in mind
  • 57. Implement redundancy and replication
  • 58. Consider latency an adjustment screw
  • 59. Consider availability an adjustment screw
  • 60. Be prepared for disaster
  • 61. Utilize the fog/clouds
  • 62. Design for theoretically unlimited amount of data
  • 63. Design for frequent structure changes
  • 64. Design for the all-in-one mix
  • 65. Agenda Blah-blah More blah-blah Color pics Standing ovations
  • 66. Why can we never be sure till we die . Or have killed for an answer
  • 67. CAP – C onsistency, A vailability, P artition tolerance
  • 68. CAP – the variations CA – irrelevant CP – eventually unavailable offering maximum consistency AP – eventually inconsistent offering maximum availability
  • 69. CAP – the tradeoff A C
  • 70. CP Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2 v 2
  • 71. CP ( partition ) Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2
  • 72. AP Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2 v 2 replicate
  • 73. AP ( partition ) Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2 v 2 hint handoff
  • 74. BASE
  • 75. BASE B asically A vailable, S oft-state, E ventually consistent Opposite to ACID
  • 76. Causal ordering / consistency RM1 RM2 RM3
  • 77. Read your write consistency write v 2 read v2 FE1 v 2 Data store v 3 v 1 write v 1 read v1 FE2
  • 78. Session 2 Session 1 Session consistency write v 2 read v2 FE v 2 Data store v 3 v 1 write v 1 read v1
  • 79. FIFO ordering RM1 RM2 RM3
  • 80. Monotonic read consistency read v 2 read v2 FE1 v 2 Data store v 3 v 1 read v 3 read v4 FE2 v 4 read v3
  • 81. Total ordering RM1 RM2 RM3
  • 82. Monotonic write consistency write v 1 write v4 FE1 Data store v 2 write v 2 write v3 FE2 v 4 v 1 v 3
  • 83. Eventual consistency read v 1 read v2 FE1 Data store v 3 write v 3 FE2 read v3 v 1 read v2 v 2
  • 84. Run, rabbit, run. Dig that hole , forget the sun
  • 85. Logical sharding
  • 86. Node 1 Node 2 users products contracts Vertical sharding items orders addresses invoices „ read contract“ user=foo
  • 87. Node 1 Node 2 users id(1-N) products Range based sharding addresses zip(1234- 2345) read users id(1-M) addresses zip(2346- 9999) write write read
  • 88. Hash based sharding start with 3 nodes: node hash N = # mod 3 add 2 nodes N = # mod 5 kill 2 nodes N = # mod 3
  • 89. Insert key Key = “foo” # = N N
  • 90. rehash leave leave rehash Add 2 nodes
  • 91. Lookup key Key = “foo” # = N N Value = “bar”
  • 92. rehash leave leave rehash Remove node
  • 93. Consistent hashing
  • 94. The ring X bit integer space 0 <= N <= 2 ^ X or: 2 x Pi 0 <= A <= 2 x Pi x(N) = cos(A) y(N) = sin(A)
  • 95. Key = “foo” # = N N Insert key
  • 96. copy leave rehash leave leave rehash Add node
  • 97. Lookup key Key = “foo” # = N N Value = “bar”
  • 98. copy/ miss leave rehash leave leave rehash Remove node
  • 99. Clustering 12 partitions (constant) 3 nodes, 4 vnodes each add node 4 nodes, 3 vnodes each Alternatives: 3 nodes, 2 x 5 + 1 x 2 vnodes container based
  • 100. Quorum V: vnodes holding a key W: write quorum R: read quorum DW: durable write quorum W > 0.5 * V R + W > V
  • 101. Key = “foo” # = N, W = 2 N Insert key ( sloppy quorum) replicate ok
  • 102. leave Add node copy copy leave
  • 103. Key = “foo” # = N, R = 2 N Lookup key ( sloppy quorum) Value = “bar”
  • 104. leave Remove node copy copy leave
  • 105. Inside out, outside in. Perpetual change
  • 106. Clocks V(i), V(j): competing Conflict resolution: 1: siblings , client 2: merge , system 3: voting , system
  • 107. Node 1 Node 2 Node 3 10:00 10:11 10:20 10:20 10:01 9:59 10:09 10:10 Timestamps 10:18 10:19
  • 108. Node 1 Node 2 Node 3 1 3 5 6 2 2 4 5 4 7 7 7 Logical clocks 6 6 ? ?
  • 109. Node 1 Node 2 Node 3 1,0,0 1,2,0 3,2,0 1,3,3 1,1,0 1,0,1 1,2,2 1,2,3 2,2,0 4,3,3 4,4,3 4,3,4 Vector clocks
  • 110. Node 2 Node 3 Node 4 1,1,0,0 1,0,1,0 1,0,0,1 1,3,0,3 1,2,0, 2 1,2,0,3 Vector clocks Node 1 1,0,0,0 1,2,0,0 1,0,2,0
  • 111. Merkle Trees N, M: nodes HT(N), HT(M): hash trees M needs update: obtain HT(N) calc delta(HT(M), HT(N)) pull keys(delta)
  • 112. Node a.1 Node a.2 a ab ac abc abd acb acc Merkle Trees a ab ad abe abd ada adb
  • 113. Node a.1 Node a.2 a ab abc abd Merkle Trees a ab ad abd ada adb
  • 114. Sudden call shouldn't take away the startled memory
  • 115. Replication – state transfer Target node users products addresses Source node take
  • 116. Replication – operational transfer Target node updates inserts deletes Source node take run
  • 117. Eager replication - 3PC Coordinator Cohort 1 Cohort 2 yes can commit? pre commit ACK commit ok
  • 118. Eager replication – 3PC ( failure ) Coordinator Cohort 1 Cohort 2 yes can commit? pre commit ACK abort ok
  • 119. Eager replication- Paxos Commit 2F + 1 acceptorsoverall , F + 1 correct ones to achieve consensus Stability, Consistency, Non-Triviality, Non-Blocking
  • 120. prepare 2b prepared initial leader other RMs RM1 2a prepared Eager replication – Paxos Commit Acceptors begin commit commit
  • 121. Eager replication – Paxos Commit ( failure ) prepare timeout, no decision initial leader other RMs RM 1 2a prepared Acceptors begin commit abort prepare 2a prepared timeout, no decision
  • 122. Master node Slave node(s) users products Lazy replication – Master/slave addresses read write read
  • 123. Master node(s) Master node(s) Lazy replication – Master/master read write read users id(1-N) users id(1-M) items id(1-K) items id(1-L) write
  • 124. stable updates Gossip – RM RM1 Clock table Replica clock Update log Value clock Value Executed operation table write RM2 gossip
  • 125. Node 1 Node 2 Node 3 update Gossip – node down/up Node 4 update update, 4 down read read, 4 up update
  • 126. Hinted handoff N: node, G: group including N node(N) is unavailable replicate to G or store data(N) locally hint handoff for later node(N) is alive handoff data to node(N)
  • 127. Key = “foo” N replicate Key = “foo”, # = N -> handoff hint = true Direct replica fails
  • 128. Replica recovers handoff
  • 129. N Key = “foo”, # = N -> handoff hint = true All replicas fail
  • 130. All replicas recover replicate handoff
  • 131. I’m a speed king, see me fly
  • 132. MapReduce
  • 133. MapReduce model: functional map/fold out-database MR irrelevant in-database MR: data locality no splitting needed distributed querying distributed processing
  • 134. In-database MapReduce map reduce Node X Node C N = &quot;Alice&quot; map query = &quot;Alice&quot; Node A N = „ Alice&quot; Node B N = &quot;Alice&quot; map hit list
  • 135. Caching
  • 136. Caching Variations: eager write , append only lazy write , eventual consistency
  • 137. Write through read write data store products write through users cache read read miss
  • 138. Write back / snapshotting read write data store products write back users cache read miss
  • 139. Physical storage
  • 140. Physical storage row based: irrelevant column based: many rows, few columns value based: ad-hoc querying
  • 141. Column based storage 1, 2 Peter, Anna London, Paris data store ID Name City 1 Peter London 2 Anna Paris
  • 142. Value based storage 1:1, 3:Peter, 5:London, 2:2, 4:Anna, 6:Paris, 7:[1, 3, 5], 8:[2, 4, 6] data store ID Name City 1 Peter London 2 Anna Paris
  • 143. Agenda Blah-blah More blah-blah Color pics Standing ovations
  • 144. Thank you
  • 145. Many graphics I’ve created myself, though I better should have asked @mononcqc for help ‘cause his drawings are awesome Some images originate from istockphoto.com except few ones taken from Wikipedia and product pages

×