Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

6,747 views

Published on

See https://bitbucket.org/vsmirnov/memoria/wiki/MemoriaForBigData

for additional details.

Published in:
Data & Analytics

No Downloads

Total views

6,747

On SlideShare

0

From Embeds

0

Number of Embeds

4,511

Shares

0

Downloads

5

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Advanced Non-Relational Schemas For Big Data by Victor Smirnov
- 2. Non-Relational Schema ● Is just a data structure ● That uses some Memory Model ● Typically, Key->Value mapping ● Where Key is an Integer ID ● And Value is an arbitrary array of a limited size or memory block ● It's assumed that operations on memory blocks are atomic.
- 3. Storage Options
- 4. Partial (Prefix) Sums Tree ● Given a sequence of S[0, N) = s0...sn-1 of non- negative integers ● Sum(i) returns X = s0+s1+...+si. ● FindLT(X) returns position i of largest Sum(i) < X ● FindLE(X) is the same, but Sum(i) <= X ● We can also define range versions of Sum(i, j) and FindLT(j, X) ● All operations perform in O(log N) time.
- 5. Packing Perfect Balanced Tree into an Array
- 6. Some Performance Bits 0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07 3e+07 3.5e+07 4e+07 4.5e+07 5e+07 1 4 16 64 256 1024 4096 16384 65536 262144 Performance,operations/sec Memory Block Size, Kb PackedTree random read performance, 1 million random reads PackedTree<BigInt>, 2 children PackedTree<BigInt>, 32 children std::set<BigInt>, 2 children L1 L2 L3 RAM
- 7. Dynamic Vector ● An ordered sequence of elements (bytes, integers, strings) of size N ● Acess(i) is O(log N) ● Insert(i, value) is O(log N) ● Delete(i) is O(log N) ● We can also define batch operations: ● Insert(i, value[]) ● Delete(i, j) ● Split(i); Merge(AnotherVector);...
- 8. Dynamic Vector
- 9. Dynamic Vector Operations ● FindLT(i) returns the B where i bounds and offset j in the block B for i ● Acces(i) is O(log N) ● Insert(i, value) and Delete(i) are also O(log N) because the tree is balanced.
- 10. File System: Map<ID, Vector<T>> ● Maps ID to Vector<T> ● Merge all values into one large Dynamic Vector, in ID order ● Create separate “index” sequence from pairs <ID, Offset> in ID order ● We can represent this “index” sequence as two partial sums tree, for ID and for Offset ● We can merge both these trees to one because they have exactly the same structure: multi-index balanced partial sums tree.
- 11. Map<ID, Vector<T>>
- 12. Sharing Tree Structures ● Tree structure sharing saves both space and time: SPMD principle (single program, multiple data) ● We can align partial sum trees with different structures using interpolation (padding with zeroes) ● We can merge index and data streams (index and data) of Map<ID, Vector<T>> in one multi-stream tree. ● Merging the trees, we will try to fix index pairs and corresponding data into the same leaf node of multi- stream tree.
- 13. Multistream Tree Node Layout
- 14. Multistream Balanced Tree
- 15. ACID ● Atomic block operations are not enough ● Even simple tree update affects several blocks ● So, ACID is mandatory for advanced non- relational schemas ● We can get ACID for free with Multi-Version Concurrency Control (MVCC) ● We need Version History over data blocks ● Where each each transaction is a version.
- 16. Transaction History via MVCC
- 17. Version History Implementation ● Version History maps pair <ID, Version> to an ID of real data block for that version and given ID ● We have Map<ID, Vector<Version, ID>> ● We can turn it to Version History by sorting each Vector<Version, ID> (less sapce, slower) ● Or by creating additional partial sums tree index on top of it (more space, but much faster) ● We can do it in just one multi-stream balanced tree ● MVCC requires some other data structures but they can be designed by analogy.
- 18. Concurrency Handling ● Version History is a complicated data structure ● Concurrent access to it must be restricted ● Split whole Version History to shards ● And shard blocks by ID to reduce lock contention on Version History
- 19. Distributed Storage and Processing ● MVCC is very Raft/Paxos-friendly ● Because of Version History and MVCC ● So we can join storage nodes to Raft groups ● And join Raft groups to larger groups with 2PC ● Using split/merge model to map data to nodes.
- 20. Bonus Slides
- 21. Searchable Bitmaps ● rank1(n) = number of ones in [0, n) ● select1(i) = position of i-th 1 in the bitmap ● rank0(n) = number of zeroes in [0, n) ● select0(i) = position of i-th 0 in the bitmap
- 22. Searchable Bitmap: Structure
- 23. Searchable Bitmaps: Views
- 24. LOUDS Tree
- 25. LOUDS Tree: Parent()
- 26. Wavelet Tree ● Searchable sequence [0...N) for large alphabets ● Rank(i, s) returns number of symbols s in [0, i) ● Select(k, s) returns position i of k-th symbol s ● Insert(i, s), Delere(i), Access(i) – insert, remove and access the symbol at position i respectively ● All these operations have O(log N) time complexity ● By mapping numbers to symbols we can perform the following lookup operations: >, >=, <, <=, <> in O(log N) time.
- 27. Wavelet Tree: Structure
- 28. Wavelet Tree: Rank
- 29. Wavelet Tree: Inverted Index
- 30. Inverted Index Lookup
- 31. Thanks! More details are at: https://bitbucket.org/vsmirnov/memoria/wiki/MemoriaForBigData

No public clipboards found for this slide

Be the first to comment