SlideShare a Scribd company logo
Next Generation Indexes For Big Data Engineering
Daniel Lemire and collaborators
blog: https://lemire.me
twitter: @lemire
Université du Québec (TÉLUQ)
Montreal
“One Size Fits All”: An Idea Whose Time Has Come and Gone (Stonebraker, 2005)
3
Rediscover Unix
In 2018, Big Data Engineering is made of several specialized and re‑usable components:
Calcite : SQL + optimization
Hadoop
etc.
4
"Make your own database engine from parts"
We are in a Cambrian explosion, with thousands of organizations and companies building their
custom high‑speed systems.
Specialized used cases
Heterogeneous data (not everything is in your Oracle DB)
5
For high‑speed in data engineering you need...
High‑level optimizations (e.g., Calcite)
Indexes (e.g., Pilosa, Elasticsearch)
Great compression routimnes
Specialized data structures
....
6
Sets
A fundamental concept (sets of documents, identifiers, tuples...)
→ For performance, we often work with sets of integers (identifiers).
→ Often 32‑bit integers suffice (local identifiers).
7
tests : x ∈ S?
intersections : S ∩ S , unions : S ∪ S , differences : S ∖ S
Similarity (Jaccard/Tanimoto): ∣S ∩ S ∣/∣S ∪ S ∣
Iteration
for x in S do
print(x)
2 1 2 1 2 1
1 1 1 2
8
How to implement sets?
sorted arrays ( std::vector<uint32_t> )
hash tables ( java.util.HashSet<Integer> ,  std::unordered_set<uint32_t> )
…
bitmap ( java.util.BitSet )
compressed bitmaps
9
Arrays are your friends
while (low <= high) {
int mI =
(low + high) >>> 1;
int m = array.get(mI);
if (m < key) {
low = mI + 1;
} else if (m > key) {
high = mI - 1;
} else {
return mI;
}
}
return -(low + 1);
10
Hash tables
value x at index h(x)
random access to a value in expected constant‑time
much faster than arrays
11
in‑order access is kind of terrible
[15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7]
[15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7]
[15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7]
[15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7]
[15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7]
[15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7]
(Robin Hood, linear probing, MurmurHash3 hash function)
12
Set operations on hash tables
h1 <- hash set
h2 <- hash set
...
for(x in h1) {
insert x in h2 // cache miss?
}
13
"Crash" Swift
var S1 = Set<Int>(1...size)
var S2 = Set<Int>()
for i in d {
S2.insert(i)
}
14
Some numbers: half an hour for 64M keys
size time (s)
1M 0.8
8M 22
64M 1400
Maps and sets can have quadratic‑time performance
https://lemire.me/blog/2017/01/30/maps‑and‑sets‑can‑have‑quadratic‑time‑performance/
Rust hash iteration+reinsertion
https://accidentallyquadratic.tumblr.com/post/153545455987/rust‑hash‑iteration‑reinsertion
15
16
Bitmaps
Efficient way to represent sets of integers.
For example, 0, 1, 3, 4 becomes  0b11011 or "27".
{0} →  0b00001 
{0, 3} →  0b01001 
{0, 3, 4} →  0b11001 
{0, 1, 3, 4} →  0b11011 
17
Manipulate a bitmap
64‑bit processor.
Given x , word index is  x/64 and bit index  x % 64 .
add(x) {
array[x / 64] |= (1 << (x % 64))
}
18
How fast is it?
index = x / 64 -> a shift
mask = 1 << ( x % 64) -> a shift
array[ index ] |- mask -> a OR with memory
One bit every ≈ 1.65 cycles because of superscalarity
19
Bit parallelism
Intersection between {0, 1, 3} and {1, 3}
a single AND operation
between  0b1011 and  0b1010 .
Result is 0b1010 or {1, 3}.
No branching!
20
Bitmaps love wide registers
SIMD: Single Intruction Multiple Data
SSE (Pentium 4), ARM NEON 128 bits
AVX/AVX2 (256 bits)
AVX‑512 (512 bits)
AVX‑512 is now available (e.g., from Dell!) with Skylake‑X processors.
21
Bitsets can take too much memory
{1, 32000, 64000} : 1000 bytes for three values
We use compression!
22
Git (GitHub) utilise EWAH
Run‑length encoding
Example: 000000001111111100 est
00000000 − 11111111 − 00
Code long runs of 0s or 1s efficiently.
https://github.com/git/git/blob/master/ewah/bitmap.c
23
Complexity
Intersection : O(∣S ∣ + ∣S ∣) or O(min(∣S ∣, ∣S ∣))
In‑place union (S ← S ∪ S ): O(∣S ∣ + ∣S ∣) or O(∣S ∣)
1 2 1 2
2 1 2 1 2 2
24
Roaring Bitmaps
http://roaringbitmap.org/
Apache Lucene, Solr et Elasticsearch, Metamarkets’ Druid, Apache Spark, Apache Hive,
Apache Tez, Netflix Atlas, LinkedIn Pinot, InfluxDB, Pilosa, Microsoft Visual Studio Team
Services (VSTS), Couchbase's Bleve, Intel’s Optimized Analytics Package (OAP), Apache
Hivemall, eBay’s Apache Kylin.
Java, C, Go (interoperable)
Roaring bitmaps 25
Hybrid model
Set of containers
sorted arrays ({1,20,144})
bitset (0b10000101011)
runs ([0,10],[15,20])
Related to: O'Neil's RIDBit + BitMagic‑
Roaring bitmaps 26
Roaring bitmaps 27
Roaring
All containers are small (8 kB), fit in CPU cache
We predict the output container type during computations
E.g., when array gets too large, we switch to a bitset
Union of two large arrays is materialized as a bitset...
Dozens of heuristics... sorting networks and so on
Roaring bitmaps 28
Use Roaring for bitmap compression whenever possible. Do not use other bitmap compression
methods (Wang et al., SIGMOD 2017)
Roaring bitmaps 29
Unions of 200 bitmaps (cycles per input input value)
bitset array hash table Roaring
census1881 524 32 195 15.1
weather 15.3 32 195 5.38
bitset array hash table Roaring
census1881 9.85 542 1010 2.6
weather 0.35 94 237 0.16
Roaring bitmaps 30
Integer compression
"Standard" technique: VByte, VarInt, VInt
Use 1, 2, 3, 4, ... byte per integer
Use one bit per byte to indicate the length of the integers in bytes
Lucene, Protocol Buffers, etc.
Integer compression 31
varint‑GB from Google
VByte: one branch per integer
varint‑GB: one branch per 4 integers
each 4‑integer block is preceded byte a control byte
Integer compression 32
Vectorisation
Stepanov (STL in C++) working for Amazon proposed varint‑G8IU
Use vectorization (SIMD)
Patented
Fastest byte‑oriented compression technique (until recently)
SIMD‑Based Decoding of Posting Lists, CIKM 2011
https://stepanovpapers.com/SIMD_Decoding_TR.pdf
Integer compression 33
Observations from Stepanov et al.
We can vectorize Google's varint‑GB, but it is not as fast as varint‑G8IU
Integer compression 34
Stream VByte
Reuse varint‑GB from Google
But instead of mixing control bytes and data bytes, ...
We store control bytes separately and consecutively...
Daniel Lemire, Nathan Kurz, Christoph Rupp
Stream VByte: Faster Byte‑Oriented Integer Compression
Information Processing Letters 130, 2018
Integer compression 35
Integer compression 36
Stream VByte is used by...
Redis (within RediSearch) https://redislabs.com
upscaledb https://upscaledb.com
Trinity https://github.com/phaistos‑networks/Trinity
Integer compression 37
Dictionary coding
Use, e.g., by Apache Arrow
Given a list of values:
"Montreal", "Toronto", "Boston", "Montreal", "Boston"...
Map to integers
0, 1, 2, 0, 2
Compress integers:
Given 2 distinct values...
Can use n‑bit per values (binary packing, patched coding, frame‑of‑reference)
n
Integer compression 38
Dictionary coding + SIMD
dict. size bits per value scalar AVX2 (256‑bit) AVX‑512 (512‑bit)
32 5 8 3 1.5
1024 10 8 3.5 2
65536 16 12 5.5 4.5
(cycles per value decoded)
https://github.com/lemire/dictionary
Integer compression 39
To learn more...
Blog (twice a week) : https://lemire.me/blog/
GitHub: https://github.com/lemire
Home page : https://lemire.me/en/
CRSNG : Faster Compressed Indexes On Next‑Generation Hardware (2017‑2022)
Twitter @lemire
@lemire 40

More Related Content

What's hot

WebAssembly向け多倍長演算の実装
WebAssembly向け多倍長演算の実装WebAssembly向け多倍長演算の実装
WebAssembly向け多倍長演算の実装
MITSUNARI Shigeo
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in Practice
Rakuten Group, Inc.
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014
Mark Rees
 
Introduction to RevKit
Introduction to RevKitIntroduction to RevKit
Introduction to RevKit
Mathias Soeken
 
Reversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKitReversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKit
Mathias Soeken
 
High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018
Zahari Dichev
 
Deep dumpster diving 2010
Deep dumpster diving 2010Deep dumpster diving 2010
Deep dumpster diving 2010RonnBlack
 
Python opcodes
Python opcodesPython opcodes
Python opcodes
alexgolec
 
TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.
lnikolaeva
 
On Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & OutlooksOn Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & Outlooks
Filip Maertens
 
Understanding the Disruptor
Understanding the DisruptorUnderstanding the Disruptor
Understanding the Disruptor
Trisha Gee
 
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and TasksSegmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
David Evans
 
Java Performance Tweaks
Java Performance TweaksJava Performance Tweaks
Java Performance Tweaks
Jim Bethancourt
 
ZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their Applications
Alex Pruden
 
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
OdessaJS Conf
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
Alex Pruden
 
Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...
Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...
Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...
MITSUNARI Shigeo
 
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
OdessaJS Conf
 
Functional Reactive Programming with RxJS
Functional Reactive Programming with RxJSFunctional Reactive Programming with RxJS
Functional Reactive Programming with RxJS
stefanmayer13
 

What's hot (20)

WebAssembly向け多倍長演算の実装
WebAssembly向け多倍長演算の実装WebAssembly向け多倍長演算の実装
WebAssembly向け多倍長演算の実装
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in Practice
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014
 
Introduction to RevKit
Introduction to RevKitIntroduction to RevKit
Introduction to RevKit
 
Reversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKitReversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKit
 
High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018
 
Deep dumpster diving 2010
Deep dumpster diving 2010Deep dumpster diving 2010
Deep dumpster diving 2010
 
Python opcodes
Python opcodesPython opcodes
Python opcodes
 
TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.TCO in Python via bytecode manipulation.
TCO in Python via bytecode manipulation.
 
On Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & OutlooksOn Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & Outlooks
 
Understanding the Disruptor
Understanding the DisruptorUnderstanding the Disruptor
Understanding the Disruptor
 
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and TasksSegmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
 
Java Performance Tweaks
Java Performance TweaksJava Performance Tweaks
Java Performance Tweaks
 
ZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their ApplicationsZK Study Club: Sumcheck Arguments and Their Applications
ZK Study Club: Sumcheck Arguments and Their Applications
 
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...
Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...
Efficient Two-level Homomorphic Encryption in Prime-order Bilinear Groups and...
 
Faster Python, FOSDEM
Faster Python, FOSDEMFaster Python, FOSDEM
Faster Python, FOSDEM
 
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
 
Functional Reactive Programming with RxJS
Functional Reactive Programming with RxJSFunctional Reactive Programming with RxJS
Functional Reactive Programming with RxJS
 

Similar to Next Generation Indexes For Big Data Engineering (ODSC East 2018)

Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Spark Summit
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
Amazon Web Services
 
HeroLympics Eng V03 Henk Vd Valk
HeroLympics  Eng V03 Henk Vd ValkHeroLympics  Eng V03 Henk Vd Valk
HeroLympics Eng V03 Henk Vd Valk
hvdvalk
 
Joblib: Lightweight pipelining for parallel jobs (v2)
Joblib:  Lightweight pipelining for parallel jobs (v2)Joblib:  Lightweight pipelining for parallel jobs (v2)
Joblib: Lightweight pipelining for parallel jobs (v2)
Marcel Caraciolo
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Altinity Ltd
 
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryInterview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
PVS-Studio
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
Evan Chan
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
Sagar Dolas
 
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichData Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Romeo Kienzler
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with Cassandra
Jacek Lewandowski
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 
Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015
Christian Peel
 
General Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareGeneral Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareDaniel Blezek
 
GWT is Smarter Than You
GWT is Smarter Than YouGWT is Smarter Than You
GWT is Smarter Than You
Robert Cooper
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningCarol McDonald
 
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
MITSUNARI Shigeo
 
Java on arm theory, applications, and workloads [dev5048]
Java on arm  theory, applications, and workloads [dev5048]Java on arm  theory, applications, and workloads [dev5048]
Java on arm theory, applications, and workloads [dev5048]
Aleksei Voitylov
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8
Rahul Gupta
 
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
Umbra Software
 

Similar to Next Generation Indexes For Big Data Engineering (ODSC East 2018) (20)

Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
HeroLympics Eng V03 Henk Vd Valk
HeroLympics  Eng V03 Henk Vd ValkHeroLympics  Eng V03 Henk Vd Valk
HeroLympics Eng V03 Henk Vd Valk
 
Joblib: Lightweight pipelining for parallel jobs (v2)
Joblib:  Lightweight pipelining for parallel jobs (v2)Joblib:  Lightweight pipelining for parallel jobs (v2)
Joblib: Lightweight pipelining for parallel jobs (v2)
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
 
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryInterview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichData Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with Cassandra
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015
 
General Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareGeneral Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics Hardware
 
GWT is Smarter Than You
GWT is Smarter Than YouGWT is Smarter Than You
GWT is Smarter Than You
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
 
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
深層学習フレームワークにおけるIntel CPU/富岳向け最適化法
 
Java on arm theory, applications, and workloads [dev5048]
Java on arm  theory, applications, and workloads [dev5048]Java on arm  theory, applications, and workloads [dev5048]
Java on arm theory, applications, and workloads [dev5048]
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8
 
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
 

More from Daniel Lemire

Accurate and efficient software microbenchmarks
Accurate and efficient software microbenchmarksAccurate and efficient software microbenchmarks
Accurate and efficient software microbenchmarks
Daniel Lemire
 
Parsing JSON Really Quickly: Lessons Learned
Parsing JSON Really Quickly: Lessons LearnedParsing JSON Really Quickly: Lessons Learned
Parsing JSON Really Quickly: Lessons Learned
Daniel Lemire
 
Ingénierie de la performance au sein des mégadonnées
Ingénierie de la performance au sein des mégadonnéesIngénierie de la performance au sein des mégadonnées
Ingénierie de la performance au sein des mégadonnées
Daniel Lemire
 
SIMD Compression and the Intersection of Sorted Integers
SIMD Compression and the Intersection of Sorted IntegersSIMD Compression and the Intersection of Sorted Integers
SIMD Compression and the Intersection of Sorted Integers
Daniel Lemire
 
Decoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorizationDecoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorization
Daniel Lemire
 
Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...
Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...
Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...
Daniel Lemire
 
MaskedVByte: SIMD-accelerated VByte
MaskedVByte: SIMD-accelerated VByteMaskedVByte: SIMD-accelerated VByte
MaskedVByte: SIMD-accelerated VByte
Daniel Lemire
 
Roaring Bitmaps (January 2016)
Roaring Bitmaps (January 2016)Roaring Bitmaps (January 2016)
Roaring Bitmaps (January 2016)
Daniel Lemire
 
Roaring Bitmap : June 2015 report
Roaring Bitmap : June 2015 reportRoaring Bitmap : June 2015 report
Roaring Bitmap : June 2015 report
Daniel Lemire
 
La vectorisation des algorithmes de compression
La vectorisation des algorithmes de compression La vectorisation des algorithmes de compression
La vectorisation des algorithmes de compression
Daniel Lemire
 
Decoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorization  Decoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorization
Daniel Lemire
 
Extracting, Transforming and Archiving Scientific Data
Extracting, Transforming and Archiving Scientific DataExtracting, Transforming and Archiving Scientific Data
Extracting, Transforming and Archiving Scientific Data
Daniel Lemire
 
Innovation without permission: from Codd to NoSQL
Innovation without permission: from Codd to NoSQLInnovation without permission: from Codd to NoSQL
Innovation without permission: from Codd to NoSQL
Daniel Lemire
 
Write good papers
Write good papersWrite good papers
Write good papers
Daniel Lemire
 
Faster Column-Oriented Indexes
Faster Column-Oriented IndexesFaster Column-Oriented Indexes
Faster Column-Oriented Indexes
Daniel Lemire
 
Compressing column-oriented indexes
Compressing column-oriented indexesCompressing column-oriented indexes
Compressing column-oriented indexes
Daniel Lemire
 
All About Bitmap Indexes... And Sorting Them
All About Bitmap Indexes... And Sorting ThemAll About Bitmap Indexes... And Sorting Them
All About Bitmap Indexes... And Sorting Them
Daniel Lemire
 
A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP
A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAPA Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP
A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP
Daniel Lemire
 
Tag-Cloud Drawing: Algorithms for Cloud Visualization
Tag-Cloud Drawing: Algorithms for Cloud VisualizationTag-Cloud Drawing: Algorithms for Cloud Visualization
Tag-Cloud Drawing: Algorithms for Cloud Visualization
Daniel Lemire
 

More from Daniel Lemire (20)

Accurate and efficient software microbenchmarks
Accurate and efficient software microbenchmarksAccurate and efficient software microbenchmarks
Accurate and efficient software microbenchmarks
 
Parsing JSON Really Quickly: Lessons Learned
Parsing JSON Really Quickly: Lessons LearnedParsing JSON Really Quickly: Lessons Learned
Parsing JSON Really Quickly: Lessons Learned
 
Ingénierie de la performance au sein des mégadonnées
Ingénierie de la performance au sein des mégadonnéesIngénierie de la performance au sein des mégadonnées
Ingénierie de la performance au sein des mégadonnées
 
SIMD Compression and the Intersection of Sorted Integers
SIMD Compression and the Intersection of Sorted IntegersSIMD Compression and the Intersection of Sorted Integers
SIMD Compression and the Intersection of Sorted Integers
 
Decoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorizationDecoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorization
 
Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...
Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...
Logarithmic Discrete Wavelet Transform for High-Quality Medical Image Compres...
 
MaskedVByte: SIMD-accelerated VByte
MaskedVByte: SIMD-accelerated VByteMaskedVByte: SIMD-accelerated VByte
MaskedVByte: SIMD-accelerated VByte
 
Roaring Bitmaps (January 2016)
Roaring Bitmaps (January 2016)Roaring Bitmaps (January 2016)
Roaring Bitmaps (January 2016)
 
Roaring Bitmap : June 2015 report
Roaring Bitmap : June 2015 reportRoaring Bitmap : June 2015 report
Roaring Bitmap : June 2015 report
 
La vectorisation des algorithmes de compression
La vectorisation des algorithmes de compression La vectorisation des algorithmes de compression
La vectorisation des algorithmes de compression
 
OLAP and more
OLAP and moreOLAP and more
OLAP and more
 
Decoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorization  Decoding billions of integers per second through vectorization
Decoding billions of integers per second through vectorization
 
Extracting, Transforming and Archiving Scientific Data
Extracting, Transforming and Archiving Scientific DataExtracting, Transforming and Archiving Scientific Data
Extracting, Transforming and Archiving Scientific Data
 
Innovation without permission: from Codd to NoSQL
Innovation without permission: from Codd to NoSQLInnovation without permission: from Codd to NoSQL
Innovation without permission: from Codd to NoSQL
 
Write good papers
Write good papersWrite good papers
Write good papers
 
Faster Column-Oriented Indexes
Faster Column-Oriented IndexesFaster Column-Oriented Indexes
Faster Column-Oriented Indexes
 
Compressing column-oriented indexes
Compressing column-oriented indexesCompressing column-oriented indexes
Compressing column-oriented indexes
 
All About Bitmap Indexes... And Sorting Them
All About Bitmap Indexes... And Sorting ThemAll About Bitmap Indexes... And Sorting Them
All About Bitmap Indexes... And Sorting Them
 
A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP
A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAPA Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP
A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP
 
Tag-Cloud Drawing: Algorithms for Cloud Visualization
Tag-Cloud Drawing: Algorithms for Cloud VisualizationTag-Cloud Drawing: Algorithms for Cloud Visualization
Tag-Cloud Drawing: Algorithms for Cloud Visualization
 

Recently uploaded

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

Next Generation Indexes For Big Data Engineering (ODSC East 2018)

  • 1.
  • 2. Next Generation Indexes For Big Data Engineering Daniel Lemire and collaborators blog: https://lemire.me twitter: @lemire Université du Québec (TÉLUQ) Montreal
  • 3. “One Size Fits All”: An Idea Whose Time Has Come and Gone (Stonebraker, 2005) 3
  • 4. Rediscover Unix In 2018, Big Data Engineering is made of several specialized and re‑usable components: Calcite : SQL + optimization Hadoop etc. 4
  • 5. "Make your own database engine from parts" We are in a Cambrian explosion, with thousands of organizations and companies building their custom high‑speed systems. Specialized used cases Heterogeneous data (not everything is in your Oracle DB) 5
  • 6. For high‑speed in data engineering you need... High‑level optimizations (e.g., Calcite) Indexes (e.g., Pilosa, Elasticsearch) Great compression routimnes Specialized data structures .... 6
  • 7. Sets A fundamental concept (sets of documents, identifiers, tuples...) → For performance, we often work with sets of integers (identifiers). → Often 32‑bit integers suffice (local identifiers). 7
  • 8. tests : x ∈ S? intersections : S ∩ S , unions : S ∪ S , differences : S ∖ S Similarity (Jaccard/Tanimoto): ∣S ∩ S ∣/∣S ∪ S ∣ Iteration for x in S do print(x) 2 1 2 1 2 1 1 1 1 2 8
  • 9. How to implement sets? sorted arrays ( std::vector<uint32_t> ) hash tables ( java.util.HashSet<Integer> ,  std::unordered_set<uint32_t> ) … bitmap ( java.util.BitSet ) compressed bitmaps 9
  • 10. Arrays are your friends while (low <= high) { int mI = (low + high) >>> 1; int m = array.get(mI); if (m < key) { low = mI + 1; } else if (m > key) { high = mI - 1; } else { return mI; } } return -(low + 1); 10
  • 11. Hash tables value x at index h(x) random access to a value in expected constant‑time much faster than arrays 11
  • 12. in‑order access is kind of terrible [15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7] [15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7] [15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7] [15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7] [15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7] [15, 3, 0, 6, 11, 4, 5, 9, 12, 13, 8, 2, 1, 14, 10, 7] (Robin Hood, linear probing, MurmurHash3 hash function) 12
  • 13. Set operations on hash tables h1 <- hash set h2 <- hash set ... for(x in h1) { insert x in h2 // cache miss? } 13
  • 14. "Crash" Swift var S1 = Set<Int>(1...size) var S2 = Set<Int>() for i in d { S2.insert(i) } 14
  • 15. Some numbers: half an hour for 64M keys size time (s) 1M 0.8 8M 22 64M 1400 Maps and sets can have quadratic‑time performance https://lemire.me/blog/2017/01/30/maps‑and‑sets‑can‑have‑quadratic‑time‑performance/ Rust hash iteration+reinsertion https://accidentallyquadratic.tumblr.com/post/153545455987/rust‑hash‑iteration‑reinsertion 15
  • 16. 16
  • 17. Bitmaps Efficient way to represent sets of integers. For example, 0, 1, 3, 4 becomes  0b11011 or "27". {0} →  0b00001  {0, 3} →  0b01001  {0, 3, 4} →  0b11001  {0, 1, 3, 4} →  0b11011  17
  • 18. Manipulate a bitmap 64‑bit processor. Given x , word index is  x/64 and bit index  x % 64 . add(x) { array[x / 64] |= (1 << (x % 64)) } 18
  • 19. How fast is it? index = x / 64 -> a shift mask = 1 << ( x % 64) -> a shift array[ index ] |- mask -> a OR with memory One bit every ≈ 1.65 cycles because of superscalarity 19
  • 20. Bit parallelism Intersection between {0, 1, 3} and {1, 3} a single AND operation between  0b1011 and  0b1010 . Result is 0b1010 or {1, 3}. No branching! 20
  • 21. Bitmaps love wide registers SIMD: Single Intruction Multiple Data SSE (Pentium 4), ARM NEON 128 bits AVX/AVX2 (256 bits) AVX‑512 (512 bits) AVX‑512 is now available (e.g., from Dell!) with Skylake‑X processors. 21
  • 22. Bitsets can take too much memory {1, 32000, 64000} : 1000 bytes for three values We use compression! 22
  • 23. Git (GitHub) utilise EWAH Run‑length encoding Example: 000000001111111100 est 00000000 − 11111111 − 00 Code long runs of 0s or 1s efficiently. https://github.com/git/git/blob/master/ewah/bitmap.c 23
  • 24. Complexity Intersection : O(∣S ∣ + ∣S ∣) or O(min(∣S ∣, ∣S ∣)) In‑place union (S ← S ∪ S ): O(∣S ∣ + ∣S ∣) or O(∣S ∣) 1 2 1 2 2 1 2 1 2 2 24
  • 25. Roaring Bitmaps http://roaringbitmap.org/ Apache Lucene, Solr et Elasticsearch, Metamarkets’ Druid, Apache Spark, Apache Hive, Apache Tez, Netflix Atlas, LinkedIn Pinot, InfluxDB, Pilosa, Microsoft Visual Studio Team Services (VSTS), Couchbase's Bleve, Intel’s Optimized Analytics Package (OAP), Apache Hivemall, eBay’s Apache Kylin. Java, C, Go (interoperable) Roaring bitmaps 25
  • 26. Hybrid model Set of containers sorted arrays ({1,20,144}) bitset (0b10000101011) runs ([0,10],[15,20]) Related to: O'Neil's RIDBit + BitMagic‑ Roaring bitmaps 26
  • 28. Roaring All containers are small (8 kB), fit in CPU cache We predict the output container type during computations E.g., when array gets too large, we switch to a bitset Union of two large arrays is materialized as a bitset... Dozens of heuristics... sorting networks and so on Roaring bitmaps 28
  • 29. Use Roaring for bitmap compression whenever possible. Do not use other bitmap compression methods (Wang et al., SIGMOD 2017) Roaring bitmaps 29
  • 30. Unions of 200 bitmaps (cycles per input input value) bitset array hash table Roaring census1881 524 32 195 15.1 weather 15.3 32 195 5.38 bitset array hash table Roaring census1881 9.85 542 1010 2.6 weather 0.35 94 237 0.16 Roaring bitmaps 30
  • 31. Integer compression "Standard" technique: VByte, VarInt, VInt Use 1, 2, 3, 4, ... byte per integer Use one bit per byte to indicate the length of the integers in bytes Lucene, Protocol Buffers, etc. Integer compression 31
  • 32. varint‑GB from Google VByte: one branch per integer varint‑GB: one branch per 4 integers each 4‑integer block is preceded byte a control byte Integer compression 32
  • 33. Vectorisation Stepanov (STL in C++) working for Amazon proposed varint‑G8IU Use vectorization (SIMD) Patented Fastest byte‑oriented compression technique (until recently) SIMD‑Based Decoding of Posting Lists, CIKM 2011 https://stepanovpapers.com/SIMD_Decoding_TR.pdf Integer compression 33
  • 34. Observations from Stepanov et al. We can vectorize Google's varint‑GB, but it is not as fast as varint‑G8IU Integer compression 34
  • 35. Stream VByte Reuse varint‑GB from Google But instead of mixing control bytes and data bytes, ... We store control bytes separately and consecutively... Daniel Lemire, Nathan Kurz, Christoph Rupp Stream VByte: Faster Byte‑Oriented Integer Compression Information Processing Letters 130, 2018 Integer compression 35
  • 37. Stream VByte is used by... Redis (within RediSearch) https://redislabs.com upscaledb https://upscaledb.com Trinity https://github.com/phaistos‑networks/Trinity Integer compression 37
  • 38. Dictionary coding Use, e.g., by Apache Arrow Given a list of values: "Montreal", "Toronto", "Boston", "Montreal", "Boston"... Map to integers 0, 1, 2, 0, 2 Compress integers: Given 2 distinct values... Can use n‑bit per values (binary packing, patched coding, frame‑of‑reference) n Integer compression 38
  • 39. Dictionary coding + SIMD dict. size bits per value scalar AVX2 (256‑bit) AVX‑512 (512‑bit) 32 5 8 3 1.5 1024 10 8 3.5 2 65536 16 12 5.5 4.5 (cycles per value decoded) https://github.com/lemire/dictionary Integer compression 39
  • 40. To learn more... Blog (twice a week) : https://lemire.me/blog/ GitHub: https://github.com/lemire Home page : https://lemire.me/en/ CRSNG : Faster Compressed Indexes On Next‑Generation Hardware (2017‑2022) Twitter @lemire @lemire 40