SlideShare a Scribd company logo
x86/x64最適化勉強会#4
A x86-optimized rank/select
dictionary for bit sequences
                             2012/6/16
                     Takeshi Yamamuro




                                         1
What’s Succinct Data Structure?




                                  2
SDS: Succinct Data Structure
        • Recently, Getting Popular in Some Areas
              – Researches & Engineering

        • Not Data Structure, But Data Representation
              – A compressed method for other data structures
              – e.g., alphabets, trees, and graphs

        • Transparent Operations w/o Unpacking Explicitly
              – e.g., succinct LZ77 compression*1




*1
                                                                                                             3
     Kreft, S. and Navarro, G.: LZ77-Like Compression with Fast Random Access, In Proceedings of DCC, 2010
More Details
• SDS = Succinct Data + Succinct Index

• Succinct Data
  – Compact representation for target data
  – Almost to information theoretic lower bounds
               e.g., If N patterns, the lower bound’s logN


• Succinct Index
  – O(1) operations for target data
  – o(N) space costs: ignored asymptotically




                                                             4
More Details

   If you need more information, ...




                  cited from: http://goo.gl/rkQ5z
                                                    5
A rank/select dictionary for SDS




                                   6
A Rank/Select Operations
• SDS Composed of Rank/Select Operations
  – Many calls of rank/select inside

• Rank/Select for Succinct Bit Sequences: B[i]
  – rankx(n, B): the total of 1s in B[0...n]
  – selectx(n, B): n-th position of x in B[]



        i   0    1     2    3   4    5   6     7   8
     B[i]   1    0     1    1   0    0   1     1   0
                     rank1(5, B)=3   select1(4, B)=6


                                                       7
A Rank/Select Operations
• Available Rank/Select Implementation
  – ux-trie: http://code.google.com/p/ux-trie/
  – rx: http://code.google.com/p/mozc/
  – marisa-trie: http://code.google.com/p/marisa-trie/


• Today Contributions
  – x86-optimized rank/select
  – https://github.com/maropu/dbitv




                                                         8
Performance Results
        • Performance Benchmark Setups*1
              – Generate a random sequence of bits: 50% density
              – Random rank/select queries over the bits
              – CPU: Intel Core-i5 U470@1.33GHz

        • Latency Observed
              – 11 trials, and median latency




*1
                                                                   9
     Reference: http://d.hatena.ne.jp/s-yata/20111216/1324032373
Performance Results: Rank

                             1.E+03
averaged rank latency (ns)




                             1.E+02




                             1.E+01                ux
                                                   rx
                                                   marisa
                                                   opt

                             1.E+00




                                      bit length
                                                            10
Performance Results: Select

                               1.E+04
averaged select latency (ns)




                               1.E+03



                               1.E+02


                                                     ux
                               1.E+01                rx
                                                     marisa
                                                     opt

                               1.E+00




                                        bit length

                                                              11
Implementation Details




                         12
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space

 B[] =              A sequence of bits


                          N-bits




                                               13
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
                log 2 N
  B[] =                          A sequence of bits

  L[] =            l1                       l2


• Split into log2N fixed-length blocks
• Total Counts Pre-computed in L[]

                           x          x / log 2 N                      x
          rank1 ( x, B)   B[i ]                    B[i ]           B[i]
                          i 1            i 1                                
                                                                 i  x / log 2 N 1

                                      L1[ x / log 2 N ]

                                                                                      14
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
                log 2 N
  B[] =                          A sequence of bits

  L[] =            l1                       l2


• Split into log2N fixed-length blocks
• Total Counts Pre-computed in L[]

                           x          x / log 2 N                      x
          rank1 ( x, B)   B[i ]                    B[i ]           B[i]
                          i 1            i 1                                
                                                                 i  x / log 2 N 1

                                      L[ x / log 2 N ]
                                                                         O(log2N)
                                                   O(1)                               15
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                     A sequence of bits

  L[] =          l1                l2


• L[]: o(N) space costs

            N                  N
             2
                 log N  O(       )  o( N )
          log N              log N



                                                16
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                          A sequence of bits

  L[] =           l1                           l2                     1 log n
                                                                       2
 S[] = s1 s2
• Split into 1/2logN fixed-length blocks again
• Total Counts Pre-computed in S[]
                                                         1           
                 x           x / log N 
                                    2                    x / 2 log N 
                                                                                  x
 rank1 ( x, B)   B[i ]                  B[i ]           B[i]                B[i]
                i 1             i 1                               
                                                      i  x / log 2 N 1        1         
                                                                           i   x / log N  1
                                                                                2         
                                                            1
                             L[ x / log 2 n]          S[ x / log n]
                                                            2
                                                                                                  17
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                          A sequence of bits

  L[] =           l1                           l2                    1 log n
                                                                      2
 S[] = s1 s2
• Split into 1/2logN fixed-length blocks again
• Total Counts Pre-computed in S[]
                                                         1                        O(logN)
                             x / log N 
                                    2                     x / log N 
                                                         2
                 x                                                                x
 rank1 ( x, B)   B[i ]                  B[i ]           B[i]                B[i]
                i 1             i 1                              
                                                      i  x / log 2 N 1        1         
                                                                           i   x / log N  1
                                                                                2         
                                                             1
                             L[ x / log 2 n]          S [ x / log n]
                                                             2
                                        O(1)                       O(1)                           18
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
             log 2 N
 B[] =                    A sequence of bits

  L[] =        l1                 l2           1 log n
                                                2
 S[] = s1 s2
• S[]: o(N) space costs

          N                           log log N
            2
                log(log N )  O( N 
                        2
                                                )  o( N )
     1 2 log N                          log N



                                                             19
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                           A sequence of bits

  L[] =           l1                             l2                     1 log n
                                                                         2
 S[] = s1 s2
• O(1) Popcount/Table-Lookup in Last Term

                                                           1                         O(logN) -> O(1)
                 x           x / log 2 N                 x / 2 log N 
                                                                                     x
 rank1 ( x, B)   B[i ]                    B[i ]           B[i]                 B[i]
                i 1             i 1                                 
                                                        i  x / log 2 N 1         1         
                                                                              i   x / log N  1
                                                                                   2         
                                                               1
                             L[ x / log 2 n]            S [ x / log n]
                                                               2
                                          O(1)                         O(1)
                                                                                                     20
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
                 log 2 N
 B[] =                         A sequence of bits

  L[] =              l1                l2           1 log n
                                                     2
 S[] = s1 s2
• As a result, o(N) Space Costs

            N     4 N log log N          log log N
                                O( N            )  o( N )
          log N       log N                log N
          L[] size         S[] size



                                                                21
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space




                                               22
Implementation: Practice
• Low Computation Costs & High Cache Penalties
   – 3 cache/TLB misses per rank




                         ex. rank1(402=256*1+32*4+18, B)
                256bit

  B[]: 01..000000....101......0 0110....001...............0 0000100 ...
        32bit                                   Popcount these left bits

 L[]:            18                     21                                 …
 S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 …




                                                                           23
Implementation: Practice
• Low Computation Costs & High Cache Penalties
   – 3 cache/TLB misses per rank




                         ex. rank1(402=256*1+32*4+18, B)
                256bit

  B[]: 01..000000....101......0 0110....001...............0 0000100 ...
        32bit                      Miss!        Popcount these left bits

 L[]:            18      Miss!          21                                 …
 S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 …
                           Miss!




                                                                           24
Implementation: Practice
• Packing the required data into a single cacheline




                                 56B Chunk
         4B                 1B                     32B


   ・・・        12B padding
                                         0110....001..........0 padding


                                 64B Cache line




                                                                          25
Implementation: Practice
• Packing the required data into a single cacheline




                                                      26
Implementation: Practice
• BTW, where select?
  – Omitted for my time limit 
  – Plz see the code ...


• 2 Way Implementation
  – O(logN) complexity
     • ux-trie, rx, and marisa-trie
     • Binary searches with rank
     • Many cache/TLB misses suffered


  – O(1) complexity
     • My implementation to minimize these penalties
     • 1-rank, 1-SIMD comparison, and O(1) –bsf
     • Only 2 cache/TLB misses
                                                       27
Implementation: Practice
• BTW, where select?
  – Omitted for my time limit 
  – Plz see the code ...


• 2 Way Implementation
  – O(logN) complexity
     • ux-trie, rx, and marisa-trie
     • Binary searches with rank
     • Many cache/TLB misses suffered


  – O(1) complexity
     • My implementation to minimize these penalties
     • 1-rank, 1-SIMD comparison, and O(1) –bsf
     • Only 2 cache/TLB misses
                      Not implemented yet ...

                                                       28

More Related Content

What's hot

Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
DB Tsai
 
Data assimilation with OpenDA
Data assimilation with OpenDAData assimilation with OpenDA
Data assimilation with OpenDA
nilsvanvelzen
 
Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011
Ed Dodds
 
4241
42414241
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
Yusuf Bhujwalla
 
Binary decision diagrams
Binary decision diagramsBinary decision diagrams
Binary decision diagrams
haroonrashidlone
 
2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC Meetup2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC Meetup
David Smiley
 
An32272275
An32272275An32272275
An32272275
IJERA Editor
 
Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015
David Smiley
 
STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...
Luuk Brederode
 
The status of the GeoServer WPS
The status of the GeoServer WPSThe status of the GeoServer WPS
The status of the GeoServer WPS
GeoSolutions
 
Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagram
Team-VLSI-ITMU
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
佳蓉 倪
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph Convolution
Kazuki Fujikawa
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
ashishtinku
 
Algorithm Complexity and Main Concepts
Algorithm Complexity and Main ConceptsAlgorithm Complexity and Main Concepts
Algorithm Complexity and Main Concepts
Adelina Ahadova
 
MSc Presentation
MSc PresentationMSc Presentation
MSc Presentation
eriprandopacces
 

What's hot (17)

Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Data assimilation with OpenDA
Data assimilation with OpenDAData assimilation with OpenDA
Data assimilation with OpenDA
 
Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011
 
4241
42414241
4241
 
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
 
Binary decision diagrams
Binary decision diagramsBinary decision diagrams
Binary decision diagrams
 
2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC Meetup2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC Meetup
 
An32272275
An32272275An32272275
An32272275
 
Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015
 
STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...
 
The status of the GeoServer WPS
The status of the GeoServer WPSThe status of the GeoServer WPS
The status of the GeoServer WPS
 
Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagram
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph Convolution
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
 
Algorithm Complexity and Main Concepts
Algorithm Complexity and Main ConceptsAlgorithm Complexity and Main Concepts
Algorithm Complexity and Main Concepts
 
MSc Presentation
MSc PresentationMSc Presentation
MSc Presentation
 

Viewers also liked

Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介MITSUNARI Shigeo
 
x86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNTx86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNT
takesako
 
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
Ryoma Sin'ya
 
Popcntによるハミング距離計算
Popcntによるハミング距離計算Popcntによるハミング距離計算
Popcntによるハミング距離計算
Norishige Fukushima
 
X86opti01 nothingcosmos
X86opti01 nothingcosmosX86opti01 nothingcosmos
X86opti01 nothingcosmos
nothingcosmos
 
明日使えないすごいビット演算
明日使えないすごいビット演算明日使えないすごいビット演算
明日使えないすごいビット演算
京大 マイコンクラブ
 

Viewers also liked (6)

Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介
 
x86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNTx86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNT
 
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
 
Popcntによるハミング距離計算
Popcntによるハミング距離計算Popcntによるハミング距離計算
Popcntによるハミング距離計算
 
X86opti01 nothingcosmos
X86opti01 nothingcosmosX86opti01 nothingcosmos
X86opti01 nothingcosmos
 
明日使えないすごいビット演算
明日使えないすごいビット演算明日使えないすごいビット演算
明日使えないすごいビット演算
 

Similar to A x86-optimized rank&select dictionary for bit sequences

Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applications
Yu Liu
 
Threshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random PermutationsThreshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random Permutations
Aleksandr Yampolskiy
 
Slide11 icc2015
Slide11 icc2015Slide11 icc2015
Slide11 icc2015
T. E. BOGALE
 
Graph Regularised Hashing
Graph Regularised HashingGraph Regularised Hashing
Graph Regularised Hashing
Sean Moran
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
pallavidhade2
 
Mmclass5
Mmclass5Mmclass5
Mmclass5
Hassan Dar
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
PTIHPA
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
Leonid Zhukov
 
Basic data structures part I
Basic data structures part IBasic data structures part I
Basic data structures part I
Daniel Gomez-Prado
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
shin
 
Generic parallelization strategies for data assimilation
Generic parallelization strategies for data assimilationGeneric parallelization strategies for data assimilation
Generic parallelization strategies for data assimilation
nilsvanvelzen
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
Alex Pruden
 
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Sean Moran
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marks
vvcetit
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select Dictionaries
Rakuten Group, Inc.
 
Code generation in Compiler Design
Code generation in Compiler DesignCode generation in Compiler Design
Code generation in Compiler Design
Kuppusamy P
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Matthew Lease
 
15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-document15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-document
maomao125
 
Selective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarizationSelective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarization
Kodaira Tomonori
 
Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2
Khaja Dileef
 

Similar to A x86-optimized rank&select dictionary for bit sequences (20)

Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applications
 
Threshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random PermutationsThreshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random Permutations
 
Slide11 icc2015
Slide11 icc2015Slide11 icc2015
Slide11 icc2015
 
Graph Regularised Hashing
Graph Regularised HashingGraph Regularised Hashing
Graph Regularised Hashing
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
 
Mmclass5
Mmclass5Mmclass5
Mmclass5
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
Basic data structures part I
Basic data structures part IBasic data structures part I
Basic data structures part I
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
 
Generic parallelization strategies for data assimilation
Generic parallelization strategies for data assimilationGeneric parallelization strategies for data assimilation
Generic parallelization strategies for data assimilation
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marks
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select Dictionaries
 
Code generation in Compiler Design
Code generation in Compiler DesignCode generation in Compiler Design
Code generation in Compiler Design
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
 
15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-document15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-document
 
Selective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarizationSelective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarization
 
Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2
 

More from Takeshi Yamamuro

LT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature ExpectationLT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature Expectation
Takeshi Yamamuro
 
Apache Spark + Arrow
Apache Spark + ArrowApache Spark + Arrow
Apache Spark + Arrow
Takeshi Yamamuro
 
Quick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + αQuick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + α
Takeshi Yamamuro
 
MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理
Takeshi Yamamuro
 
Taming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache SparkTaming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache Spark
Takeshi Yamamuro
 
LLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecodeLLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecode
Takeshi Yamamuro
 
20180417 hivemall meetup#4
20180417 hivemall meetup#420180417 hivemall meetup#4
20180417 hivemall meetup#4
Takeshi Yamamuro
 
An Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List CompressionAn Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List Compression
Takeshi Yamamuro
 
Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題
Takeshi Yamamuro
 
20160908 hivemall meetup
20160908 hivemall meetup20160908 hivemall meetup
20160908 hivemall meetup
Takeshi Yamamuro
 
20150513 legobease
20150513 legobease20150513 legobease
20150513 legobease
Takeshi Yamamuro
 
20150516 icde2015 r19-4
20150516 icde2015 r19-420150516 icde2015 r19-4
20150516 icde2015 r19-4
Takeshi Yamamuro
 
VLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging HardwareVLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging HardwareTakeshi Yamamuro
 
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4Takeshi Yamamuro
 
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)Takeshi Yamamuro
 
Introduction to Modern Analytical DB
Introduction to Modern Analytical DBIntroduction to Modern Analytical DB
Introduction to Modern Analytical DBTakeshi Yamamuro
 
SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-Takeshi Yamamuro
 
VAST-Tree, EDBT'12
VAST-Tree, EDBT'12VAST-Tree, EDBT'12
VAST-Tree, EDBT'12
Takeshi Yamamuro
 
VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-Takeshi Yamamuro
 
研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法Takeshi Yamamuro
 

More from Takeshi Yamamuro (20)

LT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature ExpectationLT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature Expectation
 
Apache Spark + Arrow
Apache Spark + ArrowApache Spark + Arrow
Apache Spark + Arrow
 
Quick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + αQuick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + α
 
MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理
 
Taming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache SparkTaming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache Spark
 
LLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecodeLLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecode
 
20180417 hivemall meetup#4
20180417 hivemall meetup#420180417 hivemall meetup#4
20180417 hivemall meetup#4
 
An Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List CompressionAn Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List Compression
 
Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題
 
20160908 hivemall meetup
20160908 hivemall meetup20160908 hivemall meetup
20160908 hivemall meetup
 
20150513 legobease
20150513 legobease20150513 legobease
20150513 legobease
 
20150516 icde2015 r19-4
20150516 icde2015 r19-420150516 icde2015 r19-4
20150516 icde2015 r19-4
 
VLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging HardwareVLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging Hardware
 
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
 
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
 
Introduction to Modern Analytical DB
Introduction to Modern Analytical DBIntroduction to Modern Analytical DB
Introduction to Modern Analytical DB
 
SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-
 
VAST-Tree, EDBT'12
VAST-Tree, EDBT'12VAST-Tree, EDBT'12
VAST-Tree, EDBT'12
 
VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-
 
研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法
 

Recently uploaded

Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results
 
Dpboss Satta Matta Matka Kalyan Chart Indian Matka
Dpboss Satta Matta Matka Kalyan Chart Indian MatkaDpboss Satta Matta Matka Kalyan Chart Indian Matka
Dpboss Satta Matta Matka Kalyan Chart Indian Matka
Dpboss Matka
 
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
essorprof62
 
Pro Tips for Effortless Contract Management
Pro Tips for Effortless Contract ManagementPro Tips for Effortless Contract Management
Pro Tips for Effortless Contract Management
Eternity Paralegal Services
 
deft. 2024 pricing guide for onboarding
deft.  2024 pricing guide for onboardingdeft.  2024 pricing guide for onboarding
deft. 2024 pricing guide for onboarding
hello960827
 
Enhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: IntroductionEnhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: Introduction
Cor Verdouw
 
Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...
Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...
Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...
IPLTech Electric
 
Stainless Steel Conveyor Manufacturers Chennai
Stainless Steel Conveyor Manufacturers ChennaiStainless Steel Conveyor Manufacturers Chennai
Stainless Steel Conveyor Manufacturers Chennai
ConveyorSystem
 
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
eaqmokn
 
Kanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR ReportKanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR Report
Helen Meek
 
Science Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around UsScience Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around Us
PennapaKeavsiri
 
Cover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SUCover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SU
msthrill
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results
 
欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】
欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】
欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】
concepsionchomo153
 
Easy Earnings Through Refer and Earn Apps Without KYC.pptx
Easy Earnings Through Refer and Earn Apps Without KYC.pptxEasy Earnings Through Refer and Earn Apps Without KYC.pptx
Easy Earnings Through Refer and Earn Apps Without KYC.pptx
Fx Lotus
 
2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf
2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf
2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf
Cambridge Product Management Network
 
TriStar Gold Corporate Presentation - June 2024
TriStar Gold Corporate Presentation - June 2024TriStar Gold Corporate Presentation - June 2024
TriStar Gold Corporate Presentation - June 2024
Adnet Communications
 
Discover the Beauty and Functionality of The Expert Remodeling Service
Discover the Beauty and Functionality of The Expert Remodeling ServiceDiscover the Beauty and Functionality of The Expert Remodeling Service
Discover the Beauty and Functionality of The Expert Remodeling Service
obriengroupinc04
 
NewBase 20 June 2024 Energy News issue - 1731 by Khaled Al Awadi_compressed.pdf
NewBase 20 June 2024  Energy News issue - 1731 by Khaled Al Awadi_compressed.pdfNewBase 20 June 2024  Energy News issue - 1731 by Khaled Al Awadi_compressed.pdf
NewBase 20 June 2024 Energy News issue - 1731 by Khaled Al Awadi_compressed.pdf
Khaled Al Awadi
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results
 

Recently uploaded (20)

Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
Dpboss Satta Matta Matka Kalyan Chart Indian Matka
Dpboss Satta Matta Matka Kalyan Chart Indian MatkaDpboss Satta Matta Matka Kalyan Chart Indian Matka
Dpboss Satta Matta Matka Kalyan Chart Indian Matka
 
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
 
Pro Tips for Effortless Contract Management
Pro Tips for Effortless Contract ManagementPro Tips for Effortless Contract Management
Pro Tips for Effortless Contract Management
 
deft. 2024 pricing guide for onboarding
deft.  2024 pricing guide for onboardingdeft.  2024 pricing guide for onboarding
deft. 2024 pricing guide for onboarding
 
Enhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: IntroductionEnhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: Introduction
 
Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...
Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...
Sustainable Logistics for Cost Reduction_ IPLTech Electric's Eco-Friendly Tra...
 
Stainless Steel Conveyor Manufacturers Chennai
Stainless Steel Conveyor Manufacturers ChennaiStainless Steel Conveyor Manufacturers Chennai
Stainless Steel Conveyor Manufacturers Chennai
 
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
 
Kanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR ReportKanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR Report
 
Science Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around UsScience Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around Us
 
Cover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SUCover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SU
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】
欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】
欧洲杯投注-欧洲杯投注外围盘口-欧洲杯投注盘口app|【​网址​🎉ac22.net🎉​】
 
Easy Earnings Through Refer and Earn Apps Without KYC.pptx
Easy Earnings Through Refer and Earn Apps Without KYC.pptxEasy Earnings Through Refer and Earn Apps Without KYC.pptx
Easy Earnings Through Refer and Earn Apps Without KYC.pptx
 
2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf
2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf
2024.06 CPMN Cambridge - Beyond Now-Next-Later.pdf
 
TriStar Gold Corporate Presentation - June 2024
TriStar Gold Corporate Presentation - June 2024TriStar Gold Corporate Presentation - June 2024
TriStar Gold Corporate Presentation - June 2024
 
Discover the Beauty and Functionality of The Expert Remodeling Service
Discover the Beauty and Functionality of The Expert Remodeling ServiceDiscover the Beauty and Functionality of The Expert Remodeling Service
Discover the Beauty and Functionality of The Expert Remodeling Service
 
NewBase 20 June 2024 Energy News issue - 1731 by Khaled Al Awadi_compressed.pdf
NewBase 20 June 2024  Energy News issue - 1731 by Khaled Al Awadi_compressed.pdfNewBase 20 June 2024  Energy News issue - 1731 by Khaled Al Awadi_compressed.pdf
NewBase 20 June 2024 Energy News issue - 1731 by Khaled Al Awadi_compressed.pdf
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 

A x86-optimized rank&select dictionary for bit sequences

  • 1. x86/x64最適化勉強会#4 A x86-optimized rank/select dictionary for bit sequences 2012/6/16 Takeshi Yamamuro 1
  • 2. What’s Succinct Data Structure? 2
  • 3. SDS: Succinct Data Structure • Recently, Getting Popular in Some Areas – Researches & Engineering • Not Data Structure, But Data Representation – A compressed method for other data structures – e.g., alphabets, trees, and graphs • Transparent Operations w/o Unpacking Explicitly – e.g., succinct LZ77 compression*1 *1 3 Kreft, S. and Navarro, G.: LZ77-Like Compression with Fast Random Access, In Proceedings of DCC, 2010
  • 4. More Details • SDS = Succinct Data + Succinct Index • Succinct Data – Compact representation for target data – Almost to information theoretic lower bounds e.g., If N patterns, the lower bound’s logN • Succinct Index – O(1) operations for target data – o(N) space costs: ignored asymptotically 4
  • 5. More Details If you need more information, ... cited from: http://goo.gl/rkQ5z 5
  • 7. A Rank/Select Operations • SDS Composed of Rank/Select Operations – Many calls of rank/select inside • Rank/Select for Succinct Bit Sequences: B[i] – rankx(n, B): the total of 1s in B[0...n] – selectx(n, B): n-th position of x in B[] i 0 1 2 3 4 5 6 7 8 B[i] 1 0 1 1 0 0 1 1 0 rank1(5, B)=3 select1(4, B)=6 7
  • 8. A Rank/Select Operations • Available Rank/Select Implementation – ux-trie: http://code.google.com/p/ux-trie/ – rx: http://code.google.com/p/mozc/ – marisa-trie: http://code.google.com/p/marisa-trie/ • Today Contributions – x86-optimized rank/select – https://github.com/maropu/dbitv 8
  • 9. Performance Results • Performance Benchmark Setups*1 – Generate a random sequence of bits: 50% density – Random rank/select queries over the bits – CPU: Intel Core-i5 U470@1.33GHz • Latency Observed – 11 trials, and median latency *1 9 Reference: http://d.hatena.ne.jp/s-yata/20111216/1324032373
  • 10. Performance Results: Rank 1.E+03 averaged rank latency (ns) 1.E+02 1.E+01 ux rx marisa opt 1.E+00 bit length 10
  • 11. Performance Results: Select 1.E+04 averaged select latency (ns) 1.E+03 1.E+02 ux 1.E+01 rx marisa opt 1.E+00 bit length 11
  • 13. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space B[] = A sequence of bits N-bits 13
  • 14. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 • Split into log2N fixed-length blocks • Total Counts Pre-computed in L[] x x / log 2 N  x rank1 ( x, B)   B[i ]   B[i ]   B[i] i 1 i 1   i  x / log 2 N 1 L1[ x / log 2 N ] 14
  • 15. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 • Split into log2N fixed-length blocks • Total Counts Pre-computed in L[] x x / log 2 N  x rank1 ( x, B)   B[i ]   B[i ]   B[i] i 1 i 1   i  x / log 2 N 1 L[ x / log 2 N ] O(log2N) O(1) 15
  • 16. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 • L[]: o(N) space costs N N 2  log N  O( )  o( N ) log N log N 16
  • 17. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • Split into 1/2logN fixed-length blocks again • Total Counts Pre-computed in S[]  1  x x / log N  2  x / 2 log N    x rank1 ( x, B)   B[i ]   B[i ]   B[i]   B[i] i 1 i 1   i  x / log 2 N 1  1  i   x / log N  1  2  1 L[ x / log 2 n] S[ x / log n] 2 17
  • 18. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • Split into 1/2logN fixed-length blocks again • Total Counts Pre-computed in S[]  1  O(logN) x / log N  2 x / log N   2 x   x rank1 ( x, B)   B[i ]   B[i ]   B[i]   B[i] i 1 i 1   i  x / log 2 N 1  1  i   x / log N  1  2  1 L[ x / log 2 n] S [ x / log n] 2 O(1) O(1) 18
  • 19. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • S[]: o(N) space costs N log log N 2  log(log N )  O( N  2 )  o( N ) 1 2 log N log N 19
  • 20. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • O(1) Popcount/Table-Lookup in Last Term  1  O(logN) -> O(1) x x / log 2 N   x / 2 log N    x rank1 ( x, B)   B[i ]   B[i ]   B[i]   B[i] i 1 i 1   i  x / log 2 N 1  1  i   x / log N  1  2  1 L[ x / log 2 n] S [ x / log n] 2 O(1) O(1) 20
  • 21. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • As a result, o(N) Space Costs N 4 N log log N log log N   O( N  )  o( N ) log N log N log N L[] size S[] size 21
  • 22. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space 22
  • 23. Implementation: Practice • Low Computation Costs & High Cache Penalties – 3 cache/TLB misses per rank ex. rank1(402=256*1+32*4+18, B) 256bit B[]: 01..000000....101......0 0110....001...............0 0000100 ... 32bit Popcount these left bits L[]: 18 21 … S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 … 23
  • 24. Implementation: Practice • Low Computation Costs & High Cache Penalties – 3 cache/TLB misses per rank ex. rank1(402=256*1+32*4+18, B) 256bit B[]: 01..000000....101......0 0110....001...............0 0000100 ... 32bit Miss! Popcount these left bits L[]: 18 Miss! 21 … S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 … Miss! 24
  • 25. Implementation: Practice • Packing the required data into a single cacheline 56B Chunk 4B 1B 32B ・・・ 12B padding 0110....001..........0 padding 64B Cache line 25
  • 26. Implementation: Practice • Packing the required data into a single cacheline 26
  • 27. Implementation: Practice • BTW, where select? – Omitted for my time limit  – Plz see the code ... • 2 Way Implementation – O(logN) complexity • ux-trie, rx, and marisa-trie • Binary searches with rank • Many cache/TLB misses suffered – O(1) complexity • My implementation to minimize these penalties • 1-rank, 1-SIMD comparison, and O(1) –bsf • Only 2 cache/TLB misses 27
  • 28. Implementation: Practice • BTW, where select? – Omitted for my time limit  – Plz see the code ... • 2 Way Implementation – O(logN) complexity • ux-trie, rx, and marisa-trie • Binary searches with rank • Many cache/TLB misses suffered – O(1) complexity • My implementation to minimize these penalties • 1-rank, 1-SIMD comparison, and O(1) –bsf • Only 2 cache/TLB misses Not implemented yet ... 28