SlideShare a Scribd company logo
1 of 12
Lucene KV-Store
A high-performance key-value store




                                     Mark Harwood
Benefits
High-speed reads and writes of key/value pairs
sustained over growing volumes of data

Read costs are always 0 or 1 disk seek

Efficient use of memory

Simple file structures with strong durability
guarantees
Why “Lucene” KV store?
Uses Lucene’s “Directory” APIs for low-level file
access

Based on Lucene’s concepts of segment
files, soft deletes, background merges, commit
points etc BUT a fundamentally different form of
index

I’d like to offer it to the Lucene community as a
“contrib” module because they have a track
record in optimizing these same concepts (and
could potentially make use of it in Lucene?)
Example benchmark
               results




Note, regular Lucene search indexes follow the same trajectory of the
“Common KV Store” when it comes to lookups on a store with millions
of keys
KV-Store High-level Design
Map        Key hash (int)   Disk pointer
                            (int)
held in
           23434            0
RAM        6545463          10
           874382           22



           Num keys Key 1        Key 1      Value 1 Value 1 Key/values 2,3,4…
           with hash size        (byte [ ]) size    (byte[ ])
           (VInt)    (VInt)                 (Vint)
Disk
           1           3         Foo       3      Bar
           2           5         Hello     5      World    7,Bonjour,8,Le Mon..

          Most hashes have only one associated key and value     Some hashes will
                                                                have key collisions
                                                               requiring the use of
                                                               extra columns here
Read logic (pseudo code)
int keyHash=hash(searchKey);
int filePointer=ramMap.get(keyHash);            There is a
if filePointer is null                         guaranteed
                                            maximum of one
      return null for value;               random disk seek
file.seek(filePointer);                      for any lookup
int numKeysWithHash=file.readInt()
for numKeysWithHash                          With a good
{                                          hashing function
                                           most lookups will
      storedKey=file.readKeyData();         only need to go
      if(storedKey==searchKey)             once around this
            return file.readValueData();          loop

      file.readValueData();
}
Write logic (pseudo code)
                                                                 Updates will
int keyHash=hash(newKey);                                    always append to
int oldFilePointer=ramMap.get(keyHash);                         the end of the
ramMap.put(keyHash,file.length());                           file, leaving older
if oldFilePointer is null                                           values
{                                                               unreferenced
       file.append(1);//only 1 key with hash
       file.append(newKey);
       file.append(newValue);                                In case of any key
}else                                                        collisions, previou
{                                                             sly stored values
       file.seek(oldFilePointer);                             are copied to the
       int numOldKeys=file.readInt();                          new position at
       Map tmpMap=file.readNextNKeysAndValues(numOldKeys);    the end of the file
       tmpMap.put(newKey,newValue);                             along with the
       file.append(tmpMap.size());                               new content
       file.appendKeysAndValues(tmpMap);
}
Segment generations:
                  writes
             Hash        Pointer   Hash         Pointer   Hash        Pointer   Hash        Pointer
Maps held    23434       0         203765       0         23434       0         15243       0
                                                                                                       Writes append to
  in RAM     65463       10        37594        10        65463       10
                                                                                3                        the end of the
                                                                                74229       10         latest generation
             …           …         …            …         …           …         7
                                                                                …           …
                                                                                                        segment until it
                                                                                                         reaches a set
  Key and                                                                                                 size then it is
                                                                                        3
value disk                                                                                              made read-only
                     0                      1                     2
                                                                                                             and new
    stores
                                                                                                           segment is
             old                                                                                             created.
                                                                                                 new
Segment generations:
                  reads
Maps held    Hash        Pointer   Hash         Pointer   Hash        Pointer   Hash     Pointer   Read operations
             23434       0         203765       0         23434       0         15243     0         search memory
  in RAM     65463       10        37594        10        65463       10
                                                                                3
                                                                                74229     10
                                                                                                    maps in reverse
             …           …         …            …         …           …         7                   order. The first
                                                                                …         …
                                                                                                   map found with a
                                                                                                   hash is expected
  Key and                                                                               3          to have a pointer
value disk           0                      1                     2                               into its associated
    stores                                                                                       file for all the latest
                                                                                                   keys/values with
             old                                                                             new        this hash
Segment generations:
                 merges
             Hash        Pointer   Hash        Pointer   Hash        Pointer   Hash        Pointer
Maps held    23434       0         20376       0         23434       0         15243       0
  in RAM     65463       10
                                   5
                                                         65463       10
                                                                               3
                                   37594       10                              74229       10
             …           …                               …           …         7
                                   …           …
                                                                               …           …


  Key and                                                                              3
value disk           0                     1                     2
    stores



 A background thread
  merges read-only
 segments with many                                      4
 outdated entries into
 new, more compact
       versions
Segment generations:
             durability
             Hash        Pointer   Hash         Pointer   Hash         Pointer
Maps held    23434       0         203765       0         152433       0
  in RAM     65463       10        37594        10        742297       10

             …           …         …            …         …            …




  Key and                                                          3
value disk           0                      4
    stores

                                                                                 Completed     0,4
                                                                                 Segment IDs
                                                                                 Active        3
          Like Lucene, commit                                                    Segment ID

       operations create a new                                                   Active        423423
                                                                                 segment
     generation of a “segments”                                                  committed
                                                                                 length
      file, the contents of which
      reflect the committed (i.e.
     fsync’ed state of the store.)
Implementation details
JVM needs sufficient RAM for 2 ints for every active key
(note: using “modulo N” on the hash can reduce RAM max
to Nx2 ints at the cost of more key collisions = more disk
IO)
Uses Lucene Directory for
   Abstraction from choice of file system
   Buffered reads/writes
   Support for Vint encoding of numbers
   Rate-limited merge operations

Borrows successful Lucene concepts:
   Multiple segments flushed then made read-only.
   “Segments” file used to list committed content (could
   potentially support multiple commit points)
   Background merges

Uses LGPL “Trove” for maps of primitives

More Related Content

Similar to Lucene KV-Store

General commands for navisphere cli
General commands for navisphere cliGeneral commands for navisphere cli
General commands for navisphere climsaleh1234
 
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer SimonDocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simonlucenerevolution
 
Column Stride Fields aka. DocValues
Column Stride Fields aka. DocValues Column Stride Fields aka. DocValues
Column Stride Fields aka. DocValues Lucidworks (Archived)
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117exsuns
 
Implementation of rainbow tables to crack md5 codes
Implementation of rainbow tables to crack md5 codesImplementation of rainbow tables to crack md5 codes
Implementation of rainbow tables to crack md5 codesKhadidja BOUKREDIMI
 
Hadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduceHadoop, HDFS and MapReduce
Hadoop, HDFS and MapReducefvanvollenhoven
 
Plebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain statePlebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain stateJun Furuse
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsCheng Lian
 
Process and Threads in Linux - PPT
Process and Threads in Linux - PPTProcess and Threads in Linux - PPT
Process and Threads in Linux - PPTQUONTRASOLUTIONS
 
GlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack IntegrationGlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack IntegrationEtsuji Nakai
 
Linux Survival Kit for Proof of Concept & Proof of Technology
Linux Survival Kit for Proof of Concept & Proof of TechnologyLinux Survival Kit for Proof of Concept & Proof of Technology
Linux Survival Kit for Proof of Concept & Proof of TechnologyNugroho Gito
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: IntroductionBrendan Gregg
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introductioninjae yeo
 
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedis Labs
 

Similar to Lucene KV-Store (19)

General commands for navisphere cli
General commands for navisphere cliGeneral commands for navisphere cli
General commands for navisphere cli
 
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer SimonDocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
 
Column Stride Fields aka. DocValues
Column Stride Fields aka. DocValuesColumn Stride Fields aka. DocValues
Column Stride Fields aka. DocValues
 
Column Stride Fields aka. DocValues
Column Stride Fields aka. DocValues Column Stride Fields aka. DocValues
Column Stride Fields aka. DocValues
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
 
Implementation of rainbow tables to crack md5 codes
Implementation of rainbow tables to crack md5 codesImplementation of rainbow tables to crack md5 codes
Implementation of rainbow tables to crack md5 codes
 
RuG Guest Lecture
RuG Guest LectureRuG Guest Lecture
RuG Guest Lecture
 
Hadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduceHadoop, HDFS and MapReduce
Hadoop, HDFS and MapReduce
 
Plebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain statePlebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain state
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime Internals
 
Process and Threads in Linux - PPT
Process and Threads in Linux - PPTProcess and Threads in Linux - PPT
Process and Threads in Linux - PPT
 
GlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack IntegrationGlusterFS Update and OpenStack Integration
GlusterFS Update and OpenStack Integration
 
Devoxx 17 - Swift server-side
Devoxx 17 - Swift server-sideDevoxx 17 - Swift server-side
Devoxx 17 - Swift server-side
 
About memcached
About memcachedAbout memcached
About memcached
 
Linux Survival Kit for Proof of Concept & Proof of Technology
Linux Survival Kit for Proof of Concept & Proof of TechnologyLinux Survival Kit for Proof of Concept & Proof of Technology
Linux Survival Kit for Proof of Concept & Proof of Technology
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: Introduction
 
Top ESXi command line v2.0
Top ESXi command line v2.0Top ESXi command line v2.0
Top ESXi command line v2.0
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introduction
 
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 

Lucene KV-Store

  • 1. Lucene KV-Store A high-performance key-value store Mark Harwood
  • 2. Benefits High-speed reads and writes of key/value pairs sustained over growing volumes of data Read costs are always 0 or 1 disk seek Efficient use of memory Simple file structures with strong durability guarantees
  • 3. Why “Lucene” KV store? Uses Lucene’s “Directory” APIs for low-level file access Based on Lucene’s concepts of segment files, soft deletes, background merges, commit points etc BUT a fundamentally different form of index I’d like to offer it to the Lucene community as a “contrib” module because they have a track record in optimizing these same concepts (and could potentially make use of it in Lucene?)
  • 4. Example benchmark results Note, regular Lucene search indexes follow the same trajectory of the “Common KV Store” when it comes to lookups on a store with millions of keys
  • 5. KV-Store High-level Design Map Key hash (int) Disk pointer (int) held in 23434 0 RAM 6545463 10 874382 22 Num keys Key 1 Key 1 Value 1 Value 1 Key/values 2,3,4… with hash size (byte [ ]) size (byte[ ]) (VInt) (VInt) (Vint) Disk 1 3 Foo 3 Bar 2 5 Hello 5 World 7,Bonjour,8,Le Mon.. Most hashes have only one associated key and value Some hashes will have key collisions requiring the use of extra columns here
  • 6. Read logic (pseudo code) int keyHash=hash(searchKey); int filePointer=ramMap.get(keyHash); There is a if filePointer is null guaranteed maximum of one return null for value; random disk seek file.seek(filePointer); for any lookup int numKeysWithHash=file.readInt() for numKeysWithHash With a good { hashing function most lookups will storedKey=file.readKeyData(); only need to go if(storedKey==searchKey) once around this return file.readValueData(); loop file.readValueData(); }
  • 7. Write logic (pseudo code) Updates will int keyHash=hash(newKey); always append to int oldFilePointer=ramMap.get(keyHash); the end of the ramMap.put(keyHash,file.length()); file, leaving older if oldFilePointer is null values { unreferenced file.append(1);//only 1 key with hash file.append(newKey); file.append(newValue); In case of any key }else collisions, previou { sly stored values file.seek(oldFilePointer); are copied to the int numOldKeys=file.readInt(); new position at Map tmpMap=file.readNextNKeysAndValues(numOldKeys); the end of the file tmpMap.put(newKey,newValue); along with the file.append(tmpMap.size()); new content file.appendKeysAndValues(tmpMap); }
  • 8. Segment generations: writes Hash Pointer Hash Pointer Hash Pointer Hash Pointer Maps held 23434 0 203765 0 23434 0 15243 0 Writes append to in RAM 65463 10 37594 10 65463 10 3 the end of the 74229 10 latest generation … … … … … … 7 … … segment until it reaches a set Key and size then it is 3 value disk made read-only 0 1 2 and new stores segment is old created. new
  • 9. Segment generations: reads Maps held Hash Pointer Hash Pointer Hash Pointer Hash Pointer Read operations 23434 0 203765 0 23434 0 15243 0 search memory in RAM 65463 10 37594 10 65463 10 3 74229 10 maps in reverse … … … … … … 7 order. The first … … map found with a hash is expected Key and 3 to have a pointer value disk 0 1 2 into its associated stores file for all the latest keys/values with old new this hash
  • 10. Segment generations: merges Hash Pointer Hash Pointer Hash Pointer Hash Pointer Maps held 23434 0 20376 0 23434 0 15243 0 in RAM 65463 10 5 65463 10 3 37594 10 74229 10 … … … … 7 … … … … Key and 3 value disk 0 1 2 stores A background thread merges read-only segments with many 4 outdated entries into new, more compact versions
  • 11. Segment generations: durability Hash Pointer Hash Pointer Hash Pointer Maps held 23434 0 203765 0 152433 0 in RAM 65463 10 37594 10 742297 10 … … … … … … Key and 3 value disk 0 4 stores Completed 0,4 Segment IDs Active 3 Like Lucene, commit Segment ID operations create a new Active 423423 segment generation of a “segments” committed length file, the contents of which reflect the committed (i.e. fsync’ed state of the store.)
  • 12. Implementation details JVM needs sufficient RAM for 2 ints for every active key (note: using “modulo N” on the hash can reduce RAM max to Nx2 ints at the cost of more key collisions = more disk IO) Uses Lucene Directory for Abstraction from choice of file system Buffered reads/writes Support for Vint encoding of numbers Rate-limited merge operations Borrows successful Lucene concepts: Multiple segments flushed then made read-only. “Segments” file used to list committed content (could potentially support multiple commit points) Background merges Uses LGPL “Trove” for maps of primitives