Your SlideShare is downloading. ×
  • Like
Lucene KV-Store
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Lucene KV-Store

  • 1,695 views
Published

A fast Key-Value store for large datasets

A fast Key-Value store for large datasets

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,695
On SlideShare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
13
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Lucene KV-StoreA high-performance key-value store Mark Harwood
  • 2. BenefitsHigh-speed reads and writes of key/value pairssustained over growing volumes of dataRead costs are always 0 or 1 disk seekEfficient use of memorySimple file structures with strong durabilityguarantees
  • 3. Why “Lucene” KV store?Uses Lucene’s “Directory” APIs for low-level fileaccessBased on Lucene’s concepts of segmentfiles, soft deletes, background merges, commitpoints etc BUT a fundamentally different form ofindexI’d like to offer it to the Lucene community as a“contrib” module because they have a trackrecord in optimizing these same concepts (andcould potentially make use of it in Lucene?)
  • 4. Example benchmark resultsNote, regular Lucene search indexes follow the same trajectory of the“Common KV Store” when it comes to lookups on a store with millionsof keys
  • 5. KV-Store High-level DesignMap Key hash (int) Disk pointer (int)held in 23434 0RAM 6545463 10 874382 22 Num keys Key 1 Key 1 Value 1 Value 1 Key/values 2,3,4… with hash size (byte [ ]) size (byte[ ]) (VInt) (VInt) (Vint)Disk 1 3 Foo 3 Bar 2 5 Hello 5 World 7,Bonjour,8,Le Mon.. Most hashes have only one associated key and value Some hashes will have key collisions requiring the use of extra columns here
  • 6. Read logic (pseudo code)int keyHash=hash(searchKey);int filePointer=ramMap.get(keyHash); There is aif filePointer is null guaranteed maximum of one return null for value; random disk seekfile.seek(filePointer); for any lookupint numKeysWithHash=file.readInt()for numKeysWithHash With a good{ hashing function most lookups will storedKey=file.readKeyData(); only need to go if(storedKey==searchKey) once around this return file.readValueData(); loop file.readValueData();}
  • 7. Write logic (pseudo code) Updates willint keyHash=hash(newKey); always append toint oldFilePointer=ramMap.get(keyHash); the end of theramMap.put(keyHash,file.length()); file, leaving olderif oldFilePointer is null values{ unreferenced file.append(1);//only 1 key with hash file.append(newKey); file.append(newValue); In case of any key}else collisions, previou{ sly stored values file.seek(oldFilePointer); are copied to the int numOldKeys=file.readInt(); new position at Map tmpMap=file.readNextNKeysAndValues(numOldKeys); the end of the file tmpMap.put(newKey,newValue); along with the file.append(tmpMap.size()); new content file.appendKeysAndValues(tmpMap);}
  • 8. Segment generations: writes Hash Pointer Hash Pointer Hash Pointer Hash PointerMaps held 23434 0 203765 0 23434 0 15243 0 Writes append to in RAM 65463 10 37594 10 65463 10 3 the end of the 74229 10 latest generation … … … … … … 7 … … segment until it reaches a set Key and size then it is 3value disk made read-only 0 1 2 and new stores segment is old created. new
  • 9. Segment generations: readsMaps held Hash Pointer Hash Pointer Hash Pointer Hash Pointer Read operations 23434 0 203765 0 23434 0 15243 0 search memory in RAM 65463 10 37594 10 65463 10 3 74229 10 maps in reverse … … … … … … 7 order. The first … … map found with a hash is expected Key and 3 to have a pointervalue disk 0 1 2 into its associated stores file for all the latest keys/values with old new this hash
  • 10. Segment generations: merges Hash Pointer Hash Pointer Hash Pointer Hash PointerMaps held 23434 0 20376 0 23434 0 15243 0 in RAM 65463 10 5 65463 10 3 37594 10 74229 10 … … … … 7 … … … … Key and 3value disk 0 1 2 stores A background thread merges read-only segments with many 4 outdated entries into new, more compact versions
  • 11. Segment generations: durability Hash Pointer Hash Pointer Hash PointerMaps held 23434 0 203765 0 152433 0 in RAM 65463 10 37594 10 742297 10 … … … … … … Key and 3value disk 0 4 stores Completed 0,4 Segment IDs Active 3 Like Lucene, commit Segment ID operations create a new Active 423423 segment generation of a “segments” committed length file, the contents of which reflect the committed (i.e. fsync’ed state of the store.)
  • 12. Implementation detailsJVM needs sufficient RAM for 2 ints for every active key(note: using “modulo N” on the hash can reduce RAM maxto Nx2 ints at the cost of more key collisions = more diskIO)Uses Lucene Directory for Abstraction from choice of file system Buffered reads/writes Support for Vint encoding of numbers Rate-limited merge operationsBorrows successful Lucene concepts: Multiple segments flushed then made read-only. “Segments” file used to list committed content (could potentially support multiple commit points) Background mergesUses LGPL “Trove” for maps of primitives