Improvements in Bitsy 1.5
Sridhar Ramachandran
Founder, LambdaZen LLC
Background
● Bitsy is a small, fast, embeddable, durable,
in-memory graph database that implements
the Tinkerpop Blueprint...
Major features in the 1.5 release
● The 1.5 release features:
○ Memory-efficient data structures
○ Mostly lock-free read a...
Memory-efficient data structures
● Bitsy 1.0 relied on Java Collections to
maintain adjacency lists and properties of
vert...
Memory-efficient data structures
● Different concrete
classes capture
adjacency lists and
properties for small N.
○ This a...
Lock-free reading
● Bitsy 1.5 also introduces lock-free reading
using sequential locks (seqlock).
● Read operations track ...
(Mostly) lock-free reading
● Bitsy’s sequential locks can cause “live lock”
situations when there are too many writers.
● ...
Benchmarks
● The plot below shows the read throughput*
of a test!
application that repeatedly loops through a graph.
*
Tes...
Benchmarks
● The lock-free read algorithms in Bitsy 1.5 show a
significantly higher throughput than Bitsy 1.0.
○ Bitsy 1.0...
Another read benchmark
● The following plot shows the traversal performance of
Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threade...
Benchmarks for write
● As with 1.0 release, Bitsy’s write throughput is much
higher than Neo4J because of the “No Seek” pr...
Wrap-up
● The 1.5 release introduces memory-efficient
data structures and (mostly) lock-free
reading to the Bitsy graph da...
Upcoming SlideShare
Loading in …5
×

Improvements in Bitsy 1.5

1,625 views

Published on

This presentation covers the improvements to the Bitsy Graph Database in version 1.5.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,625
On SlideShare
0
From Embeds
0
Number of Embeds
28
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Improvements in Bitsy 1.5

  1. 1. Improvements in Bitsy 1.5 Sridhar Ramachandran Founder, LambdaZen LLC
  2. 2. Background ● Bitsy is a small, fast, embeddable, durable, in-memory graph database that implements the Tinkerpop Blueprints API. ● The original presentation on Bitsy is available at http://slideshare.net/lambdazen/bitsy-graphdatabase ● Bitsy 1.5 is faster and leaner than before! ○ Has a smaller memory footprint ○ Uses (mostly) lock-free read algorithms ● This presentation covers the improvements in the 1.5 release.
  3. 3. Major features in the 1.5 release ● The 1.5 release features: ○ Memory-efficient data structures ○ Mostly lock-free read algorithms ● Bitsy’s new memory-efficient data structures are designed to reduce the overhead of maintaining adjacency lists and properties. ● Bitsy’s new read algorithms are designed to use the latest Java “compare-and-set” (CAS) concurrency features to reduce the overhead of locks in highly threaded scenarios.
  4. 4. Memory-efficient data structures ● Bitsy 1.0 relied on Java Collections to maintain adjacency lists and properties of vertices. ● Java Collections aren’t memory efficient for small-sized data structures because they create many holder objects. ● The 1.5 release stores small adjacency lists (N<24) and small properties (N<16) in hand- coded objects with minimal overhead.
  5. 5. Memory-efficient data structures ● Different concrete classes capture adjacency lists and properties for small N. ○ This approach reduces the overall number of objects. ○ Large adjacency lists are stored in a compact hash- set by label referring to memory-efficient lists. Adjacency lists for out-degree 0, 1 and 2 Vertex properties for N = 0, 1 and 2
  6. 6. Lock-free reading ● Bitsy 1.5 also introduces lock-free reading using sequential locks (seqlock). ● Read operations track the sequence numbers at the start and end. ○ If they are the same -- Success. ○ If they are different -- Retry! ● Reads don’t start till the counter is even. ● Writers increment the counters twice ○ Before the write to make the counter an odd number ○ After the write to make the counter an even number
  7. 7. (Mostly) lock-free reading ● Bitsy’s sequential locks can cause “live lock” situations when there are too many writers. ● To avoid this, readers degrade to RW locks after a certain number of retries. ● Seqlock are faster than RW locks in highly threaded environments where the # of active threads exceed the # of cores. ● Bitsy uses locks on writes because ○ write-retries are complex with transactions, and ○ locking is not the bottleneck for writes -- the file system is the bottleneck.
  8. 8. Benchmarks ● The plot below shows the read throughput* of a test! application that repeatedly loops through a graph. * Tests performed on a $600 HP p7-1287c desktop PC with a single 7200rpm hard disk. ! The code for this test can be found in BitsyGraphTest.java under the method testMultiThreadedCommits().
  9. 9. Benchmarks ● The lock-free read algorithms in Bitsy 1.5 show a significantly higher throughput than Bitsy 1.0. ○ Bitsy 1.0 had a drop in performance when the number of threads exceeded the number of cores. ○ The read throughput exceeds 10M reads/sec! ● Bitsy is now comparable to Neo4J in read throughput* . ○ This is an apples-to-apples comparison since Neo4J is embedded and the graph is fully cached. ○ Most “bad” Neo4J benchmarks are taken when the graph doesn’t fit in memory. ○ Neo4J is extremely fast when the graph fits in memory -- and now, so is Bitsy!
  10. 10. Another read benchmark ● The following plot shows the traversal performance of Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threaded setting on a bipartite graph with 1M vertices and out-degree of 3. ● Again, you can see that the performance is comparable.
  11. 11. Benchmarks for write ● As with 1.0 release, Bitsy’s write throughput is much higher than Neo4J because of the “No Seek” principle. ○ For more info, please refer to the project page at http://bitbucket.org/lambdazen/bitsy/
  12. 12. Wrap-up ● The 1.5 release introduces memory-efficient data structures and (mostly) lock-free reading to the Bitsy graph database. ○ With these improvements, Bitsy’s read performance is comparable to Neo4J’s cache. ○ Bitsy’s “No Seek” write algorithms continue to outperform other graph databases, including Neo4J. ● Bitsy is a dual-licensed product with ○ an AGPL license for open-source projects, and ○ a liberal unlimited-use OEM/end-user license for commercial projects. Details at lambdazen.com.

×