What is Bitsy?
● A small, fast, embeddable,
durable, in-memory graph
● Maintains an on-disk copy of
the graph database.
● Designed for multi-threaded
● Provides ACID guarantees
and optimistic concurrency
control for transactions.
● Compatible with
Tinkerpop/Blueprints -- the
graph database standard.
Tinkerpop software stack
In-memory and durable?
● Bitsy maintains a copy of the entire graph in memory
● Bitsy saves all changes made to the database, to the
disk, during a commit operation.
● Commits from different threads are forced to the disk at
once, thereby improving the write performance in a
multithreaded OLTP environment.
● The database is loaded from files during startup.
● All database files are append-only text files with JSONencoded vertices and edges.
● The database files are periodically compacted by a
Design Principle #1: No Seek
● Bitsy appends all changes to an
unordered transaction log, unlike
most databases which persist data in
B-Trees and other ordered
● Ordered data structures perform
multiple seeks per updated element.
● Seek operations on the hard-disk are
expensive (5-15 ms).
● Bitsy avoids seeks per element, and
addresses rotational latency by
combining commits from concurrent
Hard disk head: Seek
operations require a
mechanical movement of
the hard disk head which
Rotational latency is the
time taken for the
requested sector in the
rotating platter to reach
the head. Takes 2-4ms.
Design Principle #2: No Socket
● Typical databases run in a separate
process exposing a socket-based
protocol to applications.
● The cost of serializing and
deserializing the requests and
responses, and calling OS-level
functions, reduces the overall
throughput of the database.
● By avoiding a socket-based protocol
between the application and the
database, Bitsy can achieve submicrosecond query latencies.
The OSI model requires
deserialization as the
packet crosses from one
layer to another
Design Principle #3: No SQL
● Tuning a SQL database is
a non-trivial task.
● The biggest factor in a
SQL query's efficiency is
its execution plan.
● By avoiding SQL and the
execution plans that come
with it, Bitsy ensures that
all queries and updates
An example execution plan from Oracle's
* The "allow full-graph scan" option must be disabled to guarantee quick responses.
● Bitsy is designed to work in multi-threaded OLTP
● It implements optimistic concurrency control where
edges and vertices are tied to version numbers that are
incremented on updates.
● A BitsyRetryException is raised during a transaction
commit, if an updated vertex/edge has a different
version at the time of commit, than at the time of query.
● The application should retry the entire transaction in
case of conflict.
The write algorithms operate on
three levels of "double buffers".
The transaction buffers capture
transactions to be committed
The commit waits for the buffer to
flush to a transaction file (A/B).
Transaction files are moved to
vertex and edge files on exceeding
a threshold size (default is 4MB).
Vertex and edge files are
reorganized after a period of growth
(default is +1x initial size).
Online backups trigger a
transaction flush, and then copy the
backup the vertex and edge files
representing the DB snapshot.
Write throughput in an OLTP setting
The plot below shows the throughput of a test application* that repeatedly
commits a small transaction (1 vertex + 1 edge) from multiple threads.
The throughput exceeds 50K ops/second at 750 concurrent threads.
The comparison with Neo4J 1.9.2 illustrates the benefit of "No Seek".
* Tests performed on a $600 HP p7-1287c desktop PC with a single 7200 rpm hard disk.
Read throughput in an OLTP setting
The plot below shows the read throughput of threads, repeatedly traversing
separate portions of the graph in a desktop PC*.
Bitsy implements mostly lock-free read algorithms that can perform close
to 20M ops/second at 1000 threads -- on par with Neo4J’s warm caches.
* Tests performed on a $600 HP p7-1287c desktop PC with 4 cores
Monitoring and Management
● Offline backup and
restore operations are
simple file copy
operations on the
● Bitsy exposes a JMX
interface to make online
backups, and adjust
● Bitsy logs messages
using the SLF4J API with
logger names starting
Online backup through jconsole
Jackson JSON Processor
Ness Computing Core Component: For fast UUID
● Bitsy is a dual-licensed product.
● The AGPL v3 license can be used for open-source
projects and internally-used closed-source projects.
The commercial license is an extremely liberal license
that provides rights to modify and use Bitsy in an
unlimited number of instances, products* and services.
Pricing details with a 15% promotional discount (till Feb 2014)
Startups and small
* The products must not encourage the direct use of Bitsy APIs.
● Bitsy is a small, fast, embeddable, durable, in-memory
graph database, with the following features:
○ ACID guarantees and clean recovery from crashes
○ Query latency in sub-microseconds
○ High transaction throughput in an OLTP setting with multiple
clients/threads accessing the database
○ Well-defined optimistic concurrency model
○ Support for online backups
○ Human-readable database files
○ Small code footprint (~1.5MB with dependencies)
Bitsy is dual-licensed under AGPL and a liberal
commercial license for unlimited enterprise-wide use.
Questions and Feedback
● The project is hosted at https://bitbucket.
org/lambdazen/bitsy with publicly accessible
○ Documentation and install instructions (in Wiki)
○ Links to downloads
○ Issue management
● Please email your questions and feedback to