SlideShare a Scribd company logo
1 of 12
Download to read offline
Improvements in Bitsy 1.5
Sridhar Ramachandran
Founder, LambdaZen LLC
Background
● Bitsy is a small, fast, embeddable, durable,
in-memory graph database that implements
the Tinkerpop Blueprints API.
● The original presentation on Bitsy is
available at
http://slideshare.net/lambdazen/bitsy-graphdatabase
● Bitsy 1.5 is faster and leaner than before!
○ Has a smaller memory footprint
○ Uses (mostly) lock-free read algorithms
● This presentation covers the improvements
in the 1.5 release.
Major features in the 1.5 release
● The 1.5 release features:
○ Memory-efficient data structures
○ Mostly lock-free read algorithms
● Bitsy’s new memory-efficient data structures
are designed to reduce the overhead of
maintaining adjacency lists and properties.
● Bitsy’s new read algorithms are designed to
use the latest Java “compare-and-set” (CAS)
concurrency features to reduce the overhead
of locks in highly threaded scenarios.
Memory-efficient data structures
● Bitsy 1.0 relied on Java Collections to
maintain adjacency lists and properties of
vertices.
● Java Collections aren’t memory efficient for
small-sized data structures because they
create many holder objects.
● The 1.5 release stores small adjacency lists
(N<24) and small properties (N<16) in hand-
coded objects with minimal overhead.
Memory-efficient data structures
● Different concrete
classes capture
adjacency lists and
properties for small N.
○ This approach reduces
the overall number of
objects.
○ Large adjacency lists are
stored in a compact hash-
set by label referring to
memory-efficient lists.
Adjacency lists for out-degree 0, 1 and 2
Vertex properties for N = 0, 1 and 2
Lock-free reading
● Bitsy 1.5 also introduces lock-free reading
using sequential locks (seqlock).
● Read operations track the sequence
numbers at the start and end.
○ If they are the same -- Success.
○ If they are different -- Retry!
● Reads don’t start till the counter is even.
● Writers increment the counters twice
○ Before the write to make the counter an odd number
○ After the write to make the counter an even number
(Mostly) lock-free reading
● Bitsy’s sequential locks can cause “live lock”
situations when there are too many writers.
● To avoid this, readers degrade to RW locks
after a certain number of retries.
● Seqlock are faster than RW locks in highly
threaded environments where the # of active
threads exceed the # of cores.
● Bitsy uses locks on writes because
○ write-retries are complex with transactions, and
○ locking is not the bottleneck for writes -- the file
system is the bottleneck.
Benchmarks
● The plot below shows the read throughput*
of a test!
application that repeatedly loops through a graph.
*
Tests performed on a $600 HP p7-1287c desktop PC with a single 7200rpm hard disk.
!
The code for this test can be found in BitsyGraphTest.java under the method testMultiThreadedCommits().
Benchmarks
● The lock-free read algorithms in Bitsy 1.5 show a
significantly higher throughput than Bitsy 1.0.
○ Bitsy 1.0 had a drop in performance when the
number of threads exceeded the number of cores.
○ The read throughput exceeds 10M reads/sec!
● Bitsy is now comparable to Neo4J in read throughput*
.
○ This is an apples-to-apples comparison since Neo4J
is embedded and the graph is fully cached.
○ Most “bad” Neo4J benchmarks are taken when the
graph doesn’t fit in memory.
○ Neo4J is extremely fast when the graph fits in
memory -- and now, so is Bitsy!
Another read benchmark
● The following plot shows the traversal performance of
Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threaded setting on a
bipartite graph with 1M vertices and out-degree of 3.
● Again, you can see that the performance is comparable.
Benchmarks for write
● As with 1.0 release, Bitsy’s write throughput is much
higher than Neo4J because of the “No Seek” principle.
○ For more info, please refer to the project page at
http://bitbucket.org/lambdazen/bitsy/
Wrap-up
● The 1.5 release introduces memory-efficient
data structures and (mostly) lock-free
reading to the Bitsy graph database.
○ With these improvements, Bitsy’s read performance
is comparable to Neo4J’s cache.
○ Bitsy’s “No Seek” write algorithms continue to
outperform other graph databases, including Neo4J.
● Bitsy is a dual-licensed product with
○ an AGPL license for open-source projects, and
○ a liberal unlimited-use OEM/end-user license for
commercial projects. Details at lambdazen.com.

More Related Content

What's hot

MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB
 
Speed up large-scale ML/DL offline inference job with Alluxio
Speed up large-scale ML/DL offline inference job with AlluxioSpeed up large-scale ML/DL offline inference job with Alluxio
Speed up large-scale ML/DL offline inference job with AlluxioAlluxio, Inc.
 
Get More Out of MySQL with TokuDB
Get More Out of MySQL with TokuDBGet More Out of MySQL with TokuDB
Get More Out of MySQL with TokuDBTim Callaghan
 
Get More Out of MongoDB with TokuMX
Get More Out of MongoDB with TokuMXGet More Out of MongoDB with TokuMX
Get More Out of MongoDB with TokuMXTim Callaghan
 
WiredTiger MongoDB Integration
WiredTiger MongoDB Integration WiredTiger MongoDB Integration
WiredTiger MongoDB Integration MongoDB
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfRedis Labs
 
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage EngineMongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage EngineMongoDB
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerMongoDB
 
Ambry : Linkedin's Scalable Geo-Distributed Object Store
Ambry : Linkedin's Scalable Geo-Distributed Object StoreAmbry : Linkedin's Scalable Geo-Distributed Object Store
Ambry : Linkedin's Scalable Geo-Distributed Object StoreSivabalan Narayanan
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouseAltinity Ltd
 
Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0MongoDB
 
WiredTiger Overview
WiredTiger OverviewWiredTiger Overview
WiredTiger OverviewWiredTiger
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarMongoDB
 
Remote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New FeaturesRemote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New FeaturesRemote DBA Experts
 
BigTable PreReading
BigTable PreReadingBigTable PreReading
BigTable PreReadingeverestsun
 
Hybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxioHybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxioThai Bui
 
Leveraging Structured Data To Reduce Disk, IO & Network Bandwidth
Leveraging Structured Data To Reduce Disk, IO & Network BandwidthLeveraging Structured Data To Reduce Disk, IO & Network Bandwidth
Leveraging Structured Data To Reduce Disk, IO & Network BandwidthPerforce
 

What's hot (20)

MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
 
Speed up large-scale ML/DL offline inference job with Alluxio
Speed up large-scale ML/DL offline inference job with AlluxioSpeed up large-scale ML/DL offline inference job with Alluxio
Speed up large-scale ML/DL offline inference job with Alluxio
 
Get More Out of MySQL with TokuDB
Get More Out of MySQL with TokuDBGet More Out of MySQL with TokuDB
Get More Out of MySQL with TokuDB
 
Get More Out of MongoDB with TokuMX
Get More Out of MongoDB with TokuMXGet More Out of MongoDB with TokuMX
Get More Out of MongoDB with TokuMX
 
WiredTiger MongoDB Integration
WiredTiger MongoDB Integration WiredTiger MongoDB Integration
WiredTiger MongoDB Integration
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConf
 
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage EngineMongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
 
Percona FT / TokuDB
Percona FT / TokuDBPercona FT / TokuDB
Percona FT / TokuDB
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
Ambry : Linkedin's Scalable Geo-Distributed Object Store
Ambry : Linkedin's Scalable Geo-Distributed Object StoreAmbry : Linkedin's Scalable Geo-Distributed Object Store
Ambry : Linkedin's Scalable Geo-Distributed Object Store
 
25 snowflake
25 snowflake25 snowflake
25 snowflake
 
Rit 2011 ats
Rit 2011 atsRit 2011 ats
Rit 2011 ats
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouse
 
Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0
 
WiredTiger Overview
WiredTiger OverviewWiredTiger Overview
WiredTiger Overview
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 Webinar
 
Remote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New FeaturesRemote DBA Experts SQL Server 2008 New Features
Remote DBA Experts SQL Server 2008 New Features
 
BigTable PreReading
BigTable PreReadingBigTable PreReading
BigTable PreReading
 
Hybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxioHybrid collaborative tiered storage with alluxio
Hybrid collaborative tiered storage with alluxio
 
Leveraging Structured Data To Reduce Disk, IO & Network Bandwidth
Leveraging Structured Data To Reduce Disk, IO & Network BandwidthLeveraging Structured Data To Reduce Disk, IO & Network Bandwidth
Leveraging Structured Data To Reduce Disk, IO & Network Bandwidth
 

Similar to Improvements in Bitsy 1.5

Threads - Why Can't You Just Play Nicely With Your Memory?
Threads - Why Can't You Just Play Nicely With Your Memory?Threads - Why Can't You Just Play Nicely With Your Memory?
Threads - Why Can't You Just Play Nicely With Your Memory?Robert Burrell Donkin
 
Threads - Why Can't You Just Play Nicely With Your Memory_
Threads - Why Can't You Just Play Nicely With Your Memory_Threads - Why Can't You Just Play Nicely With Your Memory_
Threads - Why Can't You Just Play Nicely With Your Memory_Robert Burrell Donkin
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Introduction to Memoria
Introduction to MemoriaIntroduction to Memoria
Introduction to MemoriaVictor Smirnov
 
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...Lucidworks
 
MongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerMongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerWiredTiger
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
 
NetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapNetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapRuslan Meshenberg
 
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...Alluxio, Inc.
 
Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjayGluster.org
 
Boltdb - an embedded key value database
Boltdb - an embedded key value databaseBoltdb - an embedded key value database
Boltdb - an embedded key value databaseManoj Awasthi
 
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Pôle Systematic Paris-Region
 
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...MongoDB
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines	Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines MongoDB
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Deep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdfDeep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdfShubhamChaurasia88
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
 

Similar to Improvements in Bitsy 1.5 (20)

Threads - Why Can't You Just Play Nicely With Your Memory?
Threads - Why Can't You Just Play Nicely With Your Memory?Threads - Why Can't You Just Play Nicely With Your Memory?
Threads - Why Can't You Just Play Nicely With Your Memory?
 
Threads - Why Can't You Just Play Nicely With Your Memory_
Threads - Why Can't You Just Play Nicely With Your Memory_Threads - Why Can't You Just Play Nicely With Your Memory_
Threads - Why Can't You Just Play Nicely With Your Memory_
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Introduction to Memoria
Introduction to MemoriaIntroduction to Memoria
Introduction to Memoria
 
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
 
MongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerMongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTiger
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
NetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapNetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmap
 
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...
 
Sharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika DhananjaySharding: Past, Present and Future with Krutika Dhananjay
Sharding: Past, Present and Future with Krutika Dhananjay
 
Boltdb - an embedded key value database
Boltdb - an embedded key value databaseBoltdb - an embedded key value database
Boltdb - an embedded key value database
 
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
 
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines	Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Deep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdfDeep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdf
 
Concept of thread
Concept of threadConcept of thread
Concept of thread
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Improvements in Bitsy 1.5

  • 1. Improvements in Bitsy 1.5 Sridhar Ramachandran Founder, LambdaZen LLC
  • 2. Background ● Bitsy is a small, fast, embeddable, durable, in-memory graph database that implements the Tinkerpop Blueprints API. ● The original presentation on Bitsy is available at http://slideshare.net/lambdazen/bitsy-graphdatabase ● Bitsy 1.5 is faster and leaner than before! ○ Has a smaller memory footprint ○ Uses (mostly) lock-free read algorithms ● This presentation covers the improvements in the 1.5 release.
  • 3. Major features in the 1.5 release ● The 1.5 release features: ○ Memory-efficient data structures ○ Mostly lock-free read algorithms ● Bitsy’s new memory-efficient data structures are designed to reduce the overhead of maintaining adjacency lists and properties. ● Bitsy’s new read algorithms are designed to use the latest Java “compare-and-set” (CAS) concurrency features to reduce the overhead of locks in highly threaded scenarios.
  • 4. Memory-efficient data structures ● Bitsy 1.0 relied on Java Collections to maintain adjacency lists and properties of vertices. ● Java Collections aren’t memory efficient for small-sized data structures because they create many holder objects. ● The 1.5 release stores small adjacency lists (N<24) and small properties (N<16) in hand- coded objects with minimal overhead.
  • 5. Memory-efficient data structures ● Different concrete classes capture adjacency lists and properties for small N. ○ This approach reduces the overall number of objects. ○ Large adjacency lists are stored in a compact hash- set by label referring to memory-efficient lists. Adjacency lists for out-degree 0, 1 and 2 Vertex properties for N = 0, 1 and 2
  • 6. Lock-free reading ● Bitsy 1.5 also introduces lock-free reading using sequential locks (seqlock). ● Read operations track the sequence numbers at the start and end. ○ If they are the same -- Success. ○ If they are different -- Retry! ● Reads don’t start till the counter is even. ● Writers increment the counters twice ○ Before the write to make the counter an odd number ○ After the write to make the counter an even number
  • 7. (Mostly) lock-free reading ● Bitsy’s sequential locks can cause “live lock” situations when there are too many writers. ● To avoid this, readers degrade to RW locks after a certain number of retries. ● Seqlock are faster than RW locks in highly threaded environments where the # of active threads exceed the # of cores. ● Bitsy uses locks on writes because ○ write-retries are complex with transactions, and ○ locking is not the bottleneck for writes -- the file system is the bottleneck.
  • 8. Benchmarks ● The plot below shows the read throughput* of a test! application that repeatedly loops through a graph. * Tests performed on a $600 HP p7-1287c desktop PC with a single 7200rpm hard disk. ! The code for this test can be found in BitsyGraphTest.java under the method testMultiThreadedCommits().
  • 9. Benchmarks ● The lock-free read algorithms in Bitsy 1.5 show a significantly higher throughput than Bitsy 1.0. ○ Bitsy 1.0 had a drop in performance when the number of threads exceeded the number of cores. ○ The read throughput exceeds 10M reads/sec! ● Bitsy is now comparable to Neo4J in read throughput* . ○ This is an apples-to-apples comparison since Neo4J is embedded and the graph is fully cached. ○ Most “bad” Neo4J benchmarks are taken when the graph doesn’t fit in memory. ○ Neo4J is extremely fast when the graph fits in memory -- and now, so is Bitsy!
  • 10. Another read benchmark ● The following plot shows the traversal performance of Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threaded setting on a bipartite graph with 1M vertices and out-degree of 3. ● Again, you can see that the performance is comparable.
  • 11. Benchmarks for write ● As with 1.0 release, Bitsy’s write throughput is much higher than Neo4J because of the “No Seek” principle. ○ For more info, please refer to the project page at http://bitbucket.org/lambdazen/bitsy/
  • 12. Wrap-up ● The 1.5 release introduces memory-efficient data structures and (mostly) lock-free reading to the Bitsy graph database. ○ With these improvements, Bitsy’s read performance is comparable to Neo4J’s cache. ○ Bitsy’s “No Seek” write algorithms continue to outperform other graph databases, including Neo4J. ● Bitsy is a dual-licensed product with ○ an AGPL license for open-source projects, and ○ a liberal unlimited-use OEM/end-user license for commercial projects. Details at lambdazen.com.