Manoj Awasthi,
Tech Architect @Tokopedia
Boltdb
an embedded key value database
Structure of this talk..
A bit of history..
Image
Server 1
Image
Server 2
Image
Server N
...
123.jpg : server_03;
246.jpg : server_02;
345.jpg : server_17;
….
….
tokopedia
image
router
.. as time passed
gradually, we kept newer images to s3:// ..
• All images uploaded from that point onwards could be
served from a single server
• no such mapping (mongodb) was required,
• Old images still being served in the same way and did need the mapping.
• But, now the database was “read only” and fixed size.


Also: We suffered frequent memory spikes and process kill by linux “out of
memory killer” (mongodb) which led both to latency and downtimes.
Search for alternative..
Requirements boiled down to: 

• Fast retrieval - needed all across
• Scalable - to tens of thousands of queries per second
• Persistent - don’t have to recompute everything from scratch on each bootup
(in case!)
Read only usage - not a
constraint but this could help in
“trade off”
Also, we can do with

Redis! Well, it could work well given our fixed data size and read only usage.
In fact, we did try and saw scale problems with redis (high cpu load).
Also $$.

We needed a
lightweight embedded
database .. “BoltDB” - an embedded key value
database written in golang looked
interesting.

Why not redis?
Compact, fast.
Based on LMDB [0].
Both use B+ tree for storage, maintain ACID semantics with fully serializable
transactions, and support many other database features.
Simple
While LMDB focuses on raw performance, Boltdb is focussed on ease of use.
Fits better for a “read heavy” usage (read more, write less)
Written in golang so fits well with rest of the stack at Tokopedia.
[0] https://symas.com/products/lightning-memory-mapped-database/
Why boltdb?
Why boltdb?
In traditional sense, boltdb is not really a database but simply a memory
mapped file. But it provides ACID semantics and other properties associated
with databases so calling it a DB is not misnomer, though.
No installation
required
● It comes as a library
● Installation is as simple as 

importing it in your go program
Opening the 

database..
Add a key value
Fetch a value by key
bolt - command line utility
Bolt is a tool for inspecting bolt databases 

Things to use it for:
Check the integrity of bolt database
Run synthetic benchmarks against bolt database for gauging read and write
performance
Print basic info about database
Generate useful statistics on all pages in the database
Available under cmd/bolt in the github repository.
Caveat: random writes slow as the db grows!
Let’s get back to the problem we were solving. 

The raw data from mongodb exported using mongo-export utility was ~ 4G.
This translated to ~ 13G boltdb database file. 

Export tool that we wrote to export from mongo output to boltdb became much
slower as the size of the database grew. Hence we used sharding to horizontally
partition the data from mongo into many small files and have a smaller boltdb file
for each of them.
The result!
Following is the output of `free -m’ on one of the servers we use: 







Snippet of `top’ output from the same server:
Limitations
Bolt is good for read intensive workloads. Random writes can be slow.
Bolt uses B+ tree internally so there can be a lot of random page access. SSDs
provide a significant performance boost over spinning disks.
Bolt can handle databases much larger than available physical RAM, provided its
memory map fits in process address space. It may be problematic on 32 bit
systems.
The data structures used by bolt are memory mapped and hence endian specific.
This means that you cannot copy a bolt file from a little endian machine to a big
endian machine and have it work. (Most modern CPUs are little endian).
Conclusion
Boltdb worked pretty well for our usecase.
Service handles many thousands of queries per second, is not limited by physical RAM
and doing well! :D
Do give it a try if it fits some of your use case.
References:
[1] https://github.com/boltdb/bolt

[2] http://tech.tokopedia.com/blog/using-boltdb-as-a-fast-persistent-kv-store/

[3] https://symas.com/products/lightning-memory-mapped-database/
Connect with me over:
{ “Email”: “awasthi.manoj@gmail.com”, 

“Twitter”: “https://twitter.com/awmanoj”, 

“Linkedin”: “https://www.linkedin.com/in/manojawasthi”, 

“Github”: “https://github.com/awmanoj/”, 

“Blog”: [ “http://awmanoj.github.io/”, “http://www.manojawasthi.com”]

}
Thank you!

Boltdb - an embedded key value database

  • 1.
    Manoj Awasthi, Tech Architect@Tokopedia Boltdb an embedded key value database
  • 2.
  • 3.
    A bit ofhistory.. Image Server 1 Image Server 2 Image Server N ... 123.jpg : server_03; 246.jpg : server_02; 345.jpg : server_17; …. …. tokopedia image router .. as time passed
  • 4.
    gradually, we keptnewer images to s3:// .. • All images uploaded from that point onwards could be served from a single server • no such mapping (mongodb) was required, • Old images still being served in the same way and did need the mapping. • But, now the database was “read only” and fixed size. 
 Also: We suffered frequent memory spikes and process kill by linux “out of memory killer” (mongodb) which led both to latency and downtimes.
  • 5.
    Search for alternative.. Requirementsboiled down to: 
 • Fast retrieval - needed all across • Scalable - to tens of thousands of queries per second • Persistent - don’t have to recompute everything from scratch on each bootup (in case!) Read only usage - not a constraint but this could help in “trade off” Also, we can do with

  • 6.
    Redis! Well, itcould work well given our fixed data size and read only usage. In fact, we did try and saw scale problems with redis (high cpu load). Also $$.
 We needed a lightweight embedded database .. “BoltDB” - an embedded key value database written in golang looked interesting.
 Why not redis?
  • 7.
    Compact, fast. Based onLMDB [0]. Both use B+ tree for storage, maintain ACID semantics with fully serializable transactions, and support many other database features. Simple While LMDB focuses on raw performance, Boltdb is focussed on ease of use. Fits better for a “read heavy” usage (read more, write less) Written in golang so fits well with rest of the stack at Tokopedia. [0] https://symas.com/products/lightning-memory-mapped-database/ Why boltdb?
  • 8.
    Why boltdb? In traditionalsense, boltdb is not really a database but simply a memory mapped file. But it provides ACID semantics and other properties associated with databases so calling it a DB is not misnomer, though. No installation required ● It comes as a library ● Installation is as simple as 
 importing it in your go program
  • 9.
    Opening the 
 database.. Adda key value Fetch a value by key
  • 10.
    bolt - commandline utility Bolt is a tool for inspecting bolt databases 
 Things to use it for: Check the integrity of bolt database Run synthetic benchmarks against bolt database for gauging read and write performance Print basic info about database Generate useful statistics on all pages in the database Available under cmd/bolt in the github repository.
  • 11.
    Caveat: random writesslow as the db grows! Let’s get back to the problem we were solving. 
 The raw data from mongodb exported using mongo-export utility was ~ 4G. This translated to ~ 13G boltdb database file. 
 Export tool that we wrote to export from mongo output to boltdb became much slower as the size of the database grew. Hence we used sharding to horizontally partition the data from mongo into many small files and have a smaller boltdb file for each of them.
  • 12.
    The result! Following isthe output of `free -m’ on one of the servers we use: 
 
 
 
 Snippet of `top’ output from the same server:
  • 13.
    Limitations Bolt is goodfor read intensive workloads. Random writes can be slow. Bolt uses B+ tree internally so there can be a lot of random page access. SSDs provide a significant performance boost over spinning disks. Bolt can handle databases much larger than available physical RAM, provided its memory map fits in process address space. It may be problematic on 32 bit systems. The data structures used by bolt are memory mapped and hence endian specific. This means that you cannot copy a bolt file from a little endian machine to a big endian machine and have it work. (Most modern CPUs are little endian).
  • 14.
    Conclusion Boltdb worked prettywell for our usecase. Service handles many thousands of queries per second, is not limited by physical RAM and doing well! :D Do give it a try if it fits some of your use case. References: [1] https://github.com/boltdb/bolt
 [2] http://tech.tokopedia.com/blog/using-boltdb-as-a-fast-persistent-kv-store/
 [3] https://symas.com/products/lightning-memory-mapped-database/
  • 15.
    Connect with meover: { “Email”: “awasthi.manoj@gmail.com”, 
 “Twitter”: “https://twitter.com/awmanoj”, 
 “Linkedin”: “https://www.linkedin.com/in/manojawasthi”, 
 “Github”: “https://github.com/awmanoj/”, 
 “Blog”: [ “http://awmanoj.github.io/”, “http://www.manojawasthi.com”]
 } Thank you!