#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
A zoom on membase vng
1. A Zoom on Membase
Dedicated to VNG
Viet-Trung TRAN
ENS Cachan, INRIA/IRISA France
1 www.trungtv.com 19/06/11
2. What’s Membase
A key/value store
Simple, fast, elastic
Membase’s API is simple but not simpler
SET(key, value)
Value = GET(key)
2 www.trungtv.com 19/06/11
3. Where’s Membase
SQL database? No
No complex queries, no-schema, no ACID
NoSQL
Non-relational, distributed and HORIZONTALLY scalable
Key/value store
Dynamo, Membase,Voldemort, Riak, Redis, etc.
Column-oriented store
BigTable, Hbase, Cassandra, etc.
Documents store
MongoDB, CouchDB, Terrastore, etc.
Array-oriented store
Pyramid, SciDB
3 www.trungtv.com 19/06/11
4. Why NoSQL
For over 40 years, mostly used RDMS
So good but so COMPLEX
Hard to SCALE
2005: “One size fits all”: An idea whose time has come and
gone
Called for “Scale OUT” design
Cheap, easy
Why Membase
Membase = So-called Memcached + persistent storage
Membase = A Distributed caching system + persistent storage
4 www.trungtv.com 19/06/11
5. Why Membase
Membase = So-called Memcached + persistent storage
Membase = A Distributed caching system + persistent
storage
Membase speaking Memcached languages
5 www.trungtv.com 19/06/11
6. MEMBASE = SIMPLE, FAST, ELASTIC
Simple
2 primitives GET, SET (key, value)
Fast
Cost for I/O routing: O(1)
Give me a key, I know exactly where to go
Elastic
Free scalle UP and DOWN
Scale from 1 to thousands machines
Fault-tolerance
6 www.trungtv.com 19/06/11
12. Membase’s design choices
CAP theorem: Pick 2 out of 3
Consistency
Availability
Patition-tolerance
Membase is CA
Do we really need strong consistency ?
12 www.trungtv.com 19/06/11
13. Strong consistency
Pessimistic replication may be costly
A write is blocking until data is completely replicated
1 single master node coordinates reads and writes
Lower I/O performance in concurrency
Synchronous replication schema
One replica failed, I/O failed
Proposal: using different consistency models depending on
applications
13 www.trungtv.com 19/06/11
14. Data migration & replication
LRU algorithm
Replication factor is configurable per (key, value)?
Vbucket
Re-replication in case of failure?
“Anti-entropy” replica synchronisation?
Proposal: Application-aware migration is the best
14 www.trungtv.com 19/06/11
15. Cluster management
One single node is elected as cluster leader
Only running efficiently in single cluster environment
High load on the leader at large-scale
Rebalancing?
Permanent failure vs temporary failure?
“Node capacity-aware” load balancing?
Heartbeat frequency should be well configured
Depending on cluster size and network type
Efficiency of leader election algorithm?
15 www.trungtv.com 19/06/11
16. Conclusion
Pros
In production for many companies
Well known API
Cons
Not so well documented
May be better in source code?
Some key techniques should be well clarified
One size fit all has come and gone: Design patterns
Application-aware
Infrastructure-aware
Human resource-aware
16 www.trungtv.com 19/06/11