• Like


Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Membase Introduction

Uploaded on

Introduction to Membase, presented at SD Forum Cloud SIG on Oct. 26, 2010.

Introduction to Membase, presented at SD Forum Cloud SIG on Oct. 26, 2010.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. membase.org: The Simple, Fast, Elastic NoSQL Database Membase, Inc. Matt Ingenthron matt@membase.com
  • 2. Membase is an Open Source distributed, key-value database management system optimized for storing data behind interactive web applications. All aspects of membase are simple, fast and elastic by design. 2
  • 3. Value image courtesy http://www.flickr.com/photos/vintagedept/3617706196/ 3
  • 4. Simple Image courtesy http://www.flickr.com/photos/brenda-starr/ 3509344100/sizes/m/in/photostream/ 4
  • 5. Simple (with a replica ) Image courtesy http://www.flickr.com/photos/brenda-starr/ 3509344100/sizes/m/in/photostream/ 4
  • 6. Fast • Original use case: speed up access to authoritative data as a distributed hashtable • Must be at at least as fast as a highly tuned DBMS • Designed for modern datacenter substrate – Designed for VM and cloud deployments 5
  • 7. Elastic • Add nodes without losing access to data • Maintain consistency when accessing data – membase is a CP type system • Scale linearly by just adding more nodes 6
  • 8. Before: Application scales linearly, data hits wall Application Scales Out Just add more commodity web servers Database Scales Up Get a bigger, more complex server 7
  • 9. Membase is a distributed database Application user Web application server Membase Servers In the data center On the administrator console 8
  • 10. Built-in Memcached Caching Layer Memcached Memcached Membase Database Membase Database Memcached Mode Membase Mode Fact: Membase development team has also contributed over half of the code to the Memcached project. 9
  • 11. Proven at small, and extra large scale Leading cloud service (PAAS) Social game leader – FarmVille, provider Mafia Wars, Café World Over 65,000 hosted Over 230 million monthly users applications Membase Server is the Over 2,000 users to date 500,000 ops-per-second Membase Server serving over database behind FarmVille and 3,000 Heroku customers Café World 10
  • 12. After: Data layer scales like application logic layer Data layer now scales with linear cost and constant performance. Application Scales Out Just add more commodity web servers Membase Servers Database Scales Out Just add more commodity data servers Scaling out flattens the cost and performance curves. 11
  • 13. Who? 12
  • 14. Fault-tolerant memcached Cluster at  NHN the  biggest  web  portal  in  Korea
  • 15. What is Project Arcus? • Memcached – Common protocol across PHP, Java, C applications • Moxi (Memcached proxy) based • In-house automatic fault-detection and failover solution • Collectd-based monitoring • Proxy and cache server administration UI • Private cloud service 14
  • 16. Previous Deployments • A few individual memcached installations • Problems – No fault-tolerance • Hardware failures are common (heat, network switch failure, etc) – No automatic scalability • To add / remove a memcached server, they need to rebuild code, distribute, and restart all clients 15
  • 17. Today • Memcached clusters – Fault-tolerance transparent to clients • Consistent hashing in moxi (memcached proxy) – Cache As A Service (CaaS) • All major services in NHN started using cache • Multitenancy across cache services 16
  • 18. Performance impact Performance DB Load X 16.6 50 % X 10 34 % Response Time Throughput
  • 19. Membase-Cloudera Partnership “AOL serves more than 5 billion impressions per day from our ad serving platforms, and any incremental improvement in processing time translates to huge benefits in our ability to more effectively serve the ads to needed meet our contractual commitments. Traditional databases like MySQL lack the scalability required to support our goal of five milliseconds per read/write. Creating user profiles with Hadoop, then serving them from Membase, reduces profile read and write access to under a millisecond, leaving the bulk of the processing time budget for improved targeting and customization.” Pero Subasic Chief Architect, AOL
  • 20. Membase-Cloudera Partnership Joint development of bi-directional software integration between Membase and Hadoop • Membase NodeCode Module streaming interface to Cloudera Distribution for Hadoop via Flume interface • Sqoop-derived command line utility for bi- directional batch movement of data between Membase and Cloudera Distribution for Hadoop Joint marketing and sales of integrated distributed OLTP-OLAP solution • Membase – the distributed OLTP solution • Cloudera – the distributed OLAP solution Cloudera to distribute integration
  • 21. Customer use case – Ad targeting 40 milliseconds to come up with an answer. profiles, real time campaign 3 statistics 2 1 profiles, campaigns events 20
  • 22. Demo 21
  • 23. The Guts Photo Courtesy http://www.flickr.com/photos/pellis/76804760/ 23
  • 24. Clustering • Underlying cluster functionality based on erlang OTP • Have a custom, vector clock based way of storing and propagating... – Cluster topology – vBucket mapping • Collect statistics from many nodes of the cluster – Identify hot keys, resource utilization 24
  • 25. vBucket mapping 26
  • 26. TAP • A generic, scalable method of streaming mutations from a given server – As data operations arrive, they can be sent to arbitrary TAP receivers • Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data • Three modes of operation Data Mutations Working set Data Mutations Working set Working set 27
  • 27. Disk > Memory Dataset may have many memory quota Bucket Configuration items infrequently accessed. mem_high_wat However, memcached has mem_low_wat different behavior (LRU) than wanted with membase. Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important. 28
  • 28. Clients, nodes and other nodes Client moxi + Client port 11210 port 11211 memcached operations REST/comet memcached operations cluster topology and vbucket map memcached operations moxi ns_server membase ns_server (memcached + membase engine) TAP memcached operations with tap commands vbucketmigrator 29