Membase Meetup 2010


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Membase Meetup 2010

  1. 1. Membase Meetup<br />Technical Overview and Architectural Discussion<br />
  2. 2. 2<br />Tonight<br />Membase Overview<br />Use Cases and Deployment Examples<br />Membase Architecture<br />Demo!<br />Developing with Membase<br />A Glimpse into the Future<br />
  3. 3. What is Membase?<br />
  4. 4. Before: Application scales linearly, data hits wall<br />Application Scales Out<br />Just add more commodity web servers<br />Database Scales Up<br />Get a bigger, more complex server<br />4<br />
  5. 5. Membase is a distributed database<br />5<br />Application user<br />Web application server<br />Membase Servers<br />In the data center<br />On the administrator console<br />
  6. 6. Built-in Memcached Caching Layer<br />6<br />Memcached<br />Memcached<br />Membase Database<br />Membase Database<br />Memcached Mode<br />Membase Mode<br />Membase development team has contributed over half of the source code to the Memcached project.<br />
  7. 7. Deployment options<br />7<br />
  8. 8. Secure multitenant support<br />8<br />Application user<br />Web application server<br />Bucket 1<br />Bucket 2<br />Aggregate Cluster Memory and Disk Capacity<br />Membase data servers<br />In the data center<br />On the administrator console<br />
  9. 9. Five minutes or less to a working cluster<br />Downloads for Linux and Windows<br />Start with a single node<br />One button press joins nodes to a cluster<br />Easy to develop against<br />Just SET and GET – no schema required<br />Drop it in. 10,000+ existing applications already “speak membase” (via memcached)<br />Practically every language and application framework is supported, out of the box<br />Easy to manage<br />One-click failover and cluster rebalancing<br />Graphical and programmatic interfaces<br />Configurable alerting<br />Membase is Simple, Fast, Elastic<br />9<br />
  10. 10. Membase is Simple, Fast, Elastic<br />10<br />Predictable<br />“Never keep an application waiting”<br />Quasi-deterministic latency and throughput<br />Low latency<br />Built-in Memcached technology<br />Auto-migration of hot data to lowest latency storage technology (RAM, SSD, Disk)<br />Selectable write behavior – asynchronous, synchronous (on replication, persistence)<br />High throughput<br />Multi-threaded<br />Low lock contention<br />Asynchronous wherever possible<br />Automatic write de-duplication<br />
  11. 11. Membase is Simple, Fast, Elastic<br />11<br />Zero-downtime elasticity<br />Spread I/O and data across commodity servers (or VMs) <br />Consistent performance with linear cost<br />Dynamic rebalancing of a live cluster<br />All nodes are created equal<br />No special case nodes<br />Clone to grow<br />Extensible<br />Filtered TAP interface provides hook points for external systems (e.g. full-text search, backup, warehouse)<br />Data bucket – engine API for specialized container types<br />Membase NodeCode[FUTURE]<br />
  12. 12. Proven at small, and extra large scale<br />Leading cloud service (PAAS) provider<br />Over 65,000 hosted applications<br />Membase Server supporting over 3,000 Heroku customers<br />Social game leader – FarmVille, Mafia Wars, Café World<br />Over 230 million monthly users<br />Zynga is a core contributor to and large scale user of Membase Server<br />12<br />
  13. 13. After: Data layer scales like application logic layerData layer now scales with linear cost and constant performance.<br />Application Scales Out<br />Just add more commodity web servers<br />Membase Servers<br />Database Scales Out<br />Just add more commodity data servers<br />Scaling out flattens the cost and performance curves.<br />13<br />
  14. 14. Membase - A practical path to “NoSQL” adoption<br />14<br />
  15. 15. Use Cases<br />
  16. 16. 16<br />Where does Membase Fit?<br />Applications with many, many users<br />Online, dynamic applications<br />Applications with growing datasets which need quick access, growing audience<br />
  17. 17. Use case – Ad targeting<br />17<br />40 milliseconds to come up with an answer.<br />profiles, real time campaign statistics<br />3<br />2<br />1<br />profiles, campaigns<br />events<br />
  18. 18. 18<br />Search and Gaming Portal<br />Database<br />
  19. 19. Membase Architecture<br />
  20. 20. 20<br />Clustering<br />Underlying cluster functionality based on erlang OTP<br />Have a custom, vector clock based way of storing and propagating...<br />Cluster topology<br />vBucket mapping<br />Collect statistics from many nodes of the cluster<br />Identify hot keys, resource utilization<br />20<br />
  21. 21. 21<br />
  22. 22. 22<br />TAP<br />A generic, scalable method of streaming mutations from a given server<br />As data operations arrive, they can be sent to arbitrary TAP receivers<br />Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data<br />Three modes of operation<br />
  23. 23. 11211<br />11210<br />memcapable 1.0<br />memcapable 2.0<br />moxi<br />memcached<br />protocol listener/sender<br />REST management API/Web UI<br />vBucket state and replication manager<br />Node health monitor<br />Rebalance orchestrator<br />Heartbeat<br />Process monitor<br />Global singleton supervisor<br />Configuration manager<br />engine interface<br />membase storage engine<br />http<br />on each node<br />one per cluster<br />Erlang/OTP<br />HTTP<br />distributed erlang<br />erlang port mapper<br />21100 – 21199<br />4369<br />8080<br />Membase Architecture<br />
  24. 24. 24<br />Clients, nodes and other nodes<br />
  25. 25. 25<br />Data buckets are secure membase “slices”<br />Application user<br />Web application server<br />Bucket 1<br />Bucket 2<br />Aggregate Cluster Memory and Disk Capacity<br />Membase data servers<br />In the data center<br />On the administrator console<br />
  26. 26. 11211<br />11210<br />memcapable 1.0<br />memcapable 2.0<br />moxi<br />memcached<br />protocol listener/sender<br />REST management API/Web UI<br />vBucket state and replication manager<br />Rebalance orchestrator<br />Node health monitor<br />Data Manager<br />Cluster Manager<br />Global singleton supervisor<br />Configuration manager<br />Heartbeat<br />Process monitor<br />engine interface<br />membase storage engine<br />http<br />on each node<br />one per cluster<br />Erlang/OTP<br />HTTP<br />distributed erlang<br />erlang port mapper<br />21100 – 21199<br />4369<br />8080<br />Membase Architecture<br />
  27. 27. 27<br />vBucket mapping<br />
  28. 28. 28<br />Membase “write” data flow – application view<br />User action results in the need to change the VALUE of KEY<br />1<br />Application updates key’s VALUE, performs SET operation <br />2<br />4<br />Membase (memcached) client hashes KEY, identifies KEY’s master server<br />3<br />SET request sent over network to master server<br />5<br />Membase replicates KEY-VALUE pair, caches it in memory and stores it to disk<br />
  29. 29. 29<br />Disk > Memory<br />Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase.<br />Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.<br />
  30. 30. Membase Demo<br />
  31. 31. 31<br />Thanks!<br />
  32. 32. Key-Value Patterns<br />
  33. 33. 33<br />Key-Value<br />(with a replica)<br />Items have:<br />Key<br />Value<br />Expiration<br />Flags<br />CAS (more on this later)<br />Operations include:<br />Get/Set<br />Increment/Decrement<br />Append/Prepend<br />
  34. 34. 34<br />Membase Datatypes<br />“Any customer can have a car painted any colour that he wants so long as it is black.”<br />byte[]<br />Does your data have 1s and 0s?<br /><ul><li>Items do have flags
  35. 35. Many clients use flags
  36. 36. Data type options
  37. 37. Google protobuf
  38. 38. Thrift
  39. 39. Avro</li></li></ul><li>35<br />Transactions<br />Lock == slow me down<br />CAS operations<br />Optimistic locking<br />Very useful with complex datatypes<br />Imagine two clients trying to update a complex item<br />You’re likely using CAS already... if you use a CPU<br />User 1<br />User 2<br />Success<br />Fail!<br />
  40. 40. 36<br />Common Use: Sessions<br />Web user sessions<br />Highly read, less writes in many case<br />Protocol advantage of memcached<br />Options already for PHP, Ruby and Java<br />Application state<br />Not necessarily “entity” style things<br />May be appropriate for a “cache” pool<br />
  41. 41. 37<br />Common Use (cache): Rate Limiting<br />Want to provide API calls into the system<br />Twitter search<br />Google search services<br />Use the atomic increment<br />Set an item with a unique ID<br />Upon API request, increment and check<br />HTTP 420: go away and come back later<br />Your Users<br />¡Ouch!<br />Your App<br />
  42. 42. Looking Ahead: NodeCodeFrank Weigel, Membase<br />
  43. 43. 39<br />Beyond key-value <br /><ul><li>Indexing/Range Queries
  44. 44. Advanced Data Structures
  45. 45. Sub-object direct manipulation</li></ul>Validation and In-flight transformation<br /><ul><li>Block mutations failing validation
  46. 46. Enrich or transform objects</li></ul>Connectors (Integrate easily with other systems)<br /><ul><li>Solr
  47. 47. Hadoop
  48. 48. MySQL</li></ul>NodeCode – Motivation <br />
  49. 49. 40<br />NodeCode - What is it?<br />Method for extending & customizing Membase<br />Separate code modules<br />Defined interface to datapath and cluster manager<br />Notification on events<br /><ul><li>Synchronous
  50. 50. Asynchronous</li></li></ul><li>41<br />Simple<br /><ul><li>Packaged modules for easy install and enable
  51. 51. Library of “off the shelf” modules
  52. 52. Module monitoring
  53. 53. Straight forward development and debugging</li></ul>Fast<br /><ul><li>Low latency/high-throughput
  54. 54. Per-bucket process isolation
  55. 55. Don’t break data manager performance/correctness</li></ul>Elastic<br /><ul><li>Automatically migrate and instantiate on rebalance
  56. 56. Provide support for migration of internal data
  57. 57. Leverage native Membase engine for internal data storage</li></ul>NodeCode – Drivers<br />
  58. 58. 42<br />Block-level architecture<br />
  59. 59. 43<br />Java only<br /><ul><li>jar format</li></ul>Must implement minimal module API<br /><ul><li>Initial module startup
  60. 60. Module removal
  61. 61. Association with bucket</li></ul>NodeCode library helper functions<br /><ul><li>Register synchronous & asynchronous listeners/callbacks
  62. 62. Register protocol extension/callbacks
  63. 63. Register rebalance callback
  64. 64. Register cluster manager event callbacks
  65. 65. Membase data access</li></ul>NodeCode 1.0 Plans<br />
  66. 66.
  67. 67. 45<br />Q&A<br />
  68. 68. 46<br />Attributions<br /><br /><br /><br />