Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Membase Meetup - Silicon Valley


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Membase Meetup - Silicon Valley

  1. 1. Silicon Valley
  2. 2. 2 Tonight • Membase Overview • Use Cases and Deployment Examples • Membase Architecture • Demo! • Developing with Membase • A Glimpse into the Future
  3. 3. What is Membase?
  4. 4. Before: Application scales linearly, data hits wall Application Scales Out Just add more commodity web servers Database Scales Up Get a bigger, more complex server 4
  5. 5. Membase is a distributed database 5 Membase Servers In the data center Web application server Application user On the administrator con
  6. 6. Built-in Memcached Caching Layer 6 Memcached Membase Database Memcached Membase Database Memcached Mode Membase Mode Membase development team has contributed over half of the source code to the Memcached project.
  7. 7. Deployment options 7 application logic OTC memcached client data operations application logic OTC memcached client data operations cluster operations 11211 server list OTC Memcached Server 11211 Membase Server server list proxy vbucket map application logic OTC memcached client Membase Server localhost proxy vbucket map application logic NEW memcached client Membase Server vbucket map Embedded proxy Standalone proxy “vBucket-aware”client Deployment Option1 Deployment Option2 Deployment Option3 11210 data operations cluster operations 11211 proxy vbucket map 11210 data operations cluster operations 11211 proxy vbucket map 11210
  8. 8. Secure multitenant support 8 Membase data servers In the data center Web application server Application user On the administrator con Bucket 1 Bucket 2Aggregate Cluster Memory and Disk Capacity
  9. 9. Five minutes or less to a working cluster • Downloads for Linux and Windows • Start with a single node • One button press joins nodes to a cluster Easy to develop against • Just SET and GET – no schema required • Drop it in. 10,000+ existing applications already “speak membase” (via memcached) • Practically every language and application framework is supported, out of the box Easy to manage • One-click failover and cluster rebalancing • Graphical and programmatic interfaces • Configurable alerting Membase is Simple, Fast, Elastic 9
  10. 10. Membase is Simple, Fast, Elastic 10 Predictable • “Never keep an application waiting” • Quasi-deterministic latency and throughput Low latency • Built-in Memcached technology • Auto-migration of hot data to lowest latency storage technology (RAM, SSD, Disk) • Selectable write behavior – asynchronous, synchronous (on replication, persistence) High throughput • Multi-threaded • Low lock contention • Asynchronous wherever possible • Automatic write de-duplication
  11. 11. Membase is Simple, Fast, Elastic 11 Zero-downtime elasticity • Spread I/O and data across commodity servers (or VMs) • Consistent performance with linear cost • Dynamic rebalancing of a live cluster All nodes are created equal • No special case nodes • Clone to grow Extensible • Filtered TAP interface provides hook points for external systems (e.g. full-text search, backup, warehouse) • Data bucket – engine API for specialized container types • Membase NodeCode [FUTURE]
  12. 12. Leading cloud service (PAAS) provider Over 65,000 hosted applications Membase Server supporting over 3,000 Heroku customers Proven at small, and extra large scale 12 Social game leader – FarmVille, Mafia Wars, Café World Over 230 million monthly users Zynga is a core contributor to and large scale user of Membase Server
  13. 13. After: Data layer scales like application logic layer Data layer now scales with linear cost and constant performance. Application Scales Out Just add more commodity web servers 13 Database Scales Out Just add more commodity data servers Scaling out flattens the cost and performance curves. Membase Servers
  14. 14. Membase - A practical path to “NoSQL” adoption 14
  15. 15. Use Cases
  16. 16. 17 Leading cloud service (PAAS) provider Over 65,000 hosted applications Membase Server serving over 1,200 Heroku customers (as of June 10, 2010) Deployments Leading Membase Social game leader – FarmVille, Mafia Wars, Café World Over 230 million monthly users • Membase Server is the 500,000 ops-per-second database behind FarmVille and Café World
  17. 17. Use case – Ad targeting 18 events profiles, campaigns profiles, real time campaign statistics 40 milliseconds to come up with an answer. 2 3 1
  18. 18. 19 Search and Gaming Portal Database
  19. 19. Targeting at ShareThis
  20. 20. Largest integrated sharing network We make sharing simple, engaging & valuable Powerful Social Analytics & Audience Monetization About ShareThis 450/mo million consumers ~850 thousand sites 50+ social channels
  21. 21. This is how it works Log FilesSearch Keywords Page Views Sharing Behavior HDFS Map/Reduce Content Analysis Taxonomy Ad Server User Membase 2
  22. 22. ShareThis Ad Products
  23. 23. Membase Architecture
  24. 24. 25 Clustering • Underlying cluster functionality based on erlang OTP • Have a custom, vector clock based way of storing and propagating... – Cluster topology – vBucket mapping • Collect statistics from many nodes of the cluster – Identify hot keys, resource utilization 25
  25. 25. 26
  26. 26. 27 TAP • A generic, scalable method of streaming mutations from a given server – As data operations arrive, they can be sent to arbitrary TAP receivers • Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data • Three modes of operation
  27. 27. 29 Clients, nodes and other nodes
  28. 28. 30 Data buckets are secure membase “slices” Membase data servers In the data center Web application server Application user On the administrator console Bucket 1 Bucket 2 Aggregate Cluster Memory and Disk Capacity
  29. 29. 32 vBucket mapping
  30. 30. 34 Disk > Memory Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase. Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.
  31. 31. Membase Demo
  32. 32. 36 Thanks!
  33. 33. Key-Value Patterns
  34. 34. 38 Key-Value (with a replica) Items have: Key Value Expiration Flags CAS (more on this later) Operations include: Get/Set Increment/Decrement Append/Prepend
  35. 35. 39 Membase Datatypes • byte[] – Does your data have 1s and 0s? “Any customer can have a car painted any colour that he wants so long as it is black.” • Items do have flags – Many clients use flags – Data type options • Google protobuf • Thrift • Avro
  36. 36. 40 Transactions • Lock == slow me down • CAS operations – Optimistic locking • Very useful with complex datatypes – Imagine two clients trying to update a complex item • You’re likely using CAS already... if you use a CPU User 1 User 2
  37. 37. 41 Common Use: Sessions • Web user sessions – Highly read, less writes in many case – Protocol advantage of memcached • Options already for PHP, Ruby and Java • Application state – Not necessarily “entity” style things – May be appropriate for a “cache” pool
  38. 38. 42 Common Use (cache): Rate Limiting • Want to provide API calls into the system – Twitter search – Google search services • Use the atomic increment – Set an item with a unique ID – Upon API request, increment and check • HTTP 420: go away and come back later Your Users Your App
  39. 39. Looking Ahead: NodeCode Frank Weigel, Membase
  40. 40. 44 Beyond key-value • Indexing/Range Queries • Advanced Data Structures • Sub-object direct manipulation Validation and In-flight transformation • Block mutations failing validation • Enrich or transform objects Connectors (Integrate easily with other systems) • Solr • Hadoop • MySQL NodeCode – Motivation
  41. 41. 45 NodeCode - What is it? Method for extending & customizing Membase Separate code modules Defined interface to datapath and cluster manager Notification on events • Synchronous • Asynchronous
  42. 42. 46 Simple • Packaged modules for easy install and enable • Library of “off the shelf” modules • Module monitoring • Straight forward development and debugging Fast • Low latency/high-throughput • Per-bucket process isolation • Don’t break data manager performance/correctness Elastic • Automatically migrate and instantiate on rebalance • Provide support for migration of internal data • Leverage native Membase engine for internal data storage NodeCode – Drivers
  43. 43. 47 Block-level architecture
  44. 44. 48 Java only – jar format Must implement minimal module API • Initial module startup • Module removal • Association with bucket NodeCode library helper functions • Register synchronous & asynchronous listeners/callbacks • Register protocol extension/callbacks • Register rebalance callback • Register cluster manager event callbacks • Membase data access NodeCode 1.0 Plans
  45. 45. 50 Q&A
  46. 46. 51 Attributions • g_of_China.png • g_of_South_Korea.svg • g_of_Japan.svg