Container Independent failover framework - Mobicents Summit 2011


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Container Independent failover framework - Mobicents Summit 2011

  1. 1. Container Independent Failover Framework Mobicents 2011 Summit
  2. 2. Agenda <ul><ul><li>Current Scaling Limitations </li></ul></ul><ul><ul><li>DataGrid and Eventual Consistency </li></ul></ul>
  3. 3. Current Scaling Limitations Mobicents 2011 Summit
  4. 4. Cluster Replication Default <ul><ul><li>Total replication </li></ul></ul><ul><ul><li>Each node has to store a copy of the N other nodes state present in the cluster. </li></ul></ul><ul><ul><li>Involves keeping 100% state constant across entire cluster </li></ul></ul><ul><ul><li>Becomes very expensive fast, doesn't scale very well, deteriorate perf very fast </li></ul></ul>
  5. 5. <ul><ul><li>Data is replicated to a finite number of nodes in a cluster rather than the entire cluster. </li></ul></ul><ul><ul><li>Each node has its own data, and the backup data of N (configurable) other node(s) </li></ul></ul><ul><ul><li>Data is only replicated only to the buddy nodes </li></ul></ul><ul><ul><li>Synchronous and asynchronous replication modes supported </li></ul></ul><ul><ul><li>If a node fails its data is still backed up on other buddies </li></ul></ul><ul><ul><li>If failover happens on non buddy node and look for this data, the data &quot;gravitates&quot; to this new non buddy node which now owns the data and acts as the new backup node </li></ul></ul>Buddy Replication A B C D E arrows indicate replication direction
  6. 6. <ul><ul><li>Sounded promising to control memory growth and network utilization and allow larger scaling </li></ul></ul><ul><ul><li>Gravitation is expensive and is uncontrolled with regard to current server load and can make the entire cluster fail. </li></ul></ul>Buddy Replication Limitations
  7. 7. <ul><ul><li>Scales very well </li></ul></ul><ul><ul><li>Higher Management and Operational costs </li></ul></ul><ul><ul><li>Firing up a new instance may often require to fire up a new cluster instead </li></ul></ul>N nodes mini clusters <ul><ul><li>If an entire cluster goes down, calls fails </li></ul></ul><ul><ul><li>Require a SIP LB aware of the topology </li></ul></ul>
  8. 8. DataGrids and Eventual Consistency Mobicents 2011 Summit
  9. 9. CAP (Brewer's) Theorem <ul><ul><li>Consistency : all nodes see the same data at the same time </li></ul></ul><ul><ul><li>Availability : a guarantee that every request receives a response about whether it was successful or failed. </li></ul></ul><ul><ul><li>Partition Tolerance : the system continues to operate despite arbitrary message loss </li></ul></ul><ul><ul><li>A distributed system can satisfy any two of these guarantees at the same time, but not all three </li></ul></ul>
  10. 10. Drop Network Partition
  11. 11. Drop Availability Network Partition Failure : All affected nodes wait until the partition is whole again before replying, thus loss of availability
  12. 12. Drop Consistency Network Partition Failure : Consistency is Broken
  13. 13. DataGrid vs JBoss Cache <ul><ul><li>DataGrid similar to JBoss Cache Buddy Replication except : </li></ul></ul><ul><ul><li>transaction atomicity is not conserved </li></ul></ul><ul><ul><li>data is eventually replicated (same for JBoss Cache in asynchronous mode) </li></ul></ul><ul><ul><ul><li>Loosing up Consistency allows to scale much better and provide better performance overall as less intensive in terms of node synchronisation </li></ul></ul></ul><ul><ul><li>Failover to the correct buddy is ensured by consistent hashing of the keys </li></ul></ul>
  14. 14. Customers and Community HA Use Cases <ul><ul><li>No High Availability : one node </li></ul></ul><ul><ul><li>High Availability with no replication : up to hundreds of nodes </li></ul></ul><ul><ul><li>High Availability with replication but controlled load : Fixed Low Number of Nodes < 5 </li></ul></ul><ul><ul><ul><li>Usually need high consistency </li></ul></ul></ul><ul><ul><li>High Availability with replication and uncontrolled load : up to hundreds and more nodes </li></ul></ul><ul><ul><ul><li>Eventual consistency can't be avoided </li></ul></ul></ul><ul><ul><li>GeoLocation Failover Needs to be added to the last 2 items </li></ul></ul><ul><ul><ul><li>Eventual consistency will typically help here </li></ul></ul></ul>
  15. 15. Peer to Peer Mode with DIST with L1 Cache The application container or the apps will need to access sessions, dialogs and other objects in the distributed store. Each container will maintain a quick-lookup local cache of deserialized objects. In Sip Servlets that is implemented for sessions. If that fails (the session needed by the app is not in the local cache), then the container will look it up in infinispan. Each infinispan instance has a data store with items hashed to the specific node (DIST-assigned serialized data). This is where the majority of the memory is consumed. However the data in this store is randomly distributed based on hash values and is not cached based on LRU or other common caching rule. To help with that infinispan allows enabling L1 cache which will enable LRU policy on commonly accessed items that are missing from the DIST-assigned store.
  16. 16. Data Grid HA Model In this model the data is primarily stored in a remote data-grid by partitioning it based on some key value. The communication with the grid is done only via the network with protocols such as hot rod, memcached or thrift. In this model caching deserialized data would be very difficult if concurrent changes are allowed. Protocols for network lookup and transport in grid systems rarely allow delivery of async events to the clients (in this case the Local Node is the client for the grid). So we will never be eagerly notified of some session has changed so that we know if we can use the cached deserialized local copy. L1 cache for serialized chunks however may have a function to check if the data is up to date which still requires at least one network request by itself even without locking the data. Each grid protocol supports different operations and features that can be used to optimize the network access and support cache more efficiently. Local cache invalidation is more difficult in this model as it is not guaranteed that the HA grid architecture allows notifications to keep the caches up to date with any remote changes.
  17. 17. Thank you !