VMware vFabric gemfire for high performance, resilient distributed apps

1,546 views

Published on

Learn how VMware vFabric GemFire helps build

Published in: Technology
  • Be the first to comment

VMware vFabric gemfire for high performance, resilient distributed apps

  1. 1. Building High Performance, Data Intensive Resilient, Distributed Applications<br />
  2. 2. The Challenge<br />Data Explosion<br />Decision Time Compression<br />Critical to act time frame that matters<br />Milliseconds<br />Seconds<br />Minutes or even hours <br />
  3. 3. Today’s Modern Architecture’s<br />Web Applications<br />Load Balancer<br />Web<br />Stateless<br />Application<br />Database Tier<br />Stateful<br />Storage Tier<br />
  4. 4. Today’s Modern Architecture’s<br />Data Ingest Applications<br />
  5. 5. Agenda - Challenges and Solutions<br />High Performance & Data Intensive<br /><ul><li>Latency
  6. 6. Scale</li></ul>Resilient<br /><ul><li>Reliability
  7. 7. Availability</li></ul>Solutions<br />
  8. 8. Sources of Latency<br />Disk access <br />Serial access<br />Network time<br />Sockets (open/close)<br />Marshalling and unmarshalling<br />Security overhead<br />
  9. 9. Sources of Latency - Disk Access <br />Mitre Public Release: 10-0861. Distribution Unlimited<br />
  10. 10. Network time<br />Keep sockets open to all members<br /><ul><li> Helps with security performance</li></ul>Minimize network hops<br />Push computing to data<br />
  11. 11. Marshalling/Unmarshalling<br /> Lazy deserialization<br />Index serialized data<br />Shared compact data format<br />
  12. 12. Security overhead<br />Mutual authentication at socket time<br /> Process/user level (optional)<br />
  13. 13. Agenda - Challenges and Solutions<br />High Performance & Data Intensive<br /><ul><li>Latency
  14. 14. Scale</li></ul>Resilient<br /><ul><li>Reliability
  15. 15. Availability</li></ul>Solutions<br />
  16. 16. Architecting Infinitely Scalable Systems<br />A seminal paper on the architecture of elastic applications <br />by Pat Helland(Tandem Computing, Amazon.com, Microsoft)<br />“Life Beyond Distributed Transactions: an Apostate’s Opinion”<br />http://www.cidrdb.org/cidr2007/papers/cidr07p15.pdf<br />http://blogs.msdn.com/b/pathelland/<br />Application architectures need to change to achieve<br />infinite scalability and elasticity without using large hardware<br />
  17. 17. Scale - Layered Code<br />Common layered architecture in largest scale applications<br />Top layer<br />Scale Agnostic Code<br />Programming Abstraction<br />Abstraction layer<br />Scale Aware Code<br />Bottom layer understands <br />application is distributed<br />
  18. 18. Scale <br />Shared nothing<br />Partition/Sharding<br />Collocated relations<br />Replicated reference<br />14<br />
  19. 19. Agenda - Challenges and Solutions<br />High Performance & Data Intensive<br /><ul><li>Latency
  20. 20. Scale</li></ul>Resilient<br /><ul><li>Reliability
  21. 21. Availability</li></ul>Solutions<br />
  22. 22. Reliability<br />No data loss<br />No data corruption<br />Consistency <br />Race condition<br />Synchronous vs Asynchronous<br />
  23. 23. No Data Loss, No Data corruption, Consistency <br />Distributed semaphore - lightweight<br />Primary copy <br />Distributed transaction(s) – heavy weight<br />MVCC – <br />Acronyms are annoying<br />
  24. 24. Race Conditions, Consistency<br />Eventually Consistent<br />Stateless ?<br />Application<br />Data Tier<br />Stateful<br />Controllably Consistent<br />
  25. 25. Agenda - Challenges and Solutions<br />High Performance & Data Intensive<br /><ul><li>Latency
  26. 26. Scale</li></ul>Resilient<br /><ul><li>Reliability
  27. 27. Availability</li></ul>Solutions<br />
  28. 28. Availability - on Server<br />Protect data<br />Extra copies<br />Disk?<br />Data Center crashes<br />Network Splits<br />Split Brain detection<br />
  29. 29. Availability – Between Client/Server<br />Slow Consumers<br />HA Queues<br /> Client Network drops<br />Durable subscribers<br />
  30. 30. Agenda - Challenges and Solutions<br />High Performance & Data Intensive<br /><ul><li>Latency
  31. 31. Scale</li></ul>Resilient<br /><ul><li>Reliability
  32. 32. Availability</li></ul>Solutions<br />
  33. 33. Latency & Reliability - Memory-based Performance<br />Memory on a peer machine to make data updates durable, <br />Writes return 10x to 100x faster than disk, <br />10s to 100s of Microseconds vs 10s to 100s Milliseconds<br />Perform<br />Customers<br />Orders<br />Product<br />Keep redundant copies of data<br />Update thru primary<br />0 data loss<br />Optionally write updates to disk, <br />Optional write todata warehouse asynchronously and reliably. <br />Protect<br />
  34. 34. Memory-based Performance<br />Perform<br />In Situ data processing<br />Real-time controls<br />Calculate: current total fuel left<br />
  35. 35. Latency - Data-Aware Access<br />Perform<br />Application Client<br />Java, C++, .Net, SQL<br />
  36. 36. Latency & Reliability - HA Data-Aware Function<br />Data Aware Function<br />Execute<br />Client<br />Move behavior to data<br />
  37. 37. Parallel Queries<br />Compute<br />Client<br />Scatter-Gather <br />Queries & Functions<br />
  38. 38. Data Distribution<br />Distribute<br />Keep clusters synchronized in real-time <br />Operate reliably <br />Disconnected, Intermittent and Low-Bandwidth network environments.<br />
  39. 39. Distributed Events<br />Notify<br />Targeted, <br />Guaranteed delivery. <br />Event notification <br />&<br />Continuous Queries<br />Disconnected, Intermittent and Low-Bandwidth network environments<br />
  40. 40. Cloud Ready<br />Soar<br />Load Balancer<br />Web Tier<br />Application Tier<br />GemFire Jar 11MB (or less)<br />Optional reliable, asynchronous feed to <br />Data Warehouse or Archival Database<br />
  41. 41. Thank you <br />

×