An Introduction To Space Based Architecture


Published on

Presentation at 3rd Iranian JUG Meeting. Introduces the whys behind scalable architecture, space based architecture and case-study in Gigaspaces.

Published in: Technology

An Introduction To Space Based Architecture

  1. 1. An Introduction to Space Based Architecture Amin Abbaspour MAGFA IT Development Center
  2. 2. Agenda Title Time Scalability, why and hows (15) 10 Space Based Architecture (6) 5 Java Spaces (4) 5 GigaSpaces (5) 10 Migrating Spring Apps to GigaSpaces (6) 5 Case Study (2) 10 Conclusion (2) 1 Q/A - 40 slides 45 min
  3. 3. Innovation Comes From The Need The applications workload is increasing each day. This is inevitable. We expect fast and reliable softwares even with increasing workload. Speed and reliability means the death or life of a business.
  4. 4. But why so much Workload? Todays softwares are not limited to operators and limited society. They directly interact with millions of people and thousand of other softwares. Large scale community sites, like facebook, hi5, twitter. ● Prepaid Telecoms ● Banking/XTP ● Online Gaming ● Online Fraud/Risk Management ●
  5. 5. Need For Speed A brokerage can lose up to $4 million per millisecond of latency. - The Tabb Group An additional 500 ms latency resulted in -20% traffic. - Google An additional 100 ms in latency resulted in -1% sales. - Amazon
  6. 6. Cost of Downtime According to a 2004 Forrester survey of 235 companies the hourly cost of downtime was: Percent of Companies Hourly Cost 33% $10K-100K 25% $100K-500K 13% $500K- 1M 4% >$1M 25% Didn't Know
  7. 7. Unpredictable work load How do you design and build applications that cost- ● effectively scale in such conditions? Without compromising reliability, performance and time- ● to-market?
  8. 8. Scalability The solutions is to have scalable softwares. With scalability we create speed and reliability. Vertical scalability; More powerful machines leads to – faster software. Horizontal scalability; More boxes leads to faster and – more reliable software. Linear scalability; The overall throughput = (number of – processing units) * (throughput per unit). Dynamic scalability; Scale on demand (usually using – some sort of provisioning and monitoring capabilities) We usually refer to horizontal scalability, since its more applicable and cost effective. Budget is a great excuse.
  9. 9. Amdal's Law if, for example, your program has only 10% of a given function synchronized, then: if the throughput of that function at a single CPU is 100 messages per second, to increase performance by a factor of 10 (to 1,000 msg/sec) we will need to increase our CPU resources by a factor of 100 This is 10 times more then what would have been required if the application wouldn't have any synchronization blocks in its code
  10. 10. Scalability Wall Non-Scalable applications are expensive and risky. At some point the application will hit a wall: Application crashes ● Re-architecting the application every few months/years ● Server cost 20,000 Server Throughput: 1,000 tx/sec Contention: 15%
  11. 11. Amdal's Law Consequences To have scalable softwares, we should eliminate synchronized blocks. This means eliminating the bottlenecks and contentions.
  12. 12. Do We build Scalable Software?
  13. 13. Order Management Example
  14. 14. Need Availability? Things Get Worse
  15. 15. Tier Based Car-wash Total CPH is the ● minimal CPH. Failure in each ● warehouse makes the whole business fail. To increase ● performance need to budget all three warehouses. Personnel with ● specialized capabilities.
  16. 16. All in One Car-wash To increase CPH, ● simple add new warehouses. Better resources ● utilization. Each warehouse is ● independent. Less steps ●
  17. 17. Scaling Made Simple – Process Unit Design
  18. 18. Space-Based Architecture Based on Object Space Computational Model. ● Processing Unit ● Self sufficient unit of scale – Combination of Data, Processing and Messaging – Principles of Partitioning ● Content Based Routing ● Interaction Model Abstractions ●
  19. 19. Inside A Processing Unit
  20. 20. Closer Look at PU
  21. 21. Parallel Pus – Bring Linear Scalability
  22. 22. The Ideal Scenario - “Write Once Scale Anywhere” Scale-out to get more ● processing power when volume increases. Through caching ● Parallelizing of TX ● Low commodity ● resources Better Utilization ●
  23. 23. Space Based Architecture – Theory Basics Object Spaces is a paradigm for development of ● distributed computing applications. Spaces can be used to achieve scalability through parallel ● processing. Objects, when deposited in an Object Space are passive, ● i.e., their methods cannot be invoked while the objects are in the Object Space. This paradigm inherently provides mutual exclusion. ● Linda coordination language was developed at Yale. ● Object Spaces is usually called Tuple Spaces since it ● contains of tuples unrelated to each others.
  24. 24. SBA Paradigm in Java; Sun Didn't (Re)invent The Wheel Linda a language and platform on tuple-spaces. ● Space model was recommending a plug-n-play ● infrastructure. Jini was there – So JavaSpaces was invented, based on TupleSpaces ● paradigm and on top of Jini platform. By the way Java is not the only language to take the ● concept. Tuple-spaces are ported to many other languages such as Python, Ruby, Scala, C, .NET, ... .
  25. 25. Tuple/Java Spaces Basics Operations
  26. 26. JavaSpaces Standard API // An Entry class public class SpaceEntry implements Entry { public Integer count = 0; public String toString() { return quot;Count: quot; + count; } } public class Server { public static void main(String[] args) throws Exception { SpaceEntry entry = new SpaceEntry(); JavaSpace space = (JavaSpace) space(); // Register and write the Entry into the Space space.write(entry, null, Lease.FOREVER); // retrieve the Entry and check its state. SpaceEntry e = SpaceEntry(), null, Long.MAX_VALUE); }
  27. 27. Java Spaces Implementations Sun RI (now River Project) ● Orbitz (running ● Blitz (open source) ● Openwings (?) ● Semispaces ● TSpaces (IBM's implementation) ● GigaSpaces ●
  28. 28. About GigaSpaces Technologies Provides Application Platform product (XAP) for ● applications characterized by: High volume transaction processing and – Very Low latency requirements – Large Data Volumes – Scaled-Out Application Server – GigaSpaces XAP ● In-Memory Data Grid – Service Grid – Java, .NET and C++ – Customer Base ● Financial Services, Retail, Banking, Gaming –
  29. 29. XAP – eXtreme Application Platform XAP – pronounced zap - a new class of application server focusing cloud computing and scaling out architectures. Used for two main domain: Data intensive/EDG (write-behind cache) ● Compute intensive ●
  30. 30. GigaSpaces Architecture – Sub Systems
  31. 31. GigaSpaces Architecture - Runtime
  32. 32. GigaSpaces Architecture - SLA
  33. 33. How Can GigaSpaces Help Me To much data and DB is slow. My application has too many ● interactions with database. Application does not scale well. We have (strong) hardware but ● throughput does not increase anymore (symptom of tier based architecture) We develop XTP platform. e.g. billing, banking, finance. ● Not pleased with my HTTP session clustering solution. ● Want an scalable SOA/ESB platform. ● Need in memory indexing, searching. ● Want to deploy my application in cloud (pay-as-you-go) ● Want CBR over my MQ. ● We don't use Java. Want to stay in C++ or .NET. ● Want SLA in my application/data-partition. ●
  34. 34. Expectations From Application Server Data access ● Messaging / Event Processing ● Remoting ● TX management ● Web ●
  35. 35. Migration to GigaSpaces Messaging / Event processing ● Replace MDBs with GigaSpaces event listeners – Remoting ● Replace SLSBs with GigaSpaces SVF – (Remoting/Executors) Data access ● Use GigaSpaces 2nd cache for Hibernate – Convert your DAOs to use GigaSpaces, use mirror to persist – TX management ● Use Spring… – Web ● Use GigaSpaces web processing unit – Use GS HTTP session replication –
  36. 36. Migration in Practice Converted Code change Config Effort Layer change (3 is biggest) Messaging Minor to none Yes 1 1 Remoting No Yes Yes 2-3 Data ORM 2nd level cache: No Access DAO: Yes 1 Http No No Session
  37. 37. XAP and Integration to Other (JEE) Platforms Spring ● ORM ● Lucene/Compass ● Mule ESB ● JGroovy, JRuby, and hopefully Scala ● C++ ● .NET ●
  38. 38. XAP Alternatives Data Grid ● GemFire, Coherence – Shared Memory ● Terracotta/NAM, Memcached, Tokyo Cabinet, Infinispin – Computation Grids ● IBM Extreme Scale Platform – Cloud/Grid ● Google AppEngine, GridGain – Map Reduce Engines ● Hadoop, Disco, Skynet –
  39. 39. Case Study – MAGFA SMPP Gateway
  40. 40. Case Study – Let's See it in Action
  41. 41. Other Useful Results Use everything in place. Memcached helped us a lot to ● have a fast, simple and centralized Key/Value store. BASE-like transactions in favor of full XA. Memcached as ● a transaction-memory. Tried on both Linux and Solaris. No tangible difference. ● Don't design for an specific platform. Use them as tools. ● Easily switched to ActiveMQ, RabbitMQ, Coherence – Spring greatly helps to apply above rule. ● Love immutable. It prevents bugs before they happen. ● Reduces contention as much as possible. ●
  42. 42. References Migrating JEE Apps to GigaSpaces, Uri Cohen ● Scaling Out Tier Based Applications, Nati Shalom ● Cloud Computing; Designing Applications for Efficiency, ● Geva Perry Characteristics of The Next Generation Application ● Servers, Guy Nirpaz GigaSpaces Wiki ● Wikipedia ●
  43. 43. Thanks for Your Attention Q/A