Application architecture for cloud


Published on

How can we rewrite our application to handle new challenges coming from applications that need to scale over the cloud? Use patterns, so you can use the best technology at every tier

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Picture source:
  • To build for big scale – use more of the same pieces, not bigger pieces; though a different approach may be needed
  • Application architecture for cloud

    1. 1. Many Application Types…
    2. 2. Many Application Models…Web Hosting High Performance Computing  Massive scale infrastructure  Parallel & distributed processing  Burst & overflow capacity  Massive modeling & simulation  Temporary, ad-hoc sites  Advanced analyticsApplication Hosting  Hybrid applications Information Sharing  Reference data  Composite applications  Common data repositories  Automated agents / jobs  Knowledge discovery & mgmtMedia Hosting & Processing  CGI rendering Collaborative Processes  Multi-enterprise integration  Content transcoding  B2B & e-commerce  Media streaming  Supply chain managementDistributed Storage  Health & life sciences  External backup and storage  Domain-specific services
    3. 3. Computing -OrientedApplication
    4. 4. Computing-Oriented Applications Calculation intensive Focus on  Algorithms  Speed  Parallelism
    5. 5. Data- OrientedApplication
    6. 6. Data-Oriented Applications Data access intensive Focus on  Storage Capacity  Storage Realibility  Access Speed  Query API
    7. 7. Traditional CRUD application Let’s take a step back. Why do we build applications like we do today? It started with a stack of paper……that needed to be keyed …and along came into the machine the CRUD app!
    8. 8. Traditional scale-uparchitecture Common characteristics  synchronous processes  sequential units of work  tight coupling  stateful  pessimistic concurrency  clustering for HA  vertical scaling
    9. 9. Statefulness
    10. 10. Syncronous Calls
    11. 11. Tightly coupledarchitectures
    12. 12. The ability to increase or reduce thenumber of resources without Scalabilityaffecting the end user experience.
    13. 13. Traditional scale-up architecture  To scale, get bigger servers  expensive  has scaling limits  inefficient use of resourcesweb app server data store
    14. 14. When failure is happening…
    15. 15. When problems occur…  bigger failure impactweb app server web app serverweb app server data store web data store
    16. 16. Complex solution…
    17. 17. DB-oriented
    18. 18. Paura??
    19. 19. Stale Data In computer processing, if a processor changes the value of an operand and then, at a subsequent time, fetches the operand and obtains the old rather than the new value of the operand, then it is said to have seen stale data.
    20. 20. Why is CQRS needed? To understand this better, let’s look at a basic multi-user system. Retrieve data Retrieve dataUser is looking at stale data Modify data Stale data is inherent in a multi-user system. The machine is now the source of truth…not a piece of paper.
    21. 21. Why is CQRS needed? All of this to provide scalability & a consistent view of the data. But did we succeed?
    22. 22. Why is CQRS needed? Back to our CRUD app… ? ? ? ? ? ?Where is the consistency? We have stale data all over the place!
    23. 23. Why is new model needed? Since stale data always exists, is all of this complexity really needed to scale?  No, we need a different approach. One that offers extreme scalability Inherently handle multiple users Can grow to handle complex problems without growing development costs
    24. 24. Use more pieces, not bigger pieces LEGO 7778 Midi-scale Millennium Falcon • 9.3 x 6.7 x 3.2 inches (L/W/H) • 356 pieces LEGO 10179 Ultimate Collectors Millennium Falcon • 33 x 22 x 8.3 inches (L/W/H) • 5,195 pieces
    25. 25. Scale out
    26. 26. Fundamental concepts Horizontal scaling Small pieces, loosely coupled Distributed computing best practices  asynchronous processes (event-driven design)  parallelization  idempotent operations (handle duplicity)  optimistic concurrency  shared nothing architecture  fault-tolerance by redundancy and replication  etc.
    27. 27. Cloud-Scale ArchitectureDesign Data & Content  Horizontal scaling  De-normalization  Service-oriented composition  Logical partitioning  Eventual consistency  Distributed in-memory cache  Fault tolerant (expect failures)  Diverse data storage optionsSecurity (persistent & transient, relational & unstructured, text & binary, read &  Claims-based authentication & write, etc.) access control Processes  Federated identity  Loosely coupled components  Data encryption & key mgmt.  Parallel & distributed processingManagement  Asynchronous distributed  Policy-driven automation communication  Aware of application lifecycles  Idempotent (handle duplicity)  Handle dynamic data schema and  Isolation (separation of concerns) configuration changes
    28. 28. Scale-out architecture Common characteristics  small logical units of work  loosely-coupled processes  stateless web app server data store  event-driven design  optimistic web app server data store concurrency  partitioned data  redundancy fault-tolerance  re-try-based recoverability
    29. 29. Scale-out architecture To scale, add more servers web app server data store  not bigger servers web app server data store When problems occur web app server data store  smaller failure impact web app server data store  higher web app server data store perceived availability web app server data store  simpler recovery
    30. 30. Scale-out architecture + distributedcomputing parallel tasks Scalable performance at web app server data store extreme scale  asynchronous web app server data store processes  parallelization web app server data store  smaller footprint web app server data store optimized perceived response time  resource usage web app server data store  reduced async tasks response time web app server data store  improved throughput
    31. 31. How does CQRS work? Task-based UI Why rethink the User Interface?» Grids don’t capture the user’s intent
    32. 32. CQRSAs a concept A set of principles A way of thinking about software architecture.As a pattern Is a way of designing and developing scalable and robust enterprise solutions where reads are independent from writes.What is not The CQRS pattern says nothing about how this should be implemented
    33. 33. Some newarchitectures [CQRS?!?]
    34. 34. Common components ofthe CQRS pattern Task-based UI  ViewModels Commands Domain Objects Events Persistent View Model
    35. 35. Task-Driven User Interface Scrum-based analysis Collect user-stories  Scenario Each user-story is not an entity Every user story is a task
    36. 36. How does CQRS work?Rethinking the User Interface Adjust UI design to capture intent  what did the user really mean?  intent becomes a command Why is intent important?  Last name changed because of misspelling  Last name changed because of marriage  Last name changed because of divorce User interface can affect your architecture
    37. 37. View Models ViewModel  Only Data  Flat, only strings Why DomainModel is not good?  Views should not know how to traverse the DM  Views usually need less properties  Using ORMs you might start a SQL query by mistake How to do it?  Copy the properties needed from DM to VM  Possibly flatten data
    38. 38. How does CQRS work? Validation  increase likelihood of command succeeding  validate client-side  optimize validation using persistent view model What about user feedback?  Polling: wait until read model is updated  Use asynchronous messaging such as email  “Your request is being processed. You will receive an email when it is completed”  Just fake it!  Scope the change to the current user. Update a local in-memory model
    39. 39. How do Commands work? Commands encapsulate the user’s intent but do not contain business logic, only enough data for the command What makes a good command?  A command is an action – starts with a verb  The kind you can reply with: “Thank you. Your confirmation email will arrive shortly”. Inherently asynchronous. Commands can be considered messages  Messaging provides an asynchronous delivery mechanism for the commands. As a message, it can be routed, queued, and transformed all independent of the sender & receiver
    40. 40. Domain Model The domain model is utilized for processing commands; it is unnecessary for queries. Unlike entity objects you may be used to, aggregate roots in CQRS only have methods (no getters/setters)
    41. 41. Events Events describe changes in the system state An Event Bus can be utilized to dispatch events to subscribers Events primary purpose update the read model Events can also provider integration with external systems CQRS can also be used in conjunction with Event Sourcing.
    42. 42. Persistent View Model Reads are usually the most common activity – many times 80-90%. Why not optimize them? Read model is based on how the user wants to see the data. Read model can be denormalized RDBMS, document store, etc. Reads from the view model don’t need to be loaded into the domain model, they can be bond directly to the UI.
    43. 43. Persistent View Model Data Duplicated, No Relationships, Data Pre-CalculatedCustomer Service Rep view Supervisor view List of customers List of customersID Name Phone ID Name Phone Lifetime value Rep_Customers_Table Supervisor_Customers_Table ID Name Phone ID Name Phone Lifetime Value
    44. 44. When should not useCQRS? CQRS can be overkill for simple applications. Don’t use it in a non- collaborative domain or where you can horizontally add more database servers to support more users/requests/data at the same time you’re adding web servers – there is no real scalability problem – Udi Dahan
    45. 45. When should I use CQRS?Guidelines for using CQRS: Large, multi-user systems CQRS is designed to address concurrency issues. Scalability matters With CQRS you can achieve great read and write performance. The system intrinsically supports scaling out. By separating read & write operations, each can be optimized. Difficult business logic CQRS forces you to not mix domain logic and infrastructural operations. Large or Distributed teams you can split development tasks between different teams with defined interfaces.
    46. 46. Compute Services Low Availability Computing Nodes High Availability Computing Node
    47. 47. Service LevelAgreement
    48. 48. Low Available ComputeNode Many virtual servers of public clouds are offered at a low availability. Sometimes, availability is additionally expressed in an uncommon manner. For example, Amazon guarantees an availability of EC2 instances of 99.95% during a service year of 365 days [8].  99.95% means about 4,4h/yr However, this does not mean that a single instance has 99.95% availability during this time period, as could be expected. Instead, unavailability is defined as the state when all running instances cannot be reached longer than five minutes and no replacement instances can be provisioned.
    49. 49. Elastic Infrastructure Resources shall be assigned to and revoked from applications dynamically depending on the current load. The infrastructure must support dynamic provisioning and deprovisioning of resources This functionality must be offered through an API to be used by atomized management tools and the applications that are hosted by the environment. An elastic infrastructure supports the dynamic allocation of (virtual) resources that constitute a common resource pool.
    50. 50. Storage Consistency Strict Consistency Eventual Consistency
    51. 51. Strict Consistency A storage offering usually consists of multiple replicas to ensure fault tolerance. It is of major importance that the consistency of the data contained in these replicas is pertained at all times while the performance is of secondary importance. The highest level of consistency is granted if all replicas are updated if the data contained by them is altered. However, this would mean that the availability of the overall storage solution is decreased drastically. It has to be ensured that it is available even if not all replicas are available, but still the correct version of the data is read.
    52. 52. Eventual Consistency Eventually consistent data storage allows reducing data consistency to increase availability and performance, since the impact of network partitioning is reduced and fewer replicas have to be accessed during read and write operation. While strictly consistent databases ensure that always at least one of the current version is read, eventually consistent databases allow that obsolete versions may also be read. This increases the availability of the storage offering since only one replica has to be available to successfully execute a read operation.
    53. 53. ACID vs. BASE
    54. 54. CAP (Consistency, Availability, Partition) Theorem At most two of these properties for any shared-data system Consistency + Availability C A • High data integrity P • Single site, cluster database, LDAP, xFS file system, etc. • 2-phase commit, data replication, etc. Consistency + Partition C A • Distributed database, distributed locking, etc. P • Pessimistic locking, minority partition unavailable, etc. Availability + Partition C A • High scalability • Distributed cache, DNS, etc. P • Optimistic locking, expiration/leases, etc.
    55. 55. Hybrid architectures• Scale-out (horizontal)  Scale-up (vertical) – BASE: Basically Available, Soft state,  ACID: Atomicity, Consistency, Eventually consistent Isolation, Durability  availability first; best effort – focus on “commit”  aggressive (optimistic) – conservative (pessimistic)  transactional – shared nothing  favor accuracy/consistency – favor extreme size  e.g., BI & analytics, financial – e.g., user requests, data collection & processing, etc. processing, etc. Most distributed systems employ both approaches
    56. 56. Storage Services Relations Data Storage Blob Data Storage Block Data Storage NoSQL Storage
    57. 57. Relational Data Store An application uses a central database for storing data elements and performs complex queries on them
    58. 58. Blob Storage A distributed application needs to manage large data elements, such as virtual server images or videos, which are too large for traditional databases. In a distributed application data elements must be made available to all application components and to distributed users. Access to the data needs to be performed in a standardized fashion and access control has to be established. Organize the data elements in a folder hierarchy similar to a traditional file system. Give each data element a unique identifier that can be used to access it over a network. Also, establish access control mechanisms.
    59. 59. Block Storage Resources in clouds are often unreliable (low available compute nodes). Therefore, the data that they access locally shall in fact be stored in a high available central data store. This way, if a server fails the data is not lost, but a new server can be started to use the secured data. Offer data elements in a central storage that can be accessed by distributed servers and integrate them as local drives.
    60. 60. NoSQL Storage Need to handle very large amounts of data and also need to be adjusted to new user demands flexibly. Database solution is required that focuses on scaling out rather than on optimizing the use of a single resource and that can adjust flexibly to changes of the data structure. Use a schema-free storage solution, with limited query capabilities to enable extreme scale-out through easy data replication.
    61. 61. Communication Services Message Oriented Middleware Reliable Messaging Exactly Once Delivery At least Once Delivery
    62. 62. Message-oriented middleware Different applications usually use different languages, data formats, and technology platforms. When one application (component) needs to exchange information with another one, the format of the target application has to be respected. Sending messages directly to the target application results in a tight coupling of sender and receiver since format changes directly affect both implementations. Connect applications through an intermediary, the message oriented middleware, that hides the complexity of addressing and availability of communication partners as well as supports transformation of different message formats.
    63. 63. Reliable Messaging The message transfer from one communication partner to the other is performed under transactional context. Especially, this transaction subsumes the operation performed to store the messages in persistent storage. Thus, if an error occurs during message receiving, sending, or processing the transaction can be compensated transferring the overall system back to a correct and consistent state.
    64. 64. At-least once The receiver of messages sends special acknowledge messages to the sender. If the sender does not receive such an acknowledgement message in a given time frame it retransmits the message. Thus, messages, which are lost due to communication errors, are still received eventually. However, duplicate messages can occur, for example, if an acknowledgement message is lost. To reduce the communication overhead, acknowledgement messages can be sent either after each individual message or after an agreed upon number of messages.
    65. 65. Exaclty-once delivery Whenever a message is created it is associated with a unique identifier. This is used by a filtering component on the message path to delete duplicates. It does so by storing the identifiers of messages it has already seen. The identifiers of messages passing through this filtering component are then compared to the identifiers that have been recorded to identify and delete duplicates.