Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Strategic Design by Architecture and Organisation @ - JavaZone 2016


Published on

Experience talk about how architecture and organization come together to address the challenges we face at How we believe decentralised ownership and decision making can help improve development speed and product quality over time. Where we still see complexity in FINN after we started using micro services. And how we try to use the inverse Conway manoeuvre together with DDD to extract the strategic parts of our legacy code. The talk will also address how we currently address data flows across services, and how we are moving in the direction of using events and data streams.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Strategic Design by Architecture and Organisation @ - JavaZone 2016

  1. 1. S Sebastian Verheughe Chief Enterprise Architect @ JavaZone - 8 September 2016 Strategic Design Architecture and Organization
  2. 2. Norway´s Classifieds Marketplace ~ 250 services ~ 150 deploys / day ~ 1.4 B NOK ~ 120 developers #1
  3. 3. 6800 employees 30 countries 200 million users One Global Product and Technology Organization Building Global Platforms & Services part of
  4. 4. How to use architecture and organization to… • deliver more effectively and efficiently on the business strategy • address complexity without needing total rewrites each time • minimize risk and maintain quality over time... Strategic
  5. 5. • I will be talking about the big picture in general terms • You will learn about real challenges we observe at FINN • You will learn how we try to solve these challenges • I will provide you with some recommendations… What to Expect
  6. 6. Partitioning of FINN Chapter I
  7. 7. The Value of Domain Driven Design “Getting service boundaries wrong can be expensive. It can lead to a larger number of cross-service changes, overly coupled components, and in general could be worse than just having a single monolithic system” Sam Newman, Thoughtworks Extract from the article Microservices for greenfields?
  8. 8. Dependency Graph of FINN With microservices, each service is simpler but the distributed environment is more complex
  9. 9. Conceptual Architecture of FINN Not only microservices, monolithic front-end and legacy system/database Front-EndFEFEFE Legacy (Big Ball of Mud) µsµsµsµsµsµsµsµsµs API The highest complexity seems to be related to a few central structures, and especially legacy database integration makes it hard to split. µsµsµsµsµsµs Front-End
  10. 10. Risk of Unhandled Complexity Key areas with unhandled complexity, may damage your business Complexity Time Point of Total Rewrite (or worse) Missed Business Opportunities Simple full speed ahead The Twilight Zone Complex What happened?
  11. 11. Architecture Challenges @ FINN • A few central structures with high complexity used in many contexts proves very hard to modify and can cause unintended behavior elsewhere. • High coupling between services and long request call chains affects performance and availability negatively (8 fallacies of distributed computing). • Clients aggregating services become monolithic front-ends, bottleneck, single- points-of-failures, and require / desire standardization.
  12. 12. Should we partition by business entities, business objectives or both? Partitioning Strategy Customer Service Agreement Service Store Service Login Service User Service Authentication Service Enrollment Service Review Service
  13. 13. Challenges with Services Based on Entities They don´t do one thing well • support multiple biz challenges • have unnecessary dependencies • are complex / hard to change • Will always be suboptimal solutions • impact performance and availability (single point of failure) • grow in complexity for each new business capability using them Customer Customer Service Id Password Billing Address Credit Card Purchases Shipping Address Mr. Customer, what do you do? Front-End Client Service Service
  14. 14. Benefits with Services Based on Objectives Auth ShippingTradePaymentBilling They do one thing well • have fewer dependencies • are simpler to change • allow for optimal models (of entities in context) • improves performance and availability. • are only as complex as the business objective they deliver on! Well, I know why we need Mr. Billing around… Front-End Client Customer Id Billing Address … Customer Lead Id Interests … Customer Id Password … Customer Seller Id Customer Buyer Id Object Id Customer Id Shipping Address … Transactional business boundary
  15. 15. Why do we Default to Global Entities / Models? We have a global information model / information architect that created one global model of all entities and their relationships, and you implemented it. We learned object-oriented programming at school usually solving problems within a single context, believing that you should never model the same entity twice (DRY). We live in a world filled with entities around us… and we try to clone the world as-is. It might just be harder for the human brain to work with abstract models of specific problems, so we default to easy…
  16. 16. Aligning the Architecture with the Business Technical solutions align with business capabilities (objectives / domains) Rump Rib Tenderloin Business Domain Tech Solution Product Sales Advertising Billing Communication They will be stable, with a minimum of dependencies between them, and form transactional boundaries from a user / business perspective
  17. 17. Target Architecture (we call it direction in FINN) The front-ends aggregate responses from services clustered by domain Partitioning applies to both front-end (components) and back-end Capability / domain (problem space) & services (solution space) Front-End Message BUS (utilizing eventual consistency) Front-End Authentication Product Sales Billing ShippingAdvertising µs COMP µs µs µs µs µs µs µs µs µs Service / System COMP COMP COMP COMP COMP COMP COMP COMP
  18. 18. Recommendations Invest heavily in building competence around distributed systems design - NOW! Writing a single microservices is easy, testing and running hundreds together is hard. Distribution requires rethinking of service boundaries, integration and handling of failures. Handle the increase in number of services by automation and standardization in tools and infrastructure. Tomorrow: Use a circuit breaker (Hystrix), use pub-sub between two services to communicate, mock dependencies between two services for testing, read “8 fallacies of distributed systems”. Design you services around business objectives, with several entity models Entity services become monolithic & complex, and result in a lot of unnecessary dependencies. Instead create different models of entities for the different problems you try to solve. Tomorrow: select a complex entity service (or table) and divide it by different known contexts
  19. 19. Guide to Objective Based Service Boundaries The service name should reveal the business purpose. Ideally in the form of verb and optionally subject. E.g. authentication / authenticate user (does not always work) Evaluate each field of your global entity model, determine data & relations that are only of interest in a given context that serves a business purpose (e.g. authentication). Evaluate which fields are updated at the same time (different use cases), these form transactional boundaries around data groups and should be good candidates. Ensure that data / fields are only mastered by one service (to avoid sync conflicts) Never share / integrate via the database (complicated to break away from later…)
  20. 20. Organization Strategy Chapter II
  21. 21. Organizational Observations @ FINN Many features required several hand-offs between teams in order to be solved. This was really slowing down development speed. Teams where not able deliver a complete user / business capability by themselves. We observed sub-optimal APIs, badly placed or duplicated business logic, presentation inconsistency, and lack of full-stack operational responsibility (serving the end user). Layered teams sometimes became overly focused about technical perfection. This resulted in bottlenecks in the organization, maybe because they lacked clear business goals. Static teams made it difficult to move resources to changing business challenges.
  22. 22. Conway's Law “Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations”  Melvin E. Conway Extract from the article How do Committees Invent?
  23. 23. Aligning the Organization with the Business The organization should align with the business (and the technical solution) The goal was an organization that more effectively was able to deliver on the business challenges Inverse Conway Maneuver: Organize to promote your desired architecture (end state) Rump Rib Tenderloin Org Ownership Business Domain Tech Solution Product Sales Advertising Billing Communication
  24. 24. Infrastructure FINN Technology Organization Organized primarily by business domains, secondarily by cross-cutting concerns Front-End Platform(s) Space A DomainA DomainB DomainC Space BDomainD DomainE DomainF Space C DomainG DomainH DomainI Space D DomainJ DomainK DomainL HARDER TO CHANGE EASIERTOCHANGE FULLSTACK WebiOSAndroid ServersDatabasesData BusMon ToolsDeploy PL All cross-cutting teams are structured as enabling teams, meaning they should deliver self-service solutions upfront to the vertical functional teams.
  25. 25. Spaces allows for Flexible Developer Allocation A space contains all kinds of developers, assigned to dynamic teams to solve prioritized tasks. Size from 15-25, small enough to know each other well. Space Leads, two persons in charge of organization and flow of prioritized tasks affecting domains within the space. Decides the teams to support the current prioritized business challenges. Developers, free to move within the space. Grouped into teams depending on the current problems needing to be solved / prioritized. Technical Domain Experts (tech leads), responsible for the long term development of the technical solution (QoS) delivering a full-stack business capability. Architects (4 in FINN), responsible for everything that affects multiple domains (e.g. domain boundaries, integration, front-end architecture… Space A Advertising ProductSales Billing TDE TDE TDE DEV DEV DEV SLSL DEV DEV DEV DEV DEV DEV Same space 2-4 teams Teams
  26. 26. Transitioning the Apps Team About 25% of visits are from native apps and growing, but we had one team. Since we believe that our main biz challenges are not related to native apps development, it makes sense to focus on full-stack functional and more autonomous teams instead. Front-End Platform(s) Space A DomainA DomainB DomainC Space BDomainD DomainE DomainF Space C DomainG DomainH DomainI Space D DomainJ DomainK DomainL WebiOSAndroid
  27. 27. Decentralized Authority (Framing) The TDEs are free to choose solutions within their domain, no sign-offs needed Cross-Domain Front-End Domain Domain Domain DomainDomain TDE TDE TDE TDE TDE The architects only govern integration and boundaries between the domains and the front-end, in addition to help succeeding with the implementation of the strategy.
  28. 28. Reorganization Observations (one year in) • A single team has proven that they can improve a full-stack capability autonomously, and it seems like faster (we have no god way of measuring) • Tech leads are starting to take full-stack responsibility for the capability they deliver. • The teams can freely choose how they within a domain solve each business challenge, there are less technical “religious” discussions. And no exploitation of new technologies. • One year in, the systems does not automagically partition themselves by org. structure, but needs to be driven in each case. The inverse Conway maneuver only removed barriers. • The organization structure (silos) works as barriers when cross domain work is needed, and developers rather wait than contribute if changes are needed in someone else domain. It may be a smaller problem now, but we need to change the culture and expectations.
  29. 29. Organizational Recommendations Primarily align your teams with your business domains, not tech layers Teams focusing on business problems first, and who are more efficient as they can take all decisions internally in order to solve a business problem. Staff the team with the necessary competence (e.g. full-stack) to reach their goal. Use Conway´s law to your advantage. Decentralize decision making whenever applicable When teams understand which problem they must solve, allowing them to take internal decisions improved development speed, and probably results in the best design (for that specific problem). Minimize central governance, but where needed make it effective and clear to the org. Structure cross-cutting teams as enabling teams, not bottlenecks or guards Cross-cutting functions such as infrastructure or architecture should always deliver tools and rules (for cross-cutting concerns) up-front, not acting as guards slowing down development. Make sure your organization culture encourages cross-organizational collaboration
  30. 30. Being Strategic about Complexity Chapter III
  31. 31. Living With an Imperfect World Chances are that your current system is not exactly how you want it to be… What to do about it… Domain Domain Domain DomainDomain Message BUS Big Ball of Mud front-end front-end Medium Ball of Mud
  32. 32. Complexity & Business Paralysis Business: We need to be able to “something important”! Tech: Oh, that’s part of a complex area, we need to fix that first! Business: Well, can you fix it this month? Tech: Funny guy, it will take 9 months to rewrite that mess! Business: Bugger, what about a simple meaningless feature? Tech: Sure, with nanoservices it will be in production tomorrow!
  33. 33. The Total Rewrite Disease Thoughts about why this happens… • We take decisions about existing structures (systems and structures), not about future user / business challenges we need to solve. • We take the decision we can take, and when organization and ownership is not aligned with user / business challenges, neither will our decisions be it. • High complexity and lack of knowledge about how to address such complex systems strategically forces us to solve a larger problem than necessary.
  34. 34. Identifying the smallest step needed to deliver a modified or new capability Fastest Minimum Viable Capability
  35. 35. 1. Identifying What Strategic Capabilities to Improve In short: fix the basics, focus on advantages, leave the rest when possible… THE REST Basics Advantages Avoid feature driven…Ignore how for now… Customers expect / require… Why customers choose you…
  36. 36. 2. Understanding How & Where to Deliver When you know what, identify where it should go and how to deliver it. Domain Domain Domain DomainDomain Big Ball of Mud front-end front-end Medium Ball of Mud Capability model that needs improvement… About this time, it might be helpful with an architecture strategy…
  37. 37. Slicing a Part of a Legacy Model @ FINN Illustration of how we modified parts of a legacy model without needing to rewrite all dependencies at the same time / at all. (Strangler Application) Evil Monolithic Big Ball of Mud Legacy System Front-End 1. Identify Domain New Narrower Model 2. Create new master 3. (a)sync data 4. Move master and get business value quick 5, 6, 7, … Rewrite other dependencies gradually (if any biz value) The goal is not to have the new model behave identically to the legacy model. The entire point is to change the capability in this case.
  38. 38. Faster Delivery of the Business Strategy Tech: What would success look like for “something important”? Business: Well, if these users could do this when doing that… Tech: Right, so you don’t need this and that also to use it? Business: No, I only care about these users being able to do that! Tech: Why didn´t you say so, I can actually fix that in 3 weeks. Business: Really? Thanks a lot…
  39. 39. Recommendations Always question large rewrites without a clear strategic user / business value Start by trying to understand what the business is trying to accomplish – or rephrased which ability it needs to deliver using IT. E.g. able to deploy continuously, able to recommending products, able to register the user. Ask ”why” 5 times. Deliver new / modified capabilities outside of legacy monoliths Existing capabilities are often used by many solutions, but the cost and risk of rewriting all dependencies is very high. Instead, only deliver the modified capability to those who need it. Syncing the data with the legacy solution minimizes the amount of rewriting needed. Always find the simplest / smallest solution that delivers on strategic capabilities When you know what, you can start discussing how best to achieve it. Suppress the temptation to “upgrade everything” you think as bad with the existing solution. Find the simplest solution that delivers the capability in a good way (should be possible to test).
  40. 40. Overflow… Chapter IV
  41. 41. Infrastructure / Cloud Default to Cloud Services Unless you have special needs accept the fact that companies of all sizes choosing cloud services will have an advantage It´s simple, fast, scalable, and let´s you focus on the business! FINN is cloudifying… Register and deploy to e.g. Heroku or Google today and see for yourself (application containers, message busses, etc.)
  42. 42. Data Pipeline (accessing distributed data) With smart products and data analytics, how to best access distributed data? To avoid exponential complexity, move the complexity to the infrastructure PPPP CCCC CCCC PPPP Message BUS (not ESB) - Kafka pub subPublish Domain Events (not commands)
  43. 43. Pipeline Challenges @ FINN Challenges we experienced at FINN when learning to use a pipeline Reliable messaging is more than a product feature on the message bus We had serious reliability issues related to order and payment in FINN (not good). Creating a single transaction boundary between the service update and domain event was not trivial. Running and configuring Kafka our self was also not without issues. Introducing event based communication is hard, developers are used to RPC For the developer, it´s easier just to make an RPC call another service. And in the beginning, there are no data available to consume, and no motivation to publish since the publisher and consumer are two different teams. You may need to enforce publishing until it is the norm.
  44. 44. FINN - Technology Stack Mostly Java (some Scala) Moving from Sybase to PostgreSQL Ramping up Node in the front end Moving data around with the bus Kafka (separate JavaZone presentation) Using Cassandra for high frequency writing (counting page clicks etc.) Service calls using REST (default) or Thrift
  45. 45. Key Takeaways 1. Partition by business objective (domains), avoid global entity models 2. Organize by business objective, not by competence, technology or process 3. Take the smallest possible step to reach strategic capability, no total rewrites But more importantly, it all requires an development organization with good knowledge of the business domain and a great culture for working together across organizational boundaries. The End Open for Questions…