14. • A collection of single function modules with well-
defined interfaces and operations that can be
deployed and scaled independently
• Service-oriented architecture composed of loosely-
coupled elements that have bounded contexts
What is a Microservice
20. Pricebook Order Management
Invoicing &
Payments
Portal POS Integration
Inventory
Business Processes
Admin UIs
B2B Platform Physical Architecture
MS SQL
SERVER
Iron
JAVA
Jun 2016
21. DOCKER
VPC
Public subnet
Private subnet
AWS Cloud
Availability zone 1
Auto Scaling group
NAT gateway
AWS Region
DOCKER
Public subnet
Private subnet
Availability zone 2
NAT gateway
Cluster
LEGACY LEGACY
22. DOCKER
VPC
Public subnet
Private subnet
AWS Cloud
Availability zone 1
Auto Scaling group
NAT gateway
AWS Region
DOCKER
Public subnet
Private subnet
Availability zone 2
NAT gateway
Cluster
LEGACY LEGACY
SCALABLE
SECURE
RESILIENT
23. Architecture Principle
We tend to prefer AWS tools over
custom or 3rd party tools because
they rapidly improve over time
24. Availability zone 1
Stored Procedures
LEGACY
APP
July 2016
In praise of Monoliths
EASE OF CHANGE
AUTOMATED
29. Suppliers FulfilmentRetailers
LEGACY & Friends RETAILER SUPPLIER FULFILMENT
LEGACY
Product
Order
Retailer
Supplier
Pricing
Demand
Invoice
Order
Catalog Transport
Despatch
FA API
FA Admin
Inventory
Promo
Ranging Craft API
Craft Admin
TEAM SCALE SPEED OF
LEARNING
Slice by Domain
August 2016
30. Suppliers FulfilmentRetailers
LEGACY & Friends RETAILER SUPPLIER FULFILMENT
LEGACY
Product
Order
Retailer
Supplier
Pricing
Demand
Invoice
Order
Catalog Transport
Despatch
FA API
FA Admin
Inventory
Promo
Ranging Craft API
Craft Admin
TEAM SCALE SPEED OF
LEARNING
Slice by Domain
August 2016
COST
34. LEGACY & Friends RETAILER SUPPLIER FULFILMENT
LEGACY
Product
Order
Retailer
Supplier
Pricing
Demand
Invoice
Order
Catalog Transport
Despatch
FA API
FA Admin
Inventory
Promo
Ranging Craft API
Craft Admin
EASE OF
CHANGE
We get slower
TEAM SCALE
September 2016
COST
35. LEGACY & Friends RETAILER SUPPLIER FULFILMENT
Legacy
Product
Order
Retailer
Supplier
Pricing
Demand
Invoice
Order
Catalog Transport
Despatch
FA API
FA Admin
Inventory
Promo
Ranging Craft API
Craft Admin
EASE OF CHANGETEAM SCALE
September 2016
And slower
36. Vision & Friends RETAILER SUPPLIER FULFILMENT
Vision
Product
Order
Retailer
Supplier
PricingDemand
Invoice
Order
Catalog Transport
Despatch
FA API
FA AdminInventoryPromo
Ranging Craft API
Craft Admin
EASE OF CHANGEMORALE
September 2016
And slower still
TEAM SCALE
37. December 2018
There is a happy ending
http://www.unicornsrule.com/rainbows-and-unicorns/
50. Platform Overview
50
irexchange B2B platform components
Pricebook
• Product attributes
• Pricing
• Promotions
• Ranging
Order Management
• Optimised ordering
• Aggregation by distribution centre
• Aggregation by supplier
• Order days and supplier lead times
• Purchase order generation & submission
• Delivery confirmation by supplier
Flow through & delivery
• Receiving
• Optimised routing
• Pick-to-zero
• Small carton pick
• Exception management
• Despatch & delivery tracking
Analytics & insights
Invoicing &
Payments
• Transparency
• Product
• Service fee
• Freight
• Payment terms
• eInvoice
CRM
Finance
system
Portal
• Navigation and search
• Shopping cart
POS Integration
• Host file
• Order interface
Sub-components
Vision
CRAFT/FlowAssist
Dynamics
Online
NAV/Sage
V
C/F
CRM
O
GL Power BI BI
VV
V
O
BI
C/F
GLCRM
V
Single order to supplier Supplier delivery Flow-through distribution process
Inbound Outbound
Optimised delivery to retailersIntelligent supplyRetailer places an orderirexchange publishes productSupplier uploads to GS1
Editor's Notes
I’m Head of Development at irexchange – a small start-up based here in Melbourne.
I wanted a share a case-study of how we evolved our architecture, and our experiences with adopting microservices.
Hopefully you will get some ideas that you can apply in your own organisation
Irexchange is a technology and distribution start-up aiming to disrupt the traditional wholesaler model in the FMCGs domain.
The company has been around for 3 years and is going well. We have raised over $40M in investment capital and all the business metrics are looking good.
It is a good news story and I am proud to have been part of it.
That business growth has been built on top of the growth of our technology platform.
Conceptually, we have 3 main systems.
We have no fixed infrastructure – everything is in AWS
That business growth has been built on top of the growth of our technology platform.
Conceptually, we have 3 main systems.
We have no fixed infrastructure – everything is in AWS
Nowadays I feel that is like this – and the business wants everything.
But when you dig into the Business strategy for our start-up there are two over-arching business needs that come to the fore.
Most important was speed to market –
Speed in delivering new features for our customers
Speed in evolving features based on customer feedback
And Speed in identifying and fixing issues
No data loss or breach
Acceptable performance
High availability
The other boxes are mostly hygiene factors – they need to be considered but they aren’t as important as the first
two.
So what sort of architecture will give us those benefits – speed to market and minimise reputational damage?
So what sort of architecture will give us those benefits – speed to market and minimise reputational damage?
So what sort of architecture will give us those benefits – speed to market and minimise reputational damage?
The answer, according to a bunch of experts was Microservices
We did a bit of research, bought Russell’s book, attended Fred’s workshop, and listened to Martin’s keynote.
We used to draw our architecture like this
Conceptually microservices promise a lot.
A collection of single function modules with well-defined interfaces and operations.
They abstract away a lot of hard networking stuff
Ensure consistency of messaging and networking, and logging and alerting and monitoring
Keep the functional devs focussed on features rather than chasing configuration
if you like drawing neat boxes and clouds, then adding hexagons to your tool kit doesn’t seem that hard.
The reality doesn’t necessarily match packaging.
When you apply these ideas in the real world, it gets complicated and messy and it stops looking like the pretty pictures in the book
I want to share our ongoing journey of adopting microservices, and in particular share some of the lessons we learned.
Throughout the talk I’ll call out some of the principles that we adopted based on what we learned.
When I started at irexchange 3 years ago, we had acquired an inventory management system, that had been modified to demonstrate our unique B2B workflow
Conceptually the system can be represented like this.
It is two sided market place with Suppliers and Retailers buying groceries through the platform. Underpinning that was our Logistics and fulfilment team that managed the flow of the physical goods.
The system can be represented as a a set of core business concepts that are tied together to represent the business processes
When I started at irexchange 3 years ago, we had acquired an inventory management system, that had been modified to demonstrate our unique B2B workflow
Conceptually the system can be represented like this.
It is two sided market place with Suppliers and Retailers buying groceries through the platform. Underpinning that was our Logistics and fulfilment team that managed the flow of the physical goods.
The system can be represented as a a set of core business concepts that are tied together to represent the business processes
The physical architecture looked like this
A monolithic 2-tier architecture running on physical hardware.
Our first goal, was to move the application into AWS and satisfy the “minimise reputational damage” business driver.
Warning – a bit of AWS jargon
Move the Java application into Docker, We migrated the SQL Server to RDS, and encrypted the tables
The EC2 instance sizes gave us the required performance.
The multi-AZ and Auto Scaling Groups gave us the resilience
The private subnets, bastion box, ServerSide Encryption and MFA gave us our security.
There is a bit more that we did about security, but if I told you, I would have to kill you.
It all sounds very easy when I say it that fast. The reality is that it was more complicated than that.
When we started Containers and Clusters were fairly new concepts and the tooling to manage them was very immature.
We had to write our scripts to manage the Docker containers within their Clusters.
We lost time trying a few cluster management tools, before settling on AWS CF ECS and ECR. They were under-featured but have improved over the last 3 years.
One architecture principle that emerged is “We tend to prefer the AWS tool over custom or 3rd party tools” because they rapidly improve over time.” An example is using Cost Explorer over 3rd party SAAS cost monitoring tools. When it was first launched it was very rudimentary, but now it is quite full featured.
Having established these patterns for building our Containers, we codified them into our Build scripts and templates, and shifted our focus to the second business driver “Speed to Market”
As a start-up building a well-designed monolith is an excellent strategy to go fast.
There is only one thing to deploy. All the functionality is one project, it is easy to trace, and easy to understand your dependencies.
In this case, the application had been designed to be lightning fast. And they achieved that by encapsulating most of the business logic into stored procedures. There was over 30 person-years of development in the application. It had a very rich feature set and was well proven.
But without a suite of tests, it was difficult to for the new team to understand the complexities of the logic, and in particular it was difficult to predict the side effects when we made a change.
Our approach to this problem was to adopt a strangler vine pattern. We wanted to interact through the front-end, and to manipulate the data in the tables, but not write new business logic in the stored proc layer.
We did that by creating a set of pass through services that talked to the tables that were relevant to each Domain concept.
Now we could write new business logic in a testable maintainable way, and over time deprecate and retire the existing stored proc logic.
Our approach to this problem was to adopt a strangler vine pattern. We wanted to interact through the front-end, and to manipulate the data in the tables, but not write new business logic in the stored proc layer.
We did that by creating a set of pass through services that talked to the tables that were relevant to each Domain concept.
Now we could write new business logic in a testable maintainable way, and over time deprecate and retire the existing stored proc logic.
Then, we had a new business driver.
It was June, and we needed to be in market by November in time for the Christmas demand.
Accordingly we quickly ramped up to a team of 24 engineers – 3 teams of 8.
But we couldn’t wrap that many people around the codebase – The surface area was too small.
We getting significantly slower as we now had 24 people who had to understand the codebase and make changes in a coherent manner.
Our approach to this problem was to split into 3 teams that were lined up with the three Domains – Suppliers, Retailers and Network and Operations. Our thinking was that each team would be able to focus on rapidly delivering features for their customer segment.
Then, we had a new business driver.
It was June, and we needed to be in market by November in time for the Christmas demand.
Accordingly we quickly ramped up to a team of 24 engineers – 3 teams of 8.
But we couldn’t wrap that many people around the codebase – The surface area was too small.
We getting significantly slower as we now had 24 people who had to understand the codebase and make changes in a coherent manner.
Our approach to this problem was to split into 3 teams that were lined up with the three Domains – Suppliers, Retailers and Network and Operations. Our thinking was that each team would be able to focus on rapidly delivering features for their customer segment.
Architecturally, we created three new Clusters of related Domain services. Each team had their own Environment which was complete set of all of the infrastructure and Services and could work in parallel extending the functionality in their Domain.
Firstly our AWS costs doubled over the month.
It turns out there were three related reasons.
It was easy for the teams to spin up environments – so they did. But no-one was spinning them down.
Secondly we had over spec’d some of the instances and had were paying for processing power that we didn’t need.
And thirdly, we hadn’t factored running cost into some of our designs, and had built and expensive solution.
We initially solved this by borrowing some open-source scripts from REA to turn stuff off.
Out of this experience we emerged a few more principles.
One architecture principle that emerged is “We tend to prefer the AWS tool over custom or 3rd party tools” because they rapidly improve over time.” An example is using Cost Explorer over 3rd party SAAS cost monitoring tools. When it was first launched it was very rudimentary, but now it is quite full featured.
Having established these patterns for building our Containers, we codified them into our Build scripts and templates, and shifted our focus to the second business driver “Speed to Market”
As well, we elevated running cost as a first-class citizen in our design discussions. Which resulted in principle 4:
Finally we moved to Gorillastack and used that tool to manage our environments.
So now, when you had deployed a change to a service, you had to deploy the Infrastructure first. Every single time. This made deployment duration longer. Within a team if we had multiple changes to deploy, we had to coordinate all of those changes and test that there were no configuration issues. Because deployments took a long time and required a fair amount of confirmation testing we started having to batch changes. This meant that we were getting slower.
We were holding changes for up to a week or a fortnight and deploying large chunks of functionality at a time. These large batch sizes meant that when something went wrong in Production we took longer to diagnose the root cause because there was more new code to examine.
Each Service was well written, with Unit and Contract tests and good automation, but in the collective environment it was getting harder to understand the dependencies as 20+ Engineers made simultaneous changes.
One answer was to the situation, was to split the infrastructure from being associated with the Cluster to being associated with each service. We called this Atomic Deployments and spend considerable time refactoring the deployment scripts to allow us to deploy and individual service and its associated infrastructure services in in one smaller deployment.
Chasing variations between environments became more common and took longer and longer.
The final nail in our microservices dream came when we realised that although we had the three clear business Domains, the actual business processes that we were codifying crossed the domain boundaries.
As an example, to calculate the final price of a product for a specific customer, you needed to know who the customer was, which state they were in, what promotions they had access to, what else they had ordered, what the product was, which state it was coming from, and a bunch of other information.
So to determine Product price we needed changes in multiple Services across the multiple teams. Now we were batching changes as we waited for other teams to completed their part of the business feature. This was resulting in less frequent, even larger deployments.
It was all getting very hard and the deadline was looming. And our cash reserves were decreasing!
Actually skipping forward two years – it’s a very different situation now.
We successfully achieved that deadline and many more since
Last month we deployed changes to Production 55 times (roughly 2 a day)
In that month we had three (low severity defects)
Each one was detected and resolved in less than 30 minutes.
Our average Cycle Time on a new Story is about a day.
And no one works long hours or on the weekends
So how did we get there?
Firstly we made the team smaller
That change forced us to focus on prioritising Value, and reduced the Comms overhead and reduced the dependencies
We moved all of our Configuration into Parameter store and refactored our build pipelines to be able to deploy a Service or Infrastructure in a more consistent decoupled fashion that minimised the occurrences of environment mismatches
And most importantly, we realised that our Domain model idea was wrong. We weren’t building a bunch of loosely coupled microservices that were all small single-function modules
We had a bunch of larger Domain services that were overlaid with a set of business processes that touched all of those core services.
Those business processes were instantiated as a collection of Lambdas and Queues, Streams and Messages, and yes Single Function microservices. Those processes were encapsulated in the Environment configuration, and in the infrastructure settings.
We had a Pricing service – It was big and we broke it up
But every time there was a Pricing business change we found that we had change every single pricing microservice
A lot of those business processes are synchronous. Most are short-running and bursty.
Some are long-running
Some are scheduled batch jobs
They all leverage the the Domain primitives.
There are bunch of services and infrastructure and configuration that encapsulate our business processes
A lot of those business processes are synchronous.
Some are long-running
Some are short-running and bursty.
Some are scheduled batch jobs
They all leverage the the Domain primitives.
Our architecture doesn’t look like this:
A lot of those business processes are synchronous. Most are short-running and bursty.
Some are long-running
Some are scheduled batch jobs
They all leverage the the Domain primitives.
Our architecture doesn’t look like this:
It looks a lot like the monolith that we started with:
I think that you could describe it as a distributed monolith
And that’s a good thing, because that is what we needed.