VINCENT KOK | ENGINEERING MANAGER, TRELLO | @VINCENTKOK
Microservices
5 things I wish I’d known5 6 things I wish I’d known
Part-time speaker
For fun and zero profit
About me: @vincentkok
Trello
Engineering Manager on the
Trello team
Dutch
You probably heard that already ;)
Microservices
Everybody seems to want them. Do we
really know the impact of our choices?
Why do we want them so badly?
Microservices are messy!
https://flic.kr/p/9u5pDA
http://geek-and-poke.com/geekandpoke/2013/7/13/foodprints
Grow Fat
Code base grows. All
the things slow
down.
Age
Your code base will
become a jurassic
park introducing new
tech becomes hard
Ownership
Who is responsible
for which part and
more important: who
has the pager
Economies of
Scale
The bigger the team
the more they
interrupt each other
Monolithical issues
8100
Build jobs ran last week
31992
Automated tests
Cause of issues can be extremely hard
Who is having the pager?
INCIDENT RESPONSE
Remember, we’re not all
webscale
Optimise for rapid and
sustainable flow of value.
DAN NORD
Small
The size will be
reasonable and
manageable
Independent
lifecycle
Nothing will hold the
team back. Go as
fast as you can
Optimise for
the problem
Pick solution and
tech based on the
problem at hand
Replaceable
It is easier to replace
if there is a need for
it
The microservice promise
Patterns
Basics
Deployments
Testing
Security
Operations
https://flic.kr/p/9t2138
Decomposition
#1: Basics
https://flic.kr/p/5E9ZF
Creating a call-out
Watch the tutorial in the
Presentation Guidelines to learn
how to create call-outs on
screenshots within this template.
MINIMAL SERVICE
Health check
200 app is alive. 500 app is unhealthy,
destroy the node
Stateless*
Run as many nodes as you need
Expose a port
Only access to the service
DEEPCHECK
Deep check
Quickly discover if a service

fails to connect to a dependency
DEEPCHECK EXAMPLE
{
"avatar": {
"details": {
"avatarRepository": {
"isHealthy": true
},
"crowd": {
"isHealthy": true
},
"deadlock": {
"isHealthy": true
CODE & BUILDS
1 repository 1 build
Libraries
Feel free to use
shared libraries but
keep them loose
Config
Be aware of the
configuration
lifecycle
Schemas
Make sure that
services are resilient
to schema changes
-> Postel’s law
Testing
Test in isolation.
Keep them decoupled
Strict separation of config
from code.
12 FACTOR APP
Redeploy
Part of the service
configuration.
Configuration lifecycles
Instant change
Switches you would like to
enable/disable straight away
Rebuild
Rebuild to apply changes
Treat them as cattle, not
pets.
BILL BAKER
#2: Deployments
https://flic.kr/p/qP31Tf
Only one person
There is only one person in
the team that owns it
Deployment smells
Takes more then 15
mins
Setting it up should be quick
and initial deployment should
quick
Requires a ticket
A ticket for the deployment
team
Always deploy an empty
service into production
ME AND PROBABLY OTHERS
Developers in control
Artifact
What is the artifact we’re running.
We’re mostly standardising on Docker
Resources
What resources are requires: RDS,
SQS, Dynamo etc..
Compute
What EC2 instance do we want how
many of those and when to scale
Alarms
What are the alarm thresholds for this
service
Ownership
Who is owning the service
Configuration
We will be adding more icons as need
arises. Speak up if in need!
DECLARATIVE DEPLOYMENT
name: Confluence
description: Confluence Cloud
links:
binary:
type: docker
name: docker.atlassian.io/confluence
tag: latest
healthcheck:
uri: /wiki/internal/healthcheck
deepcheck:
uri: /wiki/internal/deepcheck
semanticCheck:
CONFIGURATION
config:
environmentVariables:
ASAP_AUDIENCE: "foo"
ASAP_ISSUER: "foo"
CONFLUENCE_VERTIGO_SMTP_HOST: "smtp.foo.com"
CONFLUENCE_VERTIGO_SMTP_PORT: "587"
LOG4J_EXTRA_RULES: "log4j.logger.org.hiberate=DEBU
environmentOverrides:
staging:
config:
environmentVariables:
ASAP_PUBLIC_KEY_FALLBACK_REPOSITORY_URL:
RESOURCES
resources:
- type: sqs
name: default
attributes:
MaxReceiveCount: 20
VisibilityTimeout: 60
scaling:
instance: m3.xlarge
min: 7
SIDECARS
compose:
httpfrontend:
image: nginx
tag: ‘1.13.6’
ports:
- 8080:80
500
Services in production
#3: Testing
https://flic.kr/p/hn4K4b
Testing microservices
TESTING MONOLITHS IS EASY
Unit
Integration
UI
TESTING
Live service
Test agains a real serviceMock service
Test against a mock service
In process
A local implementation of
your client
Out of process
Use tools like WireMock and
MockServer
Two options
MOCKING SERVICES - IN PROCESS
<beans profile=“integration-test">
<bean id="attachmentService"
class=“c.a.attachment.AttachmentStub”/>
</beans>
MOCKING SERVICES - WIREMOCK
{
"request": {
"url": “/rest/api/content“,
"method": “POST”
"Accept": {
"matches": “application/json”
}
},
"response": {
"status": 200
}
}
Stable API
If it is external it already
should have a CTK so rely on
it
How to trust your mock?
Contract testing
Internal fast moving API’s an
benefit from this
Rely on monitoring
Small service, low MTTR
therefore low impact
Semantic Check
Automated test that runs against a
node before it will be added to the
load balancer
#4: Security
https://flic.kr/p/7LcF2W
OAuth 2.0
Grant a client access to
resources based on a newly
created set of credentials
Common standards
OpenID Connect
Identity on top of OAuth 2
OpenID
Allows identity and some
metadata only
How to secure a set of many
services?
SECURING SERVICES
ASAP
Atlassian Service Authentication Protocol
HOW DOES IT WORK?
Foo BarJWT
WHAT’S INSIDE?
Foo Bar
{
"typ": "JWT",
"kid": "foo/key1",
"alg": "RS256"
}
{
"sub": “32769:87e…”
"aud": "bar",
"nbf": 1494284564,
"iss": "foo",
"exp": 1494284624,
"iat": 1494284564,
"jti": “961253cf-ac…”
}
{
"kid": "foo/key1",
}
{
"sub": “32769:87e…”
"aud": "bar",
"iss": "foo"
}
s2sauth.bitbucket.io
AVAILABLE ON BITBUCKET
#5: Operations
https://flic.kr/p/npbxAm
100 lbs
99% water
dehydrate 98%
Guess the weight!
https://flic.kr/p/npbxAm
50
lbs
Uptime of a system with 30
services of 99.99?
TRANSLATING THIS TO A MICROSERVICE ARCHITECTURE
2 hours
99.99 = 99.7
30
Failure is imminent
RESILIENCE
Circuit breakers
Write code with failure in
mind
Three must haves
Request tracing
Don’t spend hours debugging
Log aggregations
Stream all logs into one
place.
DO YOU KNOW YOUR SYSTEM?
CREATE INSIGHT: AGGREGATED LOGGING
Response times
How much time do services
spend calling other services.
Back pressure
Stop putting pressure on a
system that is in trouble
and fail fast
Fallback
How do you handle failure. A
mandatory step in the
programming model.
Circuit breakers
CREATE INSIGHT: CIRCUIT BREAKERS
Request TracingX-B3-TraceId : 1
X-B3-SpanId : 1
X-B3-TraceId : 1
X-B3-SpanId : 2
X-B3-ParentSpanId : 1
X-B3-TraceId : 1
X-B3-SpanId : 3
X-B3-ParentSpanId : 2
X-B3-TraceId : 1
X-B3-SpanId : 4
X-B3-ParentSpanId : 3
TRACE ID’S
You Build It You Run It
The team who builds it looks after it.
Ops Team
Handover your services and let them
deal with the fun. Don’t do this.
#6: Decomposition
https://flic.kr/p/4hAC16
The monolith is deprecated
MAKE A STATEMENT
A CONFLUENCE EXAMPLE
Core functionality
Scheduler
Attachments
Operational
Transformation
Platform Services
Front end
Code
Team is responsible for the codebase.Focus on
ownership
Pipeline
Team responsible for CI and Deployment
Incidents
You built it you run it
Decomposing core functionality
GraphQL service
What should you take home?
Basics
Services are cattle not pets.
Testing
Testing a monolith is “easy” think
about your service testing strategy
Deployment
Deploying a service shouldn’t take
longer then 15 minutes
Operations
You build it you run it.
Security
Think how you would like to secure
service to service communications
Focus on value
Optimise for rapid and sustainable
flow of value
VINCENT KOK | ENGINEERING MANAGER, TRELLO | @VINCENTKOK
Thanks!

Microservices 5 things i wish i'd known java with the best 2018