SlideShare a Scribd company logo
1 of 364
Download to read offline
Microservices Workshop:
Why, what, and how to get there
Adrian Cockcroft @adrianco
Technology Fellow - Battery Ventures
October 2016
CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode
Workshop vs. Presentation
Questions at any time
Interactive discussions
Share your experiences
Everyone’s voice should be heard
What does @adrianco do?
@adrianco
Technology Due
Diligence on Deals
Presentations at
Companies and
Conferences
Tech and Board
Advisor
Support for
Portfolio
Companies
Consulting and
Training
Networking with
Interesting PeopleTinkering with
Technologies
Vendor
Relationships
Previously: Netflix, eBay, Sun Microsystems, Cambridge Consultants, City University London - BSc Applied Physics
Topics
Scene Setting
State of the Cloud
What Changes?
Product Processes
Microservices
State of the Art
Segmentation
What’s Missing?
Monitoring
Challenges
Migration
Response Times
Serverless
Lock-In
Teraservices
Wrap-Up
Setting the Scene
Why am I here?
%*&!”
By Simon Wardley http://enterpriseitadoption.com/
Why am I here?
%*&!”
By Simon Wardley http://enterpriseitadoption.com/
2009
Why am I here?
%*&!”
By Simon Wardley http://enterpriseitadoption.com/
2009
Why am I here?
@adrianco’s job at the
intersection of cloud
and Enterprise IT,
looking for disruption
and opportunities.
%*&!”
By Simon Wardley http://enterpriseitadoption.com/
20142009
Disruptions in 2016
coming from server-
less computing and
teraservices.
Typical reactions to my Netflix talks…
Typical reactions to my Netflix talks…
“You guys are
crazy! Can’t
believe it”
– 2009
Typical reactions to my Netflix talks…
“You guys are
crazy! Can’t
believe it”
– 2009
“What Netflix is doing
won’t work”
– 2010
Typical reactions to my Netflix talks…
“You guys are
crazy! Can’t
believe it”
– 2009
“What Netflix is doing
won’t work”
– 2010
It only works for
‘Unicorns’ like
Netflix”
– 2011
Typical reactions to my Netflix talks…
“You guys are
crazy! Can’t
believe it”
– 2009
“What Netflix is doing
won’t work”
– 2010
It only works for
‘Unicorns’ like
Netflix”
– 2011
“We’d like to do 

that but can’t”
– 2012
Typical reactions to my Netflix talks…
“You guys are
crazy! Can’t
believe it”
– 2009
“What Netflix is doing
won’t work”
– 2010
It only works for
‘Unicorns’ like
Netflix”
– 2011
“We’d like to do 

that but can’t”
– 2012
“We’re on our way using
Netflix OSS code”
– 2013
What I learned from my time at Netflix
What I learned from my time at Netflix
•Speed wins in the marketplace
What I learned from my time at Netflix
•Speed wins in the marketplace
•Remove friction from product development
What I learned from my time at Netflix
•Speed wins in the marketplace
•Remove friction from product development
•High trust, low process, no hand-offs between teams
What I learned from my time at Netflix
•Speed wins in the marketplace
•Remove friction from product development
•High trust, low process, no hand-offs between teams
•Freedom and responsibility culture
What I learned from my time at Netflix
•Speed wins in the marketplace
•Remove friction from product development
•High trust, low process, no hand-offs between teams
•Freedom and responsibility culture
•Don’t do your own undifferentiated heavy lifting
What I learned from my time at Netflix
•Speed wins in the marketplace
•Remove friction from product development
•High trust, low process, no hand-offs between teams
•Freedom and responsibility culture
•Don’t do your own undifferentiated heavy lifting
•Use simple patterns automated by tooling
What I learned from my time at Netflix
•Speed wins in the marketplace
•Remove friction from product development
•High trust, low process, no hand-offs between teams
•Freedom and responsibility culture
•Don’t do your own undifferentiated heavy lifting
•Use simple patterns automated by tooling
•Self service cloud makes impossible things instant
“You build it, you run it.”
Werner Vogels 2006
In 2014 Enterprises finally embraced
public cloud and in 2015 began
replacing entire datacenters.
In 2014 Enterprises finally embraced
public cloud and in 2015 began
replacing entire datacenters.
Oct 2014
In 2014 Enterprises finally embraced
public cloud and in 2015 began
replacing entire datacenters.
Oct 2014 Oct 2015
In 2014 Enterprises finally embraced
public cloud and in 2015 began
replacing entire datacenters.
Oct 2014 Oct 2015
Key Goals of the CIO?
Align IT with the business
Develop products faster
Try not to get breached
Security Blanket Failure
Insecure applications
hidden behind firewalls
make you feel safe until
the breach happens…
http://peanuts.wikia.com/wiki/Linus'_security_blanket
“Web scale”
vs.
“Enterprise”
“Webscale”
Freedom and responsibility
High trust
“Enterprise”
Bureaucracy and blame
Low trust
How can everyone get
speed, low cost, and better
usability?
Mixed methods:
Disaggregation into
microservices helps!
Example Monolith:
Sign
Up
Login
Home
Page
Payment
Method
Personal
Data
Reports
Monolithic
“kitchen sink”
database
Monolithic
application
Complex mix of
queries
User
Because one
part of the
monolithic
application and
database holds
sensitive data all
of it is subject to
the most rigorous
policies
Microservices version:
Sign
Up
Login
Home
Page
Payment
Method
Personal
Data
Reports
Optimized
datastores
Microservices
separation of concerns
Isolated single purpose
connections
User
Because each
microservice can
conform to the
appropriate policy,
demands for agility
can be separated
from requirements
for security
Segregated team owns
secure data sources and
infrequent updates
Segregated team owns
rapid improvement of
most common use cases
State of the Cloud
Previous Cloud Trend Updates
GigaOM Structure May 2014
D&B Cloud Innovation July 2015
GigaOM Structure November 2015
1
Previous Cloud Trend Updates
GigaOM Structure May 2014
D&B Cloud Innovation July 2015
GigaOM Structure November 2015
1
Trends from 2014: Noted as appropriate
Cloud Ecosystem Matures
Staying Power
Support
Scale
Location
Cloud Ecosystem Matures
Staying Power
Support
Scale
Location
Trends from 2014: Even more emphasis on location
Safe Bets
Safe Bets
Safe Bets
Trends from 2014: AWS capacity share vs. Azure still growing
AWS growing 60% year on year, Enterprise share growing fast
Azure providing strong Linux and Open Source support
The Global Land-Grab
Azure
AWS
GCE
20 Regions
11 Regions
6 Regions
http://aws.amazon.com/about-aws/global-infrastructure/
http://azure.microsoft.com/en-us/regions/
https://cloud.google.com/compute/docs/zones#available
http://www.google.com/about/datacenters/inside/locations/index.html
The Global Land-Grab
Azure
AWS
GCE
20 Regions
11 Regions
6 Regions
http://aws.amazon.com/about-aws/global-infrastructure/
http://azure.microsoft.com/en-us/regions/
https://cloud.google.com/compute/docs/zones#available
http://www.google.com/about/datacenters/inside/locations/index.html
The Global Land-Grab
Azure
AWS
GCE
20 Regions
11 Regions
6 Regions
http://aws.amazon.com/about-aws/global-infrastructure/
http://azure.microsoft.com/en-us/regions/
https://cloud.google.com/compute/docs/zones#available
http://www.google.com/about/datacenters/inside/locations/index.html
The Global Land-Grab
Azure
AWS
GCE
20 Regions
11 Regions
6 Regions
http://aws.amazon.com/about-aws/global-infrastructure/
http://azure.microsoft.com/en-us/regions/
https://cloud.google.com/compute/docs/zones#available
http://www.google.com/about/datacenters/inside/locations/index.html
The Global Land-Grab
Azure
AWS
GCE
20 Regions
11 Regions
6 Regions
http://aws.amazon.com/about-aws/global-infrastructure/
http://azure.microsoft.com/en-us/regions/
https://cloud.google.com/compute/docs/zones#available
http://www.google.com/about/datacenters/inside/locations/index.html
India, UK, Korea
coming in 2016
The Global Land-Grab
Azure
AWS
GCE
20 Regions
11 Regions
6 Regions
http://aws.amazon.com/about-aws/global-infrastructure/
http://azure.microsoft.com/en-us/regions/
https://cloud.google.com/compute/docs/zones#available
http://www.google.com/about/datacenters/inside/locations/index.html
India, UK, Korea
coming in 2016
Trends from 2014: AWS and Azure go after big markets
2016 Google finally decides it needs more regions too…
Questions we keep

hearing about cloud
Who is challenging
VMware for in-house
cloud automation?
Challenges for Private Cloud
Stability
Functionality
Total Cost of
Ownership
Challenges for Private Cloud
Stability
Functionality
Total Cost of
Ownership
Challenges for Private Cloud
Stability
Functionality
Total Cost of
Ownership
Some enterprise vendor
responses to cloud and container
ecosystem growth…
The ship is sinking, let’s re-brand as a submarine!
The ship is sinking, let’s re-brand as a submarine!
The ship is sinking, let’s merge with a submarine!
The ship is sinking, let’s merge with a submarine!
Look! we cut our ship in two really quickly!
What needs to
change?
Developer responsibilities:
Faster, cheaper, safer
“It isn't what we don't know that
gives us trouble, it's what we
know that ain't so.”
Will Rogers
Assumptions
Optimizations
Assumption:
Process prevents
problems
Organizations build up
slow complex “Scar
tissue” processes
"This is the IT swamp draining manual for anyone who is
neck deep in alligators.”
1984 2014
Product
Development
Processes
Waterfall Product Development
Business
Need
• Documents
• Weeks
Approval
Process
• Meetings
• Weeks
Hardware
Purchase
• Negotiations
• Weeks
Software
Development
• Specifications
• Weeks
Deployment and
Testing
• Reports
• Weeks
Customer
Feedback
• It sucks!
• Weeks
Waterfall Product Development
Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS
Business
Need
• Documents
• Weeks
Approval
Process
• Meetings
• Weeks
Hardware
Purchase
• Negotiations
• Weeks
Software
Development
• Specifications
• Weeks
Deployment and
Testing
• Reports
• Weeks
Customer
Feedback
• It sucks!
• Weeks
Waterfall Product Development
Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS
Business
Need
• Documents
• Weeks
Approval
Process
• Meetings
• Weeks
Hardware
Purchase
• Negotiations
• Weeks
Software
Development
• Specifications
• Weeks
Deployment and
Testing
• Reports
• Weeks
Customer
Feedback
• It sucks!
• Weeks
IaaS
Cloud
Waterfall Product Development
Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS
Business
Need
• Documents
• Weeks
Software
Development
• Specifications
• Weeks
Deployment and
Testing
• Reports
• Weeks
Customer
Feedback
• It sucks!
• Weeks
Process Hand-Off Steps for Agile
Development on IaaS
Product Manager
Development Team
QA Integration
Team
Operations Deploy
Team
BI Analytics Team
IaaS Agile Product Development
Business Need
• Documents
• Weeks
Software Development
• Specifications
• Weeks
Deployment and Testing
• Reports
• Days
Customer Feedback
• It sucks!
• Days
IaaS Agile Product Development
Business Need
• Documents
• Weeks
Software Development
• Specifications
• Weeks
Deployment and Testing
• Reports
• Days
Customer Feedback
• It sucks!
• Days
etc…
IaaS Agile Product Development
Software provisioning is undifferentiated heavy lifting – replace it with PaaS
Business Need
• Documents
• Weeks
Software Development
• Specifications
• Weeks
Deployment and Testing
• Reports
• Days
Customer Feedback
• It sucks!
• Days
etc…
IaaS Agile Product Development
Software provisioning is undifferentiated heavy lifting – replace it with PaaS
Business Need
• Documents
• Weeks
Software Development
• Specifications
• Weeks
Deployment and Testing
• Reports
• Days
Customer Feedback
• It sucks!
• Days
PaaS
Cloud
etc…
IaaS Agile Product Development
Software provisioning is undifferentiated heavy lifting – replace it with PaaS
Business Need
• Documents
• Weeks
Software Development
• Specifications
• Weeks
Customer Feedback
• It sucks!
• Days
etc…
Process for Continuous Delivery of
Features on PaaS
Product Manager
Developer
BI Analytics Team
PaaS CD Feature Development
Business Need
• Discussions
• Days
Software Development
• Code
• Days
Customer Feedback
• Fix this Bit!
• Hours
etc…
PaaS CD Feature Development
Building your own business apps is undifferentiated heavy lifting – use SaaS
Business Need
• Discussions
• Days
Software Development
• Code
• Days
Customer Feedback
• Fix this Bit!
• Hours
etc…
PaaS CD Feature Development
Building your own business apps is undifferentiated heavy lifting – use SaaS
Business Need
• Discussions
• Days
Software Development
• Code
• Days
Customer Feedback
• Fix this Bit!
• Hours
SaaS/
BPaaS
Cloud
etc…
PaaS CD Feature Development
Building your own business apps is undifferentiated heavy lifting – use SaaS
Business Need
• Discussions
• Days
Customer Feedback
• Fix this Bit!
• Hours
etc…
SaaS Based Business Application
Development
Business Need
•GUI Builder
•Hours
Customer Feedback
•Fix this bit!
•Seconds
SaaS Based Business Application
Development
Business Need
•GUI Builder
•Hours
Customer Feedback
•Fix this bit!
•Seconds
and thousands more…
Value Chain Mapping
Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html
Related tools and training http://www.wardleymaps.com/
Value Chain Mapping
Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html
Related tools and training http://www.wardleymaps.com/
Your unique product - Agile
Value Chain Mapping
Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html
Related tools and training http://www.wardleymaps.com/
Your unique product - Agile Best of breed as a Service - Lean
Value Chain Mapping
Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html
Related tools and training http://www.wardleymaps.com/
Your unique product - Agile
Undifferentiated
utility suppliers - 6sigma
Best of breed as a Service - Lean
Observe
Orient
Decide
Act Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
INNOVATION
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
Model
Hypotheses
INNOVATION
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
Model
Hypotheses
BIG DATA
INNOVATION
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
JFDI
Plan Response
Share Plans
Model
Hypotheses
BIG DATA
INNOVATION
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
JFDI
Plan Response
Share Plans
Model
Hypotheses
BIG DATA
INNOVATION
CULTURE
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
JFDI
Plan Response
Share Plans
Incremental
Features
Automatic
Deploy
Launch AB
Test
Model
Hypotheses
BIG DATA
INNOVATION
CULTURE
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
JFDI
Plan Response
Share Plans
Incremental
Features
Automatic
Deploy
Launch AB
Test
Model
Hypotheses
BIG DATA
INNOVATION
CULTURE
CLOUD
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
JFDI
Plan Response
Share Plans
Incremental
Features
Automatic
Deploy
Launch AB
Test
Model
Hypotheses
BIG DATA
INNOVATION
CULTURE
CLOUD
Measure
Customers
Continuous
Delivery
Observe
Orient
Decide
Act
Land grab
opportunity Competitive
Move
Customer Pain
Point
Analysis
JFDI
Plan Response
Share Plans
Incremental
Features
Automatic
Deploy
Launch AB
Test
Model
Hypotheses
BIG DATA
INNOVATION
CULTURE
CLOUD
Measure
Customers
Continuous
Delivery
Breaking Down the SILOs
Breaking Down the SILOs
QA DBA
Sys
Adm
Net
Adm
SAN
Adm
DevUX
Prod
Mgr
Breaking Down the SILOs
QA DBA
Sys
Adm
Net
Adm
SAN
Adm
DevUX
Prod
Mgr
Product Team Using Monolithic Delivery
Product Team Using Monolithic Delivery
Breaking Down the SILOs
QA DBA
Sys
Adm
Net
Adm
SAN
Adm
DevUX
Prod
Mgr
Product Team Using Microservices
Product Team Using Monolithic Delivery
Product Team Using Microservices
Product Team Using Microservices
Product Team Using Monolithic Delivery
Breaking Down the SILOs
QA DBA
Sys
Adm
Net
Adm
SAN
Adm
DevUX
Prod
Mgr
Product Team Using Microservices
Product Team Using Monolithic Delivery
Platform TeamProduct Team Using Microservices
Product Team Using Microservices
Product Team Using Monolithic Delivery
Breaking Down the SILOs
QA DBA
Sys
Adm
Net
Adm
SAN
Adm
DevUX
Prod
Mgr
Product Team Using Microservices
Product Team Using Monolithic Delivery
Platform Team
A
P
I
Product Team Using Microservices
Product Team Using Microservices
Product Team Using Monolithic Delivery
Breaking Down the SILOs
QA DBA
Sys
Adm
Net
Adm
SAN
Adm
DevUX
Prod
Mgr
Product Team Using Microservices
Product Team Using Monolithic Delivery
Platform Team
Re-Org from project teams to product teams
A
P
I
Product Team Using Microservices
Product Team Using Microservices
Product Team Using Monolithic Delivery
Release Plan
Developer
Developer
Developer
Developer
Developer
QA Release
Integration
Ops Replace Old
With New
Release
Monolithic service updates
Works well with a small number
of developers and a single
language like php, java or ruby
Release Plan
Developer
Developer
Developer
Developer
Developer
QA Release
Integration
Ops Replace Old
With New
Release
Bugs
Monolithic service updates
Works well with a small number
of developers and a single
language like php, java or ruby
Release Plan
Developer
Developer
Developer
Developer
Developer
QA Release
Integration
Ops Replace Old
With New
Release
Bugs
Bugs
Monolithic service updates
Works well with a small number
of developers and a single
language like php, java or ruby
Use monolithic apps for small teams,
simple systems and when you must,
to optimize for efficiency and latency
Developer
Developer
Developer
Developer
Developer
Old Release Still
Running
Release Plan
Release Plan
Release Plan
Release Plan
Immutable microservice deployment
scales, is faster with large teams and
diverse platform components
Developer
Developer
Developer
Developer
Developer
Old Release Still
Running
Release Plan
Release Plan
Release Plan
Release Plan
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Immutable microservice deployment
scales, is faster with large teams and
diverse platform components
Developer
Developer
Developer
Developer
Developer
Old Release Still
Running
Release Plan
Release Plan
Release Plan
Release Plan
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Bugs
Immutable microservice deployment
scales, is faster with large teams and
diverse platform components
Developer
Developer
Developer
Developer
Developer
Old Release Still
Running
Release Plan
Release Plan
Release Plan
Release Plan
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Bugs
Deploy
Feature to
Production
Immutable microservice deployment
scales, is faster with large teams and
diverse platform components
Configure
Configure
Developer
Developer
Developer
Release Plan
Release Plan
Release Plan
Deploy
Standardized
Services
Standardized container deployment
saves time and effort
https://hub.docker.com
Configure
Configure
Developer
Developer
Developer
Release Plan
Release Plan
Release Plan
Deploy
Standardized
Services
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Bugs
Deploy
Feature to
Production
Standardized container deployment
saves time and effort
https://hub.docker.com
Developer Developer
Run What You Wrote
Developer Developer
Developer Developer
Run What You Wrote
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Developer Developer
DeveloperDeveloper Developer
Run What You Wrote
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Developer Developer
Monitoring
Tools
DeveloperDeveloper Developer
Run What You Wrote
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Developer Developer
Site
Reliability
Monitoring
Tools
Availability
Metrics
99.95% customer
success rate
DeveloperDeveloper Developer
Run What You Wrote
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Developer Developer
Manager Manager
Site
Reliability
Monitoring
Tools
Availability
Metrics
99.95% customer
success rate
DeveloperDeveloper Developer
Run What You Wrote
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Micro
service
Developer Developer
Manager Manager
VP
Engineering
Site
Reliability
Monitoring
Tools
Availability
Metrics
99.95% customer
success rate
Non-Destructive Production Updates
● “Immutable Code” Service Pattern
● Existing services are unchanged, old code remains in service
● New code deploys as a new service group
● No impact to production until traffic routing changes
● A|B Tests, Feature Flags and Version Routing control traffic
● First users in the test cell are the developer and test engineers
● A cohort of users is added looking for measurable improvement
Deliver four features every four weeks
Work In Progress = 4
Opportunity for bugs: 100% (baseline)
Time to debug each: 100% (baseline)
Deliver four features every four weeks
Bugs! Which feature broke?
Need more time to test!
Extend release to six weeks?
Work In Progress = 4
Opportunity for bugs: 100% (baseline)
Time to debug each: 100% (baseline)
Deliver four features every four weeks
But: risk of bugs in delivery increases with interactions!
Bugs! Which feature broke?
Need more time to test!
Extend release to six weeks?
Work In Progress = 4
Opportunity for bugs: 100% (baseline)
Time to debug each: 100% (baseline)
Deliver four features every four weeks
16
16
16
But: risk of bugs in delivery increases with interactions!
Bugs! Which feature broke?
Need more time to test!
Extend release to six weeks?
Work In Progress = 4
Opportunity for bugs: 100% (baseline)
Time to debug each: 100% (baseline)
Deliver six features every six weeks
Deliver six features every six weeks
Work In Progress = 6
Individual bugs: 150%
Interactions: 150%?
Deliver six features every six weeks
More features
What broke?
More interactions
Even more bugs!!
Work In Progress = 6
Individual bugs: 150%
Interactions: 150%?
36
36
Deliver six features every six weeks
More features
What broke?
More interactions
Even more bugs!!
Work In Progress = 6
Individual bugs: 150%
Interactions: 150%?
36
36
Deliver six features every six weeks
Risk of bugs in delivery increased to 225% of original!
More features
What broke?
More interactions
Even more bugs!!
Work In Progress = 6
Individual bugs: 150%
Interactions: 150%?
4
4
4
4
4
4
Deliver two features every two weeks
Complexity of delivery decreased by 75% from original
Fewer interactions
Fewer bugs
Better flow
Less Work In Progress
Work In Progress = 2
Opportunity for bugs: 50%
Time to debug each: 50%
Change One Thing at a Time!
If it hurts, do it more often!
What Happened?
Rate of change
increased
Cost and size and
risk of change
reduced
Low Cost of Change Using Docker
Developers
• Compile/Build
• Seconds
Extend container
• Package dependencies
• Seconds
Deploy Container
• Docker startup
• Seconds
Low Cost of Change Using Docker
Fast tooling supports continuous delivery of many tiny changes
Developers
• Compile/Build
• Seconds
Extend container
• Package dependencies
• Seconds
Deploy Container
• Docker startup
• Seconds
Disruptor:
Continuous Delivery with
Containerized Microservices
It’s what you know that isn’t so
It’s what you know that isn’t so
● Make your assumptions explicit
It’s what you know that isn’t so
● Make your assumptions explicit
● Extrapolate trends to the limit
It’s what you know that isn’t so
● Make your assumptions explicit
● Extrapolate trends to the limit
● Listen to non-customers
It’s what you know that isn’t so
● Make your assumptions explicit
● Extrapolate trends to the limit
● Listen to non-customers
● Follow developer adoption, not IT spend
It’s what you know that isn’t so
● Make your assumptions explicit
● Extrapolate trends to the limit
● Listen to non-customers
● Follow developer adoption, not IT spend
● Map evolution of products to services to utilities
It’s what you know that isn’t so
● Make your assumptions explicit
● Extrapolate trends to the limit
● Listen to non-customers
● Follow developer adoption, not IT spend
● Map evolution of products to services to utilities
● Re-organize your teams for speed of execution
Nuances of Lock-In
Most IT Ops people will
try to avoid lock-in
Most product
developers will pick the
best of breed option
Analogy:
Dev’s get to go on dates and get married
Ops get to run the family and get divorced
DevOps gets the whole lifecycle!
DevOps to the rescue!
https://www.youtube.com/watch?v=7g3uqSzWVZs
"End the practice of awarding business on the
basis of a price tag. Instead, minimize total cost.
Move toward a single supplier for any one item,
on a long-term relationship of loyalty and trust.”
W. Edwards Deming - 4th Point
"End the practice of awarding business on the
basis of a price tag. Instead, minimize total cost.
Move toward a single supplier for any one item,
on a long-term relationship of loyalty and trust.”
How did we end up here?
dysfunctional exploitation and abuse
Project vs. Product
Leads to lock-in Evolves to follow
best of breed
Evolution
Technology Refresh
Move to open Source
On-prem -> as a Service
Best of breed is now OSS and as a
Service
Less inherent lock-in
What kinds of lock-in
are there?
Business lock-in
Hardest to escape…
e.g. compliance with laws that exclude
alternatives based on jurisdiction or certification
Legal lock-in
e.g. compliance with laws that exclude
alternatives based on jurisdiction or certification
Financial lock-in
e.g. budget spent in advance on long term
deal with a vendor
Legal lock-in
e.g. compliance with laws that exclude
alternatives based on jurisdiction or certification
Contractual lock-in
e.g. partnership or investment deal with
one vendor prevents using alternatives
Financial lock-in
e.g. budget spent in advance on long term
deal with a vendor
Legal lock-in
Technology lock-in
Possible to escape given time and work…
Proximity lock-in
e.g. chatty clients don’t work unless they
are co-located with their server
e.g. quorum based availability (C*, Riak) needs
three zones/datacenters per region
Topology lock-in
Proximity lock-in
e.g. chatty clients don’t work unless they
are co-located with their server
e.g. quorum based availability (C*, Riak) needs
three zones/datacenters per region
Topology lock-in
Proximity lock-in
e.g. chatty clients don’t work unless they
are co-located with their server
Implementation
e.g. interface is the same but behavior is different
Soft lock-in
Relatively easy to escape…
Interface lock-in
e.g. different APIs that get the same result,
easy to hide behind an abstraction layer
Query syntax lock-in
e.g. SQL variants for different databases
Interface lock-in
e.g. different APIs that get the same result,
easy to hide behind an abstraction layer
Query syntax lock-in
e.g. SQL variants for different databases
Interface lock-in
e.g. different APIs that get the same result,
easy to hide behind an abstraction layer
Web service lock-in
Interface lock-in, but remote access
unlocks ability to migrate applications
Data gravity lock-in
e.g. lots of data to move or duplicate
Query syntax lock-in
e.g. SQL variants for different databases
Interface lock-in
e.g. different APIs that get the same result,
easy to hide behind an abstraction layer
Web service lock-in
Interface lock-in, but remote access
unlocks ability to migrate applications
Cloud native
microservices
Example for discussion
Example for discussion
Example for discussion
Example for discussion
Example for discussion
AWS Aurora
Example for discussion
AWS Aurora
Example for discussion
Microservices
A Microservice Definition
Loosely coupled service oriented
architecture with bounded contexts
A Microservice Definition
Loosely coupled service oriented
architecture with bounded contexts
If every service has to be
updated at the same time
it’s not loosely coupled
A Microservice Definition
Loosely coupled service oriented
architecture with bounded contexts
If every service has to be
updated at the same time
it’s not loosely coupled
If you have to know too much about surrounding
services you don’t have a bounded context. See the
Domain Driven Design book by Eric Evans.
Coupling Concerns
http://en.wikipedia.org/wiki/Conway's_law
●Conway’s Law - organizational coupling
●Centralized Database Schemas
●Enterprise Service Bus - centralized message queues
●Inflexible Protocol Versioning
Speeding Up The Platform
Datacenter Snowflakes
• Deploy in months
• Live for years
Speeding Up The Platform
Datacenter Snowflakes
• Deploy in months
• Live for years
Virtualized and Cloud
• Deploy in minutes
• Live for weeks
Speeding Up The Platform
Datacenter Snowflakes
• Deploy in months
• Live for years
Virtualized and Cloud
• Deploy in minutes
• Live for weeks
Container Deployments
• Deploy in seconds
• Live for minutes/hours
Speeding Up The Platform
Datacenter Snowflakes
• Deploy in months
• Live for years
Virtualized and Cloud
• Deploy in minutes
• Live for weeks
Container Deployments
• Deploy in seconds
• Live for minutes/hours
Lambda Deployments
• Deploy in milliseconds
• Live for seconds
Speeding Up The Platform
AWS Lambda is leading exploration of serverless architectures in 2016
Datacenter Snowflakes
• Deploy in months
• Live for years
Virtualized and Cloud
• Deploy in minutes
• Live for weeks
Container Deployments
• Deploy in seconds
• Live for minutes/hours
Lambda Deployments
• Deploy in milliseconds
• Live for seconds
Separate Concerns with Microservices
http://en.wikipedia.org/wiki/Conway's_law
● Invert Conway’s Law – teams own service groups and backend stores
● One “verb” per single function micro-service, size doesn’t matter
● One developer independently produces a micro-service
● Each micro-service is it’s own build, avoids trunk conflicts
● Deploy in a container: Tomcat, AMI or Docker, whatever…
● Stateless business logic. Cattle, not pets.
● Stateful cached data access layer using replicated ephemeral instances
Inspiration
http://www.infoq.com/presentations/Twitter-Timeline-Scalability
http://www.infoq.com/presentations/twitter-soa
http://www.infoq.com/presentations/Zipkin
http://www.infoq.com/presentations/scale-gilt
Go-Kit https://www.youtube.com/watch?v=aL6sd4d4hxk
http://www.infoq.com/presentations/circuit-breaking-distributed-systems
https://speakerdeck.com/mattheath/scaling-micro-services-in-go-highload-plus-plus-2014
State of the Art in Web Scale
Microservice Architectures
AWS Re:Invent : Asgard to Zuul https://www.youtube.com/watch?v=p7ysHhs5hl0
Resiliency at Massive Scale https://www.youtube.com/watch?v=ZfYJHtVL1_w
Microservice Architecture https://www.youtube.com/watch?v=CriDUYtfrjs
New projects for 2015 and Docker Packaging https://www.youtube.com/watch?v=hi7BDAtjfKY
Spinnaker deployment pipeline https://www.youtube.com/watch?v=dwdVwE52KkU
http://www.infoq.com/presentations/spring-cloud-2015
Microservice Architectures
ConfigurationTooling Discovery Routing Observability
Development: Languages and Container
Operational: Orchestration and Deployment Infrastructure
Datastores
Policy: Architectural and Security Compliance
Microservices
Edda
Archaius
Configuration
Spinnaker
SpringCloud
Tooling
Eureka
Prana
Discovery
Denominator
Zuul
Ribbon
Routing
Hystrix
Pytheus
Atlas
Observability
Development using Java, Groovy, Scala, Clojure, Python with AMI and Docker Containers
Orchestration with Autoscalers on AWS, Titus exploring Mesos & ECS for Docker
Ephemeral datastores using Dynomite, Memcached, Astyanax, Staash, Priam, Cassandra
Policy via the Simian Army - Chaos Monkey, Chaos Gorilla, Conformity Monkey, Security Monkey
Cloud Native Storage
Business
Logic
Database
Master
Fabric
Storage
Arrays
Database
Slave
Fabric
Storage
Arrays
Cloud Native Storage
Business
Logic
Database
Master
Fabric
Storage
Arrays
Database
Slave
Fabric
Storage
Arrays
Business
Logic
Cassandra
Zone A nodes
Cassandra
Zone B nodes
Cassandra
Zone C nodes
Cloud Object
Store Backups
Cloud Native Storage
Business
Logic
Database
Master
Fabric
Storage
Arrays
Database
Slave
Fabric
Storage
Arrays
Business
Logic
Cassandra
Zone A nodes
Cassandra
Zone B nodes
Cassandra
Zone C nodes
Cloud Object
Store Backups
SSDs inside
arrays disrupt
incumbent
suppliers
Cloud Native Storage
Business
Logic
Database
Master
Fabric
Storage
Arrays
Database
Slave
Fabric
Storage
Arrays
Business
Logic
Cassandra
Zone A nodes
Cassandra
Zone B nodes
Cassandra
Zone C nodes
Cloud Object
Store Backups
SSDs inside
ephemeral
instances
disrupt an
entire industry
SSDs inside
arrays disrupt
incumbent
suppliers
Cloud Native Storage
Business
Logic
Database
Master
Fabric
Storage
Arrays
Database
Slave
Fabric
Storage
Arrays
Business
Logic
Cassandra
Zone A nodes
Cassandra
Zone B nodes
Cassandra
Zone C nodes
Cloud Object
Store Backups
SSDs inside
ephemeral
instances
disrupt an
entire industry
SSDs inside
arrays disrupt
incumbent
suppliers
NetflixOSS Uses Priam to create Cassandra clusters in minutes
Twitter Microservices
Decider
ConfigurationTooling
Finagle
Zookeeper
Discovery
Finagle
Netty
Routing
Zipkin
Observability
Scala with JVM Container
Orchestration using Aurora deployment in datacenters using Mesos
Custom Cassandra-like datastore: Manhattan
Twitter Microservices
Decider
ConfigurationTooling
Finagle
Zookeeper
Discovery
Finagle
Netty
Routing
Zipkin
Observability
Scala with JVM Container
Orchestration using Aurora deployment in datacenters using Mesos
Custom Cassandra-like datastore: Manhattan
Focus on efficient datacenter deployment at scale
Gilt Microservices
Decider
Configuration
Ion Cannon
SBT
Rake
Tooling
Finagle
Zookeeper
Discovery
Akka
Finagle
Netty
Routing
Zipkin
Observability
Scala and Ruby with Docker Containers
Deployment on AWS
Datastores per Microservice using MongoDB, Postgres, Voldemort
Gilt Microservices
Decider
Configuration
Ion Cannon
SBT
Rake
Tooling
Finagle
Zookeeper
Discovery
Akka
Finagle
Netty
Routing
Zipkin
Observability
Scala and Ruby with Docker Containers
Deployment on AWS
Datastores per Microservice using MongoDB, Postgres, Voldemort
Focus on fast development with Scala and Docker
Hailo Microservices
Configuration
Hubot
Janky
Jenkins
Tooling
go-platform
Discovery
go-platform
RabbitMQ
Routing Observability
Go using AMI Container and Docker
Deployment on AWS
Deployment on AWS
Hailo Microservices
Configuration
Hubot
Janky
Jenkins
Tooling
go-platform
Discovery
go-platform
RabbitMQ
Routing Observability
Go using AMI Container and Docker
Deployment on AWS
Deployment on AWS
See: go-micro and https://github.com/peterbourgon/gokit
Next Generation Applications
Fill in the gaps, rapidly evolving ecosystem choices
Wasabi
LaunchDarkly
Habitat
Configuration
Docker
Desktop
Spinnaker
Terraform
Tooling
Etcd
Eureka
Consul
Discovery
Compose
Linkerd
Weave
Routing
OpenTracing
Prometheus
Hystrix
Observability
Development: Containers and languages e.g. Docker, Lambda, Node.js, Go, Rust
Operational: Mesos, Kubernetes, Swarm, Nomad for private clouds. ECS, GKS for public
Datastores: Orchestrated, Distributed Ephemeral e.g. Cassandra, or DBaaS e.g. DynamoDB
Policy: Security compliance e.g. Vault, Docker Content Trust. Scaffolding e.g. Cloud Foundry
Segmentation
Industry Trends
Airgaps closing
Industrial IoT
Security blanket perimeter firewalls
Datacenter to cloud transitions
New systems of engagement
http://peanuts.wikia.com/wiki/Linus'_security_blanket
Policy
A can talk to B
B can talk to C
A must not talk to C
A B C
Y and Z failure modes
must be independent so
X can always succeed
Y
X
Z
Availability requirements drive
a need for distributed
segmentation
Choices?
Too many
choices!
Over-reliance on one
mechanism leads to abuse…
Lack of coordination across
many mechanisms leads to
fragility
Example
segmentation
mechanisms
B
Accounts and Roles
Who can set policy for what?
Needs distributed policy management
A C
Network Segmentation
Who controls the network?
A B C
Network Segmentation
Datacenter policies are based on
separation of duties. Tickets,
Network admins and VLANs
Network Segmentation
AWS VPC networking uses
developer-driven automation,
loses separation of duties…
VPC Abuse Antipattern
Lots of small VPC networks for
microservices, end up in IP address
space capacity management hell…
Hypervisor and Security
Group Segmentation
Distributed firewall rules
A B CA B
Security Group Abuse Antipattern
Too many microservices need to be in the
same group, overloads configuration
limitations
Kernel eBPF & Calico
IPtables Segmentation
Distributed firewall rules
A B CA B
IPtables Segmentation
Can use IP Sets to scale
Managed in the container host OS
Separates routing reachability from access
policy
Docker & Weave
Segmentation
Docker daemon manages connections
B CA B C
proxy:
build: ./proxy
ports:
- "8080:8080"
links:
- app
app:
build: ./app
links:
- db
db:
image: postgres
Docker Compose V1
proxy app db
8080
version: '2'
services:
proxy:
build: ./proxy
ports:
- "8080:8080"
networks:
- front
app:
build: ./app
networks:
- front
- back
db:
image: postgres
networks:
- back
networks:
front:
back:
Docker Compose V2
proxy app db
8080
front backfront
Docker Segmentation
Overlay network created and
managed by Docker or Weave.
DNS based lookups.
Segmentation Scalability
Real world microservices
architectures have hundreds to
thousands of distinct microservices
Segmentation Scalability
There’s often a few very popular
microservices that everyone else
wants to talk to
In Search of Segmentation
Ops
Dev
Datacenters
AD/LDAP Roles
VLAN Networks
Hypervisor
IPtables
Docker Links
In Search of Segmentation
Ops
Dev
Datacenters
AD/LDAP Roles
VLAN Networks
Hypervisor
IPtables
Docker Links
AWS Accounts
IAM Roles
VPC
Security Groups
Calico Policy
Docker Net/Weave
Hierarchical Segmentation
B CA
B C
E FD
E F
Homepage Team Security Group Reports Team Security Group
VPC Z - Manage a small number of large network spaces
D
Setup all the layers with Terraform?
AWS Account - Manage across multiple accounts
containers and links
What’s Missing?
@adrianco
Advanced Microservices Topics
Failure injection testing
Versioning, routing
Binary protocols and interfaces
Timeouts and retries
Denormalized data models
Monitoring, tracing
Simplicity through symmetry
@adrianco
Failure Injection Testing
Netflix Chaos Monkey, Simian Army, FIT and Gremlin
http://techblog.netflix.com/2011/07/netflix-simian-army.html
http://techblog.netflix.com/2014/10/fit-failure-injection-testing.html
http://techblog.netflix.com/2016/01/automated-failure-testing.html
https://www.infoq.com/presentations/failure-test-research-netflix
● Chaos Monkey - enforcing stateless business logic
● Chaos Gorilla - enforcing zone isolation/replication
● Chaos Kong - enforcing region isolation/replication
● Security Monkey - watching for insecure configuration settings
● FIT & Gremlin - inject errors to enforce robust dependencies
● See over 100 NetflixOSS projects at netflix.github.com
● Get “Technical Indigestion” reading techblog.netflix.com
Trust with Verification
@adrianco
Benefits of version aware routing
Immediately and safely introduce a new version
Canary test in production
Use DIY feature flags, , A|B tests with Wasabi
Route clients to a version so they can’t get disrupted
Change client or dependencies but not both at once
Eventually remove old versions
Incremental or infrequent “break the build” garbage collection
@adrianco
Versioning, Routing
Version numbering: Interface.Feature.Bugfix
V1.2.3 to V1.2.4 - Canary test then remove old version
V1.2.x to V1.3.x - Canary test then remove or keep both
Route V1.3.x clients to new version to get new feature
Remove V1.2.x only after V1.3.x is found to work for V1.2.x clients
V1.x.x to V2.x.x - Route clients to specific versions
Remove old server version when all old clients are gone
@adrianco
Protocols
Measure serialization, transmission, deserialization costs
Sending a megabyte of XML between microservices will
make you sad, but not as sad as 10yrs ago with SOAP
Use Thrift, Protobuf/gRPC, Avro, SBE internally
Use JSON for external/public interfaces
https://github.com/real-logic/simple-binary-encoding
@adrianco
Interfaces
When you build a service, build a “driver” client for it
Reference implementation error handling and serialization
Release automation stress test using client
Validate that service interface is usable!
Minimize additional dependencies
Swagger - OpenAPI Specification
Datawire Quark adds behaviors to API spec
@adrianco
Interface Version Pinning
Change one thing at a time!
Pin the version of everything else
Incremental build/test/deploy pipeline
Deploy existing app code with new platform
Deploy existing app code with new dependencies
Deploy new app code with pinned platform/dependencies
@adrianco
Interfaces between teams
@adrianco
Interfaces between teams
Client
Code
Minimal
Object Model
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Full Object
Model
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Full Object
Model
Cache
Code
Common
Object Model
Decoupled
object
models
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Service
Driver
Service
Handler
Full Object
Model
Cache
Code
Common
Object Model
Decoupled
object
models
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Cache
Driver
Service
Driver
Service
Handler
Full Object
Model
Cache
Code
Cache
Handler
Common
Object Model
Decoupled
object
models
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Cache
Driver
Service
Driver
Platform Platform
Service
Handler
Full Object
Model
Cache
Code
Platform
Cache
Handler
Common
Object Model
Decoupled
object
models
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Cache
Driver
Service
Driver
Platform Platform
Service
Handler
Full Object
Model
Cache
Code
Platform
Cache
Handler
Common
Object Model
Versioned
dependency
interfacesDecoupled
object
models
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Cache
Driver
Service
Driver
Platform Platform
Service
Handler
Full Object
Model
Cache
Code
Platform
Cache
Handler
Common
Object Model
Versioned
dependency
interfaces
Versioned
platform
interface
Decoupled
object
models
@adrianco
Interfaces between teams
Service
Code
Client
Code
Minimal
Object Model
Cache
Driver
Service
Driver
Platform Platform
Service
Handler
Full Object
Model
Cache
Code
Platform
Cache
Handler
Common
Object Model
Versioned
dependency
interfaces
Versioned
platform
interface
Decoupled
object
models
Versioned routing
@adrianco
Timeouts and Retries
Connection timeout vs. request timeout confusion
Usually setup incorrectly, global defaults
Systems collapse with “retry storms”
Timeouts too long, too many retries
Services doing work that can never be used
@adrianco
Connections and Requests
TCP makes a connection, HTTP makes a request
HTTP hopefully reuses connections for several requests
Both have different timeout and retry needs!
TCP timeout is purely a property of one network latency hop
HTTP timeout depends on the service and its dependencies
connection path
request path
@adrianco
Timeouts and Retries
Edge
Service
Good
Service
Good
Service
Bad config: Every service defaults to 2 second timeout, two retries
Edge
Service not
responding
Overloaded
service not
responding
Failed
Service
If anything breaks, everything upstream stops responding
Retries add unproductive work
@adrianco
Timeouts and Retries
Edge
Service
Good
Service
Good
Service
Bad config: Every service defaults to 2 second timeout, two retries
Edge
Service not
responding
Overloaded
service not
responding
Failed
Service
If anything breaks, everything upstream stops responding
Retries add unproductive work
@adrianco
Timeouts and Retries
Edge
Service
Good
Service
Good
Service
Bad config: Every service defaults to 2 second timeout, two retries
Edge
Service not
responding
Overloaded
service not
responding
Failed
Service
If anything breaks, everything upstream stops responding
Retries add unproductive work
@adrianco
Timeouts and Retries
Bad config: Every service defaults to 2 second timeout, two retries
Edge
service
responds
slowly
Overloaded
service
Partially
failed
service
@adrianco
Timeouts and Retries
Bad config: Every service defaults to 2 second timeout, two retries
Edge
service
responds
slowly
Overloaded
service
Partially
failed
service
First request from Edge timed out so it ignores the successful
response and keeps retrying. Middle service load increases as
it’s doing work that isn’t being consumed
@adrianco
Timeouts and Retries
Bad config: Every service defaults to 2 second timeout, two retries
Edge
service
responds
slowly
Overloaded
service
Partially
failed
service
First request from Edge timed out so it ignores the successful
response and keeps retrying. Middle service load increases as
it’s doing work that isn’t being consumed
@adrianco
Timeout and Retry Fixes
Cascading timeout budget
Static settings that decrease from the edge
or dynamic budget passed with request
How often do retries actually succeed?
Don’t ask the same instance the same thing
Only retry on a different connection
@adrianco
Timeouts and Retries
Edge
Service
Good
Service
Budgeted timeout, one retry
Failed
Service
@adrianco
Timeouts and Retries
Edge
Service
Good
Service
Budgeted timeout, one retry
Failed
Service
3s
1s
1s
Fast fail
response
after 2s
Upstream timeout must always be longer than
total downstream timeout * retries delay
No unproductive work while fast failing
@adrianco
Timeouts and Retries
Edge
Service
Good
Service
Budgeted timeout, failover retry
Failed
Service
For replicated services with multiple instances
never retry against a failed instance
No extra retries or unproductive work
Good
Service
@adrianco
Timeouts and Retries
Edge
Service
Good
Service
Budgeted timeout, failover retry
Failed
Service
3s 1s
For replicated services with multiple instances
never retry against a failed instance
No extra retries or unproductive work
Good
Service
Successful
response
delayed 1s
@adrianco
Manage Inconsistency
ACM Paper: "The Network is Reliable"
Distributed systems are inconsistent by nature
Clients are inconsistent with servers
Most caches are inconsistent
Versions are inconsistent
Get over it and
Deal with it
See http://queue.acm.org/detail.cfm?id=2655736
@adrianco
Denormalized Data Models
Any non-trivial organization has many databases
Cross references exist, inconsistencies exist
Microservices work best with individual simple stores
Scale, operate, mutate, fail them independently
NoSQL allows flexible schema/object versions
@adrianco
Denormalized Data Models
Build custom cross-datasource check/repair processes
Ensure all cross references are up to date
Read these Pat Helland papers
Immutability Changes Everything
http://highscalability.com/blog/2015/1/26/paper-immutability-changes-everything-by-pat-helland.html
Memories, Guesses and Apologies
https://blogs.msdn.microsoft.com/pathelland/2007/05/15/memories-guesses-and-apologies/
Standing on the Distributed Shoulders of Giants
http://queue.acm.org/detail.cfm?id=2953944
Cloud Native
Monitoring and
Microservices
Cloud Native Microservices
● High rate of change
Code pushes can cause floods of new instances and metrics
Short baseline for alert threshold analysis – everything looks unusual
● Ephemeral Configurations
Short lifetimes make it hard to aggregate historical views
Hand tweaked monitoring tools take too much work to keep running
● Microservices with complex calling patterns
End-to-end request flow measurements are very important
Request flow visualizations get overwhelmed
Microservice Based Architectures
See http://www.slideshare.net/LappleApple/gilt-from-monolith-ruby-app-to-micro-service-scala-service-architecture
Continuous Delivery and DevOps
● Changes are smaller but more frequent
● Individual changes are more likely to be broken
● Changes are normally deployed by developers
● Feature flags are used to enable new code
● Instant detection and rollback matters much more
Whoops! I didn’t mean that!
Reverting…



Not cool if it takes 5 minutes to see it failed and 5 more to see a fix

No-one notices if it only takes 5 seconds to detect and 5 to see a fix
NetflixOSS Hystrix/Turbine Circuit Breaker
http://techblog.netflix.com/2012/12/hystrix-dashboard-and-turbine.html
NetflixOSS Hystrix/Turbine Circuit Breaker
http://techblog.netflix.com/2012/12/hystrix-dashboard-and-turbine.html
Low Latency SaaS Based Monitors
https://www.datadoghq.com/ http://www.instana.com/ www.bigpanda.io www.vividcortex.com signalfx.com wavefront.com sysdig.com
dataloop.io
See www.battery.com for a list of portfolio investments
A Tragic Quadrant
Ability to scale
Ability to
handle
rapidly
changing
microservices
In-house tools
at web scale
companies
Most current
monitoring & APM
tools
Next generation
APM
Next generation
Monitoring
Datacenter
Cloud
Containers
100s 1,000s 10,000s 100,000s
Lambda
A Tragic Quadrant
Ability to scale
Ability to
handle
rapidly
changing
microservices
In-house tools
at web scale
companies
Most current
monitoring & APM
tools
Next generation
APM
Next generation
Monitoring
Datacenter
Cloud
Containers
100s 1,000s 10,000s 100,000s
Lambda
YMMV: Opinionated approximate positioning only
Metric to display latency needs to be
less than human attention span (~10s)
Challenges for
Microservice
Platforms
Managing Scale
A Possible Hierarchy
Continents
Regions
Zones
Services
Versions
Containers
Instances
How Many?
3 to 5
2-4 per Continent
1-5 per Region
100’s per Zone
Many per Service
1000’s per Version
10,000’s
It’s much more challenging
than just a large number of
machines
Flow
Some tools can show
the request flow
across a few services
Interesting
architectures have a
lot of microservices!
Flow visualization is
a big challenge.
See http://www.slideshare.net/LappleApple/gilt-from-monolith-ruby-app-to-micro-service-scala-service-architecture
Simulated Microservices
Model and visualize microservices
Simulate interesting architectures
Generate large scale configurations
Eventually stress test real tools
Code: github.com/adrianco/spigo
Simulate Protocol Interactions in Go
Visualize with D3
See for yourself: http://simianviz.surge.sh
Follow @simianviz for updates
ELB Load Balancer
Zuul
API Proxy
Karyon
Business Logic
Staash
Data Access Layer
Priam
Cassandra Datastore
Three
Availability
Zones
Denominator
DNS Endpoint
Spigo Nanoservice Structure
func Start(listener chan gotocol.Message) {
...
for {
select {
case msg := <-listener:
flow.Instrument(msg, name, hist)
switch msg.Imposition {
case gotocol.Hello: // get named by parent
...
case gotocol.NameDrop: // someone new to talk to
...
case gotocol.Put: // upstream request handler
...
outmsg := gotocol.Message{gotocol.Replicate, listener, time.Now(),
msg.Ctx.NewParent(), msg.Intention}
flow.AnnotateSend(outmsg, name)
outmsg.GoSend(replicas)
}
case <-eurekaTicker.C: // poll the service registry
...
}
}
}
Skeleton code for replicating a Put message
Instrument incoming requests
Instrument outgoing requests
update trace context
Flow Trace Records
riak2
us-east-1
zoneC
riak9
us-west-2
zoneA
Put s896
Replicate
riak3
us-east-1
zoneA
riak8
us-west-2
zoneC
riak4
us-east-1
zoneB
riak10
us-
west-2
us-east-1.zoneC.riak2 t98p895s896 Put
us-east-1.zoneA.riak3 t98p896s908 Replicate
us-east-1.zoneB.riak4 t98p896s909 Replicate
us-west-2.zoneA.riak9 t98p896s910 Replicate
us-west-2.zoneB.riak10 t98p910s912 Replicate
us-west-2.zoneC.riak8 t98p910s913 Replicate
staash
us-east-1
zoneC
s910 s908s913
s909s912
Replicate Put
Open Zipkin
A common format for trace annotations
A Java tool for visualizing traces
Standardization effort to fold in other formats
Driven by Adrian Cole (currently at Pivotal)
Extended to load Spigo generated trace files
Zipkin Trace Dependencies
Zipkin Trace Dependencies
Trace for one Spigo Flow
Global Visualization
NetflixOSS
Vizceral
https://github.com/Netflix/vizceral
https://github.com/adrianco/go-vizceral
Regional Visualization
NetflixOSS
Vizceral
Service Context Visualization
NetflixOSS
Vizceral
Simple Architecture Principles
Symmetry
Invariants
Stable assertions
No special cases
Migrating to Microservices
See for yourself: http://simianviz.surge.sh/migration
Endpoint
ELB
PHP
MySQL
MySQL
Next step Controls node
placement distance
Select models
Migrating to Microservices
See for yourself: http://simianviz.surge.sh/migration
Step 1 - Add Memcache Step 2 - Add Web Proxy Service
Migrating to Microservices
See for yourself: http://simianviz.surge.sh/migration
Step 3 - Add Data Access Layer Step 4 - Add Microservices
Data Access
node.js
memcache per zone
Migrating to Microservices
See for yourself: http://simianviz.surge.sh/migration
Step 5 - Add Cassandra Step 6 - Remove MySQL
12 node cross zone
Cassandra cluster
MySQL
Migrating to Microservices
See for yourself: http://simianviz.surge.sh/migration
Step 7 - Add Second Region Step 8 - Connect Cassandra Regions
Endpoint with
location
routed DNS
Migrating to Microservices
See for yourself: http://simianviz.surge.sh/migration
Step 9 - Add Third Region
Endpoint with
location
routed DNS
Definition of an architecture
{
"arch": "lamp",
"description":"Simple LAMP stack",
"version": "arch-0.0",
"victim": "webserver",
"services": [
{ "name": "rds-mysql", "package": "store", "count": 2, "regions": 1, "dependencies": [] },
{ "name": "memcache", "package": "store", "count": 1, "regions": 1, "dependencies": [] },
{ "name": "webserver", "package": "monolith", "count": 18, "regions": 1, "dependencies": ["memcache", "rds-mysql"] },
{ "name": "webserver-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["webserver"] },
{ "name": "www", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["webserver-elb"] }
]
}
Header includes
chaos monkey victim
New tier
name
Tier
package
0 = non
Regional
Node
count
List of tier
dependencies
See for yourself: http://simianviz.surge.sh/lamp
Running Spigo
$ ./spigo -a lamp -j -d 2
2016/01/26 23:04:05 Loading architecture from json_arch/lamp_arch.json
2016/01/26 23:04:05 lamp.edda: starting
2016/01/26 23:04:05 Architecture: lamp Simple LAMP stack
2016/01/26 23:04:05 architecture: scaling to 100%
2016/01/26 23:04:05 lamp.us-east-1.zoneB.eureka01....eureka.eureka: starting
2016/01/26 23:04:05 lamp.us-east-1.zoneA.eureka00....eureka.eureka: starting
2016/01/26 23:04:05 lamp.us-east-1.zoneC.eureka02....eureka.eureka: starting
2016/01/26 23:04:05 Starting: {rds-mysql store 1 2 []}
2016/01/26 23:04:05 Starting: {memcache store 1 1 []}
2016/01/26 23:04:05 Starting: {webserver monolith 1 18 [memcache rds-mysql]}
2016/01/26 23:04:05 Starting: {webserver-elb elb 1 0 [webserver]}
2016/01/26 23:04:05 Starting: {www denominator 0 0 [webserver-elb]}
2016/01/26 23:04:05 lamp.*.*.www00....www.denominator activity rate 10ms
2016/01/26 23:04:06 chaosmonkey delete: lamp.us-east-1.zoneC.webserver02....webserver.monolith
2016/01/26 23:04:07 asgard: Shutdown
2016/01/26 23:04:07 lamp.us-east-1.zoneB.eureka01....eureka.eureka: closing
2016/01/26 23:04:07 lamp.us-east-1.zoneA.eureka00....eureka.eureka: closing
2016/01/26 23:04:07 lamp.us-east-1.zoneC.eureka02....eureka.eureka: closing
2016/01/26 23:04:07 spigo: complete
2016/01/26 23:04:07 lamp.edda: closing
-a architecture lamp
-j graph json/lamp.json
-d run for 2 seconds
Riak IoT Architecture
{
"arch": "riak",
"description":"Riak IoT ingestion example for the RICON 2015 presentation",
"version": "arch-0.0",
"victim": "",
"services": [
{ "name": "riakTS", "package": "riak", "count": 6, "regions": 1, "dependencies": ["riakTS", "eureka"]},
{ "name": "ingester", "package": "staash", "count": 6, "regions": 1, "dependencies": ["riakTS"]},
{ "name": "ingestMQ", "package": "karyon", "count": 3, "regions": 1, "dependencies": ["ingester"]},
{ "name": "riakKV", "package": "riak", "count": 3, "regions": 1, "dependencies": ["riakKV"]},
{ "name": "enricher", "package": "staash", "count": 6, "regions": 1, "dependencies": ["riakKV", "ingestMQ"]},
{ "name": "enrichMQ", "package": "karyon", "count": 3, "regions": 1, "dependencies": ["enricher"]},
{ "name": "analytics", "package": "karyon", "count": 6, "regions": 1, "dependencies": ["ingester"]},
{ "name": "analytics-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["analytics"]},
{ "name": "analytics-api", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["analytics-elb"]},
{ "name": "normalization", "package": "karyon", "count": 6, "regions": 1, "dependencies": ["enrichMQ"]},
{ "name": "iot-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["normalization"]},
{ "name": "iot-api", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["iot-elb"]},
{ "name": "stream", "package": "karyon", "count": 6, "regions": 1, "dependencies": ["ingestMQ"]},
{ "name": "stream-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["stream"]},
{ "name": "stream-api", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["stream-elb"]}
]
}
New tier
name
Tier
package
Node
count
List of tier
dependencies
0 = non
Regional
Single Region Riak IoT
See for yourself: http://simianviz.surge.sh/riak
Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
See for yourself: http://simianviz.surge.sh/riak
Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Load Balancer
Load Balancer
See for yourself: http://simianviz.surge.sh/riak
Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Load Balancer
Load Balancer
Stream Service
Analytics Service
See for yourself: http://simianviz.surge.sh/riak
Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Enrich Message Queue
Riak KV
Enricher Services
Load Balancer
Load Balancer
Stream Service
Analytics Service
See for yourself: http://simianviz.surge.sh/riak
Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Enrich Message Queue
Riak KV
Enricher Services
Ingest Message Queue
Load Balancer
Load Balancer
Stream Service
Analytics Service
See for yourself: http://simianviz.surge.sh/riak
Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Enrich Message Queue
Riak KV
Enricher Services
Ingest Message Queue
Load Balancer
Load Balancer
Stream Service Riak TS
Analytics Service
Ingester Service
See for yourself: http://simianviz.surge.sh/riak
Two Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
East Region Ingestion
West Region Ingestion
Multi Region TS Analytics
See for yourself: http://simianviz.surge.sh/riak
Two Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
East Region Ingestion
West Region Ingestion
Multi Region TS Analytics
What’s the response
time of the stream
endpoint?
See for yourself: http://simianviz.surge.sh/riak
Response Times
What’s the response time distribution of a very
simple storage backed web service?
memcached
mysql
disk volume
web
service
load
generator
memcached
See http://www.getguesstimate.com/models/1307
memcached hit %
memcached response mysql response
service cpu time
memcached hit mode
mysql cache hit mode
mysql disk access mode
Hit rates: memcached 40% mysql 70%
Hit rates: memcached 60% mysql 70%
Hit rates: memcached 20% mysql 90%
Measuring
Response Time With
Histograms
Changes made to codahale/hdrhistogram
Changes made to go-kit/kit/metrics
Implementation in adrianco/spigo/collect
What to measure?
Client Server
GetRequest
GetResponse
Client
Time
Client Send CS
Server Receive SR
Server Send SS
Client Receive CR
Server
Time
What to measure?
Client Server
GetRequest
GetResponse
Client
Time
Client Send CS
Server Receive SR
Server Send SS
Client Receive CR
Response
CR-CS
Service
SS-SR
Network
SR-CS
Network
CR-SS
Net Round Trip
(SR-CS) + (CR-SS)
(CR-CS) - (SS-SR)
Server
Time
Spigo Histogram Results
Collected with: % spigo -d 60 -j -a storage -c
name: storage.*.*..load00...load.denominator_serv
quantiles: [{50 47103} {99 139263}]
From To Count Prob Bar
20480 21503 2 0.0007 :
21504 22527 2 0.0007 |
23552 24575 1 0.0003 :
24576 25599 5 0.0017 |
25600 26623 5 0.0017 |
26624 27647 1 0.0003 |
27648 28671 3 0.0010 |
28672 29695 5 0.0017 |
29696 30719 127 0.0421 |####
30720 31743 126 0.0418 |####
31744 32767 74 0.0246 |##
32768 34815 281 0.0932 |#########
34816 36863 201 0.0667 |######
36864 38911 156 0.0518 |#####
38912 40959 185 0.0614 |######
40960 43007 147 0.0488 |####
43008 45055 161 0.0534 |#####
45056 47103 125 0.0415 |####
47104 49151 135 0.0448 |####
49152 51199 99 0.0328 |###
51200 53247 82 0.0272 |##
53248 55295 77 0.0255 |##
55296 57343 66 0.0219 |##
57344 59391 54 0.0179 |#
59392 61439 37 0.0123 |#
61440 63487 45 0.0149 |#
63488 65535 33 0.0109 |#
65536 69631 63 0.0209 |##
69632 73727 98 0.0325 |###
73728 77823 92 0.0305 |###
77824 81919 112 0.0372 |###
81920 86015 88 0.0292 |##
86016 90111 55 0.0182 |#
90112 94207 38 0.0126 |#
94208 98303 51 0.0169 |#
98304 102399 32 0.0106 |#
102400 106495 35 0.0116 |#
106496 110591 17 0.0056 |
110592 114687 19 0.0063 |
114688 118783 18 0.0060 |
118784 122879 6 0.0020 |
122880 126975 8 0.0027 |
Normalized probability
Response time distribution
measured in nanoseconds
using High Dynamic
Range Histogram
:# Zero counts skipped
|# Contiguous buckets
Median and 99th
percentile values
service time for
load generator
Cache hit Cache miss
Go-Kit Histogram Example
const (
maxHistObservable = 1000000 // one millisecond
sampleCount = 1000 // data points will be sampled 5000 times to build a distribution by guesstimate
)
var sampleMap map[metrics.Histogram][]int64
var sampleLock sync.Mutex
func NewHist(name string) metrics.Histogram {
var h metrics.Histogram
if name != "" && archaius.Conf.Collect {
h = expvar.NewHistogram(name, 1000, maxHistObservable, 1, []int{50, 99}...)
sampleLock.Lock()
if sampleMap == nil {
sampleMap = make(map[metrics.Histogram][]int64)
}
sampleMap[h] = make([]int64, 0, sampleCount)
sampleLock.Unlock()
return h
}
return nil
}
func Measure(h metrics.Histogram, d time.Duration) {
if h != nil && archaius.Conf.Collect {
if d > maxHistObservable {
h.Observe(int64(maxHistObservable))
} else {
h.Observe(int64(d))
}
sampleLock.Lock()
s := sampleMap[h]
if s != nil && len(s) < sampleCount {
sampleMap[h] = append(s, int64(d))
sampleLock.Unlock()
}
}
}
Nanoseconds resolution!
Median and 99%ile
Slice for first 500
values as samples for
export to Guesstimate
Golang Guesstimate Interface
https://github.com/adrianco/goguesstimate
{
"space": {
"name": "gotest",
"description": "Testing",
"is_private": "true",
"graph": {
"metrics": [
{"id": "AB", "readableId": "AB", "name": "memcached", "location": {"row": 2, "column":4}},
{"id": "AC", "readableId": "AC", "name": "memcached percent", "location": {"row": 2, "column":
3}},
{"id": "AD", "readableId": "AD", "name": "staash cpu", "location": {"row": 3, "column":3}},
{"id": "AE", "readableId": "AE", "name": "staash", "location": {"row": 3, "column":2}}
],
"guesstimates": [
{"metric": "AB", "input": null, "guesstimateType": "DATA", "data":
[119958,6066,13914,9595,6773,5867,2347,1333,9900,9404,13518,9021,7915,3733,10244,5461,12243,7931,9044,11706,
5706,22861,9022,48661,15158,28995,16885,9564,17915,6610,7080,7065,12992,35431,11910,11465,14455,25790,8339,9
991]},
{"metric": "AC", "input": "40", "guesstimateType": "POINT"},
{"metric": "AD", "input": "[1000,4000]", "guesstimateType": "LOGNORMAL"},
{"metric": "AE", "input": "=100+((randomInt(0,100)>AC)?AB:AD)", "guesstimateType": "FUNCTION"}
]
}
}
}
See http://www.getguesstimate.com
% cd json_metrics; sh guesstimate.sh storage
@adrianco
Simplicity through symmetry
Symmetry
Invariants
Stable assertions
No special cases
Single purpose components
What’s Next?
Serverless
Serverless Architectures
AWS Lambda starting to take off
Google Cloud Functions, Azure Functions on the way…
Open Source: IBM OpenWhisk, open-lambda.org, apex.run
Startup activity: iron.io , serverless.com, nano-lambda.com
With AWS Lambda
compute resources are charged
by the 100ms, not the hour
First 1M node.js executions/month are free
@adrianco
Monitorless Architecture
API Gateway
Kinesis S3DynamoDB
@adrianco
Monitorless Architecture
API Gateway
Kinesis S3DynamoDB
@adrianco
Monitorless Architecture
API Gateway
Kinesis S3DynamoDB
AWS Lambda Reference Archhttp://www.allthingsdistributed.com/2016/05/aws-lambda-serverless-reference-architectures.html
Serverless Programming Model
Event driven functions
Role based permissions
Whitelisted API based security
Good for simple single threaded code
Serverless Cost Efficiencies
100% useful work, no agents, overheads
100% utilization, no charge between requests
No need to size capacity for peak traffic
Anecdotal costs ~1% of conventional system
Ideal for low traffic, Corp IT, spiky workloads
Serverless Work in Progress
Tooling for ease of use
Multi-region HA/DR patterns
Debugging and testing frameworks
Monitoring tools vendor support
Tracing - see iopipe.com
DIY Serverless Operating Challenges
Startup latency
Execution overhead
Charging model
Capacity planning
Intelligent Monitoring
Sprinkle machine learning on everything!
Diagnosis and remediation - Instana, Stackstorm
Personalized role based user interfaces?
Canary signature analysis - Netflix Spinnaker & OpsMx.com
Event correlation & filtering - BigPanda
Monitoring Challenges
Too much new stuff
Too ephemeral
Too dumb
Teraservices
Terabyte Memory Directions
Engulf dataset in memory for analytics
Balanced config for memory intensive workloads
Replace high end systems at commodity cost point
Explore non-volatile memory implications
Terabyte Memory Options
Flash based Diablo 64/128/256GB DDR4 DIMM
Shipping now as volatile memory, future non-volatile
Intel 3D XPoint - new non-volatile technology on the way
Announced May 2016
AWS X1 Instance Type - over 2TB RAM for $14/hr
Easy availability should drive innovation
Diablo Memory1: Flash DIMM
NO CHANGES to CPU or Server
NO CHANGES to Operating System
NO CHANGES to Applications
✓ UP TO 256GB DDR4 MEMORY PER MODULE
✓ UP TO 4TB MEMORY IN 2 SOCKET SYSTEM
TM
@adrianco
“We see the world as increasingly more complex and chaotic
because we use inadequate concepts to explain it. When we
understand something, we no longer see it as chaotic or complex.”
Jamshid Gharajedaghi - 2011
Systems Thinking: Managing Chaos and Complexity: A Platform for Designing Business Architecture
Learn More…
Q&A
Adrian Cockcroft @adrianco
http://slideshare.com/adriancockcroft
Technology Fellow - Battery Ventures
See www.battery.com for a list of portfolio investments
Security
Visit http://www.battery.com/our-companies/ for a full list of all portfolio companies in which all Battery Funds have invested.
Palo Alto Networks
Enterprise IT
Operations &
Management
Big DataCompute
Networking
Storage

More Related Content

What's hot

Design patterns for microservice architecture
Design patterns for microservice architectureDesign patterns for microservice architecture
Design patterns for microservice architectureThe Software House
 
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...Amazon Web Services
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
 
AWS VS AZURE VS GCP.pptx
AWS VS AZURE VS GCP.pptxAWS VS AZURE VS GCP.pptx
AWS VS AZURE VS GCP.pptxRaneesh Ramesan
 
Microservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactMicroservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactAraf Karsh Hamid
 
MicroService Architecture
MicroService ArchitectureMicroService Architecture
MicroService ArchitectureFred George
 
Evolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in MotionEvolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in Motionconfluent
 
MicroCPH - Managing data consistency in a microservice architecture using Sagas
MicroCPH - Managing data consistency in a microservice architecture using SagasMicroCPH - Managing data consistency in a microservice architecture using Sagas
MicroCPH - Managing data consistency in a microservice architecture using SagasChris Richardson
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18Harvinder Atwal
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With PrometheusKnoldus Inc.
 
Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Araf Karsh Hamid
 
Modern Data Flow
Modern Data FlowModern Data Flow
Modern Data Flowconfluent
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus OverviewBrian Brazil
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Dr. Arif Wider
 

What's hot (20)

Cloud migration
Cloud migrationCloud migration
Cloud migration
 
Design patterns for microservice architecture
Design patterns for microservice architectureDesign patterns for microservice architecture
Design patterns for microservice architecture
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
Four Strategies to Create a DevOps Culture & System that Favors Innovation & ...
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Zero-Trust SASE DevSecOps
Zero-Trust SASE DevSecOpsZero-Trust SASE DevSecOps
Zero-Trust SASE DevSecOps
 
AWS VS AZURE VS GCP.pptx
AWS VS AZURE VS GCP.pptxAWS VS AZURE VS GCP.pptx
AWS VS AZURE VS GCP.pptx
 
Microservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactMicroservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito Pact
 
MicroService Architecture
MicroService ArchitectureMicroService Architecture
MicroService Architecture
 
Evolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in MotionEvolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in Motion
 
MicroCPH - Managing data consistency in a microservice architecture using Sagas
MicroCPH - Managing data consistency in a microservice architecture using SagasMicroCPH - Managing data consistency in a microservice architecture using Sagas
MicroCPH - Managing data consistency in a microservice architecture using Sagas
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
Introduction to microservices
Introduction to microservicesIntroduction to microservices
Introduction to microservices
 
Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018
 
Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
 
Modern Data Flow
Modern Data FlowModern Data Flow
Modern Data Flow
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 

Viewers also liked

Principles of microservices velocity
Principles of microservices   velocityPrinciples of microservices   velocity
Principles of microservices velocitySam Newman
 
Continuous integration with business intelligence and analytics
Continuous integration with business intelligence and analyticsContinuous integration with business intelligence and analytics
Continuous integration with business intelligence and analyticsAlex Meadows
 
Estrategias/consejos para mejorar tu eCommerce y hacerlo más rendible
Estrategias/consejos para mejorar tu eCommerce y hacerlo más rendibleEstrategias/consejos para mejorar tu eCommerce y hacerlo más rendible
Estrategias/consejos para mejorar tu eCommerce y hacerlo más rendibleJordi Catà
 
Microservice
MicroserviceMicroservice
MicroserviceVenkat R
 
Dockercon State of the Art in Microservices
Dockercon State of the Art in MicroservicesDockercon State of the Art in Microservices
Dockercon State of the Art in MicroservicesAdrian Cockcroft
 
DevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for KubernetesDevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for KubernetesAmbassador Labs
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONAdrian Cockcroft
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkAdrian Cockcroft
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Animesh Singh
 
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...Ambassador Labs
 
.Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20...
.Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20....Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20...
.Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20...Javier García Magna
 
Sortir de notre zone de confort
Sortir de notre zone de confortSortir de notre zone de confort
Sortir de notre zone de confortThomas Pierrain
 
Ddd reboot (english version)
Ddd reboot (english version)Ddd reboot (english version)
Ddd reboot (english version)Thomas Pierrain
 
Decouvrir son sujet grace à l'event storming
Decouvrir son sujet grace à l'event stormingDecouvrir son sujet grace à l'event storming
Decouvrir son sujet grace à l'event stormingThomas Pierrain
 
The Velvet Revolution: Modernizing Traditional ASP.NET Apps with Docker
The Velvet Revolution: Modernizing Traditional ASP.NET Apps with DockerThe Velvet Revolution: Modernizing Traditional ASP.NET Apps with Docker
The Velvet Revolution: Modernizing Traditional ASP.NET Apps with DockerElton Stoneman
 
Coder sans peur du changement avec la meme pas mal hexagonal architecture
Coder sans peur du changement avec la meme pas mal hexagonal architectureCoder sans peur du changement avec la meme pas mal hexagonal architecture
Coder sans peur du changement avec la meme pas mal hexagonal architectureThomas Pierrain
 
Faible latence, haut debit PerfUG (Septembre 2014)
Faible latence, haut debit PerfUG (Septembre 2014)Faible latence, haut debit PerfUG (Septembre 2014)
Faible latence, haut debit PerfUG (Septembre 2014)Thomas Pierrain
 
Windows Containers and Docker: Why You Should Care
Windows Containers and Docker: Why You Should CareWindows Containers and Docker: Why You Should Care
Windows Containers and Docker: Why You Should CareElton Stoneman
 
Culture craft humantalks
Culture craft humantalksCulture craft humantalks
Culture craft humantalksThomas Pierrain
 

Viewers also liked (20)

Principles of microservices velocity
Principles of microservices   velocityPrinciples of microservices   velocity
Principles of microservices velocity
 
Continuous integration with business intelligence and analytics
Continuous integration with business intelligence and analyticsContinuous integration with business intelligence and analytics
Continuous integration with business intelligence and analytics
 
Estrategias/consejos para mejorar tu eCommerce y hacerlo más rendible
Estrategias/consejos para mejorar tu eCommerce y hacerlo más rendibleEstrategias/consejos para mejorar tu eCommerce y hacerlo más rendible
Estrategias/consejos para mejorar tu eCommerce y hacerlo más rendible
 
Microservice
MicroserviceMicroservice
Microservice
 
Dockercon State of the Art in Microservices
Dockercon State of the Art in MicroservicesDockercon State of the Art in Microservices
Dockercon State of the Art in Microservices
 
DevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for KubernetesDevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for Kubernetes
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New York
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!
 
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
 
.Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20...
.Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20....Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20...
.Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20...
 
Async await...oh wait!
Async await...oh wait!Async await...oh wait!
Async await...oh wait!
 
Sortir de notre zone de confort
Sortir de notre zone de confortSortir de notre zone de confort
Sortir de notre zone de confort
 
Ddd reboot (english version)
Ddd reboot (english version)Ddd reboot (english version)
Ddd reboot (english version)
 
Decouvrir son sujet grace à l'event storming
Decouvrir son sujet grace à l'event stormingDecouvrir son sujet grace à l'event storming
Decouvrir son sujet grace à l'event storming
 
The Velvet Revolution: Modernizing Traditional ASP.NET Apps with Docker
The Velvet Revolution: Modernizing Traditional ASP.NET Apps with DockerThe Velvet Revolution: Modernizing Traditional ASP.NET Apps with Docker
The Velvet Revolution: Modernizing Traditional ASP.NET Apps with Docker
 
Coder sans peur du changement avec la meme pas mal hexagonal architecture
Coder sans peur du changement avec la meme pas mal hexagonal architectureCoder sans peur du changement avec la meme pas mal hexagonal architecture
Coder sans peur du changement avec la meme pas mal hexagonal architecture
 
Faible latence, haut debit PerfUG (Septembre 2014)
Faible latence, haut debit PerfUG (Septembre 2014)Faible latence, haut debit PerfUG (Septembre 2014)
Faible latence, haut debit PerfUG (Septembre 2014)
 
Windows Containers and Docker: Why You Should Care
Windows Containers and Docker: Why You Should CareWindows Containers and Docker: Why You Should Care
Windows Containers and Docker: Why You Should Care
 
Culture craft humantalks
Culture craft humantalksCulture craft humantalks
Culture craft humantalks
 

Similar to Microservices Workshop: Why, what, and how to get there

Innovation and Architecture
Innovation and ArchitectureInnovation and Architecture
Innovation and ArchitectureAdrian Cockcroft
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceAdrian Cockcroft
 
Fast Delivery DevOps Israel
Fast Delivery DevOps IsraelFast Delivery DevOps Israel
Fast Delivery DevOps IsraelAdrian Cockcroft
 
Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...
Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...
Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...Cloud Security Alliance Lviv Chapter
 
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...Trivadis
 
Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...
Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...
Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...Amazon Web Services
 
Getting Managers to Ride the Cloud
Getting Managers to Ride the CloudGetting Managers to Ride the Cloud
Getting Managers to Ride the CloudDavid Amaya
 
Opening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner VogelsOpening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner VogelsAmazon Web Services
 
From Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtFrom Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtTechWell
 
CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...
CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...
CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...Vadim Zendejas
 
Bob Lozano - DoDIIS Worldwide 2010
Bob Lozano - DoDIIS Worldwide 2010Bob Lozano - DoDIIS Worldwide 2010
Bob Lozano - DoDIIS Worldwide 2010GovCloud Network
 
Cloud Computing Project
Cloud Computing Project Cloud Computing Project
Cloud Computing Project Ayush Mukherjee
 
HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...
HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...
HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...hdicapitalarea
 
Cloud Innovation Tour - Discover Track
Cloud Innovation Tour - Discover TrackCloud Innovation Tour - Discover Track
Cloud Innovation Tour - Discover TrackLaurenWendler
 
Microservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyMicroservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyAdrian Cockcroft
 
ClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshopClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshopRazorfish
 
Meetup HybridCloud successful 14.12.2016 #hybridcloudsuccessful
Meetup HybridCloud successful 14.12.2016 #hybridcloudsuccessfulMeetup HybridCloud successful 14.12.2016 #hybridcloudsuccessful
Meetup HybridCloud successful 14.12.2016 #hybridcloudsuccessfulSebastian Straube
 

Similar to Microservices Workshop: Why, what, and how to get there (20)

Innovation and Architecture
Innovation and ArchitectureInnovation and Architecture
Innovation and Architecture
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft Conference
 
Fast Delivery DevOps Israel
Fast Delivery DevOps IsraelFast Delivery DevOps Israel
Fast Delivery DevOps Israel
 
Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...
Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...
Iurii Garasym - Cloud Security Alliance Now in Ukraine. Mission, Opportunitie...
 
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
 
Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...
Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...
Status Quo is Death: nib health funds’ Innovative Journey to the Cloud: AWS S...
 
Getting Managers to Ride the Cloud
Getting Managers to Ride the CloudGetting Managers to Ride the Cloud
Getting Managers to Ride the Cloud
 
Microxchg Microservices
Microxchg MicroservicesMicroxchg Microservices
Microxchg Microservices
 
Opening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner VogelsOpening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner Vogels
 
From Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtFrom Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical Debt
 
The Future of Cloud Innovation, featuring Adrian Cockcroft
The Future of Cloud Innovation, featuring Adrian CockcroftThe Future of Cloud Innovation, featuring Adrian Cockcroft
The Future of Cloud Innovation, featuring Adrian Cockcroft
 
CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...
CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...
CuriousMinds and Siemens in Brasov 2015 - Building and Developing for the Clo...
 
Bob Lozano - DoDIIS Worldwide 2010
Bob Lozano - DoDIIS Worldwide 2010Bob Lozano - DoDIIS Worldwide 2010
Bob Lozano - DoDIIS Worldwide 2010
 
20161220 - microservice
20161220 - microservice20161220 - microservice
20161220 - microservice
 
Cloud Computing Project
Cloud Computing Project Cloud Computing Project
Cloud Computing Project
 
HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...
HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...
HDI Capital Area and Corporate Updates & Demystifying Cloud Computing Present...
 
Cloud Innovation Tour - Discover Track
Cloud Innovation Tour - Discover TrackCloud Innovation Tour - Discover Track
Cloud Innovation Tour - Discover Track
 
Microservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyMicroservices the Good Bad and the Ugly
Microservices the Good Bad and the Ugly
 
ClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshopClientSummit2010_CloudWorkshop
ClientSummit2010_CloudWorkshop
 
Meetup HybridCloud successful 14.12.2016 #hybridcloudsuccessful
Meetup HybridCloud successful 14.12.2016 #hybridcloudsuccessfulMeetup HybridCloud successful 14.12.2016 #hybridcloudsuccessful
Meetup HybridCloud successful 14.12.2016 #hybridcloudsuccessful
 

More from Adrian Cockcroft

Gophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesGophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesAdrian Cockcroft
 
Monitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - MonitoringlessMonitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - MonitoringlessAdrian Cockcroft
 
Evolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceEvolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceAdrian Cockcroft
 
What's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoWhat's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoAdrian Cockcroft
 
Microxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesMicroxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesAdrian Cockcroft
 
Cloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureCloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureAdrian Cockcroft
 
Openstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock InOpenstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock InAdrian Cockcroft
 
When Developers Operate and Operators Develop
When Developers Operate and Operators DevelopWhen Developers Operate and Operators Develop
When Developers Operate and Operators DevelopAdrian Cockcroft
 
Dockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferDockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferAdrian Cockcroft
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeAdrian Cockcroft
 
Software Architecture Conference - Monitoring Microservices - A Challenge
Software Architecture Conference -  Monitoring Microservices - A ChallengeSoftware Architecture Conference -  Monitoring Microservices - A Challenge
Software Architecture Conference - Monitoring Microservices - A ChallengeAdrian Cockcroft
 
Cloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCCloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCAdrian Cockcroft
 
Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)Adrian Cockcroft
 
Cloud Native Cost Optimization
Cloud Native Cost OptimizationCloud Native Cost Optimization
Cloud Native Cost OptimizationAdrian Cockcroft
 
Monktoberfest Fast Delivery
Monktoberfest Fast DeliveryMonktoberfest Fast Delivery
Monktoberfest Fast DeliveryAdrian Cockcroft
 
QCon New York - Migrating to Cloud Native with Microservices
QCon New York - Migrating to Cloud Native with MicroservicesQCon New York - Migrating to Cloud Native with Microservices
QCon New York - Migrating to Cloud Native with MicroservicesAdrian Cockcroft
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Adrian Cockcroft
 
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation ConferenceDisrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation ConferenceAdrian Cockcroft
 
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Adrian Cockcroft
 

More from Adrian Cockcroft (20)

Gophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesGophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential Goroutines
 
Monitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - MonitoringlessMonitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - Monitoringless
 
Evolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceEvolution of Microservices - Craft Conference
Evolution of Microservices - Craft Conference
 
What's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoWhat's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at Cisco
 
In Search of Segmentation
In Search of SegmentationIn Search of Segmentation
In Search of Segmentation
 
Microxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesMicroxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for Microservices
 
Cloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureCloud Trends Nov2015 Structure
Cloud Trends Nov2015 Structure
 
Openstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock InOpenstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock In
 
When Developers Operate and Operators Develop
When Developers Operate and Operators DevelopWhen Developers Operate and Operators Develop
When Developers Operate and Operators Develop
 
Dockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferDockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper Safer
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A Challenge
 
Software Architecture Conference - Monitoring Microservices - A Challenge
Software Architecture Conference -  Monitoring Microservices - A ChallengeSoftware Architecture Conference -  Monitoring Microservices - A Challenge
Software Architecture Conference - Monitoring Microservices - A Challenge
 
Cloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCCloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCC
 
Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)Goto Berlin - Migrating to Microservices (Fast Delivery)
Goto Berlin - Migrating to Microservices (Fast Delivery)
 
Cloud Native Cost Optimization
Cloud Native Cost OptimizationCloud Native Cost Optimization
Cloud Native Cost Optimization
 
Monktoberfest Fast Delivery
Monktoberfest Fast DeliveryMonktoberfest Fast Delivery
Monktoberfest Fast Delivery
 
QCon New York - Migrating to Cloud Native with Microservices
QCon New York - Migrating to Cloud Native with MicroservicesQCon New York - Migrating to Cloud Native with Microservices
QCon New York - Migrating to Cloud Native with Microservices
 
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
Monitorama - Please, no more Minutes, Milliseconds, Monoliths or Monitoring T...
 
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation ConferenceDisrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
 
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
 

Recently uploaded

eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 

Recently uploaded (20)

eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 

Microservices Workshop: Why, what, and how to get there

  • 1. Microservices Workshop: Why, what, and how to get there Adrian Cockcroft @adrianco Technology Fellow - Battery Ventures October 2016 CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode
  • 2. Workshop vs. Presentation Questions at any time Interactive discussions Share your experiences Everyone’s voice should be heard
  • 3. What does @adrianco do? @adrianco Technology Due Diligence on Deals Presentations at Companies and Conferences Tech and Board Advisor Support for Portfolio Companies Consulting and Training Networking with Interesting PeopleTinkering with Technologies Vendor Relationships Previously: Netflix, eBay, Sun Microsystems, Cambridge Consultants, City University London - BSc Applied Physics
  • 4. Topics Scene Setting State of the Cloud What Changes? Product Processes Microservices State of the Art Segmentation What’s Missing? Monitoring Challenges Migration Response Times Serverless Lock-In Teraservices Wrap-Up
  • 6. Why am I here? %*&!” By Simon Wardley http://enterpriseitadoption.com/
  • 7. Why am I here? %*&!” By Simon Wardley http://enterpriseitadoption.com/ 2009
  • 8. Why am I here? %*&!” By Simon Wardley http://enterpriseitadoption.com/ 2009
  • 9. Why am I here? @adrianco’s job at the intersection of cloud and Enterprise IT, looking for disruption and opportunities. %*&!” By Simon Wardley http://enterpriseitadoption.com/ 20142009 Disruptions in 2016 coming from server- less computing and teraservices.
  • 10. Typical reactions to my Netflix talks…
  • 11. Typical reactions to my Netflix talks… “You guys are crazy! Can’t believe it” – 2009
  • 12. Typical reactions to my Netflix talks… “You guys are crazy! Can’t believe it” – 2009 “What Netflix is doing won’t work” – 2010
  • 13. Typical reactions to my Netflix talks… “You guys are crazy! Can’t believe it” – 2009 “What Netflix is doing won’t work” – 2010 It only works for ‘Unicorns’ like Netflix” – 2011
  • 14. Typical reactions to my Netflix talks… “You guys are crazy! Can’t believe it” – 2009 “What Netflix is doing won’t work” – 2010 It only works for ‘Unicorns’ like Netflix” – 2011 “We’d like to do 
 that but can’t” – 2012
  • 15. Typical reactions to my Netflix talks… “You guys are crazy! Can’t believe it” – 2009 “What Netflix is doing won’t work” – 2010 It only works for ‘Unicorns’ like Netflix” – 2011 “We’d like to do 
 that but can’t” – 2012 “We’re on our way using Netflix OSS code” – 2013
  • 16. What I learned from my time at Netflix
  • 17. What I learned from my time at Netflix •Speed wins in the marketplace
  • 18. What I learned from my time at Netflix •Speed wins in the marketplace •Remove friction from product development
  • 19. What I learned from my time at Netflix •Speed wins in the marketplace •Remove friction from product development •High trust, low process, no hand-offs between teams
  • 20. What I learned from my time at Netflix •Speed wins in the marketplace •Remove friction from product development •High trust, low process, no hand-offs between teams •Freedom and responsibility culture
  • 21. What I learned from my time at Netflix •Speed wins in the marketplace •Remove friction from product development •High trust, low process, no hand-offs between teams •Freedom and responsibility culture •Don’t do your own undifferentiated heavy lifting
  • 22. What I learned from my time at Netflix •Speed wins in the marketplace •Remove friction from product development •High trust, low process, no hand-offs between teams •Freedom and responsibility culture •Don’t do your own undifferentiated heavy lifting •Use simple patterns automated by tooling
  • 23. What I learned from my time at Netflix •Speed wins in the marketplace •Remove friction from product development •High trust, low process, no hand-offs between teams •Freedom and responsibility culture •Don’t do your own undifferentiated heavy lifting •Use simple patterns automated by tooling •Self service cloud makes impossible things instant
  • 24. “You build it, you run it.” Werner Vogels 2006
  • 25. In 2014 Enterprises finally embraced public cloud and in 2015 began replacing entire datacenters.
  • 26. In 2014 Enterprises finally embraced public cloud and in 2015 began replacing entire datacenters. Oct 2014
  • 27. In 2014 Enterprises finally embraced public cloud and in 2015 began replacing entire datacenters. Oct 2014 Oct 2015
  • 28. In 2014 Enterprises finally embraced public cloud and in 2015 began replacing entire datacenters. Oct 2014 Oct 2015
  • 29. Key Goals of the CIO? Align IT with the business Develop products faster Try not to get breached
  • 30. Security Blanket Failure Insecure applications hidden behind firewalls make you feel safe until the breach happens… http://peanuts.wikia.com/wiki/Linus'_security_blanket
  • 34. How can everyone get speed, low cost, and better usability?
  • 36. Example Monolith: Sign Up Login Home Page Payment Method Personal Data Reports Monolithic “kitchen sink” database Monolithic application Complex mix of queries User Because one part of the monolithic application and database holds sensitive data all of it is subject to the most rigorous policies
  • 37. Microservices version: Sign Up Login Home Page Payment Method Personal Data Reports Optimized datastores Microservices separation of concerns Isolated single purpose connections User Because each microservice can conform to the appropriate policy, demands for agility can be separated from requirements for security Segregated team owns secure data sources and infrequent updates Segregated team owns rapid improvement of most common use cases
  • 38. State of the Cloud
  • 39. Previous Cloud Trend Updates GigaOM Structure May 2014 D&B Cloud Innovation July 2015 GigaOM Structure November 2015 1
  • 40. Previous Cloud Trend Updates GigaOM Structure May 2014 D&B Cloud Innovation July 2015 GigaOM Structure November 2015 1 Trends from 2014: Noted as appropriate
  • 41. Cloud Ecosystem Matures Staying Power Support Scale Location
  • 42. Cloud Ecosystem Matures Staying Power Support Scale Location Trends from 2014: Even more emphasis on location
  • 45. Safe Bets Trends from 2014: AWS capacity share vs. Azure still growing AWS growing 60% year on year, Enterprise share growing fast Azure providing strong Linux and Open Source support
  • 46. The Global Land-Grab Azure AWS GCE 20 Regions 11 Regions 6 Regions http://aws.amazon.com/about-aws/global-infrastructure/ http://azure.microsoft.com/en-us/regions/ https://cloud.google.com/compute/docs/zones#available http://www.google.com/about/datacenters/inside/locations/index.html
  • 47. The Global Land-Grab Azure AWS GCE 20 Regions 11 Regions 6 Regions http://aws.amazon.com/about-aws/global-infrastructure/ http://azure.microsoft.com/en-us/regions/ https://cloud.google.com/compute/docs/zones#available http://www.google.com/about/datacenters/inside/locations/index.html
  • 48. The Global Land-Grab Azure AWS GCE 20 Regions 11 Regions 6 Regions http://aws.amazon.com/about-aws/global-infrastructure/ http://azure.microsoft.com/en-us/regions/ https://cloud.google.com/compute/docs/zones#available http://www.google.com/about/datacenters/inside/locations/index.html
  • 49. The Global Land-Grab Azure AWS GCE 20 Regions 11 Regions 6 Regions http://aws.amazon.com/about-aws/global-infrastructure/ http://azure.microsoft.com/en-us/regions/ https://cloud.google.com/compute/docs/zones#available http://www.google.com/about/datacenters/inside/locations/index.html
  • 50. The Global Land-Grab Azure AWS GCE 20 Regions 11 Regions 6 Regions http://aws.amazon.com/about-aws/global-infrastructure/ http://azure.microsoft.com/en-us/regions/ https://cloud.google.com/compute/docs/zones#available http://www.google.com/about/datacenters/inside/locations/index.html India, UK, Korea coming in 2016
  • 51. The Global Land-Grab Azure AWS GCE 20 Regions 11 Regions 6 Regions http://aws.amazon.com/about-aws/global-infrastructure/ http://azure.microsoft.com/en-us/regions/ https://cloud.google.com/compute/docs/zones#available http://www.google.com/about/datacenters/inside/locations/index.html India, UK, Korea coming in 2016 Trends from 2014: AWS and Azure go after big markets 2016 Google finally decides it needs more regions too…
  • 52. Questions we keep
 hearing about cloud Who is challenging VMware for in-house cloud automation?
  • 53. Challenges for Private Cloud Stability Functionality Total Cost of Ownership
  • 54. Challenges for Private Cloud Stability Functionality Total Cost of Ownership
  • 55. Challenges for Private Cloud Stability Functionality Total Cost of Ownership
  • 56. Some enterprise vendor responses to cloud and container ecosystem growth…
  • 57. The ship is sinking, let’s re-brand as a submarine!
  • 58. The ship is sinking, let’s re-brand as a submarine!
  • 59. The ship is sinking, let’s merge with a submarine!
  • 60. The ship is sinking, let’s merge with a submarine!
  • 61. Look! we cut our ship in two really quickly!
  • 64. “It isn't what we don't know that gives us trouble, it's what we know that ain't so.” Will Rogers
  • 68. Organizations build up slow complex “Scar tissue” processes
  • 69. "This is the IT swamp draining manual for anyone who is neck deep in alligators.” 1984 2014
  • 71. Waterfall Product Development Business Need • Documents • Weeks Approval Process • Meetings • Weeks Hardware Purchase • Negotiations • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Weeks Customer Feedback • It sucks! • Weeks
  • 72. Waterfall Product Development Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS Business Need • Documents • Weeks Approval Process • Meetings • Weeks Hardware Purchase • Negotiations • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Weeks Customer Feedback • It sucks! • Weeks
  • 73. Waterfall Product Development Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS Business Need • Documents • Weeks Approval Process • Meetings • Weeks Hardware Purchase • Negotiations • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Weeks Customer Feedback • It sucks! • Weeks IaaS Cloud
  • 74. Waterfall Product Development Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS Business Need • Documents • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Weeks Customer Feedback • It sucks! • Weeks
  • 75. Process Hand-Off Steps for Agile Development on IaaS Product Manager Development Team QA Integration Team Operations Deploy Team BI Analytics Team
  • 76. IaaS Agile Product Development Business Need • Documents • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Days Customer Feedback • It sucks! • Days
  • 77. IaaS Agile Product Development Business Need • Documents • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Days Customer Feedback • It sucks! • Days etc…
  • 78. IaaS Agile Product Development Software provisioning is undifferentiated heavy lifting – replace it with PaaS Business Need • Documents • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Days Customer Feedback • It sucks! • Days etc…
  • 79. IaaS Agile Product Development Software provisioning is undifferentiated heavy lifting – replace it with PaaS Business Need • Documents • Weeks Software Development • Specifications • Weeks Deployment and Testing • Reports • Days Customer Feedback • It sucks! • Days PaaS Cloud etc…
  • 80. IaaS Agile Product Development Software provisioning is undifferentiated heavy lifting – replace it with PaaS Business Need • Documents • Weeks Software Development • Specifications • Weeks Customer Feedback • It sucks! • Days etc…
  • 81. Process for Continuous Delivery of Features on PaaS Product Manager Developer BI Analytics Team
  • 82. PaaS CD Feature Development Business Need • Discussions • Days Software Development • Code • Days Customer Feedback • Fix this Bit! • Hours etc…
  • 83. PaaS CD Feature Development Building your own business apps is undifferentiated heavy lifting – use SaaS Business Need • Discussions • Days Software Development • Code • Days Customer Feedback • Fix this Bit! • Hours etc…
  • 84. PaaS CD Feature Development Building your own business apps is undifferentiated heavy lifting – use SaaS Business Need • Discussions • Days Software Development • Code • Days Customer Feedback • Fix this Bit! • Hours SaaS/ BPaaS Cloud etc…
  • 85. PaaS CD Feature Development Building your own business apps is undifferentiated heavy lifting – use SaaS Business Need • Discussions • Days Customer Feedback • Fix this Bit! • Hours etc…
  • 86. SaaS Based Business Application Development Business Need •GUI Builder •Hours Customer Feedback •Fix this bit! •Seconds
  • 87. SaaS Based Business Application Development Business Need •GUI Builder •Hours Customer Feedback •Fix this bit! •Seconds and thousands more…
  • 88. Value Chain Mapping Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html Related tools and training http://www.wardleymaps.com/
  • 89. Value Chain Mapping Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html Related tools and training http://www.wardleymaps.com/ Your unique product - Agile
  • 90. Value Chain Mapping Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html Related tools and training http://www.wardleymaps.com/ Your unique product - Agile Best of breed as a Service - Lean
  • 91. Value Chain Mapping Simon Wardley http://blog.gardeviance.org/2014/11/how-to-get-to-strategy-in-ten-steps.html Related tools and training http://www.wardleymaps.com/ Your unique product - Agile Undifferentiated utility suppliers - 6sigma Best of breed as a Service - Lean
  • 93. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Measure Customers Continuous Delivery
  • 94. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point INNOVATION Measure Customers Continuous Delivery
  • 95. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis Model Hypotheses INNOVATION Measure Customers Continuous Delivery
  • 96. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis Model Hypotheses BIG DATA INNOVATION Measure Customers Continuous Delivery
  • 97. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis JFDI Plan Response Share Plans Model Hypotheses BIG DATA INNOVATION Measure Customers Continuous Delivery
  • 98. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis JFDI Plan Response Share Plans Model Hypotheses BIG DATA INNOVATION CULTURE Measure Customers Continuous Delivery
  • 99. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis JFDI Plan Response Share Plans Incremental Features Automatic Deploy Launch AB Test Model Hypotheses BIG DATA INNOVATION CULTURE Measure Customers Continuous Delivery
  • 100. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis JFDI Plan Response Share Plans Incremental Features Automatic Deploy Launch AB Test Model Hypotheses BIG DATA INNOVATION CULTURE CLOUD Measure Customers Continuous Delivery
  • 101. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis JFDI Plan Response Share Plans Incremental Features Automatic Deploy Launch AB Test Model Hypotheses BIG DATA INNOVATION CULTURE CLOUD Measure Customers Continuous Delivery
  • 102. Observe Orient Decide Act Land grab opportunity Competitive Move Customer Pain Point Analysis JFDI Plan Response Share Plans Incremental Features Automatic Deploy Launch AB Test Model Hypotheses BIG DATA INNOVATION CULTURE CLOUD Measure Customers Continuous Delivery
  • 104. Breaking Down the SILOs QA DBA Sys Adm Net Adm SAN Adm DevUX Prod Mgr
  • 105. Breaking Down the SILOs QA DBA Sys Adm Net Adm SAN Adm DevUX Prod Mgr Product Team Using Monolithic Delivery Product Team Using Monolithic Delivery
  • 106. Breaking Down the SILOs QA DBA Sys Adm Net Adm SAN Adm DevUX Prod Mgr Product Team Using Microservices Product Team Using Monolithic Delivery Product Team Using Microservices Product Team Using Microservices Product Team Using Monolithic Delivery
  • 107. Breaking Down the SILOs QA DBA Sys Adm Net Adm SAN Adm DevUX Prod Mgr Product Team Using Microservices Product Team Using Monolithic Delivery Platform TeamProduct Team Using Microservices Product Team Using Microservices Product Team Using Monolithic Delivery
  • 108. Breaking Down the SILOs QA DBA Sys Adm Net Adm SAN Adm DevUX Prod Mgr Product Team Using Microservices Product Team Using Monolithic Delivery Platform Team A P I Product Team Using Microservices Product Team Using Microservices Product Team Using Monolithic Delivery
  • 109. Breaking Down the SILOs QA DBA Sys Adm Net Adm SAN Adm DevUX Prod Mgr Product Team Using Microservices Product Team Using Monolithic Delivery Platform Team Re-Org from project teams to product teams A P I Product Team Using Microservices Product Team Using Microservices Product Team Using Monolithic Delivery
  • 110. Release Plan Developer Developer Developer Developer Developer QA Release Integration Ops Replace Old With New Release Monolithic service updates Works well with a small number of developers and a single language like php, java or ruby
  • 111. Release Plan Developer Developer Developer Developer Developer QA Release Integration Ops Replace Old With New Release Bugs Monolithic service updates Works well with a small number of developers and a single language like php, java or ruby
  • 112. Release Plan Developer Developer Developer Developer Developer QA Release Integration Ops Replace Old With New Release Bugs Bugs Monolithic service updates Works well with a small number of developers and a single language like php, java or ruby
  • 113. Use monolithic apps for small teams, simple systems and when you must, to optimize for efficiency and latency
  • 114. Developer Developer Developer Developer Developer Old Release Still Running Release Plan Release Plan Release Plan Release Plan Immutable microservice deployment scales, is faster with large teams and diverse platform components
  • 115. Developer Developer Developer Developer Developer Old Release Still Running Release Plan Release Plan Release Plan Release Plan Deploy Feature to Production Deploy Feature to Production Deploy Feature to Production Deploy Feature to Production Immutable microservice deployment scales, is faster with large teams and diverse platform components
  • 116. Developer Developer Developer Developer Developer Old Release Still Running Release Plan Release Plan Release Plan Release Plan Deploy Feature to Production Deploy Feature to Production Deploy Feature to Production Deploy Feature to Production Bugs Immutable microservice deployment scales, is faster with large teams and diverse platform components
  • 117. Developer Developer Developer Developer Developer Old Release Still Running Release Plan Release Plan Release Plan Release Plan Deploy Feature to Production Deploy Feature to Production Deploy Feature to Production Deploy Feature to Production Bugs Deploy Feature to Production Immutable microservice deployment scales, is faster with large teams and diverse platform components
  • 118. Configure Configure Developer Developer Developer Release Plan Release Plan Release Plan Deploy Standardized Services Standardized container deployment saves time and effort https://hub.docker.com
  • 119. Configure Configure Developer Developer Developer Release Plan Release Plan Release Plan Deploy Standardized Services Deploy Feature to Production Deploy Feature to Production Deploy Feature to Production Bugs Deploy Feature to Production Standardized container deployment saves time and effort https://hub.docker.com
  • 120. Developer Developer Run What You Wrote Developer Developer
  • 121. Developer Developer Run What You Wrote Micro service Micro service Micro service Micro service Micro service Micro service Micro service Developer Developer
  • 122. DeveloperDeveloper Developer Run What You Wrote Micro service Micro service Micro service Micro service Micro service Micro service Micro service Developer Developer Monitoring Tools
  • 123. DeveloperDeveloper Developer Run What You Wrote Micro service Micro service Micro service Micro service Micro service Micro service Micro service Developer Developer Site Reliability Monitoring Tools Availability Metrics 99.95% customer success rate
  • 124. DeveloperDeveloper Developer Run What You Wrote Micro service Micro service Micro service Micro service Micro service Micro service Micro service Developer Developer Manager Manager Site Reliability Monitoring Tools Availability Metrics 99.95% customer success rate
  • 125. DeveloperDeveloper Developer Run What You Wrote Micro service Micro service Micro service Micro service Micro service Micro service Micro service Developer Developer Manager Manager VP Engineering Site Reliability Monitoring Tools Availability Metrics 99.95% customer success rate
  • 126. Non-Destructive Production Updates ● “Immutable Code” Service Pattern ● Existing services are unchanged, old code remains in service ● New code deploys as a new service group ● No impact to production until traffic routing changes ● A|B Tests, Feature Flags and Version Routing control traffic ● First users in the test cell are the developer and test engineers ● A cohort of users is added looking for measurable improvement
  • 127. Deliver four features every four weeks Work In Progress = 4 Opportunity for bugs: 100% (baseline) Time to debug each: 100% (baseline)
  • 128. Deliver four features every four weeks Bugs! Which feature broke? Need more time to test! Extend release to six weeks? Work In Progress = 4 Opportunity for bugs: 100% (baseline) Time to debug each: 100% (baseline)
  • 129. Deliver four features every four weeks But: risk of bugs in delivery increases with interactions! Bugs! Which feature broke? Need more time to test! Extend release to six weeks? Work In Progress = 4 Opportunity for bugs: 100% (baseline) Time to debug each: 100% (baseline)
  • 130. Deliver four features every four weeks 16 16 16 But: risk of bugs in delivery increases with interactions! Bugs! Which feature broke? Need more time to test! Extend release to six weeks? Work In Progress = 4 Opportunity for bugs: 100% (baseline) Time to debug each: 100% (baseline)
  • 131. Deliver six features every six weeks
  • 132. Deliver six features every six weeks Work In Progress = 6 Individual bugs: 150% Interactions: 150%?
  • 133. Deliver six features every six weeks More features What broke? More interactions Even more bugs!! Work In Progress = 6 Individual bugs: 150% Interactions: 150%?
  • 134. 36 36 Deliver six features every six weeks More features What broke? More interactions Even more bugs!! Work In Progress = 6 Individual bugs: 150% Interactions: 150%?
  • 135. 36 36 Deliver six features every six weeks Risk of bugs in delivery increased to 225% of original! More features What broke? More interactions Even more bugs!! Work In Progress = 6 Individual bugs: 150% Interactions: 150%?
  • 136. 4 4 4 4 4 4 Deliver two features every two weeks Complexity of delivery decreased by 75% from original Fewer interactions Fewer bugs Better flow Less Work In Progress Work In Progress = 2 Opportunity for bugs: 50% Time to debug each: 50%
  • 137. Change One Thing at a Time! If it hurts, do it more often!
  • 138. What Happened? Rate of change increased Cost and size and risk of change reduced
  • 139. Low Cost of Change Using Docker Developers • Compile/Build • Seconds Extend container • Package dependencies • Seconds Deploy Container • Docker startup • Seconds
  • 140. Low Cost of Change Using Docker Fast tooling supports continuous delivery of many tiny changes Developers • Compile/Build • Seconds Extend container • Package dependencies • Seconds Deploy Container • Docker startup • Seconds
  • 142. It’s what you know that isn’t so
  • 143. It’s what you know that isn’t so ● Make your assumptions explicit
  • 144. It’s what you know that isn’t so ● Make your assumptions explicit ● Extrapolate trends to the limit
  • 145. It’s what you know that isn’t so ● Make your assumptions explicit ● Extrapolate trends to the limit ● Listen to non-customers
  • 146. It’s what you know that isn’t so ● Make your assumptions explicit ● Extrapolate trends to the limit ● Listen to non-customers ● Follow developer adoption, not IT spend
  • 147. It’s what you know that isn’t so ● Make your assumptions explicit ● Extrapolate trends to the limit ● Listen to non-customers ● Follow developer adoption, not IT spend ● Map evolution of products to services to utilities
  • 148. It’s what you know that isn’t so ● Make your assumptions explicit ● Extrapolate trends to the limit ● Listen to non-customers ● Follow developer adoption, not IT spend ● Map evolution of products to services to utilities ● Re-organize your teams for speed of execution
  • 150. Most IT Ops people will try to avoid lock-in
  • 151. Most product developers will pick the best of breed option
  • 152. Analogy: Dev’s get to go on dates and get married Ops get to run the family and get divorced DevOps gets the whole lifecycle!
  • 153. DevOps to the rescue!
  • 155. "End the practice of awarding business on the basis of a price tag. Instead, minimize total cost. Move toward a single supplier for any one item, on a long-term relationship of loyalty and trust.” W. Edwards Deming - 4th Point
  • 156. "End the practice of awarding business on the basis of a price tag. Instead, minimize total cost. Move toward a single supplier for any one item, on a long-term relationship of loyalty and trust.” How did we end up here? dysfunctional exploitation and abuse
  • 157. Project vs. Product Leads to lock-in Evolves to follow best of breed
  • 158. Evolution Technology Refresh Move to open Source On-prem -> as a Service
  • 159. Best of breed is now OSS and as a Service Less inherent lock-in
  • 160. What kinds of lock-in are there?
  • 162. e.g. compliance with laws that exclude alternatives based on jurisdiction or certification Legal lock-in
  • 163. e.g. compliance with laws that exclude alternatives based on jurisdiction or certification Financial lock-in e.g. budget spent in advance on long term deal with a vendor Legal lock-in
  • 164. e.g. compliance with laws that exclude alternatives based on jurisdiction or certification Contractual lock-in e.g. partnership or investment deal with one vendor prevents using alternatives Financial lock-in e.g. budget spent in advance on long term deal with a vendor Legal lock-in
  • 165. Technology lock-in Possible to escape given time and work…
  • 166.
  • 167. Proximity lock-in e.g. chatty clients don’t work unless they are co-located with their server
  • 168. e.g. quorum based availability (C*, Riak) needs three zones/datacenters per region Topology lock-in Proximity lock-in e.g. chatty clients don’t work unless they are co-located with their server
  • 169. e.g. quorum based availability (C*, Riak) needs three zones/datacenters per region Topology lock-in Proximity lock-in e.g. chatty clients don’t work unless they are co-located with their server Implementation e.g. interface is the same but behavior is different
  • 171.
  • 172. Interface lock-in e.g. different APIs that get the same result, easy to hide behind an abstraction layer
  • 173. Query syntax lock-in e.g. SQL variants for different databases Interface lock-in e.g. different APIs that get the same result, easy to hide behind an abstraction layer
  • 174. Query syntax lock-in e.g. SQL variants for different databases Interface lock-in e.g. different APIs that get the same result, easy to hide behind an abstraction layer Web service lock-in Interface lock-in, but remote access unlocks ability to migrate applications
  • 175. Data gravity lock-in e.g. lots of data to move or duplicate Query syntax lock-in e.g. SQL variants for different databases Interface lock-in e.g. different APIs that get the same result, easy to hide behind an abstraction layer Web service lock-in Interface lock-in, but remote access unlocks ability to migrate applications
  • 182. AWS Aurora Example for discussion
  • 183. AWS Aurora Example for discussion
  • 185. A Microservice Definition Loosely coupled service oriented architecture with bounded contexts
  • 186. A Microservice Definition Loosely coupled service oriented architecture with bounded contexts If every service has to be updated at the same time it’s not loosely coupled
  • 187. A Microservice Definition Loosely coupled service oriented architecture with bounded contexts If every service has to be updated at the same time it’s not loosely coupled If you have to know too much about surrounding services you don’t have a bounded context. See the Domain Driven Design book by Eric Evans.
  • 188. Coupling Concerns http://en.wikipedia.org/wiki/Conway's_law ●Conway’s Law - organizational coupling ●Centralized Database Schemas ●Enterprise Service Bus - centralized message queues ●Inflexible Protocol Versioning
  • 189. Speeding Up The Platform Datacenter Snowflakes • Deploy in months • Live for years
  • 190. Speeding Up The Platform Datacenter Snowflakes • Deploy in months • Live for years Virtualized and Cloud • Deploy in minutes • Live for weeks
  • 191. Speeding Up The Platform Datacenter Snowflakes • Deploy in months • Live for years Virtualized and Cloud • Deploy in minutes • Live for weeks Container Deployments • Deploy in seconds • Live for minutes/hours
  • 192. Speeding Up The Platform Datacenter Snowflakes • Deploy in months • Live for years Virtualized and Cloud • Deploy in minutes • Live for weeks Container Deployments • Deploy in seconds • Live for minutes/hours Lambda Deployments • Deploy in milliseconds • Live for seconds
  • 193. Speeding Up The Platform AWS Lambda is leading exploration of serverless architectures in 2016 Datacenter Snowflakes • Deploy in months • Live for years Virtualized and Cloud • Deploy in minutes • Live for weeks Container Deployments • Deploy in seconds • Live for minutes/hours Lambda Deployments • Deploy in milliseconds • Live for seconds
  • 194. Separate Concerns with Microservices http://en.wikipedia.org/wiki/Conway's_law ● Invert Conway’s Law – teams own service groups and backend stores ● One “verb” per single function micro-service, size doesn’t matter ● One developer independently produces a micro-service ● Each micro-service is it’s own build, avoids trunk conflicts ● Deploy in a container: Tomcat, AMI or Docker, whatever… ● Stateless business logic. Cattle, not pets. ● Stateful cached data access layer using replicated ephemeral instances
  • 196. http://www.infoq.com/presentations/Twitter-Timeline-Scalability http://www.infoq.com/presentations/twitter-soa http://www.infoq.com/presentations/Zipkin http://www.infoq.com/presentations/scale-gilt Go-Kit https://www.youtube.com/watch?v=aL6sd4d4hxk http://www.infoq.com/presentations/circuit-breaking-distributed-systems https://speakerdeck.com/mattheath/scaling-micro-services-in-go-highload-plus-plus-2014 State of the Art in Web Scale Microservice Architectures AWS Re:Invent : Asgard to Zuul https://www.youtube.com/watch?v=p7ysHhs5hl0 Resiliency at Massive Scale https://www.youtube.com/watch?v=ZfYJHtVL1_w Microservice Architecture https://www.youtube.com/watch?v=CriDUYtfrjs New projects for 2015 and Docker Packaging https://www.youtube.com/watch?v=hi7BDAtjfKY Spinnaker deployment pipeline https://www.youtube.com/watch?v=dwdVwE52KkU http://www.infoq.com/presentations/spring-cloud-2015
  • 197. Microservice Architectures ConfigurationTooling Discovery Routing Observability Development: Languages and Container Operational: Orchestration and Deployment Infrastructure Datastores Policy: Architectural and Security Compliance
  • 198. Microservices Edda Archaius Configuration Spinnaker SpringCloud Tooling Eureka Prana Discovery Denominator Zuul Ribbon Routing Hystrix Pytheus Atlas Observability Development using Java, Groovy, Scala, Clojure, Python with AMI and Docker Containers Orchestration with Autoscalers on AWS, Titus exploring Mesos & ECS for Docker Ephemeral datastores using Dynomite, Memcached, Astyanax, Staash, Priam, Cassandra Policy via the Simian Army - Chaos Monkey, Chaos Gorilla, Conformity Monkey, Security Monkey
  • 201. Cloud Native Storage Business Logic Database Master Fabric Storage Arrays Database Slave Fabric Storage Arrays Business Logic Cassandra Zone A nodes Cassandra Zone B nodes Cassandra Zone C nodes Cloud Object Store Backups SSDs inside arrays disrupt incumbent suppliers
  • 202. Cloud Native Storage Business Logic Database Master Fabric Storage Arrays Database Slave Fabric Storage Arrays Business Logic Cassandra Zone A nodes Cassandra Zone B nodes Cassandra Zone C nodes Cloud Object Store Backups SSDs inside ephemeral instances disrupt an entire industry SSDs inside arrays disrupt incumbent suppliers
  • 203. Cloud Native Storage Business Logic Database Master Fabric Storage Arrays Database Slave Fabric Storage Arrays Business Logic Cassandra Zone A nodes Cassandra Zone B nodes Cassandra Zone C nodes Cloud Object Store Backups SSDs inside ephemeral instances disrupt an entire industry SSDs inside arrays disrupt incumbent suppliers NetflixOSS Uses Priam to create Cassandra clusters in minutes
  • 204.
  • 205.
  • 206. Twitter Microservices Decider ConfigurationTooling Finagle Zookeeper Discovery Finagle Netty Routing Zipkin Observability Scala with JVM Container Orchestration using Aurora deployment in datacenters using Mesos Custom Cassandra-like datastore: Manhattan
  • 207. Twitter Microservices Decider ConfigurationTooling Finagle Zookeeper Discovery Finagle Netty Routing Zipkin Observability Scala with JVM Container Orchestration using Aurora deployment in datacenters using Mesos Custom Cassandra-like datastore: Manhattan Focus on efficient datacenter deployment at scale
  • 208.
  • 209.
  • 210. Gilt Microservices Decider Configuration Ion Cannon SBT Rake Tooling Finagle Zookeeper Discovery Akka Finagle Netty Routing Zipkin Observability Scala and Ruby with Docker Containers Deployment on AWS Datastores per Microservice using MongoDB, Postgres, Voldemort
  • 211. Gilt Microservices Decider Configuration Ion Cannon SBT Rake Tooling Finagle Zookeeper Discovery Akka Finagle Netty Routing Zipkin Observability Scala and Ruby with Docker Containers Deployment on AWS Datastores per Microservice using MongoDB, Postgres, Voldemort Focus on fast development with Scala and Docker
  • 212.
  • 214. Hailo Microservices Configuration Hubot Janky Jenkins Tooling go-platform Discovery go-platform RabbitMQ Routing Observability Go using AMI Container and Docker Deployment on AWS Deployment on AWS See: go-micro and https://github.com/peterbourgon/gokit
  • 215.
  • 216. Next Generation Applications Fill in the gaps, rapidly evolving ecosystem choices Wasabi LaunchDarkly Habitat Configuration Docker Desktop Spinnaker Terraform Tooling Etcd Eureka Consul Discovery Compose Linkerd Weave Routing OpenTracing Prometheus Hystrix Observability Development: Containers and languages e.g. Docker, Lambda, Node.js, Go, Rust Operational: Mesos, Kubernetes, Swarm, Nomad for private clouds. ECS, GKS for public Datastores: Orchestrated, Distributed Ephemeral e.g. Cassandra, or DBaaS e.g. DynamoDB Policy: Security compliance e.g. Vault, Docker Content Trust. Scaffolding e.g. Cloud Foundry
  • 219. Airgaps closing Industrial IoT Security blanket perimeter firewalls Datacenter to cloud transitions New systems of engagement http://peanuts.wikia.com/wiki/Linus'_security_blanket
  • 220. Policy
  • 221. A can talk to B B can talk to C A must not talk to C A B C
  • 222. Y and Z failure modes must be independent so X can always succeed Y X Z
  • 223. Availability requirements drive a need for distributed segmentation
  • 226. Over-reliance on one mechanism leads to abuse…
  • 227. Lack of coordination across many mechanisms leads to fragility
  • 229. B Accounts and Roles Who can set policy for what? Needs distributed policy management A C
  • 230. Network Segmentation Who controls the network? A B C
  • 231. Network Segmentation Datacenter policies are based on separation of duties. Tickets, Network admins and VLANs
  • 232. Network Segmentation AWS VPC networking uses developer-driven automation, loses separation of duties…
  • 233. VPC Abuse Antipattern Lots of small VPC networks for microservices, end up in IP address space capacity management hell…
  • 234. Hypervisor and Security Group Segmentation Distributed firewall rules A B CA B
  • 235. Security Group Abuse Antipattern Too many microservices need to be in the same group, overloads configuration limitations
  • 236. Kernel eBPF & Calico IPtables Segmentation Distributed firewall rules A B CA B
  • 237. IPtables Segmentation Can use IP Sets to scale Managed in the container host OS Separates routing reachability from access policy
  • 238. Docker & Weave Segmentation Docker daemon manages connections B CA B C
  • 239. proxy: build: ./proxy ports: - "8080:8080" links: - app app: build: ./app links: - db db: image: postgres Docker Compose V1 proxy app db 8080
  • 240. version: '2' services: proxy: build: ./proxy ports: - "8080:8080" networks: - front app: build: ./app networks: - front - back db: image: postgres networks: - back networks: front: back: Docker Compose V2 proxy app db 8080 front backfront
  • 241. Docker Segmentation Overlay network created and managed by Docker or Weave. DNS based lookups.
  • 242. Segmentation Scalability Real world microservices architectures have hundreds to thousands of distinct microservices
  • 243. Segmentation Scalability There’s often a few very popular microservices that everyone else wants to talk to
  • 244. In Search of Segmentation Ops Dev Datacenters AD/LDAP Roles VLAN Networks Hypervisor IPtables Docker Links
  • 245. In Search of Segmentation Ops Dev Datacenters AD/LDAP Roles VLAN Networks Hypervisor IPtables Docker Links AWS Accounts IAM Roles VPC Security Groups Calico Policy Docker Net/Weave
  • 246. Hierarchical Segmentation B CA B C E FD E F Homepage Team Security Group Reports Team Security Group VPC Z - Manage a small number of large network spaces D Setup all the layers with Terraform? AWS Account - Manage across multiple accounts containers and links
  • 248. @adrianco Advanced Microservices Topics Failure injection testing Versioning, routing Binary protocols and interfaces Timeouts and retries Denormalized data models Monitoring, tracing Simplicity through symmetry
  • 249. @adrianco Failure Injection Testing Netflix Chaos Monkey, Simian Army, FIT and Gremlin http://techblog.netflix.com/2011/07/netflix-simian-army.html http://techblog.netflix.com/2014/10/fit-failure-injection-testing.html http://techblog.netflix.com/2016/01/automated-failure-testing.html https://www.infoq.com/presentations/failure-test-research-netflix
  • 250. ● Chaos Monkey - enforcing stateless business logic ● Chaos Gorilla - enforcing zone isolation/replication ● Chaos Kong - enforcing region isolation/replication ● Security Monkey - watching for insecure configuration settings ● FIT & Gremlin - inject errors to enforce robust dependencies ● See over 100 NetflixOSS projects at netflix.github.com ● Get “Technical Indigestion” reading techblog.netflix.com Trust with Verification
  • 251. @adrianco Benefits of version aware routing Immediately and safely introduce a new version Canary test in production Use DIY feature flags, , A|B tests with Wasabi Route clients to a version so they can’t get disrupted Change client or dependencies but not both at once Eventually remove old versions Incremental or infrequent “break the build” garbage collection
  • 252. @adrianco Versioning, Routing Version numbering: Interface.Feature.Bugfix V1.2.3 to V1.2.4 - Canary test then remove old version V1.2.x to V1.3.x - Canary test then remove or keep both Route V1.3.x clients to new version to get new feature Remove V1.2.x only after V1.3.x is found to work for V1.2.x clients V1.x.x to V2.x.x - Route clients to specific versions Remove old server version when all old clients are gone
  • 253. @adrianco Protocols Measure serialization, transmission, deserialization costs Sending a megabyte of XML between microservices will make you sad, but not as sad as 10yrs ago with SOAP Use Thrift, Protobuf/gRPC, Avro, SBE internally Use JSON for external/public interfaces https://github.com/real-logic/simple-binary-encoding
  • 254. @adrianco Interfaces When you build a service, build a “driver” client for it Reference implementation error handling and serialization Release automation stress test using client Validate that service interface is usable! Minimize additional dependencies Swagger - OpenAPI Specification Datawire Quark adds behaviors to API spec
  • 255. @adrianco Interface Version Pinning Change one thing at a time! Pin the version of everything else Incremental build/test/deploy pipeline Deploy existing app code with new platform Deploy existing app code with new dependencies Deploy new app code with pinned platform/dependencies
  • 259. @adrianco Interfaces between teams Service Code Client Code Minimal Object Model Full Object Model Cache Code Common Object Model Decoupled object models
  • 260. @adrianco Interfaces between teams Service Code Client Code Minimal Object Model Service Driver Service Handler Full Object Model Cache Code Common Object Model Decoupled object models
  • 261. @adrianco Interfaces between teams Service Code Client Code Minimal Object Model Cache Driver Service Driver Service Handler Full Object Model Cache Code Cache Handler Common Object Model Decoupled object models
  • 262. @adrianco Interfaces between teams Service Code Client Code Minimal Object Model Cache Driver Service Driver Platform Platform Service Handler Full Object Model Cache Code Platform Cache Handler Common Object Model Decoupled object models
  • 263. @adrianco Interfaces between teams Service Code Client Code Minimal Object Model Cache Driver Service Driver Platform Platform Service Handler Full Object Model Cache Code Platform Cache Handler Common Object Model Versioned dependency interfacesDecoupled object models
  • 264. @adrianco Interfaces between teams Service Code Client Code Minimal Object Model Cache Driver Service Driver Platform Platform Service Handler Full Object Model Cache Code Platform Cache Handler Common Object Model Versioned dependency interfaces Versioned platform interface Decoupled object models
  • 265. @adrianco Interfaces between teams Service Code Client Code Minimal Object Model Cache Driver Service Driver Platform Platform Service Handler Full Object Model Cache Code Platform Cache Handler Common Object Model Versioned dependency interfaces Versioned platform interface Decoupled object models Versioned routing
  • 266. @adrianco Timeouts and Retries Connection timeout vs. request timeout confusion Usually setup incorrectly, global defaults Systems collapse with “retry storms” Timeouts too long, too many retries Services doing work that can never be used
  • 267. @adrianco Connections and Requests TCP makes a connection, HTTP makes a request HTTP hopefully reuses connections for several requests Both have different timeout and retry needs! TCP timeout is purely a property of one network latency hop HTTP timeout depends on the service and its dependencies connection path request path
  • 268. @adrianco Timeouts and Retries Edge Service Good Service Good Service Bad config: Every service defaults to 2 second timeout, two retries Edge Service not responding Overloaded service not responding Failed Service If anything breaks, everything upstream stops responding Retries add unproductive work
  • 269. @adrianco Timeouts and Retries Edge Service Good Service Good Service Bad config: Every service defaults to 2 second timeout, two retries Edge Service not responding Overloaded service not responding Failed Service If anything breaks, everything upstream stops responding Retries add unproductive work
  • 270. @adrianco Timeouts and Retries Edge Service Good Service Good Service Bad config: Every service defaults to 2 second timeout, two retries Edge Service not responding Overloaded service not responding Failed Service If anything breaks, everything upstream stops responding Retries add unproductive work
  • 271. @adrianco Timeouts and Retries Bad config: Every service defaults to 2 second timeout, two retries Edge service responds slowly Overloaded service Partially failed service
  • 272. @adrianco Timeouts and Retries Bad config: Every service defaults to 2 second timeout, two retries Edge service responds slowly Overloaded service Partially failed service First request from Edge timed out so it ignores the successful response and keeps retrying. Middle service load increases as it’s doing work that isn’t being consumed
  • 273. @adrianco Timeouts and Retries Bad config: Every service defaults to 2 second timeout, two retries Edge service responds slowly Overloaded service Partially failed service First request from Edge timed out so it ignores the successful response and keeps retrying. Middle service load increases as it’s doing work that isn’t being consumed
  • 274. @adrianco Timeout and Retry Fixes Cascading timeout budget Static settings that decrease from the edge or dynamic budget passed with request How often do retries actually succeed? Don’t ask the same instance the same thing Only retry on a different connection
  • 276. @adrianco Timeouts and Retries Edge Service Good Service Budgeted timeout, one retry Failed Service 3s 1s 1s Fast fail response after 2s Upstream timeout must always be longer than total downstream timeout * retries delay No unproductive work while fast failing
  • 277. @adrianco Timeouts and Retries Edge Service Good Service Budgeted timeout, failover retry Failed Service For replicated services with multiple instances never retry against a failed instance No extra retries or unproductive work Good Service
  • 278. @adrianco Timeouts and Retries Edge Service Good Service Budgeted timeout, failover retry Failed Service 3s 1s For replicated services with multiple instances never retry against a failed instance No extra retries or unproductive work Good Service Successful response delayed 1s
  • 279. @adrianco Manage Inconsistency ACM Paper: "The Network is Reliable" Distributed systems are inconsistent by nature Clients are inconsistent with servers Most caches are inconsistent Versions are inconsistent Get over it and Deal with it See http://queue.acm.org/detail.cfm?id=2655736
  • 280. @adrianco Denormalized Data Models Any non-trivial organization has many databases Cross references exist, inconsistencies exist Microservices work best with individual simple stores Scale, operate, mutate, fail them independently NoSQL allows flexible schema/object versions
  • 281. @adrianco Denormalized Data Models Build custom cross-datasource check/repair processes Ensure all cross references are up to date Read these Pat Helland papers Immutability Changes Everything http://highscalability.com/blog/2015/1/26/paper-immutability-changes-everything-by-pat-helland.html Memories, Guesses and Apologies https://blogs.msdn.microsoft.com/pathelland/2007/05/15/memories-guesses-and-apologies/ Standing on the Distributed Shoulders of Giants http://queue.acm.org/detail.cfm?id=2953944
  • 283. Cloud Native Microservices ● High rate of change Code pushes can cause floods of new instances and metrics Short baseline for alert threshold analysis – everything looks unusual ● Ephemeral Configurations Short lifetimes make it hard to aggregate historical views Hand tweaked monitoring tools take too much work to keep running ● Microservices with complex calling patterns End-to-end request flow measurements are very important Request flow visualizations get overwhelmed
  • 284. Microservice Based Architectures See http://www.slideshare.net/LappleApple/gilt-from-monolith-ruby-app-to-micro-service-scala-service-architecture
  • 285. Continuous Delivery and DevOps ● Changes are smaller but more frequent ● Individual changes are more likely to be broken ● Changes are normally deployed by developers ● Feature flags are used to enable new code ● Instant detection and rollback matters much more
  • 286. Whoops! I didn’t mean that! Reverting…
 
 Not cool if it takes 5 minutes to see it failed and 5 more to see a fix
 No-one notices if it only takes 5 seconds to detect and 5 to see a fix
  • 287. NetflixOSS Hystrix/Turbine Circuit Breaker http://techblog.netflix.com/2012/12/hystrix-dashboard-and-turbine.html
  • 288. NetflixOSS Hystrix/Turbine Circuit Breaker http://techblog.netflix.com/2012/12/hystrix-dashboard-and-turbine.html
  • 289. Low Latency SaaS Based Monitors https://www.datadoghq.com/ http://www.instana.com/ www.bigpanda.io www.vividcortex.com signalfx.com wavefront.com sysdig.com dataloop.io See www.battery.com for a list of portfolio investments
  • 290. A Tragic Quadrant Ability to scale Ability to handle rapidly changing microservices In-house tools at web scale companies Most current monitoring & APM tools Next generation APM Next generation Monitoring Datacenter Cloud Containers 100s 1,000s 10,000s 100,000s Lambda
  • 291. A Tragic Quadrant Ability to scale Ability to handle rapidly changing microservices In-house tools at web scale companies Most current monitoring & APM tools Next generation APM Next generation Monitoring Datacenter Cloud Containers 100s 1,000s 10,000s 100,000s Lambda YMMV: Opinionated approximate positioning only
  • 292. Metric to display latency needs to be less than human attention span (~10s)
  • 295. A Possible Hierarchy Continents Regions Zones Services Versions Containers Instances How Many? 3 to 5 2-4 per Continent 1-5 per Region 100’s per Zone Many per Service 1000’s per Version 10,000’s It’s much more challenging than just a large number of machines
  • 296. Flow
  • 297. Some tools can show the request flow across a few services
  • 298. Interesting architectures have a lot of microservices! Flow visualization is a big challenge. See http://www.slideshare.net/LappleApple/gilt-from-monolith-ruby-app-to-micro-service-scala-service-architecture
  • 299. Simulated Microservices Model and visualize microservices Simulate interesting architectures Generate large scale configurations Eventually stress test real tools Code: github.com/adrianco/spigo Simulate Protocol Interactions in Go Visualize with D3 See for yourself: http://simianviz.surge.sh Follow @simianviz for updates ELB Load Balancer Zuul API Proxy Karyon Business Logic Staash Data Access Layer Priam Cassandra Datastore Three Availability Zones Denominator DNS Endpoint
  • 300. Spigo Nanoservice Structure func Start(listener chan gotocol.Message) { ... for { select { case msg := <-listener: flow.Instrument(msg, name, hist) switch msg.Imposition { case gotocol.Hello: // get named by parent ... case gotocol.NameDrop: // someone new to talk to ... case gotocol.Put: // upstream request handler ... outmsg := gotocol.Message{gotocol.Replicate, listener, time.Now(), msg.Ctx.NewParent(), msg.Intention} flow.AnnotateSend(outmsg, name) outmsg.GoSend(replicas) } case <-eurekaTicker.C: // poll the service registry ... } } } Skeleton code for replicating a Put message Instrument incoming requests Instrument outgoing requests update trace context
  • 301. Flow Trace Records riak2 us-east-1 zoneC riak9 us-west-2 zoneA Put s896 Replicate riak3 us-east-1 zoneA riak8 us-west-2 zoneC riak4 us-east-1 zoneB riak10 us- west-2 us-east-1.zoneC.riak2 t98p895s896 Put us-east-1.zoneA.riak3 t98p896s908 Replicate us-east-1.zoneB.riak4 t98p896s909 Replicate us-west-2.zoneA.riak9 t98p896s910 Replicate us-west-2.zoneB.riak10 t98p910s912 Replicate us-west-2.zoneC.riak8 t98p910s913 Replicate staash us-east-1 zoneC s910 s908s913 s909s912 Replicate Put
  • 302. Open Zipkin A common format for trace annotations A Java tool for visualizing traces Standardization effort to fold in other formats Driven by Adrian Cole (currently at Pivotal) Extended to load Spigo generated trace files
  • 305. Trace for one Spigo Flow
  • 310. Migrating to Microservices See for yourself: http://simianviz.surge.sh/migration Endpoint ELB PHP MySQL MySQL Next step Controls node placement distance Select models
  • 311. Migrating to Microservices See for yourself: http://simianviz.surge.sh/migration Step 1 - Add Memcache Step 2 - Add Web Proxy Service
  • 312. Migrating to Microservices See for yourself: http://simianviz.surge.sh/migration Step 3 - Add Data Access Layer Step 4 - Add Microservices Data Access node.js memcache per zone
  • 313. Migrating to Microservices See for yourself: http://simianviz.surge.sh/migration Step 5 - Add Cassandra Step 6 - Remove MySQL 12 node cross zone Cassandra cluster MySQL
  • 314. Migrating to Microservices See for yourself: http://simianviz.surge.sh/migration Step 7 - Add Second Region Step 8 - Connect Cassandra Regions Endpoint with location routed DNS
  • 315. Migrating to Microservices See for yourself: http://simianviz.surge.sh/migration Step 9 - Add Third Region Endpoint with location routed DNS
  • 316. Definition of an architecture { "arch": "lamp", "description":"Simple LAMP stack", "version": "arch-0.0", "victim": "webserver", "services": [ { "name": "rds-mysql", "package": "store", "count": 2, "regions": 1, "dependencies": [] }, { "name": "memcache", "package": "store", "count": 1, "regions": 1, "dependencies": [] }, { "name": "webserver", "package": "monolith", "count": 18, "regions": 1, "dependencies": ["memcache", "rds-mysql"] }, { "name": "webserver-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["webserver"] }, { "name": "www", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["webserver-elb"] } ] } Header includes chaos monkey victim New tier name Tier package 0 = non Regional Node count List of tier dependencies See for yourself: http://simianviz.surge.sh/lamp
  • 317. Running Spigo $ ./spigo -a lamp -j -d 2 2016/01/26 23:04:05 Loading architecture from json_arch/lamp_arch.json 2016/01/26 23:04:05 lamp.edda: starting 2016/01/26 23:04:05 Architecture: lamp Simple LAMP stack 2016/01/26 23:04:05 architecture: scaling to 100% 2016/01/26 23:04:05 lamp.us-east-1.zoneB.eureka01....eureka.eureka: starting 2016/01/26 23:04:05 lamp.us-east-1.zoneA.eureka00....eureka.eureka: starting 2016/01/26 23:04:05 lamp.us-east-1.zoneC.eureka02....eureka.eureka: starting 2016/01/26 23:04:05 Starting: {rds-mysql store 1 2 []} 2016/01/26 23:04:05 Starting: {memcache store 1 1 []} 2016/01/26 23:04:05 Starting: {webserver monolith 1 18 [memcache rds-mysql]} 2016/01/26 23:04:05 Starting: {webserver-elb elb 1 0 [webserver]} 2016/01/26 23:04:05 Starting: {www denominator 0 0 [webserver-elb]} 2016/01/26 23:04:05 lamp.*.*.www00....www.denominator activity rate 10ms 2016/01/26 23:04:06 chaosmonkey delete: lamp.us-east-1.zoneC.webserver02....webserver.monolith 2016/01/26 23:04:07 asgard: Shutdown 2016/01/26 23:04:07 lamp.us-east-1.zoneB.eureka01....eureka.eureka: closing 2016/01/26 23:04:07 lamp.us-east-1.zoneA.eureka00....eureka.eureka: closing 2016/01/26 23:04:07 lamp.us-east-1.zoneC.eureka02....eureka.eureka: closing 2016/01/26 23:04:07 spigo: complete 2016/01/26 23:04:07 lamp.edda: closing -a architecture lamp -j graph json/lamp.json -d run for 2 seconds
  • 318. Riak IoT Architecture { "arch": "riak", "description":"Riak IoT ingestion example for the RICON 2015 presentation", "version": "arch-0.0", "victim": "", "services": [ { "name": "riakTS", "package": "riak", "count": 6, "regions": 1, "dependencies": ["riakTS", "eureka"]}, { "name": "ingester", "package": "staash", "count": 6, "regions": 1, "dependencies": ["riakTS"]}, { "name": "ingestMQ", "package": "karyon", "count": 3, "regions": 1, "dependencies": ["ingester"]}, { "name": "riakKV", "package": "riak", "count": 3, "regions": 1, "dependencies": ["riakKV"]}, { "name": "enricher", "package": "staash", "count": 6, "regions": 1, "dependencies": ["riakKV", "ingestMQ"]}, { "name": "enrichMQ", "package": "karyon", "count": 3, "regions": 1, "dependencies": ["enricher"]}, { "name": "analytics", "package": "karyon", "count": 6, "regions": 1, "dependencies": ["ingester"]}, { "name": "analytics-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["analytics"]}, { "name": "analytics-api", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["analytics-elb"]}, { "name": "normalization", "package": "karyon", "count": 6, "regions": 1, "dependencies": ["enrichMQ"]}, { "name": "iot-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["normalization"]}, { "name": "iot-api", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["iot-elb"]}, { "name": "stream", "package": "karyon", "count": 6, "regions": 1, "dependencies": ["ingestMQ"]}, { "name": "stream-elb", "package": "elb", "count": 0, "regions": 1, "dependencies": ["stream"]}, { "name": "stream-api", "package": "denominator", "count": 0, "regions": 0, "dependencies": ["stream-elb"]} ] } New tier name Tier package Node count List of tier dependencies 0 = non Regional
  • 319. Single Region Riak IoT See for yourself: http://simianviz.surge.sh/riak
  • 320. Single Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint See for yourself: http://simianviz.surge.sh/riak
  • 321. Single Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint Load Balancer Load Balancer Load Balancer See for yourself: http://simianviz.surge.sh/riak
  • 322. Single Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint Load Balancer Normalization Services Load Balancer Load Balancer Stream Service Analytics Service See for yourself: http://simianviz.surge.sh/riak
  • 323. Single Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint Load Balancer Normalization Services Enrich Message Queue Riak KV Enricher Services Load Balancer Load Balancer Stream Service Analytics Service See for yourself: http://simianviz.surge.sh/riak
  • 324. Single Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint Load Balancer Normalization Services Enrich Message Queue Riak KV Enricher Services Ingest Message Queue Load Balancer Load Balancer Stream Service Analytics Service See for yourself: http://simianviz.surge.sh/riak
  • 325. Single Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint Load Balancer Normalization Services Enrich Message Queue Riak KV Enricher Services Ingest Message Queue Load Balancer Load Balancer Stream Service Riak TS Analytics Service Ingester Service See for yourself: http://simianviz.surge.sh/riak
  • 326. Two Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint East Region Ingestion West Region Ingestion Multi Region TS Analytics See for yourself: http://simianviz.surge.sh/riak
  • 327. Two Region Riak IoT IoT Ingestion Endpoint Stream Endpoint Analytics Endpoint East Region Ingestion West Region Ingestion Multi Region TS Analytics What’s the response time of the stream endpoint? See for yourself: http://simianviz.surge.sh/riak
  • 329. What’s the response time distribution of a very simple storage backed web service? memcached mysql disk volume web service load generator memcached
  • 331. memcached hit % memcached response mysql response service cpu time memcached hit mode mysql cache hit mode mysql disk access mode Hit rates: memcached 40% mysql 70%
  • 332. Hit rates: memcached 60% mysql 70%
  • 333. Hit rates: memcached 20% mysql 90%
  • 335. Changes made to codahale/hdrhistogram Changes made to go-kit/kit/metrics Implementation in adrianco/spigo/collect
  • 336. What to measure? Client Server GetRequest GetResponse Client Time Client Send CS Server Receive SR Server Send SS Client Receive CR Server Time
  • 337. What to measure? Client Server GetRequest GetResponse Client Time Client Send CS Server Receive SR Server Send SS Client Receive CR Response CR-CS Service SS-SR Network SR-CS Network CR-SS Net Round Trip (SR-CS) + (CR-SS) (CR-CS) - (SS-SR) Server Time
  • 338. Spigo Histogram Results Collected with: % spigo -d 60 -j -a storage -c name: storage.*.*..load00...load.denominator_serv quantiles: [{50 47103} {99 139263}] From To Count Prob Bar 20480 21503 2 0.0007 : 21504 22527 2 0.0007 | 23552 24575 1 0.0003 : 24576 25599 5 0.0017 | 25600 26623 5 0.0017 | 26624 27647 1 0.0003 | 27648 28671 3 0.0010 | 28672 29695 5 0.0017 | 29696 30719 127 0.0421 |#### 30720 31743 126 0.0418 |#### 31744 32767 74 0.0246 |## 32768 34815 281 0.0932 |######### 34816 36863 201 0.0667 |###### 36864 38911 156 0.0518 |##### 38912 40959 185 0.0614 |###### 40960 43007 147 0.0488 |#### 43008 45055 161 0.0534 |##### 45056 47103 125 0.0415 |#### 47104 49151 135 0.0448 |#### 49152 51199 99 0.0328 |### 51200 53247 82 0.0272 |## 53248 55295 77 0.0255 |## 55296 57343 66 0.0219 |## 57344 59391 54 0.0179 |# 59392 61439 37 0.0123 |# 61440 63487 45 0.0149 |# 63488 65535 33 0.0109 |# 65536 69631 63 0.0209 |## 69632 73727 98 0.0325 |### 73728 77823 92 0.0305 |### 77824 81919 112 0.0372 |### 81920 86015 88 0.0292 |## 86016 90111 55 0.0182 |# 90112 94207 38 0.0126 |# 94208 98303 51 0.0169 |# 98304 102399 32 0.0106 |# 102400 106495 35 0.0116 |# 106496 110591 17 0.0056 | 110592 114687 19 0.0063 | 114688 118783 18 0.0060 | 118784 122879 6 0.0020 | 122880 126975 8 0.0027 | Normalized probability Response time distribution measured in nanoseconds using High Dynamic Range Histogram :# Zero counts skipped |# Contiguous buckets Median and 99th percentile values service time for load generator Cache hit Cache miss
  • 339. Go-Kit Histogram Example const ( maxHistObservable = 1000000 // one millisecond sampleCount = 1000 // data points will be sampled 5000 times to build a distribution by guesstimate ) var sampleMap map[metrics.Histogram][]int64 var sampleLock sync.Mutex func NewHist(name string) metrics.Histogram { var h metrics.Histogram if name != "" && archaius.Conf.Collect { h = expvar.NewHistogram(name, 1000, maxHistObservable, 1, []int{50, 99}...) sampleLock.Lock() if sampleMap == nil { sampleMap = make(map[metrics.Histogram][]int64) } sampleMap[h] = make([]int64, 0, sampleCount) sampleLock.Unlock() return h } return nil } func Measure(h metrics.Histogram, d time.Duration) { if h != nil && archaius.Conf.Collect { if d > maxHistObservable { h.Observe(int64(maxHistObservable)) } else { h.Observe(int64(d)) } sampleLock.Lock() s := sampleMap[h] if s != nil && len(s) < sampleCount { sampleMap[h] = append(s, int64(d)) sampleLock.Unlock() } } } Nanoseconds resolution! Median and 99%ile Slice for first 500 values as samples for export to Guesstimate
  • 340. Golang Guesstimate Interface https://github.com/adrianco/goguesstimate { "space": { "name": "gotest", "description": "Testing", "is_private": "true", "graph": { "metrics": [ {"id": "AB", "readableId": "AB", "name": "memcached", "location": {"row": 2, "column":4}}, {"id": "AC", "readableId": "AC", "name": "memcached percent", "location": {"row": 2, "column": 3}}, {"id": "AD", "readableId": "AD", "name": "staash cpu", "location": {"row": 3, "column":3}}, {"id": "AE", "readableId": "AE", "name": "staash", "location": {"row": 3, "column":2}} ], "guesstimates": [ {"metric": "AB", "input": null, "guesstimateType": "DATA", "data": [119958,6066,13914,9595,6773,5867,2347,1333,9900,9404,13518,9021,7915,3733,10244,5461,12243,7931,9044,11706, 5706,22861,9022,48661,15158,28995,16885,9564,17915,6610,7080,7065,12992,35431,11910,11465,14455,25790,8339,9 991]}, {"metric": "AC", "input": "40", "guesstimateType": "POINT"}, {"metric": "AD", "input": "[1000,4000]", "guesstimateType": "LOGNORMAL"}, {"metric": "AE", "input": "=100+((randomInt(0,100)>AC)?AB:AD)", "guesstimateType": "FUNCTION"} ] } } }
  • 341. See http://www.getguesstimate.com % cd json_metrics; sh guesstimate.sh storage
  • 342. @adrianco Simplicity through symmetry Symmetry Invariants Stable assertions No special cases Single purpose components
  • 345. Serverless Architectures AWS Lambda starting to take off Google Cloud Functions, Azure Functions on the way… Open Source: IBM OpenWhisk, open-lambda.org, apex.run Startup activity: iron.io , serverless.com, nano-lambda.com
  • 346. With AWS Lambda compute resources are charged by the 100ms, not the hour First 1M node.js executions/month are free
  • 350. AWS Lambda Reference Archhttp://www.allthingsdistributed.com/2016/05/aws-lambda-serverless-reference-architectures.html
  • 351. Serverless Programming Model Event driven functions Role based permissions Whitelisted API based security Good for simple single threaded code
  • 352. Serverless Cost Efficiencies 100% useful work, no agents, overheads 100% utilization, no charge between requests No need to size capacity for peak traffic Anecdotal costs ~1% of conventional system Ideal for low traffic, Corp IT, spiky workloads
  • 353. Serverless Work in Progress Tooling for ease of use Multi-region HA/DR patterns Debugging and testing frameworks Monitoring tools vendor support Tracing - see iopipe.com
  • 354. DIY Serverless Operating Challenges Startup latency Execution overhead Charging model Capacity planning
  • 355. Intelligent Monitoring Sprinkle machine learning on everything! Diagnosis and remediation - Instana, Stackstorm Personalized role based user interfaces? Canary signature analysis - Netflix Spinnaker & OpsMx.com Event correlation & filtering - BigPanda
  • 356. Monitoring Challenges Too much new stuff Too ephemeral Too dumb
  • 358. Terabyte Memory Directions Engulf dataset in memory for analytics Balanced config for memory intensive workloads Replace high end systems at commodity cost point Explore non-volatile memory implications
  • 359. Terabyte Memory Options Flash based Diablo 64/128/256GB DDR4 DIMM Shipping now as volatile memory, future non-volatile Intel 3D XPoint - new non-volatile technology on the way Announced May 2016 AWS X1 Instance Type - over 2TB RAM for $14/hr Easy availability should drive innovation
  • 360. Diablo Memory1: Flash DIMM NO CHANGES to CPU or Server NO CHANGES to Operating System NO CHANGES to Applications ✓ UP TO 256GB DDR4 MEMORY PER MODULE ✓ UP TO 4TB MEMORY IN 2 SOCKET SYSTEM TM
  • 361. @adrianco “We see the world as increasingly more complex and chaotic because we use inadequate concepts to explain it. When we understand something, we no longer see it as chaotic or complex.” Jamshid Gharajedaghi - 2011 Systems Thinking: Managing Chaos and Complexity: A Platform for Designing Business Architecture
  • 363. Q&A Adrian Cockcroft @adrianco http://slideshare.com/adriancockcroft Technology Fellow - Battery Ventures See www.battery.com for a list of portfolio investments
  • 364. Security Visit http://www.battery.com/our-companies/ for a full list of all portfolio companies in which all Battery Funds have invested. Palo Alto Networks Enterprise IT Operations & Management Big DataCompute Networking Storage