1. Cloud Native London
Skills Matter
03/10/2017
linkerd: The Cloud Native Service Mesh
Make your application and infrastructure more resilient, safer and faster
Dario Simonetti
askattest.com@AskAttest
2. ● What is a service mesh?
● Why do I need one?
● The linkerd approach
● Ain’t all sunshine and rainbows
Summary
3. Hello
● Head of Core Engineering at Attest
● Responsible for backend, architecture and infrastructure
● Mild obsession for clean code and product quality
● Previously at OVO Energy
● Sorry, me no speak k8s
https://dario.tech
Dario Simonetti
6. ● There is no clear definition
● Handles service-to-service communication
● Control Plane (namerd, Istio, NGINX Controller, …)
○ “Centralised brain” with controller logic
● Data Plane (linkerd, envoy, Weave, NGINX Plus, ...)
○ Application-unaware set of lightweight network proxies
What is a service mesh?
7. Why do I need one?
“The way that microservices interact with each other at runtime needs to be
monitored, managed, and controlled”
https://buoyant.io/2017/04/25/whats-a-service-mesh-and-why-do-i-need-one
8. Why do I need one?
● Client-side load balancing
● Circuit breaking
● Service discovery
● Retries and deadlines
● TLS
● Connection pooling
● Distributed tracing
9. 1. The network is reliable
2. Latency is zero
3. Bandwidth is infinite
4. The network is secure
5. Topology doesn't change
6. There is one administrator
7. Transport cost is zero
8. The network is homogeneous
The Eight Fallacies of Distributed Computing
https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing
11. The linkerd approach
● It’s a Data Plane, but integrates with Control Planes including namerd and Istio
● JVM based, built on battle-tested Netty and Finagle (Twitter)
● Open Source + great community
● Does one thing well and it's open to integrate with other tools that do their thing well
13. The linkerd approach
Different deployment models:
Different deployment configurations:
● service-to-linker
● linker-to-service
● linker-to-linker
● Once per host
● Once per service (sidecar)
14. Ain’t all sunshine and rainbows
● Big memory footprint (~256 MB)
● Opening a new connection takes a lot of time (>100 ms)
● Some service mesh configuration should live in your application logic instead
15. T H A N K S F O R L I S T E N I N G !
Q U E S T I O N S ?
Website: https://linkerd.io
GitHub: https://github.com/linkerd/linkerd
Examples: https://github.com/linkerd/linkerd-examples
Slack: https://linkerd.slack.com
linkerd on AWS ECS: https://medium.com/attest-engineering/937f201f847a
Editor's Notes
Does anybody know what this is?
Can anybody guess what number 83 is?
Attest:
Small startup
Started April 2015
Seed round Oct 2015
I joined a bit more than a year ago
Now 10 employees / 4 developers
Just closed series A round
Going to 18-20 employees / 8 developers
We’re hiring!
Technologies:
Microservices (not-so-micro)
Java
Spring
Go
Postgres, Redis, SimpleDB, ElastiCache
AWS
Git
Jenkins
Objective:
Reduce time-to-market while not compromising quality
Roadmap & vision:
Event-driven architecture (Kafka/Kinesis)
Plenty of microservices to reduce conflicts and increase productivity/efficiency
“Infrastructure layer that handles service-to-service communication”
Control Plane = “centralised brain”, APIs and controller logic
Data Plane = service proxies
Who hasn’t?
Why it’s getting even more important
1. It can fail at any of the ISO/OSI model, from cut/bitten wires to downtimes in big data centres (load balancing, circuit breaking, retrials)
2. At 300,000 km/sec, it will take at least 30msec to send a ping from Europe to the USA and back (load balancing)
3. Bottlenecks (load balancing, circuit breaking)
4. Unknown unknowns (TLS)
5. Services are volatile, come up and go down all the time (Service discovery)
6. More than one, less safe, risky (e.g. Edward Snowden)
7. Two sides:
Monetary amount
Transforming application-layer data (HTTP) into transport-layer data (TCP) takes time for serialization and deserialization (TCP itself, retrials). Opening connection takes time (connection pooling)
8. More of an exception these days, but too keep in mind when integrating with 3rd parties
Ok you make it more resilient, safer and faster
You don’t need a service mesh for that
If you have heterogeneous services (different languages)
Quicker and decrease time to market