Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Part 1: Building the Pillars of
Microservices
Part 2: Containerization and
Orchestration (Kubernetes)
AGENDA
Part 1: Building the Pillars
01
The Journey to Microservices
02
Building the Pillars of Microservices
Microservices Journey: A Story of Growth
2013: small (< 50 engineers)
build product & grow customer base
whatever works
20...
Challenges with a Monolith
● Reliability
● Performance
● Engineering agility/speed, cross-team coupling
● Engineering time...
https://www.squarespace.com/?gclid=<unique-id>
Challenges with a Monolith
Story of an Outage...During the Super Bowl
Challenges with a Monolith
● Monitoring typically starts at the edges
○ Think requests in, DB queries out, etc
● What abou...
The Journey to Microservices
● Define Pillars: ideas we consider necessary for successful production
microservices
● Imple...
Pillars
Microservice Framework
HTTP API
Service
Discovery
Software Load
Balancer
Observability Async Client Fault Toleranc...
Platform Features
Service
Discovery
API
Documentation
Structured
Logging
Metrics &
Dashboards
Distributed
Tracing
Contextu...
Platform Features
Service
Discovery
API
Documentation
Structured
Logging
Metrics &
Dashboards
Distributed
Tracing
Contextu...
Building the Pillars of Microservices
● HTTP + JSON
○ Industry standard. Tons of tools.
● Solid open source Java API serve...
Building the Pillars of Microservices
● Swagger (OpenAPI Specification)
● Code generation
○ Swagger spec → models, server ...
Swagger Path Example
paths:
/currency-info:
put:
tags:
- CurrencyInfo
description: "Creates a new {@link CurrencyInfo} res...
Interactive API Documentation
Building the Pillars of Microservices
● Services announce themselves, publishing their name and host/port
information
● St...
Building the Pillars of Microservices
● First: Zookeeper
○ Complicated clients (no HTTP API)
○ Must build discovery on Zoo...
● Now: Consul
○ First class discovery support
○ Built in multi-data center support
○ Simple HTTP API
○ Configurable health...
DC2DC1
Multi DC with Consul
ConsulConsulConsul ConsulConsulConsul
Service
Announce
Service
Announce
Primary DB Replica DB
...
DC2DC1
Multi DC with Consul
ConsulConsulConsul ConsulConsulConsul
Service Service
Primary DB Replica DB
Replicate
Service
...
Building the Pillars of Microservices
● Avoid middleware/extra configuration
● Customizable logic
● Connection pooling
● S...
Building the Pillars of Microservices
● Metrics
● Dashboards
● Distributed Tracing
● Structured Logging
● Healthchecks
● A...
Metrics & Dashboards
Distributed Tracing
Structured Logging
tail -f /data/logs/taxation-access.log
2017-03-22 07:24:45:026 GMT
thread=jetty-846
contextId=JaOLrH2O
...
Contextual Information
Client
v3.1
Taxation Service
Billing Service
Context IdClient Version
Client Source
Type
JaOLrH2O
Building the Pillars of Microservices
● Addresses the Fanout problem, improved latency
● Reactive: RxJava with RxNetty
● A...
Fanout Depicted
Client
Service A
Service Z
Application Container
Service B
Service C
Service D
Sync Execution
Client
Service A
Service Z
Application Container
Service B
Service C
Service D
1
2
3
4
5
Total Latency = A ...
Async Execution
Client
Service A
Service Z
Application Container
Service B
Service C
Service D
1
2
2
2
1
Total Latency = m...
Building the Pillars of Microservices
● Circuit breakers
● Retry logic
○ Much easier to implement w/ RxJava
● Timeouts
● F...
Fault Tolerance
Service B
Service A
Service C
Service A Client
Service B Client
Service C Client
User
Request
Application ...
Fault Tolerance
Service B
Service A
Service C
Service A Client
10 Threads
Service B Client
5 Threads
Service C Client
5 Th...
Pillars
Microservice Framework
HTTP API
Service
Discovery
Software Load
Balancer
Observability Async Client Fault Toleranc...
Building the Pillars of Microservices
● Entirely Async Systems
○ Async servers, Streaming, gRPC, Netty
● Distributed task ...
Part 2: Containerization & Kubernetes Orchestration
01
The problem with static infrastructure
02
Kubernetes in a datacente...
Containerization & Kubernetes Orchestration
● Engineering org grows...
● More services…
● More infrastructure to spin up…
...
Containerization & Kubernetes Orchestration
● Difficult to find resources
● Slow to provision and scale
● Already have dis...
Traditional Provisioning Process
● Pick ESX with available resources
● Pick IP
● Register host to Cobbler
● Register DNS e...
Kubernetes Provisioning Process
● kubectl apply -f app.yaml
Containerization & Kubernetes Orchestration
● Provisioning/Scaling: Kubernetes
● Monitoring: Prometheus
● Alerting: AlertM...
Kubernetes in a datacenter?
Kubernetes Architecture
Spine and Leaf Layer 3 Clos Topology
● Each leaf switch represents a Top-of-Rack switch (ToR)
● All work is performed at t...
Spine and Leaf Layer 3 Clos Topology
● Simple to understand
● Easy to scale
● Predictable and consistent latency (hops = 2...
Calico Networking
● No network overlay required
● Communicates directly with existing L3 mesh network
● BGP Peering with T...
Monitoring
● Graphite does not scale well with ephemeral instances
● Easy to have combinatoric explosion of metrics
Traditional Monit...
Kubernetes Monitoring & Alerting
Kubernetes Monitoring & Alerting
Kubernetes Monitoring & Alerting
Microservice Pod Definition
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 2
memory: 4Gi
Microservice Pod
Java Micro...
Challenges
Microservice Pod Definition
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 2
memory: 4Gi
● Kubernetes assumes no oth...
Microservice Pod Definition
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 2
memory: 4Gi
● Shares = CPU Request * 10...
Java in a Container
● JVM is able to detect # of cores via sysconf(_SC_NPROCESSORS_ONLN)
● Scales tasks relative to this
Java in a Container
● Provide a base container that calculates the container’s resources!
● Detect # of “cores” assigned
○...
Java in a Container
● Many libraries rely on Runtime.getRuntime.availableProcessors()
○ Jetty
○ ForkJoinPool
○ GC Threads
...
Java in a Container
● Use Linux preloading to override availableProcessors()
#include <stdlib.h>
#include <unistd.h>
int J...
Communication With External Services
● Environment specific services should not be encoded in application
● Single deploym...
Communication With External Services
Communication With External Services
apiVersion: v1
kind: Service
metadata:
name: kafka
namespace: elk
spec:
type: Cluster...
So what’s left?
Future Work: Enforce Squarespace Standards
● Custom Admission Controller requires all services, deployments, etc.
meet cer...
Future Work: Updating Common Dependencies
● Custom Initializers
○ Inject container dependencies into deployments (consul, ...
QUESTIONS
Thank you!
squarespace.com/careers
Doug Jones
@dougfjones
Kevin Lynch
@kevml
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices P...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices P...
Upcoming SlideShare
Loading in …5
×

2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices Practitioner Virtual Summit 2017

1,590 views

Published on

This talk covers the past, present, and future of Microservices at Squarespace. We begin with our journey to microservices, and describe the platform that made this possible. We introduce our idea of the “Pillars of Microservices”, everything a developer needs to have a successful production service. For each pillar we describe why we think it is important and discuss the implementation and how we utilize it in our environment. Next, we look to the future evolution of our microservices environment including how we are using containerization and Kubernetes to overcome some of the problems we’ve faced with more static infrastructure.

Published in: Software
  • Be the first to comment

2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices Practitioner Virtual Summit 2017

  1. 1. Part 1: Building the Pillars of Microservices Part 2: Containerization and Orchestration (Kubernetes) AGENDA
  2. 2. Part 1: Building the Pillars 01 The Journey to Microservices 02 Building the Pillars of Microservices
  3. 3. Microservices Journey: A Story of Growth 2013: small (< 50 engineers) build product & grow customer base whatever works 2014: medium (< 100 engineers) we have a lot of customers now! whatever works doesn't work anymore 2016: large (100+ engineers) architect for scalability and reliability organizational structures ?: XL (200+ engineers)
  4. 4. Challenges with a Monolith ● Reliability ● Performance ● Engineering agility/speed, cross-team coupling ● Engineering time spent fire fighting rather than building new functionality What were the increasingly difficult challenges with a monolith?
  5. 5. https://www.squarespace.com/?gclid=<unique-id> Challenges with a Monolith Story of an Outage...During the Super Bowl
  6. 6. Challenges with a Monolith ● Monitoring typically starts at the edges ○ Think requests in, DB queries out, etc ● What about the guts of the app? How much visibility do you have there? ● How long does it take you to recover from an issue? Find the cause and fix the issue? Challenges with Monitoring/Finding Faults
  7. 7. The Journey to Microservices ● Define Pillars: ideas we consider necessary for successful production microservices ● Implement these pillars as part of our platform ● Reduce boilerplate and reinventing the wheel syndrome ● Service authors get these for free and can focus on their application domain Design a Platform for Production, Remove Challenges
  8. 8. Pillars Microservice Framework HTTP API Service Discovery Software Load Balancer Observability Async Client Fault Tolerance https://engineering.squarespace.com/blog/2017/the-pill ars-of-squarespace-services
  9. 9. Platform Features Service Discovery API Documentation Structured Logging Metrics & Dashboards Distributed Tracing Contextual Information Alert Definitions Standardized Deployments Healthchecks Dynamic Configuration Client-Side Load Balancing Latency & Fault Tolerance Client-Side Caching HTTP Request Builders Code Generation Service Dashboard Traffic Visualization Server Platform Client Platform ToolingTooling
  10. 10. Platform Features Service Discovery API Documentation Structured Logging Metrics & Dashboards Distributed Tracing Contextual Information Alert Definitions Standardized Deployments Healthchecks Dynamic Configuration Client-Side Load Balancing Latency & Fault Tolerance Client-Side Caching HTTP Request Builders Code Generation Service Dashboard Traffic Visualization Async/Reactive Alert Management Log Aggregation
  11. 11. Building the Pillars of Microservices ● HTTP + JSON ○ Industry standard. Tons of tools. ● Solid open source Java API server platforms ○ Started with Dropwizard ○ now on Spring Boot (configured to use Jetty and Jersey 2) Pillar: HTTP APIs
  12. 12. Building the Pillars of Microservices ● Swagger (OpenAPI Specification) ● Code generation ○ Swagger spec → models, server API, client Even Easier APIs
  13. 13. Swagger Path Example paths: /currency-info: put: tags: - CurrencyInfo description: "Creates a new {@link CurrencyInfo} resource." summary: Create a new currency info operationId: save parameters: - name: info in: body schema: $ref: '#/definitions/CurrencyInfo' responses: 200: description: ok schema: $ref: '#/definitions/CurrencyInfo'
  14. 14. Interactive API Documentation
  15. 15. Building the Pillars of Microservices ● Services announce themselves, publishing their name and host/port information ● Started with a simple announcement payload and found that was enough ● Healthchecks to mark services down Pillar: Service Discovery
  16. 16. Building the Pillars of Microservices ● First: Zookeeper ○ Complicated clients (no HTTP API) ○ Must build discovery on Zookeeper primitives ○ Strong consistency is unnecessary ○ Client heartbeats can’t be expanded upon ○ No great way to support multiple data centers Service Discovery Systems
  17. 17. ● Now: Consul ○ First class discovery support ○ Built in multi-data center support ○ Simple HTTP API ○ Configurable healthchecks ○ key/value store ■ We use for dynamic config and leader election Building the Pillars of Microservices Service Discovery Systems
  18. 18. DC2DC1 Multi DC with Consul ConsulConsulConsul ConsulConsulConsul Service Announce Service Announce Primary DB Replica DB Replicate WAN Gossip Consistent Set
  19. 19. DC2DC1 Multi DC with Consul ConsulConsulConsul ConsulConsulConsul Service Service Primary DB Replica DB Replicate Service Query ?dc=”DC2” Remote DC forwarding
  20. 20. Building the Pillars of Microservices ● Avoid middleware/extra configuration ● Customizable logic ● Connection pooling ● System awareness to increase fault tolerance ● Builds on Netflix Ribbon OSS Pillar: Software Load Balancers
  21. 21. Building the Pillars of Microservices ● Metrics ● Dashboards ● Distributed Tracing ● Structured Logging ● Healthchecks ● Alerts Pillar: Observability
  22. 22. Metrics & Dashboards
  23. 23. Distributed Tracing
  24. 24. Structured Logging tail -f /data/logs/taxation-access.log 2017-03-22 07:24:45:026 GMT thread=jetty-846 contextId=JaOLrH2O contextSourceType=billing clientVersion=taxation-client-3.1 level=INFO class=AccessLog ip=10.100.101.205 method=GET uri=/api/1/taxation/rates queryString= httpVersion=HTTP/1.1 responseCode=200 responseTimeMs=39
  25. 25. Contextual Information Client v3.1 Taxation Service Billing Service Context IdClient Version Client Source Type JaOLrH2O
  26. 26. Building the Pillars of Microservices ● Addresses the Fanout problem, improved latency ● Reactive: RxJava with RxNetty ● Allows greater composition and reuse. Avoid “callback hell” Pillar: Async Client
  27. 27. Fanout Depicted Client Service A Service Z Application Container Service B Service C Service D
  28. 28. Sync Execution Client Service A Service Z Application Container Service B Service C Service D 1 2 3 4 5 Total Latency = A + B + C + D + Z
  29. 29. Async Execution Client Service A Service Z Application Container Service B Service C Service D 1 2 2 2 1 Total Latency = max(A, Z) A = max(B, C, D) + A’s latency
  30. 30. Building the Pillars of Microservices ● Circuit breakers ● Retry logic ○ Much easier to implement w/ RxJava ● Timeouts ● Fallbacks (cached or static values) ● Netflix Hystrix Pillar: Fault Tolerance
  31. 31. Fault Tolerance Service B Service A Service C Service A Client Service B Client Service C Client User Request Application Container
  32. 32. Fault Tolerance Service B Service A Service C Service A Client 10 Threads Service B Client 5 Threads Service C Client 5 Threads User Request Fail fast, fail silent, or fallback Application Container
  33. 33. Pillars Microservice Framework HTTP API Service Discovery Software Load Balancer Observability Async Client Fault Tolerance https://engineering.squarespace.com/blog/2017/the-pill ars-of-squarespace-services
  34. 34. Building the Pillars of Microservices ● Entirely Async Systems ○ Async servers, Streaming, gRPC, Netty ● Distributed task management ○ Serverless computing ● Easier/better alert definition and management ● Better tooling to create and deploy services Future Work
  35. 35. Part 2: Containerization & Kubernetes Orchestration 01 The problem with static infrastructure 02 Kubernetes in a datacenter? 03 Challenges
  36. 36. Containerization & Kubernetes Orchestration ● Engineering org grows... ● More services… ● More infrastructure to spin up… ● Ops becomes a blocker... Stuck in a loop
  37. 37. Containerization & Kubernetes Orchestration ● Difficult to find resources ● Slow to provision and scale ● Already have discovery! ● Metrics system must support short lived metrics ● Alerts are usually per instance Static infrastructure and microservices do not mix!
  38. 38. Traditional Provisioning Process ● Pick ESX with available resources ● Pick IP ● Register host to Cobbler ● Register DNS entry ● Create new VM on ESX ● PXE boot VM and install OS and base configuration ● Install system dependencies (LDAP, NTP, CollectD, Sensu…) ● Install app dependencies (Java, FluentD/Filebeat, Consul, Mongo-S…) ● Install the app ● App registers with discovery system and begins receiving traffic
  39. 39. Kubernetes Provisioning Process ● kubectl apply -f app.yaml
  40. 40. Containerization & Kubernetes Orchestration ● Provisioning/Scaling: Kubernetes ● Monitoring: Prometheus ● Alerting: AlertManager ● Discovery: Consul + Kubernetes ● Decentralization So how do we make this magic work?
  41. 41. Kubernetes in a datacenter?
  42. 42. Kubernetes Architecture
  43. 43. Spine and Leaf Layer 3 Clos Topology ● Each leaf switch represents a Top-of-Rack switch (ToR) ● All work is performed at the leaf switch ● Each leaf switch is separate Layer 3 domain ● Each leaf is a separate BGP domain (ASN) ● No Spanning Tree Protocol issues seen in L2 networks (convergence time, loops) Leaf Leaf Leaf Leaf Spine Spine
  44. 44. Spine and Leaf Layer 3 Clos Topology ● Simple to understand ● Easy to scale ● Predictable and consistent latency (hops = 2) ● Allows for Anycast IPs Leaf Leaf Leaf Leaf Spine Spine
  45. 45. Calico Networking ● No network overlay required ● Communicates directly with existing L3 mesh network ● BGP Peering with Top of Rack switch ● Calico supports Kubernetes NetworkPolicy firewall rules
  46. 46. Monitoring
  47. 47. ● Graphite does not scale well with ephemeral instances ● Easy to have combinatoric explosion of metrics Traditional Monitoring & Alerting ● Application and system alerts are tightly coupled ● Difficult to create alerts on SLAs ● Difficult to route alerts
  48. 48. Kubernetes Monitoring & Alerting
  49. 49. Kubernetes Monitoring & Alerting
  50. 50. Kubernetes Monitoring & Alerting
  51. 51. Microservice Pod Definition resources: requests: cpu: 2 memory: 4Gi limits: cpu: 2 memory: 4Gi Microservice Pod Java Microservice fluentd consul
  52. 52. Challenges
  53. 53. Microservice Pod Definition resources: requests: cpu: 2 memory: 4Gi limits: cpu: 2 memory: 4Gi ● Kubernetes assumes no other processes are consuming significant resources ● Completely Fair Scheduler (CFS) ○ Schedules a task based on CPU Shares ○ Throttles a task once it hits CPU Quota
  54. 54. Microservice Pod Definition resources: requests: cpu: 2 memory: 4Gi limits: cpu: 2 memory: 4Gi ● Shares = CPU Request * 1024 ● Total Kubernetes Shares = # Cores * 1024 ● Quota = CPU Limit * 100ms ● Period = 100ms
  55. 55. Java in a Container ● JVM is able to detect # of cores via sysconf(_SC_NPROCESSORS_ONLN) ● Scales tasks relative to this
  56. 56. Java in a Container ● Provide a base container that calculates the container’s resources! ● Detect # of “cores” assigned ○ /sys/fs/cgroup/cpu/cpu.cfs_quota_us divided by /sys/fs/cgroup/cpu/cpu.cfs_period_us ● Automatically tune the JVM: ○ -XX:ParallelGCThreads=${core_limit} ○ -XX:ConcGCThreads=${core_limit} ○ -Djava.util.concurrent.ForkJoinPool.common.parallelism=${core_limit }
  57. 57. Java in a Container ● Many libraries rely on Runtime.getRuntime.availableProcessors() ○ Jetty ○ ForkJoinPool ○ GC Threads ○ That mystery dependency...
  58. 58. Java in a Container ● Use Linux preloading to override availableProcessors() #include <stdlib.h> #include <unistd.h> int JVM_ActiveProcessorCount(void) { char* val = getenv("CONTAINER_CORE_LIMIT"); return val != NULL ? atoi(val) : sysconf(_SC_NPROCESSORS_ONLN); } https://engineering.squarespace.com/blog/2017/understanding-linux-container-scheduling
  59. 59. Communication With External Services ● Environment specific services should not be encoded in application ● Single deployment for all environments and datacenters ● Federation API expects same deployment ● Not all applications are using consul
  60. 60. Communication With External Services
  61. 61. Communication With External Services apiVersion: v1 kind: Service metadata: name: kafka namespace: elk spec: type: ClusterIP clusterIP: None sessionAffinity: None ports: - port: 9092 protocol: TCP targetPort: 9092 apiVersion: v1 kind: Endpoints metadata: name: kafka namespace: elk subsets: - addresses: - ip: 10.120.201.33 - ip: 10.120.201.34 - ip: 10.120.201.35 ... ports: - port: 9092 protocol: TCP
  62. 62. So what’s left?
  63. 63. Future Work: Enforce Squarespace Standards ● Custom Admission Controller requires all services, deployments, etc. meet certain standards ○ Resource requests/limits ○ Owner annotations ○ Service labels
  64. 64. Future Work: Updating Common Dependencies ● Custom Initializers ○ Inject container dependencies into deployments (consul, fluentd) ○ Configure Prometheus instances for each namespace ● Trigger rescheduling of pods when dependencies need updating apiVersion: extensions/v1beta1 kind: Deployment metadata: name: location namespace: core-services annotations: initializer.squarespace.net/consul: "true"
  65. 65. QUESTIONS Thank you! squarespace.com/careers Doug Jones @dougfjones Kevin Lynch @kevml

×