Giacomo Tirabassi [InfluxData] | Istio at InfluxData | InfluxDays Virtual Experience London 2020

Giacomo Tirabassi
Istio atInﬂuxData

© 2020 InfluxData. All rights reserved. 2
Who am
I?
Italian
SRE @ InfluxData (keep Cloud2 running)
Running Kubernetes since version 1.8 (weird flex but OK)
Travel, eat, cook, repeat

Agenda
1. What is a service mesh?
2. Why do we need a service mesh?
3. How do we run it?
4. Outcome
5. Roadmap

What is a service mesh?
Volume 1

Alternatives
● Istio (Google)
● Linkerd (CNCF, Buoyant)
● Consul Connect (Hashicorp)

What is a Service Mesh?
Extract functionalities from application into the platform
● Networking: retries, timeout, rate limits, circuit breakers, canary
● Security: mTLS or JWT for authentication, authorization policies
● Observability: automatic tracing, access log, protocol speciﬁc metrics

© 2020 InﬂuxData. All rights reserved.
Kubernetes : Linux Process = Istio : HTTP Request

Why does it make sense?
Extract functionalities from application into the platform
● Unix philosophy: “do one thing and do it well”
● Polyglot platforms: attempts like Hystrix don’t work if multiple languages are
used
● Zero trust networks: just because it's in your VPC doesn't mean it's
secure, reduce blast radius

Let it sink in
Platform All Apps

Why do we need a service mesh?
Volume 2

1
Initial reasons
● Canary deployments
● Deﬁning better SLAs and SLOs

Moooar complexity
Service Mesh
CNI
Kubernetes
Containers
Linux OS
Cloud Provider

Migrating production services...

How do we install and run Istio?
Volume 3

1
Istio Deployment
From version 1.2 to 1.4
● We kept our fork of Helm installation in Jsonnet
●Conﬁguration mess
Version 1.5 and 1.6
● Istio operator with istiod: single control plane component
● https://github.com/inﬂuxdata/helm-charts/tree/master/istio

1
Prometheus-less deployment
● With istio-mixer (metrics aggregator)
○ Telegraf sidecar for istio-mixer to scrape metrics
● Without istio-mixer
○ Sidecar with telegraf-operator foristiod
○ telegraf-operator is used to add sidecar to every pod using istioto
scrape `http://127.0.0.1:15090/stats/prometheus`

1
Istio caused outage
● Running Istio 1.3 and deﬁning a Kubernetes Service with port name
`http` and port number `443` broke all communications from application in
the mesh to external services exposed with https
● Upstream bug: https://github.com/istio/istio/issues/16458

1
Istio caused outage
● Solved in 1.4 upstream
● We solved it with conftest

2

sensitive

Dashboards soon to be in
https://github.com/inﬂuxdata/community-templates

2
Roadmap ( 1 of 3)
● Switch to mixer-less monitoring
● more services enrolled in the mesh
● ingress gateway
● conﬁgure tracing (ﬂip the switch)
● enable access logs

Roadmap ( 2 of 3)
● mTLS policy to STRICT which means that envoy will refuse all
connections which are not over mTLS from within the MESH
● outbound traffic to REGISTRY_ONLY: this means that only endpoints
inside of the mesh and in ServiceEntry are reachable by an app
● start using Egress Gateway for egress traffic out for alerts

Roadmap ( 3 of 3)
● Sidecar CRD for all workloads (reduce number of connectionscross
namespace)
● Start reducing cross service access with PeerAuthentication
● Cross Region Failover with multi-cluster conﬁguration
● Contribute to Kiali to work with InﬂuxDB

The End
Twitter / Linkedin / Github
@gitirabassi

Giacomo Tirabassi [InfluxData] | Istio at InfluxData | InfluxDays Virtual Experience London 2020

More Related Content

What's hot

Similar to Giacomo Tirabassi [InfluxData] | Istio at InfluxData | InfluxDays Virtual Experience London 2020

More from InfluxData

Recently uploaded

Giacomo Tirabassi [InfluxData] | Istio at InfluxData | InfluxDays Virtual Experience London 2020