Gianluca Arbezzano
Site Reliability Engineer @InfluxData
● https://gianarb.it
● @gianarb
What I like:
● I make dirty hacks that look awesome
● I grow my vegetables 🍅🌻🍆
● Travel for fun and work
@gianarb - gianluca@influxdb.com
@gianarb - gianluca@influxdb.com
@gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.6 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.7 @gianarb - gianluca@influxdb.com
DevOps likes automation.
Automation likes code.
YAML is not code.
Inspired by a true events
Kubernetes
1. You Know!
Your team knows and
use Docker for local
development and
testing
2. Kubernetes!
Everyone speaks
about kubernetes.
3. Hire!
You don’t know why
but you hired a
DevOps that kind of
know k8s.
3. Excitement!
You are moving
everything and
everyone to
kubernetes
We need to make our
hands dirty
Spin up a cluster that you
can break
Bring developers in the loop
Deploy CI on Kubernetes
Bring developers in the loop
Run your code in prod
Bring developers in the loop
K8s as code: From YAML to code (golang)
1. You have the ability to use Golang autocomplete as documentation, reference for every
kubernetes resources
2. You feel less a YAML engineer (great feeling btw)
3. Code is better than YAML! You can reuse it, compile it, embed it in other projects.
K8s as code: From YAML to code (golang)
Tiny cli
to make
the
migration
to golang
Some
manual
refactoring
K8s as code: From YAML to code (golang)
Tiny cli
to make
the
migration
to golang
Some
manual
refactoring
● Continue to improve our CI to validate that YAML and Go file are the same,
and the resources in Kubernetes are like the Go file.
● Maybe we will be able to remove the YAML at some point.
GitOps
Your Git repository is the entrypoint for all your code changes.
Infrastructure is ‘as code’, so the place where you make it happen should be Git.
Read More on weave.com
https://www.weave.works/technologies/gitops/
The secret of
success
Don’t be scared and write your
own tools!
Why Kubernetes
is so powerful, complex
and widely adopted?
Why AWS
is so expensive?
What do you do
to justify these costs?
© 2018 InfluxData. All rights reserved.24 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.25 @gianarb - gianluca@influxdb.com
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: {{ template "drone.fullname" . }}-agent
labels:
app: {{ template "drone.name" . }}
chart: "{{ .Chart.Name }}-{ .Chart.Version }}"
release: "{{ .Release.Name }}"
heritage: "{{ .Release.Service }}"
component: agent
spec:
replicas: {{ .Values.agent.replicas }}
template:
metadata:
annotations:
checksum/secrets: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }}
{{- if .Values.agent.annotations }
{{ toYaml .Values.agent.annotations | indent 8 }
{{- end }}
labels:
app: {{ template "drone.name" . }}
release: "{{ .Release.Name }}"
component: agent
API are
the keys for
your success!
Image credit: Pixabay
© 2018 InfluxData. All rights reserved.27 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.28 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.29 @gianarb - gianluca@influxdb.com
containerd.io
© 2018 InfluxData. All rights reserved.30 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.31 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.32 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.33 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.34 @gianarb - gianluca@influxdb.com
© 2018 InfluxData. All rights reserved.35 @gianarb - gianluca@influxdb.com
We use docker as
replacement for systemd
for process management
© 2018 InfluxData. All rights reserved.36 @gianarb - gianluca@influxdb.com
DIND - Docker in Docker
$ docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock docker sh
$ docker info
Containers: 48
Running: 1
Paused: 0
Stopped: 47
containerd version: 9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b.m
runc version: 871ba2e58e24314d1fab4517a80410191ba5ad01
init version: fec3683
Kernel Version: 4.20.13-arch1-1-ARCH
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.42GiB
Name: gianarb
UX for OPS.
Because everyone needs to feel
like at home...
Instrumentation, observability
and monitoring
~ @gianarb - https://gianarb.it ~
The secret is all about how do you
combine things together
~ @gianarb - https://gianarb.it ~
Metric
s
~ @gianarb - https://gianarb.it ~
Logs
~ @gianarb - https://gianarb.it ~
Traces
~ @gianarb - https://gianarb.it ~
Often our
aggregations looks
a bit twisted...
@gianarb - gianluca@influxdb.com
Distributed Tracing
Tracing is a way to correlate
logs using a set of IDs
@gianarb - gianluca@influxdb.com
Normal state vs Current state
Instrumentation code is a first citizen in your
codebase: OpenCensus
● Open Source project sponsored by Google
● It is a SPEC plus a set of libraries in different languages to instrument your
application
● To collect metrics, traces and events.
OpenCensus
Common
Interface to
collect stats
and traces
from your app
Different
exporters to
persist your
data
gianarb.it ~ @gianarb
# HELP http_requests_total The total number of HTTP requests.
# TYPE http_requests_total counter
http_requests_total{method="post",code="200"} 1027 1395066363000
http_requests_total{method="post",code="400"} 3 1395066363000
# Escaping in label values:
msdos_file_access_time_seconds{path="C:DIRFILE.TXT",error="Cannot find file:n"FILE.TXT""}
1.458255915e9
# Minimalistic line:
metric_without_timestamp_and_labels 12.47
# A weird metric from before the epoch:
something_weird{problem="division by zero"} +Inf -3982045
# A histogram, which has a pretty complex representation in the text format:
# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 24054
http_request_duration_seconds_bucket{le="0.1"} 33444
http_request_duration_seconds_bucket{le="0.2"} 100392
http_request_duration_seconds_bucket{le="0.5"} 129389
http_request_duration_seconds_bucket{le="1"} 133988
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423
http_request_duration_seconds_count 144320
OpenMetrics
v2 Prometheus exposition format
gianarb.it ~ @gianarb
func FetchMetricFamilies(url string, ch chan<- *dto.MetricFamily, certificate string, key string,
skipServerCertCheck bool) error {
defer close(ch)
var transport *http.Transport
if certificate != "" && key != "" {
cert, err := tls.LoadX509KeyPair(certificate, key)
if err != nil {
return err
}
tlsConfig := &tls.Config{
Certificates: []tls.Certificate{cert},
InsecureSkipVerify: skipServerCertCheck,
}
tlsConfig.BuildNameToCertificate()
transport = &http.Transport{TLSClientConfig: tlsConfig}
} else {
transport = &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: skipServerCertCheck},
}
}
https://github.com/prometheus/prom2json/blob/master/prom2json.go#L123
Summary:
★ Do not be scared to write your code!
★ Use the API
★ Instrumentation code is a first class
citizen
★ Keep calm and observe all together!
© 2018 InfluxData. All rights reserved.53 @gianarb - gianluca@influxdb.com
Credits and Links
¨ https://www.weave.works/technologies/gitops/
¨ http://gianarb.it
¨ https://thenewstack.io/why-you-cant-afford-to-ignore-distributed-tracing-for-observability/
¨ https://www.honeycomb.io/blog/
¨ https://gianarb.it/blog/infra-as-code-short-long-ttl-resource
¨ https://gianarb.it/blog/kubernetes-shared-informer
¨ https://github.com/OpenObservability/OpenMetrics
¨ https://promcon.io/2018-munich/slides/openmetrics-transforming-the-prometheus-exposition-format-into-a
-global-standard.pdf
¨ https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
~ @gianarb - https://gianarb.it ~
Thanks
@gianarb

DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned from InfluxDB v1 to v2

  • 1.
    Gianluca Arbezzano Site ReliabilityEngineer @InfluxData ● https://gianarb.it ● @gianarb What I like: ● I make dirty hacks that look awesome ● I grow my vegetables 🍅🌻🍆 ● Travel for fun and work
  • 2.
  • 3.
  • 4.
  • 6.
    © 2018 InfluxData.All rights reserved.6 @gianarb - gianluca@influxdb.com
  • 7.
    © 2018 InfluxData.All rights reserved.7 @gianarb - gianluca@influxdb.com DevOps likes automation. Automation likes code. YAML is not code.
  • 8.
    Inspired by atrue events
  • 9.
  • 10.
    1. You Know! Yourteam knows and use Docker for local development and testing 2. Kubernetes! Everyone speaks about kubernetes. 3. Hire! You don’t know why but you hired a DevOps that kind of know k8s. 3. Excitement! You are moving everything and everyone to kubernetes
  • 11.
    We need tomake our hands dirty
  • 12.
    Spin up acluster that you can break Bring developers in the loop
  • 13.
    Deploy CI onKubernetes Bring developers in the loop
  • 14.
    Run your codein prod Bring developers in the loop
  • 15.
    K8s as code:From YAML to code (golang) 1. You have the ability to use Golang autocomplete as documentation, reference for every kubernetes resources 2. You feel less a YAML engineer (great feeling btw) 3. Code is better than YAML! You can reuse it, compile it, embed it in other projects.
  • 16.
    K8s as code:From YAML to code (golang) Tiny cli to make the migration to golang Some manual refactoring
  • 17.
    K8s as code:From YAML to code (golang) Tiny cli to make the migration to golang Some manual refactoring ● Continue to improve our CI to validate that YAML and Go file are the same, and the resources in Kubernetes are like the Go file. ● Maybe we will be able to remove the YAML at some point.
  • 18.
    GitOps Your Git repositoryis the entrypoint for all your code changes. Infrastructure is ‘as code’, so the place where you make it happen should be Git. Read More on weave.com https://www.weave.works/technologies/gitops/
  • 19.
  • 20.
    Don’t be scaredand write your own tools!
  • 21.
    Why Kubernetes is sopowerful, complex and widely adopted?
  • 22.
    Why AWS is soexpensive?
  • 23.
    What do youdo to justify these costs?
  • 24.
    © 2018 InfluxData.All rights reserved.24 @gianarb - gianluca@influxdb.com
  • 25.
    © 2018 InfluxData.All rights reserved.25 @gianarb - gianluca@influxdb.com apiVersion: extensions/v1beta1 kind: Deployment metadata: name: {{ template "drone.fullname" . }}-agent labels: app: {{ template "drone.name" . }} chart: "{{ .Chart.Name }}-{ .Chart.Version }}" release: "{{ .Release.Name }}" heritage: "{{ .Release.Service }}" component: agent spec: replicas: {{ .Values.agent.replicas }} template: metadata: annotations: checksum/secrets: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }} {{- if .Values.agent.annotations } {{ toYaml .Values.agent.annotations | indent 8 } {{- end }} labels: app: {{ template "drone.name" . }} release: "{{ .Release.Name }}" component: agent
  • 26.
    API are the keysfor your success! Image credit: Pixabay
  • 27.
    © 2018 InfluxData.All rights reserved.27 @gianarb - gianluca@influxdb.com
  • 28.
    © 2018 InfluxData.All rights reserved.28 @gianarb - gianluca@influxdb.com
  • 29.
    © 2018 InfluxData.All rights reserved.29 @gianarb - gianluca@influxdb.com containerd.io
  • 30.
    © 2018 InfluxData.All rights reserved.30 @gianarb - gianluca@influxdb.com
  • 31.
    © 2018 InfluxData.All rights reserved.31 @gianarb - gianluca@influxdb.com
  • 32.
    © 2018 InfluxData.All rights reserved.32 @gianarb - gianluca@influxdb.com
  • 33.
    © 2018 InfluxData.All rights reserved.33 @gianarb - gianluca@influxdb.com
  • 34.
    © 2018 InfluxData.All rights reserved.34 @gianarb - gianluca@influxdb.com
  • 35.
    © 2018 InfluxData.All rights reserved.35 @gianarb - gianluca@influxdb.com We use docker as replacement for systemd for process management
  • 36.
    © 2018 InfluxData.All rights reserved.36 @gianarb - gianluca@influxdb.com DIND - Docker in Docker $ docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock docker sh $ docker info Containers: 48 Running: 1 Paused: 0 Stopped: 47 containerd version: 9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b.m runc version: 871ba2e58e24314d1fab4517a80410191ba5ad01 init version: fec3683 Kernel Version: 4.20.13-arch1-1-ARCH Operating System: Arch Linux OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 15.42GiB Name: gianarb
  • 37.
    UX for OPS. Becauseeveryone needs to feel like at home...
  • 38.
  • 39.
    ~ @gianarb -https://gianarb.it ~ The secret is all about how do you combine things together
  • 40.
    ~ @gianarb -https://gianarb.it ~ Metric s
  • 41.
    ~ @gianarb -https://gianarb.it ~ Logs
  • 42.
    ~ @gianarb -https://gianarb.it ~ Traces
  • 43.
    ~ @gianarb -https://gianarb.it ~ Often our aggregations looks a bit twisted...
  • 44.
    @gianarb - gianluca@influxdb.com DistributedTracing Tracing is a way to correlate logs using a set of IDs
  • 45.
  • 46.
    Normal state vsCurrent state
  • 47.
    Instrumentation code isa first citizen in your codebase: OpenCensus ● Open Source project sponsored by Google ● It is a SPEC plus a set of libraries in different languages to instrument your application ● To collect metrics, traces and events.
  • 48.
    OpenCensus Common Interface to collect stats andtraces from your app Different exporters to persist your data
  • 49.
    gianarb.it ~ @gianarb #HELP http_requests_total The total number of HTTP requests. # TYPE http_requests_total counter http_requests_total{method="post",code="200"} 1027 1395066363000 http_requests_total{method="post",code="400"} 3 1395066363000 # Escaping in label values: msdos_file_access_time_seconds{path="C:DIRFILE.TXT",error="Cannot find file:n"FILE.TXT""} 1.458255915e9 # Minimalistic line: metric_without_timestamp_and_labels 12.47 # A weird metric from before the epoch: something_weird{problem="division by zero"} +Inf -3982045 # A histogram, which has a pretty complex representation in the text format: # HELP http_request_duration_seconds A histogram of the request duration. # TYPE http_request_duration_seconds histogram http_request_duration_seconds_bucket{le="0.05"} 24054 http_request_duration_seconds_bucket{le="0.1"} 33444 http_request_duration_seconds_bucket{le="0.2"} 100392 http_request_duration_seconds_bucket{le="0.5"} 129389 http_request_duration_seconds_bucket{le="1"} 133988 http_request_duration_seconds_bucket{le="+Inf"} 144320 http_request_duration_seconds_sum 53423 http_request_duration_seconds_count 144320
  • 50.
  • 51.
    gianarb.it ~ @gianarb funcFetchMetricFamilies(url string, ch chan<- *dto.MetricFamily, certificate string, key string, skipServerCertCheck bool) error { defer close(ch) var transport *http.Transport if certificate != "" && key != "" { cert, err := tls.LoadX509KeyPair(certificate, key) if err != nil { return err } tlsConfig := &tls.Config{ Certificates: []tls.Certificate{cert}, InsecureSkipVerify: skipServerCertCheck, } tlsConfig.BuildNameToCertificate() transport = &http.Transport{TLSClientConfig: tlsConfig} } else { transport = &http.Transport{ TLSClientConfig: &tls.Config{InsecureSkipVerify: skipServerCertCheck}, } } https://github.com/prometheus/prom2json/blob/master/prom2json.go#L123
  • 52.
    Summary: ★ Do notbe scared to write your code! ★ Use the API ★ Instrumentation code is a first class citizen ★ Keep calm and observe all together!
  • 53.
    © 2018 InfluxData.All rights reserved.53 @gianarb - gianluca@influxdb.com Credits and Links ¨ https://www.weave.works/technologies/gitops/ ¨ http://gianarb.it ¨ https://thenewstack.io/why-you-cant-afford-to-ignore-distributed-tracing-for-observability/ ¨ https://www.honeycomb.io/blog/ ¨ https://gianarb.it/blog/infra-as-code-short-long-ttl-resource ¨ https://gianarb.it/blog/kubernetes-shared-informer ¨ https://github.com/OpenObservability/OpenMetrics ¨ https://promcon.io/2018-munich/slides/openmetrics-transforming-the-prometheus-exposition-format-into-a -global-standard.pdf ¨ https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
  • 54.
    ~ @gianarb -https://gianarb.it ~ Thanks @gianarb