4. Today’s
agenda
BlaBlaCar - Facts & Figures
Infrastructure Ecosystem - 100% containers powered carpooling
Backend High Availability Pillars - MariaDB as an example
Database as a Service - Building a frictionless infrastructure
What’s next?
6. 60 million
members
Founded
in 2006
1 million tonnes
less CO2
In the past year
30 million mobile
app downloads
iPhone and Android
15 million
travellers /quarter
Currently in
22 countriesFrance, Spain, UK, Italy, Poland, Hungary, Croatia, Serbia, Romania,
Germany, Belgium, India, Mexico, The Netherlands, Luxembourg,
Portugal, Ukraine, Czech Republic, Slovakia, Russia, Brazil and Turkey.
Facts and Figures
9. Infrastructure Ecosystem
bare-metal servers
1 type of
hardware
3 disk profiles
fleet cluster
CoreOS
ggn“Distributed init system”
Hardware
Container Registry
etcd
dgr
Service Codebase
rkt PODs
build
run
store
host
create mysqld
monitoring
nerve
mysql-main1
php
nginx
nerve
monitoring
synapse
front1
synapse
nerve
zookeeper Service Discovery
10. Infrastructure Ecosystem
bare-metal servers 1 type of
hardware
3 disk profiles
fleet
CoreOS
ggn“Distributed init system”
Hardware
Container Registry
etcd
dgr
Service Codebase
rkt PODs
build
run
store
host
create mysqld
monitoring
nerve
mysql-main1
php
nginx
nerve
monitoring
synapse
front1
synapse
nerve
zookeeper Service Discovery
kuberneteshelm
11. backend pod
client pod
Service Discovery
/database/node1
go-nerve does health checks
and reports to zookeeper in
service keys
node1
/database
Applications hit their local
haproxy to access backends
go-synapse watches zookeeper
service keys and reloads
haproxy if changes are
detected
HAProxy
go-nerve
Zookeeper
go-synapse
14. Asynchronous vs. Synchronous
Master
Slave Slave Slave
wsrep wsrep wsrep wsrep
MariaDB Cluster
wsrep
MariaDB Cluster means
No Single Point of
Failure
No Replication Lag
Auto States Transfers
As fast as the slowest
15. MySQL at BlaBlaCar?
wsrep wsrep wsrep wsrep
MariaDB Cluster
wsrep
MariaDB Cluster
Our prerequisites are
Containers
Writes go on one
node
Writes
Reads are balanced
on the others
Reads
20. If enableCheckStableCommand is set
The command is run at each
increase and if returning != 0,
current weight restart from 1.
Weight value is reached
The service is fully in
production.
go-nerve Zookeeper go-synapse HAProxy
call API on
/enable or
/weight/:weight
store current
weight
update weight on
HaProxy via
socket
set weight
<backend>/<server>
<weight>
23. API call /disable return
The service can be shutdown
without risk.
Call /disable on Nerve’s API
Set weight to 0 = no more new
sessions will go into the services.
if disableGracefullyDoneCommand is set
This command is run in loop until
return 0.
Gracefully
Disabling
Pipeline
24. Be Quiet!
Come gently into prod
Abolish Slavery
Every node is the same
Die in Peace...
Get out when you are ready
Graceful restart
Service Discovery (nerve/synapse)
Weight system
Slow query tracking
Graceful restart
Service Discovery (nerve/synapse)
Weight system
No master/slave
Auto States Transferts
Service Discovery (nerve/synapse)
Backend High Availability Pillars
25. Database as a Service
Building a frictionless infrastructure
26. Easy deployment
Pull Request on a services
repository
No technical parameters to
override
The services are auto initialized
27. Easy deployment with GGN
$ tree env/prod-dc1/services/mysql-main
env/prod-dc1/services/mysql-main
├── attributes
│ ├── galera.yml
│ ├── innodb.yml
│ └── nerve.yml
├── service-manifest.yml
└── unit.tmpl
1 directory, 5 files
$ cat env/prod-dc1/services/mysql-main/service-manifest.yml
containers:
- aci.blbl.cr/pod-mysql:10.1-32
nodes:
- hostname: "mysql-main1"
ip: "192.168.1.1"
fleet:
- MachineMetadata=name=r11-srv1
- hostname: "mysql-mysql-main2"
ip: "192.168.1.2"
fleet:
- MachineMetadata=name=r12-srv2
- hostname: "mysql-mysql-main3"
ip: "192.168.1.3"
fleet:
- MachineMetadata=name=r13-srv3
# ggn prod-dc1 mysql-main update -y
Deploy the service with GGN
(github.com/blablacar/ggn)
Generates systemd units based on templating send
them to the environment using fleet.
30. Prometheus relabeling
# [zk: localhost:2181(CONNECTED) 1] get /monitoring/mysql/main/mysql-main1_prometheus_192.168.1.2_ba0f1f8b
{"available":true,"host":"192.168.1.2","port":9104,"name":"mysql-main1","weight":255,"labels":{"host":"r11-srv1"}}
We push services info with Nerve into Zookeeper
And Prometheus does the magic
33. A set of bash scripts Do the basic health
checks quickly
Easy troubleshooting with “bbc” command
Manage all backends
the same way
Can be used by
non-specialists
Plugged into the
service discovery
Designed for our
needs
34. # bbc mysql list
pp-dc2 mysql-main
pp-dc2 mysql-user
pp-dc2 mysql-trip
pp-dc2 mysql-payment
prod-dc1 mysql-main
prod-dc1 mysql-user
prod-dc1 mysql-trip
prod-dc1 mysql-payment
[...]
bbc command examples
# bbc mysql overview prod-dc1 mysql-main
=== Service Overview 'prod-dc1 mysql-main' ===
mysql-main1 (192.168.1.1) PING, PORT, Synced
---
mysql-main1 (3306) - enabled - weight = 255/255
mysql-main1_prometheus (9104) - enabled - weight = 255/255
mysql-main2 (192.168.1.2) PING, PORT, Synced
---
mysql-main2 (3306) - enabled - weight = 255/255
mysql-main2_prometheus (9104) - enabled - weight = 255/255
mysql-main3 (192.168.1.3) PING, PORT, Synced
---
mysql-main3 (3306) - enabled - weight = 255/255
mysql-main3_prometheus (9104) - enabled - weight = 255/255 # bbc mysql connect prod-dc1 mysql-main
env: prod-dc1
service: mysql-main
host: mysql-main1
ip: 192.168.1.1
Enter the username [ENTER]: team_data
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or g.
Your MariaDB connection id is 2887129
Server version: 10.1.28-MariaDB-1~jessie mariadb.org binary distribution
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or 'h' for help. Type 'c' to clear the current input
statement.
MariaDB [(none)]>
# bbc mysql monitor prod-dc1 mysql-main mysql-main1
Weight: 255/255 Processes: 88 Slow: 0
Weight: 255/255 Processes: 75 Slow: 0
Weight: 255/255 Processes: 89 Slow: 0
Weight: 255/255 Processes: 99 Slow: 0
Weight: 255/255 Processes: 79 Slow: 0
Weight: 255/255 Processes: 65 Slow: 0
Weight: 255/255 Processes: 86 Slow: 0
Weight: 255/255 Processes: 93 Slow: 0
Weight: 255/255 Processes: 88 Slow: 0
Weight: 255/255 Processes: 96 Slow: 0
Weight: 255/255 Processes: 77 Slow: 0
Weight: 255/255 Processes: 73 Slow: 0
35. # bbc postgresql overview prod-dc1 postgresql-corridoring
Service Overview 'prod-dc1 postgresql-corridoring'
-- USING BDR --
postgresql-corridoring1 (192.168.1.10) PING, PORT
postgresql-corridoring2 (192.168.1.11) PING, PORT
postgresql-corridoring3 (192.168.1.12) PING, PORT
postgresql-corridoring4 (192.168.1.13) PING, PORT
postgresql-corridoring5 (192.168.1.14) PING, PORT
# bbc postgresql list
pp-dc2 postgresql-airflow
pp-dc2 postgresql-corridoring
pp-dc2 postgresql-redash
pp-dc2 postgresql-trip-pricing
prod-dc1 postgresql-corridoring
prod-dc1 postgresql-redash
bbc command examples
# bbc postgresql connect prod-dc1 postgresql-corridoring
env: prod-dc1
service: postgresql-corridoring - database : corridoring
host: postgresql-corridoring1
ip: 192.168.1.10
Enter the username [ENTER]: team_data
Password for user team_arch:
psql (9.6.6, server 9.4.12)
corridoring=#
# bbc redis overview prod-dc1 redis-main
=== Service 'prod-dc1' 'redis-main' ===
Redis elector master: redis-main1.prod.dc-1.blabla.com
redis-main1 (192.168.1.20):PING, PORT, role:master, clients:255
redis-main2 (192.168.1.21):PING, PORT, role:slave, clients:2, slaveof:192.168.1.20
redis-main3 (192.168.1.22):PING, PORT, role:slave, clients:2, slaveof:192.168.1.20
# bbc redis list
pp-dc2 redis-main
pp-dc2 redis-quota
pp-dc2 redis-translation
pp-dc2 redis-user
prod-dc1 redis-main
prod-dc1 redis-quota
# bbc redis connect prod-dc1 redis-main
env: prod-dc1
service: redis-main
host: redis-main1
ip: 192.168.1.20
role: slave
192.168.1.20:6379>
Today, 32 subcommands are available on bbc...
37. Moving to Kubernetes
From a simple
“Distributed init
system” to the
standard for container
orchestration.
Fleet is deprecated
Fleet is no longer
developed and
maintained by
CoreOS.
What does
the future
look like?
38. Ownership
Move backends
ownership to the
developers teams.
Moving to the cloud?
Extend this idea of
“expendable” services to
hardware resources.
Docker?
Kubernetes + RKT
(rktnetes, rktlet) has a
poor adoption.