Automating the Entire PostgreSQL Lifecycle

PostgreSQL Lifecycle Automation

• Transform the way databases are being operated.
• Looking at PostgreSQL and see what automation its lifecycle means.
• Select a tooling to work with.
• Draft an architectural outline.
• Show use case examples for handling the PostgreSQL lifecycle in a fully
automated way.

–The anynines mission statement
“Automate the entire
lifecycle at production
grade of a growing number
of data services across
platforms, infrastructures
at scale.”

• Infrastructure
• Platform
• (Data) Service
• (Data) Service Instance
• Service Binding
• Cluster = Streaming Replication PostreSQL Cluster

“Database Administration
is Subject to Change.”

• Application development platforms. CF, k8s, …
• Microservices
• More Apps
• More Data Service Types
• More Data Service Instances
Change

Scalability
Instant On-demand Self-
Service
Production-Readiness

Automate the entire lifecycle of
PostgreSQL to easily operate
thousands of production grade
DBs across …

and integrate well with
multiple platforms

• Provision database VM
• Install database software
• Configure database software
• Setup replication & cluster mangement
• Configure Monitoring
• Configure availability monitoring
• Configure alerting
• Configure collection of logs
• Configure collection of metrics
• Create database
• Create database users
• Adapt configuration of individual
database servers
• e.g. enable extension
• Adapt configuration of individual
databases
• Setup backup procedure
• Perform backups regularly
• Perform on-demand backups
• Perform schema migrations
• Query data
• Determine performance bottleneck
• Perform scale-out of database server
• Recover from master db failure
• Recover from standby failure
• Recover from AZ failure
• Recover from network partitioning (split
brain)
• Recover from PostgreSQL process failure
• Perform upgrade of operating system
• Restore data from backup
• Apply patch-level upgrade
• Apply major version upgrade
• Perform data migration
• Destroy data
• Destroy database server
PostgreSQL Lifecycle

PostgreSQL is
not easy to
automate.

PostgreSQL
Lifecycle
Automation

Finding the
Automation
Strategy

Is there a single strategy
that works at production
grade across data services
at scale?

On-Demand Provisioning
of
Dedicated Database
Servers

Easy Deployment
$> cf create-service postgresql
single-small single-postgres-1

single-postgres-1
Postgresql
VM#1
Service Instance #1
Service Broker
PostgreSQL Automation

Easy Deployment
$> cf create-service postgresql
cluster-small postgres-cluster-2

single-postgres-1
Postgresql
VM#1
Service Instance #1
Service Broker
PostgreSQL Automation
postgres-cluster-2
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Service Instance #2

Data ServicesApplication Runtime
App App App App App App App App App App

Data Service Automation
Lifecycle
vs.
Data Service Instance Lifecycle

Data Service
Automation
Lifecycle

Test
a9s
Release
Build
Upstream
Release
Platform
Environment #1
Platform
Environment #2
Platform
Environment #n
Ship Automation Releases
into Platform Environments
Update Data Service Instances
a9s PostgreSQL
Release Repository
Open Source
PostgreSQL
Building new PostgreSQL automation releases
PostgreSQL CI/CD-Pipeline

Data Service
Instance
Lifecycle

Create
Modify
Update
Backup
Recovery
Terminate
PostgreSQL Lifecycle - 1st Iteration

Principles - Full Lifecycle Management
Iteratively increase the
automation depth.

Create
Modify
Update
Backup
Recove
ry
Termina
te
Enabling/disabling data
service plugins
Scale-out Scale down
Add/remove a log/metric endpoint
Major version update Misc. config changes
Greenfield / Clone
Single / clustered
pre-provisioned / on-demand
Trigger manual backup
Scheduled backup
Restore backup
Hard reset of service
instance
Destroy service
instance
Minor version update
Patchlevel version update
Cluster Failure detection
Cluster failover
Self-healing by
resurrection
PostgreSQL Lifecycle - nth Iteration

a9s Deployer
Templates Deployments
Bosh || k8s
a9s Service Broker
my-3node-postgres-cluster-2
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
my-single-postgres-1
Postgresql
VM#1
Middleware Adapter
Open Service Broker API
a9s PostgreSQL SPI
Service InstanceService Instance
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Service Instance
…
Cloud ControllerCF Client
create service
create service
create deployment from template xy with attributes {…}
deploy release abc & deployment manifest xyz
Execute deployments
create
service specific
credentials
create binding

HTTP Verb Action
Service Catalog
GET /v2/catalog
Deliver meta data about the data service.
Create Service Instance
PUT /v2/service_instances/:id
Provision a VM, install and configure a
data service VMs
/ Cluster representing a service instance.
Create Service Binding
PUT /v2/service_instances/:instance_id/service_bindings/:id
Create a data service user and return
credentials representing a service
binding.
Delete Service Binding
DELETE /v2/service_instances/:instance_id/service_bindings/:id
Remove credentials associated with the
service binding.
Delete Service Instance
DELETE /v2/service_instances/:id
Destroy the VMs and data associated with
the service instance.

Selecting the
Automation
Technology

Automation
Technology
Requirements

Automation
Technology
Candidates

VMs
Containers
Strong isolation.
Bad
Neighborhood
Protection.
Faster dev
cycles.
Less overhead.
Faster instance
startup.
Weak isolation.

“„BOSH let’s you
orchestrate the lifecycle of
large-scale deployments of
stateful distributed
systems
to infrastructure.“”

Automate once,
deploy everywhere.

VMware
BOSH
BOSH CLI
$> bosh target http://bosh-on.vmware.com$> bosh deploy
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
OpenStack
BOSH
AWS
BOSH

BOSH CLI
VMware AWS OpenStack
$> bosh target http://bosh-on.aws.com$> bosh deploy
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
Some
Service / App
BOSH Agent
VIRTUAL MACHINE
BOSH BOSH BOSH

Store state on a remotely
attached block device =
persistent disk.
🔑

Infrastructure as a Service (IaaS), e.g. OpenStack
VIRTUAL DATACENTER
Router
STORAGE
Storage Node Storage Node Storage Node
HDD HDD
HDD HDD
HDD HDD
HDD HDD
HDD HDD
HDD HDD
HDD HDD
HDD HDD
HDD HDD
Storage Volume
Operating
System
VIRTUAL MACHINE
Infrasstructure API

The data lifecycle has been
decoupled from the VM
lifecycle
⇒ The VM becomes
disposable.
🔑

Ephemeral VM,
persistent disk.
🔑

Horizontal
Scaling
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE
Some Service
BOSH Agent
VIRTUAL MACHINE

BOSH Deployments are
Predictable

BOSH Deployments are
Repeatable

High Availability &
Cluster
Management

• Applies to clustered Data Service Instances
• Asynchronous streaming replication
• Three (3) Nodes:
• One (1) Master Node
• Two (2) Standby Nodes
• Replication slots configured to avoid early recycling of master WAL segments
High Availability & Replication

• Extends PostgreSQL streaming replication
• Manages replication
• Performs failure detection
• Assists during failover by triggering leader election
• Facilitates monitoring of the replication health & performance
Repmgr

• Detect network partitioning & split brain situations
• Periodically checks the repmgr database and verifies replication status and
cluster status
• Fires alarm if
• master is not followed by a majority
• standby is following the wrong master
Custom PostgreSQL Automation

Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
👑

Postgresql
VM#1
Postgresql
VM#3
👑

single-postgres-1
Postgresql
VM#1
Service Instance #1

ScalabilityVertical-Scale Out
$> cf service-update
single-postgres-1 -p single-large

Single node PostgreSQL service instance
> turned into a large PostgreSQL server.
single-postgres-
1
Postgresql
VM#1

What has happened
during the vertical scale-
out?

4GB RAM
1 vCPU
10GB persistent disk
BOSH Agent
VIRTUAL MACHINE
4 GB RAM, 1 vCPU
10 GB Persistent Disk
Data
PostgreSQL

BOSH Agent
PostgreSQL
VIRTUAL MACHINE
8 GB RAM, 2 vCPUs
Data
BOSH Agent
VIRTUAL MACHINE
4 GB RAM, 1 vCPU
PostgreSQL
Data
Data

ScalabilityDB User & DB Access
$> cf bind-service
my-app single-postgres-1

Backup & Restore for
Application Developers

Backup & Restore of
the user’s service
instances

Backup & Restore for
Platform Operators

• Disaster Recovery Plan for Data Services
• Backup & Restore for platform operators
• Backup & Restore of
• All service instances
• Individual service instances
Backup & Restore for Platform
Operators

Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
my-3node-mongo-cluster-4
MongoDB
VM#1
MongoDB
VM#2
MongoDB
VM#3
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
my-3node-redis-
solo-7
Redis
VM#1

BOSH Agent
VIRTUAL MACHINE
4 GB RAM, 1 vCPU
Data
PostgreSQL
Replication Manager
Log Agent
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Backup Agent

Data Filter Chain
a9s PostgreSQL
Service Instance #32
Data Stream Reader
Postgresql VM#2
Data Filter
Data Stream Writer
Data Filter
Backup Agent
Object Store,
e.g. AWS S3
Database
Object Store,
e.g. AWS S3

Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Postgresql
VM#1
Postgresql
VM#2
Postgresql
VM#3
Backup requested
a9s Backup Manager
a9s Backup API
⏰
Amazon S3
Backup scheduled
Tell backup agent to
perform backup
Store encrypted backup to storage
Backup🔐

• Dedicated service instances are mandatory.
• On-demand provisioning is essential.
• Choosing the right automation technology is key.

Full PostgreSQL
lifecycle automation is
feasible…

… and it is
already
happening!

Questions?
@anynines
@fischerjulian

Automating the Entire PostgreSQL Lifecycle

More Related Content

What's hot

Similar to Automating the Entire PostgreSQL Lifecycle

More from anynines GmbH

Recently uploaded

Automating the Entire PostgreSQL Lifecycle

Editor's Notes