At a blistering pace and for a variety of reasons, companies are migrating their on-premise database infrastructures to cloud-based solutions - to save costs on hardware, tame the impact of disaster recovery, or even to improve security. Zalando is not an exception: more than two years ago we migrated our first production services to AWS.
In addition to the fully managed database services like RDS and Aurora, Amazon offers a wide spectra of EC2 Instances with different types of performance and price. Without a lot of experience in running cloud databases it’s not easy to make the right choice, and as a result you will either have pure database performance or will overpay for over-provisioned resources.
In this talk I will compare different ways of running PostgreSQL on AWS, explain why we decided to run most of our databases on EC2 Instances instead of RDS, how we chose EC2 Instance types and EBS Volumes, which AWS CloudWatch metrics MUST be monitored (and why), and what problems we hit plus how to avoid them.
PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro YamadaEqunix Business Solutions
PGConf.ASIA 2019 Bali - 9 September 2019
Speaker: Etsuro Fujita & Tatsuro Yamada
Room: ACID
Title: Foreign Data Wrappers: A Powerful Technology for Data Integration
PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...Equnix Business Solutions
Atsushi Mitani from SRA Nishi-Nihon Inc. presented on how to perform write load balancing in PostgreSQL using transactions. He explained that write load distribution is important for systems with high write volumes. PostgreSQL can distribute write load using table partitioning with foreign data wrappers (FDW), which allows partitioning across database instances. Mitani created patches to automate the partitioning setup and load data in parallel to child tables to speed up benchmarking. Benchmark results showed that while increasing child databases improves performance without transactions, increasing parent databases is better with transactions to avoid lock queues. The optimal configuration depends on data size, queries, and hardware.
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
This document discusses tuning Linux and PostgreSQL for performance. It recommends:
- Tuning Linux kernel parameters like huge pages, swappiness, and overcommit memory. Huge pages can improve TLB performance.
- Tuning PostgreSQL parameters like shared_buffers, work_mem, and checkpoint_timeout. Shared_buffers stores the most frequently accessed data.
- Other tips include choosing proper hardware, OS, and database based on workload. Tuning queries and applications can also boost performance.
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...Equnix Business Solutions
PGConf.ASIA 2019 Bali - 10 September 2019
Speaker: Bo Peng
Room: SQL
Title: Setup a High-Availability and Load Balancing PostgreSQL Cluster - New Features of Pgpool-II 4.1
At a blistering pace and for a variety of reasons, companies are migrating their on-premise database infrastructures to cloud-based solutions - to save costs on hardware, tame the impact of disaster recovery, or even to improve security. Zalando is not an exception: more than two years ago we migrated our first production services to AWS.
In addition to the fully managed database services like RDS and Aurora, Amazon offers a wide spectra of EC2 Instances with different types of performance and price. Without a lot of experience in running cloud databases it’s not easy to make the right choice, and as a result you will either have pure database performance or will overpay for over-provisioned resources.
In this talk I will compare different ways of running PostgreSQL on AWS, explain why we decided to run most of our databases on EC2 Instances instead of RDS, how we chose EC2 Instance types and EBS Volumes, which AWS CloudWatch metrics MUST be monitored (and why), and what problems we hit plus how to avoid them.
PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro YamadaEqunix Business Solutions
PGConf.ASIA 2019 Bali - 9 September 2019
Speaker: Etsuro Fujita & Tatsuro Yamada
Room: ACID
Title: Foreign Data Wrappers: A Powerful Technology for Data Integration
PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...Equnix Business Solutions
Atsushi Mitani from SRA Nishi-Nihon Inc. presented on how to perform write load balancing in PostgreSQL using transactions. He explained that write load distribution is important for systems with high write volumes. PostgreSQL can distribute write load using table partitioning with foreign data wrappers (FDW), which allows partitioning across database instances. Mitani created patches to automate the partitioning setup and load data in parallel to child tables to speed up benchmarking. Benchmark results showed that while increasing child databases improves performance without transactions, increasing parent databases is better with transactions to avoid lock queues. The optimal configuration depends on data size, queries, and hardware.
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
This document discusses tuning Linux and PostgreSQL for performance. It recommends:
- Tuning Linux kernel parameters like huge pages, swappiness, and overcommit memory. Huge pages can improve TLB performance.
- Tuning PostgreSQL parameters like shared_buffers, work_mem, and checkpoint_timeout. Shared_buffers stores the most frequently accessed data.
- Other tips include choosing proper hardware, OS, and database based on workload. Tuning queries and applications can also boost performance.
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...Equnix Business Solutions
PGConf.ASIA 2019 Bali - 10 September 2019
Speaker: Bo Peng
Room: SQL
Title: Setup a High-Availability and Load Balancing PostgreSQL Cluster - New Features of Pgpool-II 4.1
StatefulSet is used to run PostgreSQL pods across Kubernetes nodes for high availability. When a pod fails, StatefulSet will restart the pod on the same node. However, if the entire node fails, the PostgreSQL pod will not failover to another node by default. To manually failover the pod, it needs to be force deleted and it will restart on a different ready node. However, manual failovers are not recommended for production use.
A look at some of the ways available to deploy Postgres in a Kubernetes cloud environment, either in small scale using simple configurations, or in larger scale using tools such as Helm charts and the Crunchy PostgreSQL Operator. A short introduction to Kubernetes will be given to explain the concepts involved, followed by examples from each deployment method and observations on the key differences.
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Spark Summit
The document discusses tuning the Garbage First (G1) garbage collector in Java 8 to reduce garbage collection pauses for large heaps used in Spark graph computing workloads. It was found that the default G1 settings resulted in lengthy full garbage collections over 100 seconds. After analyzing the garbage collection logs, the main issue was identified as the concurrent marking phase not completing before a full collection was needed. Increasing the number of concurrent marking threads from 8 to 20 addressed this by speeding up the concurrent phase. With this tuning, no full collections occurred and total stop-the-world pause time was reduced to under a minute, a significant improvement over the original implementation.
Operating PostgreSQL at Scale with KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
All this sounds great, but if you are new to the world of containers, it can be very overwhelming to find a place to start. In this talk, which centers around demos, we will see how you can get PostgreSQL up and running in a containerized environment with some advanced sidecars in only a few steps! We will also see how it extends to a larger production environment with Kubernetes, and what the future holds for PostgreSQL in a containerized world.
We will cover the following:
* Why containers are important and what they mean for PostgreSQL
* Create a development environment with PostgreSQL, pgadmin4, monitoring, and more
* How to use Kubernetes to create your own "database-as-a-service"-like PostgreSQL environment
* Trends in the container world and how it will affect PostgreSQL
At the conclusion of the talk, you will understand the fundamentals of how to use container technologies with PostgreSQL and be on your way to running a containerized PostgreSQL environment at scale!
Always upgrade! There are hundreds of fixes between each PostgreSQL release, and an important number of them are security fixes! Logical replication allows making major upgrades with minimal downtime and feasible cons.
This webinar covered:
- PostgreSQL releases
- Upgrade options
- What is Pglogical?
- Major upgrades
Multi Master PostgreSQL Cluster on KubernetesOhyama Masanori
The document discusses running PostgreSQL on Kubernetes using Crunchy PostgreSQL Containers. It describes how to run a simple PostgreSQL pod on OpenShift by writing a pod manifest file that defines the container image, ports, environment variables, volume mounts, and other settings. Pulling the PostgreSQL container image from DockerHub and defining the pod configuration in YAML allows running PostgreSQL as a stateful application on Kubernetes.
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsDataWorks Summit
As a Hadoop developer, do you want to quickly develop your Hadoop workflows? Do you want to test your workflows in a sandboxed environment similar to production? Do you want to write unit tests for your workflows and add assertions on top of it?
In just a few years, the number of users writing Hadoop/Spark jobs at LinkedIn have grown from tens to hundreds and the number of jobs running every day has grown from hundreds to thousands. With the ever increasing number of users and jobs, it becomes crucial to reduce the development time for these jobs. It is also important to test these jobs thoroughly before they go to production.
We’ve tried to address these issues by creating a testing framework for Hadoop/Spark jobs. The testing framework enables the users to run their jobs in an environment similar to the production environment and on the data which is sampled from the original data. The testing framework consists of a test deployment system, a data generation pipeline to generate the sampled data, a data management system to help users manage and search the sampled data and an assertion engine to validate the test output.
In this talk, we will discuss the motivation behind the testing framework before deep diving into its design. We will further discuss how the testing framework is helping the Hadoop users at LinkedIn to be more productive.
This document discusses Patroni, an open-source tool for managing high availability PostgreSQL clusters. It describes how Patroni uses a distributed configuration system like Etcd or Zookeeper to provide automated failover for PostgreSQL databases. Key features of Patroni include manual and scheduled failover, synchronous replication, dynamic configuration updates, and integration with backup tools like WAL-E. The document also covers some of the challenges of building automatic failover systems and how Patroni addresses issues like choosing a new master node and reattaching failed nodes.
PgOpenCL is a new PostgreSQL procedural language that allows developers to write OpenCL kernels to harness the parallel processing power of GPUs. It introduces a new execution model where tables can be copied to arrays, passed to an OpenCL kernel for parallel operations on the GPU, and results copied back to tables. This unlock the potential for dramatically improved performance on compute-intensive database operations like joins, aggregations, and sorting.
patroni-based citrus high availability environment deploymenthyeongchae lee
The document discusses deploying a Citus high availability environment using Patroni. It begins with an introduction and agenda. It then covers service discovery with Consul, dynamic configuration using ConfD and Consul templates, high availability with Patroni, and distributed PostgreSQL with Citus. Key points include that Patroni allows customized PostgreSQL high availability solutions, Citus enables scaling out PostgreSQL across nodes, and the demo would show integrating these for a production-ready scalable and highly available database cluster.
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC
Speaker: Oskari Saarenmaa
Aiven PostgreSQL is available in five different public cloud providers' infrastructure in more than 60 regions around the world, including 18 in APAC. This has given us a unique opportunity to benchmark and compare performance of similar configurations in different environments.
We'll share our benchmark methods and results, comparing various PostgreSQL configurations and workloads across different clouds.
Spilo is a tool that provides high availability for PostgreSQL databases running on AWS. It uses Patroni and ETCD to handle replication, failover, and cluster state management. Teams at Zalando use Spilo to run over 150 PostgreSQL databases in a self-managed way on AWS, with each team responsible for their own databases. Spilo provides automation for deploying, replicating, and failing over PostgreSQL clusters on AWS, allowing for increased agility compared to managed database services.
Speaker: Alexander Kukushkin
Kubernetes is a solid leader among different cloud orchestration engines and its adoption rate is growing on a daily basis. Naturally people want to run both their applications and databases on the same infrastructure.
There are a lot of ways to deploy and run PostgreSQL on Kubernetes, but most of them are not cloud-native. Around one year ago Zalando started to run HA setup of PostgreSQL on Kubernetes managed by Patroni. Those experiments were quite successful and produced a Helm chart for Patroni. That chart was useful, albeit a single problem: Patroni depended on Etcd, ZooKeeper or Consul.
Few people look forward to deploy two applications instead of one and support them later on. In this talk I would like to introduce Kubernetes-native Patroni. I will explain how Patroni uses Kubernetes API to run a leader election and store the cluster state. I’m going to live-demo a deployment of HA PostgreSQL cluster on Minikube and share our own experience of running more than 130 clusters on Kubernetes.
Patroni is a Python open-source project developed by Zalando in cooperation with other contributors on GitHub: https://github.com/zalando/patroni
Kubernetes is a solid leader among different cloud orchestration engines and its adoption rate is growing on a daily basis. Naturally people want to run both their applications and databases on the same infrastructure.
There are a lot of ways to deploy and run PostgreSQL on Kubernetes, but most of them are not cloud-native. Around one year ago Zalando started to run HA setup of PostgreSQL on Kubernetes managed by Patroni. Those experiments were quite successful and produced a Helm chart for Patroni. That chart was useful, albeit a single problem: Patroni depended on Etcd, ZooKeeper or Consul.
Few people look forward to deploy two applications instead of one and support them later on. In this talk I would like to introduce Kubernetes-native Patroni. I will explain how Patroni uses Kubernetes API to run a leader election and store the cluster state. I’m going to live-demo a deployment of HA PostgreSQL cluster on Minikube and share our own experience of running more than 130 clusters on Kubernetes.
Patroni is a Python open-source project developed by Zalando in cooperation with other contributors on GitHub: https://github.com/zalando/patroni
How to do a LIVE-demo with minikube:
1. git clone https://github.com/zalando/patroni
2. cd patroni
3. git checkout feature/demo
4. cd kubernetes
5. open demo.sh and edit line #4 (specify the minikube context )
6. docker build -t patroni .
7. may be docker push patroni
8. may be edit patroni_k8s.yaml line #22 and put the name of patroni image you build there
9. install tmux
10. run tmux in one terminal
11. run bash demo.sh in another terminal and press Enter from time to time
StatefulSet is used to run PostgreSQL pods across Kubernetes nodes for high availability. When a pod fails, StatefulSet will restart the pod on the same node. However, if the entire node fails, the PostgreSQL pod will not failover to another node by default. To manually failover the pod, it needs to be force deleted and it will restart on a different ready node. However, manual failovers are not recommended for production use.
A look at some of the ways available to deploy Postgres in a Kubernetes cloud environment, either in small scale using simple configurations, or in larger scale using tools such as Helm charts and the Crunchy PostgreSQL Operator. A short introduction to Kubernetes will be given to explain the concepts involved, followed by examples from each deployment method and observations on the key differences.
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Spark Summit
The document discusses tuning the Garbage First (G1) garbage collector in Java 8 to reduce garbage collection pauses for large heaps used in Spark graph computing workloads. It was found that the default G1 settings resulted in lengthy full garbage collections over 100 seconds. After analyzing the garbage collection logs, the main issue was identified as the concurrent marking phase not completing before a full collection was needed. Increasing the number of concurrent marking threads from 8 to 20 addressed this by speeding up the concurrent phase. With this tuning, no full collections occurred and total stop-the-world pause time was reduced to under a minute, a significant improvement over the original implementation.
Operating PostgreSQL at Scale with KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
All this sounds great, but if you are new to the world of containers, it can be very overwhelming to find a place to start. In this talk, which centers around demos, we will see how you can get PostgreSQL up and running in a containerized environment with some advanced sidecars in only a few steps! We will also see how it extends to a larger production environment with Kubernetes, and what the future holds for PostgreSQL in a containerized world.
We will cover the following:
* Why containers are important and what they mean for PostgreSQL
* Create a development environment with PostgreSQL, pgadmin4, monitoring, and more
* How to use Kubernetes to create your own "database-as-a-service"-like PostgreSQL environment
* Trends in the container world and how it will affect PostgreSQL
At the conclusion of the talk, you will understand the fundamentals of how to use container technologies with PostgreSQL and be on your way to running a containerized PostgreSQL environment at scale!
Always upgrade! There are hundreds of fixes between each PostgreSQL release, and an important number of them are security fixes! Logical replication allows making major upgrades with minimal downtime and feasible cons.
This webinar covered:
- PostgreSQL releases
- Upgrade options
- What is Pglogical?
- Major upgrades
Multi Master PostgreSQL Cluster on KubernetesOhyama Masanori
The document discusses running PostgreSQL on Kubernetes using Crunchy PostgreSQL Containers. It describes how to run a simple PostgreSQL pod on OpenShift by writing a pod manifest file that defines the container image, ports, environment variables, volume mounts, and other settings. Pulling the PostgreSQL container image from DockerHub and defining the pod configuration in YAML allows running PostgreSQL as a stateful application on Kubernetes.
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsDataWorks Summit
As a Hadoop developer, do you want to quickly develop your Hadoop workflows? Do you want to test your workflows in a sandboxed environment similar to production? Do you want to write unit tests for your workflows and add assertions on top of it?
In just a few years, the number of users writing Hadoop/Spark jobs at LinkedIn have grown from tens to hundreds and the number of jobs running every day has grown from hundreds to thousands. With the ever increasing number of users and jobs, it becomes crucial to reduce the development time for these jobs. It is also important to test these jobs thoroughly before they go to production.
We’ve tried to address these issues by creating a testing framework for Hadoop/Spark jobs. The testing framework enables the users to run their jobs in an environment similar to the production environment and on the data which is sampled from the original data. The testing framework consists of a test deployment system, a data generation pipeline to generate the sampled data, a data management system to help users manage and search the sampled data and an assertion engine to validate the test output.
In this talk, we will discuss the motivation behind the testing framework before deep diving into its design. We will further discuss how the testing framework is helping the Hadoop users at LinkedIn to be more productive.
This document discusses Patroni, an open-source tool for managing high availability PostgreSQL clusters. It describes how Patroni uses a distributed configuration system like Etcd or Zookeeper to provide automated failover for PostgreSQL databases. Key features of Patroni include manual and scheduled failover, synchronous replication, dynamic configuration updates, and integration with backup tools like WAL-E. The document also covers some of the challenges of building automatic failover systems and how Patroni addresses issues like choosing a new master node and reattaching failed nodes.
PgOpenCL is a new PostgreSQL procedural language that allows developers to write OpenCL kernels to harness the parallel processing power of GPUs. It introduces a new execution model where tables can be copied to arrays, passed to an OpenCL kernel for parallel operations on the GPU, and results copied back to tables. This unlock the potential for dramatically improved performance on compute-intensive database operations like joins, aggregations, and sorting.
patroni-based citrus high availability environment deploymenthyeongchae lee
The document discusses deploying a Citus high availability environment using Patroni. It begins with an introduction and agenda. It then covers service discovery with Consul, dynamic configuration using ConfD and Consul templates, high availability with Patroni, and distributed PostgreSQL with Citus. Key points include that Patroni allows customized PostgreSQL high availability solutions, Citus enables scaling out PostgreSQL across nodes, and the demo would show integrating these for a production-ready scalable and highly available database cluster.
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC
Speaker: Oskari Saarenmaa
Aiven PostgreSQL is available in five different public cloud providers' infrastructure in more than 60 regions around the world, including 18 in APAC. This has given us a unique opportunity to benchmark and compare performance of similar configurations in different environments.
We'll share our benchmark methods and results, comparing various PostgreSQL configurations and workloads across different clouds.
Spilo is a tool that provides high availability for PostgreSQL databases running on AWS. It uses Patroni and ETCD to handle replication, failover, and cluster state management. Teams at Zalando use Spilo to run over 150 PostgreSQL databases in a self-managed way on AWS, with each team responsible for their own databases. Spilo provides automation for deploying, replicating, and failing over PostgreSQL clusters on AWS, allowing for increased agility compared to managed database services.
Speaker: Alexander Kukushkin
Kubernetes is a solid leader among different cloud orchestration engines and its adoption rate is growing on a daily basis. Naturally people want to run both their applications and databases on the same infrastructure.
There are a lot of ways to deploy and run PostgreSQL on Kubernetes, but most of them are not cloud-native. Around one year ago Zalando started to run HA setup of PostgreSQL on Kubernetes managed by Patroni. Those experiments were quite successful and produced a Helm chart for Patroni. That chart was useful, albeit a single problem: Patroni depended on Etcd, ZooKeeper or Consul.
Few people look forward to deploy two applications instead of one and support them later on. In this talk I would like to introduce Kubernetes-native Patroni. I will explain how Patroni uses Kubernetes API to run a leader election and store the cluster state. I’m going to live-demo a deployment of HA PostgreSQL cluster on Minikube and share our own experience of running more than 130 clusters on Kubernetes.
Patroni is a Python open-source project developed by Zalando in cooperation with other contributors on GitHub: https://github.com/zalando/patroni
Kubernetes is a solid leader among different cloud orchestration engines and its adoption rate is growing on a daily basis. Naturally people want to run both their applications and databases on the same infrastructure.
There are a lot of ways to deploy and run PostgreSQL on Kubernetes, but most of them are not cloud-native. Around one year ago Zalando started to run HA setup of PostgreSQL on Kubernetes managed by Patroni. Those experiments were quite successful and produced a Helm chart for Patroni. That chart was useful, albeit a single problem: Patroni depended on Etcd, ZooKeeper or Consul.
Few people look forward to deploy two applications instead of one and support them later on. In this talk I would like to introduce Kubernetes-native Patroni. I will explain how Patroni uses Kubernetes API to run a leader election and store the cluster state. I’m going to live-demo a deployment of HA PostgreSQL cluster on Minikube and share our own experience of running more than 130 clusters on Kubernetes.
Patroni is a Python open-source project developed by Zalando in cooperation with other contributors on GitHub: https://github.com/zalando/patroni
How to do a LIVE-demo with minikube:
1. git clone https://github.com/zalando/patroni
2. cd patroni
3. git checkout feature/demo
4. cd kubernetes
5. open demo.sh and edit line #4 (specify the minikube context )
6. docker build -t patroni .
7. may be docker push patroni
8. may be edit patroni_k8s.yaml line #22 and put the name of patroni image you build there
9. install tmux
10. run tmux in one terminal
11. run bash demo.sh in another terminal and press Enter from time to time
This document provides an overview of Kubernetes 101. It begins with asking why Kubernetes is needed and provides a brief history of the project. It describes containers and container orchestration tools. It then covers the main components of Kubernetes architecture including pods, replica sets, deployments, services, and ingress. It provides examples of common Kubernetes manifest files and discusses basic Kubernetes primitives. It concludes with discussing DevOps practices after adopting Kubernetes and potential next steps to learn more advanced Kubernetes topics.
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
QuestDB es una base de datos open source de alto rendimiento. Mucha gente nos comentaba que les gustaría usarla como servicio, sin tener que gestionar las máquinas. Así que nos pusimos manos a la obra para desarrollar una solución que nos permitiese lanzar instancias de QuestDB con provisionado, monitorización, seguridad o actualizaciones totalmente gestionadas.
Unos cuantos clusters de Kubernetes más tarde, conseguimos lanzar nuestra oferta de QuestDB Cloud. Esta charla es la historia de cómo llegamos ahí. Hablaré de herramientas como Calico, Karpenter, CoreDNS, Telegraf, Prometheus, Loki o Grafana, pero también de retos como autenticación, facturación, multi-nube, o de a qué tienes que decir que no para poder sobrevivir en la nube.
Lessons learned with kubernetes in productionat PlayPassPeter Vandenabeele
Lessons learned with kubernetes in productionat PlayPass, presented at the 6th Docker Birthday Meetup in Antwerpen. What went well and what are some open issues. Also, we discussed some security measures after the presentations.
Historically, sharing a Linux server entailed all kinds of untenable compromises. In addition to the security concerns, there was simply no good way to keep one application from hogging resources and messing with the others. The classic “noisy neighbor” problem made shared systems the bargain-basement slums of the Internet, suitable only for small or throwaway projects.
Serious use-cases traditionally demanded dedicated systems. Over the past decade virtualization (in conjunction with Moore’s law) has democratized the availability of what amount to dedicated systems, and the result is hundreds of thousands of websites and applications deployed into VPS or cloud instances. It’s a step in the right direction, but still has glaring flaws.
Most of these websites are just piles of code sitting on a server somewhere. How did that code got there? How can it can be scaled? Secured? Maintained? It’s anybody’s guess. There simply isn’t enough SysAdmin talent in the world to meet the demands of managing all these apps with anything close to best practices without a better model.
Containers are a whole new ballgame. Unlike VMs, you skip the overhead of running an entire OS for every application environment. There’s also no need to provision a whole new machine to have a place to deploy, meaning you can spin up or scale your application with orders of magnitude more speed and accuracy.
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
In this presentation I talk about our motivation to converting our microservices to run on Kubernetes. I discuss many of the technical challenges we encountered along the way, including networking issues, Java issues, monitoring and alerting, and managing all of our resources!
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery called Pods. ReplicaSets ensure that a specified number of pod replicas are running at any given time. Key components include Pods, Services for enabling network access to applications, and Deployments to update Pods and manage releases.
Kubernetes is making the promise of changing the datacenter from being a group of computer to "a computer" itself. This presentation outlines the new features in K8S with 1.1 and 1.2 release.
This document discusses container technologies including App Container (appc) and rkt. It provides an overview of appc components like the image format, discovery, and executor. It then discusses rkt, an implementation of appc, describing its modular architecture with stages 0-2 and use of systemd and cgroups for isolation. It also touches on rkt security, networking, and integration with systemd and user namespaces.
Customize and Secure the Runtime and Dependencies of Your Procedural Language...VMware Tanzu
Customize and Secure the Runtime and Dependencies of Your Procedural Languages Using PL/Container
Greenplum Summit at PostgresConf US 2018
Hubert Zhang and Jack Wu
Container technologies use namespaces and cgroups to provide isolation between processes and limit resource usage. Docker builds on these technologies using a client-server model and additional features like images, containers, and volumes to package and run applications reliably and at scale. Kubernetes builds on Docker to provide a platform for automating deployment, scaling, and operations of containerized applications across clusters of hosts. It uses labels and pods to group related containers together and services to provide discovery and load balancing for pods.
Kubernetes has become the defacto standard as a platform for container orchestration. Its ease of extending and many integrations has paved the way for a wide variety of data science and research tooling to be built on top of it.
From all encompassing tools like Kubeflow that make it easy for researchers to build end-to-end Machine Learning pipelines to specific orchestration of analytics engines such as Spark; Kubernetes has made the deployment and management of these things easy. This presentation will showcase some of the larger research tools in the ecosystem and go into how Kubernetes has enabled this easy form of application management.
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...Puppet
Today there's a multitude of ways to get up and running with Kubernetes in the Cloud. In this talk we'll look at how easy it is to operationalize your K8s cluster deployments using the new gcontainer puppet module for Google Container Engine (GKE), Google’s Managed Kubernetes service. We'll walk you through an end to end deployment of a demo application using the gcontainer puppet module and the kubernetes module. We'll also take a deep dive into the unique value proposition that GKE brings to Kubernetes deployments, including security, scaling, federation, automated container builds, integrated private container registry and GPUs.
Kubernetes Basis: Pods, Deployments, and ServicesJian-Kai Wang
Kubernetes is a container management platform and empowers the scalability to the container. In this repository, we address the issues of how to use Kubernetes with real cases. We start from the basic objects in Kubernetes, Pods, deployments, and Services. This repository is also a tutorial for those with advanced containerization skills trying to step into the Kubernetes. We also provide several YAML examples for those looking for quickly deploying services. Please enjoy it and let's start the journey to Kubernetes.
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
The document introduces the Scylla Operator for Kubernetes, which provides a management layer for Scylla on Kubernetes. It addresses some limitations of using StatefulSets alone to run Scylla, such as safe scale down operations and tracking member identity. The operator implements the controller pattern with custom resources to deploy and manage Scylla clusters on Kubernetes. It handles tasks like cluster creation and scale up/down while addressing issues like local storage failures.
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
The document discusses Kubernetes adoption at Squarespace as their engineering organization grew. It describes the challenges of a monolithic architecture and how microservices addressed these challenges. It then discusses how Kubernetes helped solve operational challenges of provisioning and scaling microservices. Key Kubernetes concepts like pods, deployments, services and namespaces are explained. Monitoring, networking and security with Kubernetes are also covered.
[WSO2Con EU 2018] Deploying Applications in K8S and DockerWSO2
Within the last four years container technologies have become very popular. A lot of companies and developers are now using containers to ship their applications. Docker provides an easy-to-use packaging model to bundle the application. However in many cases, a single container is not enough to run an application. It requires multiple containers, scaled into multiple host machines to become a production grade deployment. Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. This presentation discusses best practices of deploying application in Docker and Kubernetes while discussing Docker and Kubernetes concepts.
Similar to PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin (20)
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...Equnix Business Solutions
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdfEqunix Business Solutions
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdfEqunix Business Solutions
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HiroseEqunix Business Solutions
PGSpider is a high-performance SQL cluster engine developed by Toshiba Corporation. It allows distributed querying of heterogeneous data sources using standard SQL. PGSpider improves retrieval performance through parallel queries across nodes and supports multi-tenant querying to retrieve records from the same table across nodes. It utilizes techniques like pushdown of conditional expressions and aggregation functions to nodes to reduce network traffic.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
3. 3
Typical problems and horror stories
Brief introduction to Kubernetes
Spilo & Patroni
Postgres-Operator
AGENDA
4. 4
WE BRING FASHION TO PEOPLE IN 17 COUNTRIES
17 markets
7 fulfillment centers
26.4 million active customers
5.4 billion € net sales 2018
250 million visits per month
15,000 employees in Europe
5. 5
PostgreSQL at Zalando
> 300
In the data centers
> 160
Databases on AWS
Managed by DB team
> 1000
Databases in other
Kubernetes clusters
> 190
Run in the ACID’s
Kubernetes cluster
6. 6
What is Kubernetes?
● A set of open-source components running on one or more servers
● A container orchestration system
● An abstraction layer over your real or virtualized hardware
● An “infrastructure as a code” system
● Automatic resource allocation
● Next step after Ansible/Chef/Puppet
7. 7
What Kubernetes is not?
● An operating system
● A magical way to make your infrastructure scalable
● An excuse to fire your devops (someone has to
configure and manage it)
● A good solution for running 2-3 servers
10. 10
Stateful applications on Kubernetes
● PersistentVolumes
○ Abstracts details how storage is provisioned
○ Supports many different storage types via plugins:
■ EBS, AzureDisk, iSCSI, NFS, CEPH, Glusterfs and so on
● StatefulSets
○ Guarantied number of Pods with stable (and unique) identifiers
○ Ordered deployment and scaling
○ Connecting Pods with corresponding persistent storage
(PersistentVolume+PersistentVolumeClaim)
11. 11
● All supported versions of PostgreSQL inside the single image
● Plenty of extensions (pg_partman, pg_cron, postgis, timescaledb, etc)
● Additional tools (pgq, pgbouncer, wal-e/wal-g)
● PGDATA on an external volume
● Patroni for HA
● Environment-variables based configuration
● Lightweight!
Spilo Docker image
12. 12
● Automatic failover solution for PostgreSQL streaming replication
● A python daemon that manages one PostgreSQL instance
● Keeps the cluster state in a DCS (Etcd, Zookeeper, Consul, Kubernetes)
○ Uses DCS for leader elections
■ Makes PostgreSQL 1st class citizen on Kubernetes!
● Helps to automate a lot of things like:
○ A new cluster deployment
○ Scaling out and in
○ PostgreSQL configuration management
What is Patroni
14. 14
● A few long YAML manifests to write
● Different parts of PostgreSQL configuration spread over multiple
manifests
● No easy way to work with a cluster as a whole (update, delete)
● Manual generation of DB objects, i.e. users, and their passwords.
Manual deployment to Kubernetes
15. 15
● Encapsulates knowledge of a human operating the service
● Implement a controller application to act on custom resources
● CRD (custom resource definitions) to describe a domain-specific object
(i.e. a Postgres cluster)
https://coreos.com/blog/introducing-operators.html
Kubernetes operator pattern
16. 16
● Defines a custom Postgresql resource
● Watches instances of Postgresql, creates/updates/deletes corresponding
Kubernetes objects
● Allows updating running-cluster resources (memory, cpu, volumes), postgres
configuration
● Creates databases, users and automatically generates passwords
● Auto-repairs, smart rolling updates (switchover to replicas before updating
the master)
Zalando Postgres-Operator
17. 17
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
spec:
teamId: "ACID" # is used to provision human users
volume:
size: 1Gi
numberOfInstances: 2
users:
zalando: # database owner
- createrole
- createdb
foo_app_user: # role for application foo
databases: # name->owner
foo: zalando
postgresql:
version: "11"
Postgresql manifest
21. 21
Zalando Postgres-Operator features
● Rolling updates on Postgres cluster changes
● Volume resize without Pod restarts
● Cloning Postgres clusters
● Logical Backups to S3 Bucket
● Standby cluster from S3 WAL archive
● Configurable for non-cloud environments
● UI to create and edit Postgres cluster manifests
22. 22
Kubernetes rolling upgrade
● Rotates all worker nodes in the K8s cluster
○ Does it in a rolling matter, one-by-one
○ If you are unlucky, it will cause the number of failover equal number
of pods in your postgres cluster
● How Postgres-Operator deals with it?
○ Detect the to-be-decommissioned node by lack of the ready label
and SchedulingDisabled status
○ Move all master pods:
■ Move replicas to the already updated node
■ Trigger switchover to those replicas
23. 23
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
cluster: C
replica
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (start)
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
replica
Node
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
24. 24
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
cluster: C
replica
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (step 1)
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
replica
Node
cluster: A
replica
cluster: B
replica
cluster: A
replica
cluster: B
replica
cluster: C
replica
cluster: C
replica
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
25. 25
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (step 1)
Node Node
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: A
replica
cluster: B
replica
cluster: C
replica
cluster: C
replica
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
26. 26
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (switchover)
Node Node
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: A
replica
cluster: B
replica
cluster: C
replica
cluster: C
replica
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
27. 27
Availability Zone 1
Node
cluster: A
replica
cluster: B
replica
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (switchover)
Node Node
cluster: C
replica
Node Node
cluster: A
primary
cluster: B
replica
cluster: A
replica
cluster: B
primary
cluster: C
replica
cluster: C
primary
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
28. 28
Availability Zone 1
Node
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (finish)
Node
cluster: A
replica
cluster: B
replica
Node Node
cluster: C
replica
Node
cluster: A
primary
cluster: B
replica
cluster: A
replica
cluster: B
primary
cluster: C
replica
cluster: C
primary
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
cluster: A
replica
cluster: B
replica
cluster: C
replica
30. 30
Problems with AWS infrastructure
● AWS API Rate Limit Exceeded
○ Prevents or delays attaching/detaching persistent volumes (EBS)
to/from Pods
■ Delays recovery of failed Pods
○ Might delay a deployment of a new cluster
● Sometimes EC2 instances fail and being shutdown by AWS
○ Shutdown might take ages
○ All EBS volumes remain attached until instance is shutted down
■ Pods can’t be rescheduled
31. 31
KUBE2IAM
● AWS IAM + Instance Profile is very flexible, convenient, and powerful
technology
● AWS tools and libraries can use temporary credentials obtained from the
Instance Profile to access other AWS services, for example S3
○ We are using S3 for backups and continuous archiving (wal-e/wal-g)
● Instance Profile is configured per EC2 Instance (K8s Node)!
● kube2iam - provides different IAM roles for Pods running on a single Node
● kube2iam might stop serving the Pod which is being shutdown
○ wal-e is unable to get credentials to access S3
■ Last WAL segments remain unarchived! :(
32. 32
Lack of Disk space
● Single volume for PGDATA, pg_wal and logs
● FATAL,53100,could not write to file
"pg_wal/xlogtemp.22993": No space left on device
○ Usually ends up with postgres being self shutdown
● Patroni tries to recover the primary which isn’t running
○ “start->promote->No space left->shutdown” loop
Disk space MUST be monitored!
33. 33
Disk space problems root cause or why
we don’t implement volume autoextend
● Excessive logging
○ slow queries, human access, application errors, connections/disconnections
● pg_wal growth
○ archive_command is slow/failing
○ Unconsumed changes on the replication slot
■ Replica is not streaming? Replica is slow?
■ Logical replication slot?
○ checkpoints taking too long due to throttled IOPS
● PGDATA growth
○ Table and index bloat!
■ Useless updates of unchanged data?
■ Autovacuum tuning? Zheap?
○ Natural growth of data
■ Lack of retention policies?
■ Broken cleanup jobs?
34. 34
ORM can cause wal-e to fail!
wal_e.main ERROR MSG: Attempted to archive a file that is too
large. HINT: There is a file in the postgres database
directory that is larger than 1610612736 bytes. If no such
file exists, please report this as a bug. In particular,
check pg_stat/pg_stat_statements.stat.tmp, which appears to
be 2010822591 bytes
Meanwhile in pg_stat_statements:
UPDATE foo SET bar = $1 WHERE id IN ($2, $3, $4, …, $10500);
UPDATE foo SET bar = $1 WHERE id IN ($2, $3, $4, …, $100500);
…. and so on
35. 35
Exclusive backup issues
PANIC,XX000,"online backup was canceled, recovery cannot
continue",,,,,"xlog redo at D45/EB000028 for
XLOG/CHECKPOINT_SHUTDOWN: redo D45/EB000028; tli 237; prev
tli 237; fpw true; xid 0:105446371; oid 187558; multi 1;
offset 0; oldest xid 544 in DB 1; oldest multi 1 in DB 1;
oldest/newest commit timestamp xid: 0/0; oldest running xid
0; shutdown",,,,""
● There is no way to join back such failed primary as a replica without
rebuilding (reinitializing) it!
○ wal-g supports non-exclusive backups, but not yet stable enough
36. 36
● Postgres log:
server process (PID 10810) was terminated by signal 9: Killed
● dmesg -T:
[Wed Jul 31 01:35:35 2019] Memory cgroup out of memory: Kill process 14208
(postgres) score 606 or sacrifice child
[Wed Jul 31 01:35:35 2019] Killed process 14208 (postgres) total-vm:2972124kB,
anon-rss:68724kB, file-rss:1304kB, shmem-rss:2691844kB
[Wed Jul 31 01:35:35 2019] oom_reaper: reaped process 14208 (postgres), now
anon-rss:0kB, file-rss:0kB, shmem-rss:2691844kB
Out-Of-Memory Killer
37. 37
Out-Of-Memory Killer
● Pids in the container (10810) and on the host are different (14208)
○ Sometimes it is hard to investigate what happened
● oom_score_adj trick doesn’t really make sense in the container
○ There are only Patroni+PostgreSQL running
● It is not really clear how memory accounting in the container works:
○ memory: usage 8388392kB, limit 8388608kB, failcnt 1
○ cache:2173896KB rss:6019692KB rss_huge:0KB shmem:2173428KB
mapped_file:2173512KB dirty:132KB writeback:0KB swap:0KB
inactive_anon:15732KB active_anon:8177696KB inactive_file:320KB active_file:184KB
unevictable:0KB
38. 38
Kubernetes+Docker
● ERROR: could not resize shared memory
segment "/PostgreSQL.1384046013" to 8388608
bytes: No space left on device
● PostgreSQL 11 (due to the “parallel hash join”)
● Docker limits /dev/shm to 64MB by default
● How to fix?
○ Mount custom dshm tmpfs volume to /dev/shm
■ Or set enableShmVolume: true in the cluster
manifest
39. 39
Problems with PostgreSQL
● Logical decoding on the replica? Failover slots?
○ Patroni does sort of a hack by not allowing connections until
logical slot is created.
■ Consumer might still lose some events.
● “FATAL too many connections”
○ Prevents replica from starting streaming
■ Solved in PostgreSQL 12 (wal_senders not count as
part of max_connection)
○ Built-in connection pooler?
40. 40
Human errors
● YAML formatting :)
● Inadequate resource and limit requests
○ Pod can’t be scheduled due to the node weakness
○ Processes are terminated by oom-killer
● Deleted Postgres-Operator/Spilo ServiceAccount by
employees
41. 41
Conclusion
● Postgres-Operator helps us to manage more than 1000
PostgreSQL clusters distributed in 80+ K8s accounts with
minimal effort.
○ It wouldn’t be possible without high level of
automation
● In the cloud and on K8s you have to be ready to deal with
absolutely new problems and failure scenarios
○ Find the solution and implement it in the operator!