The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
In this talk, we will cover the following:
- Why containers are important and what they mean for PostgreSQL
- Setting up and managing a PostgreSQL along with pgadmin4 and monitoring
- Running PostgreSQL on Kubernetes with a Demo
- Trends in the container world and how it will affect PostgreSQL
An Introduction to Using PostgreSQL with Docker & KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
In this talk, we will cover the following:
- Why containers are important and what they mean for PostgreSQL
- Setting up and managing a PostgreSQL container
- Extending your setup with a pgadmin4 container
- Container orchestration: What this means, and how to use Kubernetes to leverage database-as-a-service with PostgreSQL
- Trends in the container world and how it will affect PostgreSQL
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
Congratulations: you've been selected to build an application that will manage whether or not the rooms for PGConf.EU are being occupied by a session!
On the surface, this sounds simple, but we will be managing the rooms of PGConf.EU, so we know that a lot of people will be accessing the system. Therefore, we need to ensure that the system can handle all of the eager users that will be flooding the PGConf.EU website checking to see what availability each of the PGConf.EU rooms has.
To do this, we will explore the following PGConf.EU features:
* Data types and their functionality, such as:
* Data/Time types
* Ranges
Indexes such as:
* GiST
* SP-Gist
* Common Table Expressions and Recursion
* Set generating functions and LATERAL queries
* Functions and the PL/PGSQL
* Triggers
* Logical decoding and streaming
We will be writing our application primary with SQL, though we will sneak in a little bit of Python and using Kafka to demonstrate the power of logical decoding.
At the end of the presentation, we will have a working application, and you will be happy knowing that you provided a wonderful user experience for all PGConf.EU attendees made possible by the innovation of PGConf.EU!
Learning postgresql, Chapter 1: Getting started with postgresql
Remarks
This section provides an overview of what postgresql is, and why a developer might want to use it.
It should also mention any large subjects within postgresql, and link out to the related topics. Since
the Documentation for postgresql is new, you may need to create initial versions of those related
topics.
Safely Protect PostgreSQL Passwords - Tell Others to SCRAMJonathan Katz
PostgreSQL 10 introduced SCRAM (Salted Challenge Response Authentication Mechanism), introduced in RFC 5802, as a way to securely authenticate passwords. The SCRAM algorithm lets a client and server validate a password without ever sending the password, whether plaintext or a hashed form of it, to each other, using a series of cryptographic methods.
At the end of this talk, you will understand how SCRAM works, how to ensure your PostgreSQL drivers supports it, how to upgrade your passwords to using SCRAM-SHA-256, and why you want to tell other PostgreSQL password mechanisms to SCRAM!
An Introduction to Using PostgreSQL with Docker & KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
In this talk, we will cover the following:
- Why containers are important and what they mean for PostgreSQL
- Setting up and managing a PostgreSQL container
- Extending your setup with a pgadmin4 container
- Container orchestration: What this means, and how to use Kubernetes to leverage database-as-a-service with PostgreSQL
- Trends in the container world and how it will affect PostgreSQL
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
Congratulations: you've been selected to build an application that will manage whether or not the rooms for PGConf.EU are being occupied by a session!
On the surface, this sounds simple, but we will be managing the rooms of PGConf.EU, so we know that a lot of people will be accessing the system. Therefore, we need to ensure that the system can handle all of the eager users that will be flooding the PGConf.EU website checking to see what availability each of the PGConf.EU rooms has.
To do this, we will explore the following PGConf.EU features:
* Data types and their functionality, such as:
* Data/Time types
* Ranges
Indexes such as:
* GiST
* SP-Gist
* Common Table Expressions and Recursion
* Set generating functions and LATERAL queries
* Functions and the PL/PGSQL
* Triggers
* Logical decoding and streaming
We will be writing our application primary with SQL, though we will sneak in a little bit of Python and using Kafka to demonstrate the power of logical decoding.
At the end of the presentation, we will have a working application, and you will be happy knowing that you provided a wonderful user experience for all PGConf.EU attendees made possible by the innovation of PGConf.EU!
Learning postgresql, Chapter 1: Getting started with postgresql
Remarks
This section provides an overview of what postgresql is, and why a developer might want to use it.
It should also mention any large subjects within postgresql, and link out to the related topics. Since
the Documentation for postgresql is new, you may need to create initial versions of those related
topics.
Safely Protect PostgreSQL Passwords - Tell Others to SCRAMJonathan Katz
PostgreSQL 10 introduced SCRAM (Salted Challenge Response Authentication Mechanism), introduced in RFC 5802, as a way to securely authenticate passwords. The SCRAM algorithm lets a client and server validate a password without ever sending the password, whether plaintext or a hashed form of it, to each other, using a series of cryptographic methods.
At the end of this talk, you will understand how SCRAM works, how to ensure your PostgreSQL drivers supports it, how to upgrade your passwords to using SCRAM-SHA-256, and why you want to tell other PostgreSQL password mechanisms to SCRAM!
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter
Big Telco, Bigger real-time demands: Real-time processing in Telco
- Presented by Jung-ryong Lee, engineer manager at SK Telecom at Gruter TECHDAY 2014 Oct.29 Seoul, Korea
Designing your SaaS Database for Scale with PostgresOzgun Erdogan
If you’re building a SaaS application, you probably already have the notion of tenancy built in your data model. Typically, most information relates to tenants / customers / accounts and your database tables capture this natural relation.
With smaller amounts of data, it’s easy to throw more hardware at the problem and scale up your database. As these tables grow however, you need to think about ways to scale your multi-tenant (B2B) database across dozens or hundreds of machines.
In this talk, we're first going to talk about motivations behind scaling your SaaS (multi-tenant) database and several heuristics we found helpful on deciding when to scale. We'll then describe three design patterns that are common in scaling SaaS databases: (1) Create one database per tenant, (2) Create one schema per tenant, and (3) Have all tenants share the same table(s). Next, we'll highlight the tradeoffs involved with each design pattern and focus on one pattern that scales to hundreds of thousands of tenants. We'll also share an example architecture from the industry that describes this pattern in more detail.
Last, we'll talk about key PostgreSQL properties, such as semi-structured data types, that make building multi-tenant applications easy. We'll also mention Citus as a method to scale out your multi-tenant database. We'll conclude by answering frequently asked questions on multi-tenant databases and Q&A.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
At a blistering pace and for a variety of reasons, companies are migrating their on-premise database infrastructures to cloud-based solutions - to save costs on hardware, tame the impact of disaster recovery, or even to improve security. Zalando is not an exception: more than two years ago we migrated our first production services to AWS.
In addition to the fully managed database services like RDS and Aurora, Amazon offers a wide spectra of EC2 Instances with different types of performance and price. Without a lot of experience in running cloud databases it’s not easy to make the right choice, and as a result you will either have pure database performance or will overpay for over-provisioned resources.
In this talk I will compare different ways of running PostgreSQL on AWS, explain why we decided to run most of our databases on EC2 Instances instead of RDS, how we chose EC2 Instance types and EBS Volumes, which AWS CloudWatch metrics MUST be monitored (and why), and what problems we hit plus how to avoid them.
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
At Knewton we operate across five different VPCs a total of 29 clusters, each ranging from 3 nodes to 24 nodes. For a team of three to maintain this is not herculean, however good tools to diagnose issues and gather information in a distributed manner are vital to moving quickly and minimizing engineering time spent.
The database team at Knewton has been successfully using a combination of Ansible and custom open sourced tools to maintain and improve the Cassandra deployment at Knewton. I will be talking about several of these tools and giving examples of how we are using them. Specifically I will discuss the cassandra-tracing tool, which analyzes the contents of the system_traces keyspace, and the cassandra-stat tool, which gives real-time output of the operations of a cassandra cluster. Distributed administration with ad-hoc Ansible will also be covered and I will walk through examples of using these commands to identify and remediate clusterwide issues.
About the Speaker
Jeffrey Berger Lead Database Engineer, Knewton
Dr. Jeffrey Berger is currently the lead database engineer at Knewton, an education tech startup in NYC. He joined the tech scene in NYC in 2013 and spent two years working with MongoDB, becoming a certified MongoDB administrator and a MongoDB Master. He received his Cassandra Administrator certification at Cassandra Summit 2015. He holds a Ph.D. in Theoretical Physics from Penn State and spent several years working on high energy nuclear interactions.
PostgreSQL is a well-known relational database. But in the last few years, it has gained capabilities that previously belonged only to "NoSQL" databases. In this talk, I describe several of PostgreSQL that give it such capabilities.
On July 6, 2021, MariaDB 10.6 became generally available (production ready). This presentation focuses on the most important aspects of it as well as the influence it has. Improvements to InnoDB, SYS Schema Adoption, and deprecated variables and engines are all part of this presentation.
Operating PostgreSQL at Scale with KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
All this sounds great, but if you are new to the world of containers, it can be very overwhelming to find a place to start. In this talk, which centers around demos, we will see how you can get PostgreSQL up and running in a containerized environment with some advanced sidecars in only a few steps! We will also see how it extends to a larger production environment with Kubernetes, and what the future holds for PostgreSQL in a containerized world.
We will cover the following:
* Why containers are important and what they mean for PostgreSQL
* Create a development environment with PostgreSQL, pgadmin4, monitoring, and more
* How to use Kubernetes to create your own "database-as-a-service"-like PostgreSQL environment
* Trends in the container world and how it will affect PostgreSQL
At the conclusion of the talk, you will understand the fundamentals of how to use container technologies with PostgreSQL and be on your way to running a containerized PostgreSQL environment at scale!
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter
Big Telco, Bigger real-time demands: Real-time processing in Telco
- Presented by Jung-ryong Lee, engineer manager at SK Telecom at Gruter TECHDAY 2014 Oct.29 Seoul, Korea
Designing your SaaS Database for Scale with PostgresOzgun Erdogan
If you’re building a SaaS application, you probably already have the notion of tenancy built in your data model. Typically, most information relates to tenants / customers / accounts and your database tables capture this natural relation.
With smaller amounts of data, it’s easy to throw more hardware at the problem and scale up your database. As these tables grow however, you need to think about ways to scale your multi-tenant (B2B) database across dozens or hundreds of machines.
In this talk, we're first going to talk about motivations behind scaling your SaaS (multi-tenant) database and several heuristics we found helpful on deciding when to scale. We'll then describe three design patterns that are common in scaling SaaS databases: (1) Create one database per tenant, (2) Create one schema per tenant, and (3) Have all tenants share the same table(s). Next, we'll highlight the tradeoffs involved with each design pattern and focus on one pattern that scales to hundreds of thousands of tenants. We'll also share an example architecture from the industry that describes this pattern in more detail.
Last, we'll talk about key PostgreSQL properties, such as semi-structured data types, that make building multi-tenant applications easy. We'll also mention Citus as a method to scale out your multi-tenant database. We'll conclude by answering frequently asked questions on multi-tenant databases and Q&A.
This ppt was used by Devrim at pgDay Asia 2017. He talked about some important facts about WAL - Transaction Logs or xlogs in PostgreSQL. Some of these can really come handy on a bad day
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
At a blistering pace and for a variety of reasons, companies are migrating their on-premise database infrastructures to cloud-based solutions - to save costs on hardware, tame the impact of disaster recovery, or even to improve security. Zalando is not an exception: more than two years ago we migrated our first production services to AWS.
In addition to the fully managed database services like RDS and Aurora, Amazon offers a wide spectra of EC2 Instances with different types of performance and price. Without a lot of experience in running cloud databases it’s not easy to make the right choice, and as a result you will either have pure database performance or will overpay for over-provisioned resources.
In this talk I will compare different ways of running PostgreSQL on AWS, explain why we decided to run most of our databases on EC2 Instances instead of RDS, how we chose EC2 Instance types and EBS Volumes, which AWS CloudWatch metrics MUST be monitored (and why), and what problems we hit plus how to avoid them.
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
At Knewton we operate across five different VPCs a total of 29 clusters, each ranging from 3 nodes to 24 nodes. For a team of three to maintain this is not herculean, however good tools to diagnose issues and gather information in a distributed manner are vital to moving quickly and minimizing engineering time spent.
The database team at Knewton has been successfully using a combination of Ansible and custom open sourced tools to maintain and improve the Cassandra deployment at Knewton. I will be talking about several of these tools and giving examples of how we are using them. Specifically I will discuss the cassandra-tracing tool, which analyzes the contents of the system_traces keyspace, and the cassandra-stat tool, which gives real-time output of the operations of a cassandra cluster. Distributed administration with ad-hoc Ansible will also be covered and I will walk through examples of using these commands to identify and remediate clusterwide issues.
About the Speaker
Jeffrey Berger Lead Database Engineer, Knewton
Dr. Jeffrey Berger is currently the lead database engineer at Knewton, an education tech startup in NYC. He joined the tech scene in NYC in 2013 and spent two years working with MongoDB, becoming a certified MongoDB administrator and a MongoDB Master. He received his Cassandra Administrator certification at Cassandra Summit 2015. He holds a Ph.D. in Theoretical Physics from Penn State and spent several years working on high energy nuclear interactions.
PostgreSQL is a well-known relational database. But in the last few years, it has gained capabilities that previously belonged only to "NoSQL" databases. In this talk, I describe several of PostgreSQL that give it such capabilities.
On July 6, 2021, MariaDB 10.6 became generally available (production ready). This presentation focuses on the most important aspects of it as well as the influence it has. Improvements to InnoDB, SYS Schema Adoption, and deprecated variables and engines are all part of this presentation.
Operating PostgreSQL at Scale with KubernetesJonathan Katz
The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
All this sounds great, but if you are new to the world of containers, it can be very overwhelming to find a place to start. In this talk, which centers around demos, we will see how you can get PostgreSQL up and running in a containerized environment with some advanced sidecars in only a few steps! We will also see how it extends to a larger production environment with Kubernetes, and what the future holds for PostgreSQL in a containerized world.
We will cover the following:
* Why containers are important and what they mean for PostgreSQL
* Create a development environment with PostgreSQL, pgadmin4, monitoring, and more
* How to use Kubernetes to create your own "database-as-a-service"-like PostgreSQL environment
* Trends in the container world and how it will affect PostgreSQL
At the conclusion of the talk, you will understand the fundamentals of how to use container technologies with PostgreSQL and be on your way to running a containerized PostgreSQL environment at scale!
Cloud-Native PostgreSQL is a Kubernetes Operator for Postgres written by EDB entirely from scratch in the Go language and relying exclusively on the Kubernetes API.
This webinar covered:
- About DevOps & Cloud Native
- Overview of Cloud Native Postgres
- Storage for Postgres workloads in Kubernetes
- Start Using Cloud-Native Postgres
- Demo
Robert Bates, SVP Sales Engineering of Crunchy Data explains how you can tackle Data Gravity, Kubernetes, and strategies/best practices to run, scale, and leverage stateful containers in production.
Database as a Service (DBaaS) on KubernetesObjectRocket
Learn about ObjectRocket's adventures in Kubernetes. We'll cover why we chose Kubernetes for our DBaaS platform, the challenges we faced, and how we overcame them. A presentation for DevWeek Austin 2018.
Transitioning From SQL Server to MySQL - Presentation from Percona Live 2016Dylan Butler
What if you were asked to support a database platform that you had never worked with before? First you would probably say no, but after you lost that fight, then what? That is exactly how I came to support MySQL. Over the last year my team has worked to learn MySQL, architect a production environment, and figure out how to support it alongside our other platforms (Microsoft SQL Server and Oracle). Along the way, I have also come to appreciate the unique offering of this platform and see it as an important part of our environment going forward.
To make things even more challenging, our first MySQL databases were the backend for a critical, web based application that needed to be highly available across multiple data centers. This meant that we did not have the luxury of standing up a simpler environment to start with and building confidence there. Our final architecture ended up using a five node Percona XtraDB Cluster spread across three data centers.
This session will focus on lessons learned along the way, as well as challenges related to supporting more than one database platforms. It should be interesting to anyone who is new to MySQL, anyone who is being asked to support more than one database platform, or anyone who wants to see how an outsider views the platform.
EDB Cloud Native Postgres includes database container images and a Kubernetes Operator that manage the lifecycle of a database from deployment to operations. This Kubernetes Operator for Postgres is written by EDB entirely from scratch in the Go language and relies exclusively on the Kubernetes API.
Attend this webinar to learn about:
- DevOps & Cloud Native
- Overview of Cloud Native Postgres
- Storage for Postgres workloads in Kubernetes
- Using Cloud Native Postgres
- Demo
Agenda:
What is Software Defined Storage?
What is Ceph?
What is Rook?
Storage for Kubernetes
Storage Classes
Storage on Kubernetes
Operator Pattern
Custom Resource Definition
Rook Operator
Rook architecture
Ceph on Kubernetes with Rook
Demo
Rook Framework for Storage solutions
How to Get Involved?
Beyond Postgres: Interesting Projects, Tools and forksSameer Kumar
This ppt was used at Aug Meetup of Postgres User Group Singapore. We talked about
• Cool tools and extensions like PostGIS
• Great projects like pgpool and pgbouncer
• Interesting forks of PostgreSQL like EnterpriseDB, GreenPlum etc
Interesting take-away from the session was -
• Ease Oracle Migration
• Load Balancing in PostgreSQL
• Spatial data in PostgreSQL
• Connection pooling and resource management
• Your next Data Warehouse project
Meetup page- http://www.meetup.com/PUGS-Postgres-Users-Group-Singapore/
Kubernetes (commonly referred to as "K8s") is an open-source system for automating deployment, scaling and management of containerized applications It aims to provide a "platform for automating deployment, scaling, and operations of application containers across clusters of hosts". We will see Kubernetes architecture, use cases, basics and live demo
Similar to Using PostgreSQL With Docker & Kubernetes - July 2018 (20)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Jonathan Katz
Vectors are a centuries old, well-studied mathematical concept, yet they pose many challenges around efficient storage and retrieval in database systems. The heightened ease-of-use of AI/ML has lead to a surge of interested of storing vector data alongside application data, leading to some unique challenges. PostgreSQL has seen this story before with JSON, when JSON became the lingua franca of the web. So how can you use PostgreSQL to manage your vector data, and what challenges should you be aware of?
In this session, we'll review what vectors are, how they are used in applications, and what users are looking for in vector storage and search systems. We'll then see how you can search for vector data in PostgreSQL, including looking at best practices for using pgvector, an extension that adds additional vector search capabilities to PostgreSQL. Finally, we'll review ongoing development in both PostgreSQL and pgvector that will make it easier and more performant to search vector data in PostgreSQL.
There are parallels between storing JSON data in PostgreSQL and storing vectors that are produced from AI/ML systems. This lightning talk briefly covers the similarities in use-cases in storing JSON and vectors in PostgreSQL, shows some of the use-cases developers have for querying vectors in Postgres, and some roadmap items for improving PostgreSQL as a vector database.
This talk explores PostgreSQL 15 enhancements (along with some history) and looks at how they improve developer experience (MERGE and SQL/JSON), optimize support for backups and compression, logical replication improvements, enhanced security and performance, and more.
Build a Complex, Realtime Data Management App with Postgres 14!Jonathan Katz
Congratulations: you've been selected to build an application that will manage reservations for rooms!
On the surface, this sounds simple, but you are building a system for managing a high traffic reservation web page, so we know that a lot of people will be accessing the system. Therefore, we need to ensure that the system can handle all of the eager users that will be flooding the website checking to see what availability each room has.
Fortunately, PostgreSQL is prepared for this! And even better, we will be using Postgres 14 to make the problem even easier!
We will explore the following PostgreSQL features:
* Data types and their functionality, such as:
* Data/Time types
* Ranges / Multirnages
Indexes such as:
* GiST
* Common Table Expressions and Recursion (though multiranges will make things easier!)
* Set generating functions and LATERAL queries
* Functions and the PL/PGSQL
* Triggers
* Logical decoding and streaming
We will be writing our application primary with SQL, though we will sneak in a little bit of Python and using Kafka to demonstrate the power of logical decoding.
At the end of the presentation, we will have a working application, and you will be happy knowing that you provided a wonderful user experience for all users made possible by the innovation of PostgreSQL!
Get Your Insecure PostgreSQL Passwords to SCRAMJonathan Katz
Passwords: they just seem to work. You connect to your PostgreSQL database and you are prompted for your password. You type in the correct character combination, and presto! you're in, safe and sound.
But what if I told you that all was not as it seemed. What if I told you there was a better, safer way to use passwords with PostgreSQL? What if I told you it was imperative that you upgraded, too?
PostgreSQL 10 introduced SCRAM (Salted Challenge Response Authentication Mechanism), introduced in RFC 5802, as a way to securely authenticate passwords. The SCRAM algorithm lets a client and server validate a password without ever sending the password, whether plaintext or a hashed form of it, to each other, using a series of cryptographic methods.
In this talk, we will look at:
* A history of the evolution of password storage and authentication in PostgreSQL
* How SCRAM works with a step-by-step deep dive into the algorithm (and convince you why you need to upgrade!)
* SCRAM channel binding, which helps prevent MITM attacks during authentication
* How to safely set and modify your passwords, as well as how to upgrade to SCRAM-SHA-256 (which we will do live!)
all of which will be explained by some adorable elephants and hippos!
At the end of this talk, you will understand how SCRAM works, how to ensure your PostgreSQL drivers supports it, how to upgrade your passwords to using SCRAM-SHA-256, and why you want to tell other PostgreSQL password mechanisms to SCRAM!
Developing and Deploying Apps with the Postgres FDWJonathan Katz
I couldn't wait to use the Postgres Foreign Data Wrapper (postgres_fdw) in a project; imagine being able to read and write data to many databases all from a single database! I finally found a project where it made sense to use this amazing technology.
I mapped out my architecture and began to code, and realized there were some things that did not work as expected: I could not call remote functions or insert into a table with a serial primary key and have it autoupdate. I found workarounds (which I will share), so the project went on.
We tested the setup, everything seemed to work well, and then we went to deploy to production. And then the real fun began.
Despite the title, I still love the Postgres FDW but wanted to provide some cautionary tales from a hybrid developer/DBA perspective on how to properly use them in your working environment. This talk will cover:
* Basic Postgres FDW setup in a development environment vs. production environment
* Handling some common FDW uses case that you think are trivial but are not
* Working with advanced Postgres constructs such as schemas and sequences with FDWs
* Putting it all together to make sure your production application is safe with your FDWs
* ...and when you really, really need to make a remote call and it is not supported by a FDW, how to do that too!
What's the great thing about a database? Why, it stores data of course! However, one feature that makes a database useful is the different data types that can be stored in it, and the breadth and sophistication of the data types in PostgreSQL is second-to-none, including some novel data types that do not exist in any other database software!
This talk will take an in-depth look at the special data types built right into PostgreSQL version 9.4, including:
* INET types
* UUIDs
* Geometries
* Arrays
* Ranges
* Document-based Data Types:
* Key-value store (hstore)
* JSON (text [JSON] & binary [JSONB])
We will also have some cleverly concocted examples to show how all of these data types can work together harmoniously.
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
KNN-GiST indexes were added in PostgreSQL 9.1 and greatly accelerate some common queries in the geospatial and textual search realms. This presentation will demonstrate the power of KNN-GiST indexes on geospatial and text searching queries, but also their present limitations through some of my experimentations. I will also discuss some of the theory behind KNN (k-nearest neighbor) as well as some of the applications this feature can be applied too.
To see a version of the talk given at PostgresOpen 2011, please visit http://www.youtube.com/watch?v=N-MD08QqGEM
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
All data is relational and can be represented through relational algebra, right? Perhaps, but there are other ways to represent data, and the PostgreSQL team continues to work on making it easier and more efficient to do so!
With the upcoming 9.4 release, PostgreSQL is introducing the "JSONB" data type which allows for fast, compressed, storage of JSON formatted data, and for quick retrieval. And JSONB comes with all the benefits of PostgreSQL, like its data durability, MVCC, and of course, access to all the other data types and features in PostgreSQL.
How fast is JSONB? How do we access data stored with this type? What can it do with the rest of PostgreSQL? What can't it do? How can we leverage this new data type and make PostgreSQL scale horizontally? Follow along with our presentation as we try to answer these questions.
PostgreSQL comes built-in with a variety of indexes, some of which are further extensible to build powerful new indexing schemes. But what are all these index types? What are some of the special features of these indexes? What are the size & performance tradeoffs? How do I know which ones are appropriate for my application?
Fortunately, this talk aims to answer all of these questions as we explore the whole family of PostgreSQL indexes: B-tree, expression, GiST (of all flavors), GIN and how they are used in theory and practice.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
How world-class product teams are winning in the AI era by CEO and Founder, P...
Using PostgreSQL With Docker & Kubernetes - July 2018
1. An Introduction to Using PostgreSQL
with Docker & Kubernetes
JONATHAN S. KATZ
JULY 19, 2018
LOS ANGELES POSTGRESQL USER GROUP
2. About Crunchy Data
2
• Leading provider of trusted open source PostgreSQL and
PostgreSQL related technologies, support, and training
to enterprises
• We're hiring!
• crunchydata.com
• @crunchydb
3. • Director of Communications, Crunchy Data
• Previously: Engineering leadership in startups
• Longtime PostgreSQL community contributor
• Advocacy & various committees for PGDG
• @postgresql + .org content
• Director, PgUS
• Co-Organizer, NYCPUG
• Conference organization + speaking
• @jkatz05
About Me
3
4. • Containers: A Brief History
• Containers + PostgreSQL
• Setting up PostgreSQL with Containers
• Deploying! - Container Orchestration
• Look Ahead: Trends in the Container World
Outline
4
5. • Containers are processes that
encapsulate all the requirements to
execute an application
• Similar to virtual machines,
Sandbox for applications similar to
a virtual machine but with
increased density on a single host
What Are Containers?
5
Source: Docker
6. • Container Image - the file that describes how to build a container
• Container Engine - prepares for container to be executed by container
runtime by collecting container images, accepting user input, preparing
mount points, etc. Examples: docker, CRI-O, RKT, LXD
• Container Runtime - Takes information passed from container engine and
sets up containerized process. Open Containers Initiative (OCI) helping to
standardize on runc
• Container - The runtime instantiation of a Container Image, i.e. a process!
Container Glossary
6 Source: https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction/
7. • Lightweight
• compared to virtual machines, use less disk, RAM, CPU
• Sandboxed
• Container runtime is isolated from other processes
• Portability
• Containers can be run on different platforms as long as container engine is available
• Convenience
• Requirements for running applications bundled together
• Prevents messy dependency overlaps
Why Containers?
7
11. • Containers provide several advantages to running PostgreSQL:
• Setup & distribution for developer environments
• Ease of packaging extensions & minor upgrades
• Separate out secondary applications (monitoring, administration)
• Automation and scale for provisioning and creating replicas, backups
Containers & PostgreSQL
11
12. • Containers also introduce several challenges:
• Administrator needs to understand and select appropriate storage
options
• Configuration for individual database specifications and user access
• Managing 100s - 1000s of containers requires appropriate
orchestration (more on that later)
• Still a database within the container; standard DBA tuning applies
• However, these are challenges you will find in most database environments
Containers & PostgreSQL
12
13. • We will use the Crunchy Container Suite
• PostgreSQL (+ PostGIS): our favorite database; option to add our favorite geospatial extension
• pgpool + pgbouncer: connection pooling, load balancing
• pgbackrest: terabyte-scale backup management
• Monitoring: Prometheus + export
• Scheduling: "crunchy-dba"
• pgadmin4: UX-driven management
• Open source!
• Apache 2.0 license
• Support for Docker 1.12+, Kubernetes 1.5+
• Actively maintained and updated
Getting Started With Containers & PostgreSQL
13
https://github.com/CrunchyData/crunchy-containers
17. Demo: Adding Monitoring
17
cat << EOF > collect-env.list
DATA_SOURCE_NAME=postgresql://postgres:password@postgres:5432/postgres?sslmode=disable
EOF
docker run
--env-file=collect-env.list
--network=pgnetwork
--name=collect
--hostname=collect
--detach crunchydata/crunchy-collect:centos7-10.4-2.0.0
docker volume create --driver local --name=prometheus
cat << EOF > prometheus-env.list
COLLECT_HOST=collect
SCRAPE_INTERVAL=5s
SCRAPE_TIMEOUT=5s
EOF
docker run
--publish 9090:9090
--env-file=prometheus-env.list
--volume prometheus:/data
--network=pgnetwork
--name=prometheus
--hostname=prometheus
--detach crunchydata/crunchy-prometheus:centos7-10.4-2.0.0
docker volume create --driver local --name=grafana
cat << EOF > grafana-env.list
ADMIN_USER=jkatz
ADMIN_PASS=password
PROM_HOST=prometheus
PROM_PORT=9090
EOF
docker run
--publish 3000:3000
--env-file=grafana-env.list
--volume grafana:/data
--network=pgnetwork
--name=grafana
--hostname=grafana
--detach crunchydata/crunchy-grafana:centos7-10.4-2.0.0
1. Set up the metric collector
2. Set up prometheus to store metrics 3. Set up grafana to visualize
18. • Explored what / why / how of containers
• Set up a PostgreSQL 10 instance
• Set up pgadmin4 to manage our PostgreSQL instance
• Set up monitoring to analyze performance of our system
• Of course, the next question naturally is:
Recap
18
20. • "Open-source system for
automating deployment, scaling,
and management of containerized
applications."
• Manage the full lifecycle of a
container
• Assists with scheduling, scaling,
failover, high-availability, and more
Kubernetes: Container Orchestration
20 Source: https://kubernetes.io
21. • Value of Kubernetes increases
exponentially as number of containers
increases
• Due to statefulness of databases,
Kubernetes requires more knowledge to
successfully operate a standard database
workload:
• Avoid scheduling and availability issues
for longer-running database containers
• Data continues to exist even if
container does not
When to Use Kubernetes
21
22. • Node: A Kubernetes "worker" machine that is able to run pods
• Pod: One or more running containers; the "atomic" unit of Kubernetes
• Service: The access point to a set of Pods
• ReplicaSet: Ensures that a specified number of replica Pods are running at a given time
• Deployment: A controller that ensures all running Pods / ReplicaSets match the desired
state of the execution environment (total number of pods, resources, etc.)
• Persistent Volume (PV): A storage API that enables information to persist after a Pod has
terminated
• Persistent Volume Claim (PVC): Enables a PV to be mounted to a container, includes
information such as amount of storage. Used for dynamic provisioning
Kubernetes Glossary Important for PostgreSQL
22 Source: https://kubernetes.io/docs/reference/glossary/?fundamental=true&storage=true
23. • Kubernetes provide the gateway to run your own "database-as-a-service:"
• Mass apply databases commands:
• Updates
• Backups / Restores
• ACL rule changes
• Scale up / down replicas
• Failover
23
PostgreSQL in a Kubernetes World
24. • Kubernetes is "turnkey" for stateless applications
• e.g. web servers
• Databases do maintain state: permanent storage
• Persistent Volumes (PV)
• Persistent Volume Claims (PVC)
PostgreSQL in a Kubernetes World
24
25. • Utilizes Operator framework initially launched by CoreOS to
help capture nuances of managing complex applications that
maintain state, e.g. databases
• Allows an administrator to run PostgreSQL-specific
commands to manage database clusters, including:
• Creating / Deleting a cluster (your own DBaaS)
• Scaling up / down replicas
• Failover
• Apply user policies to PostgreSQL instances
• Define what container resources to use (RAM, CPU, etc.)
• Smart pod deployments to nodes
• REST API
Crunchy PostgreSQL Operator
25
https://github.com/CrunchyData/postgres-operator
26. • Automation: Complex, multi-step DBA tasks
reduced to one-line commands
• Standardization: Many customizations, same
workflow
• Ease-of-Use: Simple CLI; UI in beta
• Scale
• Provision & manage clusters quickly
amongst thousands of instances
• Load balancing, disaster recovery, security
policies, deployment specifications
• Security: Sandboxed environments, RBAC,
mass grant/revoke policies
Why Use An Operator With PostgreSQL?
26
30. • Containers are no longer "new" - orchestration technologies have matured
• Debate with containers + databases: storage & management
• No different than virtual machines + databases
• Databases are still databases: need expertise to manage
• Stateful Sets vs. Deployments
• Database deployment automation flexibility
• Deploy your architecture to any number of clouds
• Monitoring: A new frontier
Containerized PostgreSQL: Looking Ahead
30
31. • Containers + PostgreSQL gives you:
• Easy-to-setup development environments
• Your own production database-as-a-service
• Tools to automate management of over 1000s of instances in short-
order
Conclusion
31