SlideShare a Scribd company logo
1 of 60
P R E S E N T A T I O N
PostgreSQL Training
Part 1:
Terminology
Some of what I have here is copy-pasta from https://www.postgresql.org/docs/current/glossary.html
with some extra information added from their respective pages as well as some of my own knowledge
and research.
Glossary
Command
You will see this used all over the documentation, but it's never explained.
A command is a string that is sent to the server in order for it to do something for you. In PSQL, are
separated by semicolons.
A command generally is used to:
● fetch data
● modify data
● administer the PostgreSQL instance.
SELECT * FROM table
CREATE EXTENSION pg_stat_statements
BEGIN; DELETE FROM TABLE
Object
Any object that can be created with a CREATE command.
Most objects are specific to one database, and commonly known as local SQL objects.
Local Object
Schema Local Objects: Name and type are unique within each schema
● Relations
● Routines
● Data types
CREATE TABLE; CREATE VIEW; CREATE INDEX
CREATE FUNCTION
CREATE TYPE
Non-schema Local Objects
Local Objects: Name and type are unique within each database
● Extensions
● Data type casts
● Foreign data wrappers
CREATE EXTENSION
CREATE CAST
CREATE FOREIGN DATA WRAPPER
Global Objects
Exist entirely outside of any specific database. Names are unique within the database cluster.
● Roles
● Tablespaces
● Replication origins
● Subscriptions for logical replication
● Databases
CREATE ROLE
CREATE TABLESPACE
CALL pg_replication_origin_create()
CREATE SUBSCRIPTION
CREATE DATABASE
Tablespace
A named location on the server file system.
Allows database admins to define locations of the filesystem where the files representing the database
objects can be stored.
This is very useful if you have databases of varying sizes or for optimizing performance. You can put a
bigger, or less needed database on a slower disk, and a very active database on a faster disk.
Initially, a database cluster contains a single usable tablespace which is used as the default for all SQL
objects, called pg_default .
Tablespace
Some examples:
CREATE TABLESPACE tablespace_name LOCATION 'directory'; .
Then you can create an SQL Object:
CREATE DATABASE name TABLESPACE tablespace_name; .
CREATE TABLE name TABLESPACE tablespace_name; .
CREATE INDEX name ON table_name TABLESPACE tablespace_name; .
Database
A named collection of local SQL objects.
You need to connect to a database when connecting to a cluster.
The SQL standard calls databases “catalogs”, but there is no difference in practice.
There’s 2 ways to create a database:
1. CREATE DATABASE dbname [OWNER relename] from an SQL environment
2. createdb [-O rolename] dbname from the shell
There’s 2 ways to destroy a database:
1. DROP DATABASE dbname from an SQL environment
2. dropdb dbname from the shell
Database Cluster
A collection of databases and global SQL objects, and their common static and dynamic metadata.
In PostgreSQL, the term cluster is also sometimes used to refer to an instance.
Instance
A group of backend and auxiliary processes that communicate using a common shared memory area.
One postmaster process manages the instance.
One instance manages exactly one database cluster with all its databases.
Many instances can run on the same server as long as their TCP ports do not conflict.
Postmaster
The very first process of an instance.
It manages the other processes and creates backend processes on demand.
Backend
Process of an instance which acts on behalf of a client session and handles its requests.
One backed process will be forked for each client session.
Session
A state that allows a client and backend to interact, communicating over a connection.
Connection
An established line of communication between a client process and a backend process, supporting a
session.
Usually over a network, but also can work over a socket.
Query
A type of command sent by a client to a backend.
Most of the time, a query will be retrieving data or modifying the database.
Relation
The generic term for all objects in a database that have
1. A name
2. A list of attributes defined in a specific order
Includes:
● Tables
● Sequences
● Views
● Foreign Tables
● Materialized views
● Composite types
● Indexes
Heap
This is not the memory heap of the application.
It is the data for a relation.
The heap is stored in one or more file segments.
File Segment
A physical file which stores data for a given relation.
File size is limited with --with-segsize during compilation, default is 1 GB.
If a relation exceeds the size limit, it is split into multiple segments.
To know more than you ever needed to, see: https://www.postgresql.org/docs/current/storage-file-
layout.html
Storage File Layout
Table
A relation that stores a collection of tuples having a common data structure.
TOAST
Stands for: The Oversized-Attribute Storage Technique
A mechanism by which large attributes of table rows are split and stored in a secondary table, called the
TOAST table.
Each relation with large attributes has its own TOAST table.
Long string storage is generally where you will find TOAST being used.
Column
An attribute found in a table or view.
Tuple
A collection of attributes in a fixed order.
That order may be defined by the relation where the tuple is contained.
When talking about a table, a tuple is generally referred to as a row.
View
A relation that is defined by a SELECT statement, but has no storage of its own.
Any time a query references a view, the definition of the view is substituted into the query.
This substitution happens before the query planner or optimizer.
Materialized
The property that some information has been pre-computed and stored for later use, rather than
computing it on-the-fly.
Materialized View
Like an immutable table.
Update the results with REFRESH MATERIALIZED VIEW .
You can CREATE INDEX .
You can also ALTER|DROP MATERIALIZED VIEW like you can with a table.
Transaction
A combination of commands that must act as a single atomic command.
They all succeed or all fail as a single unit.
Their effects are not visible to other sessions until the transaction is complete.
Each transaction has a Transaction ID, or XID . A session is assigned a Transaction ID when it first
causes a database modification.
Manually started with BEGIN , and ends with ROLLBACK or COMMIT .
Commit
Finalizes a transaction, which makes it visible to the other transactions and assures its durability.
Rollback
Rolls back all of the changes made since the beginning of the transaction.
Index
A relation that contains data derived from a table or materialized view.
Its internal structure supports fast retrieval of and access to the original data.
Write-Ahead Log (WAL)
The journal that keeps track of the changes in the database cluster.
Consists of multiple WAL records, written sequentially to WAL files.
There is only 1 WAL per cluster.
WAL Record
A low-level, binary description of an individual change.
Replayed in the event of a database failure.
It's more efficient having this write-only log instead of modifying the page files directly.
Also a method of Postgres replication. Records are streamed to the replicas and replayed.
A change to the cluster is considered persistent when it’s WAL record is written to disk.
WAL File
A.K.A. WAL segment. A.K.A. WAL segment file.
If the system crashes, the files are read in order, eventually restoring the last state of the database.
Barman ships these WAL files to allow you to restore your database to any point in time by replaying
the WAL records until the requested time has been reached.
Each WAL file can be released after a checkpoint writes all the changes to the corresponding data files.
Releasing the file can be done by either:
● Deleting it
● Changing its name so that it will be used in the future. A.K.A. recycling.
Checkpoint
A point in the WAL sequence at which it is guaranteed that the heap and index data files have been
updated with all information from shared memory modified before that checkpoint.
A checkpoint record is written and flushed to WAL to mark that point.
A checkpoint is started:
● Every checkpoint_timeout seconds
● If max_wal_size is about to be exceeded
● When calling CHECKPOINT
Whichever comes first.
Multi-version concurrency control (MVCC)
A mechanism designed to allow several transactions to be reading and writing the same rows without
one process causing other processes to stall.
A read will not block a write and a write will not block a read.
How MVCC works
Postgres stores transaction information with each row: xmin and xmax .
These are used to determine if a row is visible to a transaction or not.
A row is visible to a transaction if xmin < XID < xmax .
This depends on the isolation level.
By default, as soon as a transaction is committed, the new visibility is applied to all transactions.
The SERIALIZABLE isolation level works as described before.
What actually happens
A row is given:
● xmin when it is INSERTed
● xmax when it is marked as DELETEd.
Updating a row is like inserting a new row and deleting the old one.
You can query xmin and xmax from any row.
SELECT xmin, xmax FROM table;
Dead tuples (Dead rows)
When a tuple (row) is no longer visible to any transaction, it is considered dead.
Time for a little quest
Transaction Exhaustion (Wraparound)
Transaction Exhaustion
Transaction IDs are 32-bits , so you can have a total of 232 transactions.
The XID s are split into 2 parts:
● XID s in the past
● XID s in the future
You can get your current XID with:
SELECT txid_current() .
Transaction Exhaustion
When txid_current reaches 232 , the next transaction will wrap around back to 0.
All of a sudden, all rows appear to be in the future.
All deleted rows are not deleted anymore and all created rows are not created.
Transaction Exhaustion
When txid_current reaches 232 , the next transaction will wrap around back to 0.
All of a sudden, all rows appear to be in the future.
All deleted rows are not deleted anymore and all created rows are not created.
This is referred to as Not a good time .
Transaction Exhaustion
When txid_current reaches 232 , the next transaction will wrap around back to 0.
All of a sudden, all rows appear to be in the future.
All deleted rows are not deleted anymore and all created rows are not created.
This is referred to as Not a good time .
But it actually works a bit differently...
Transaction Exhaustion
Basically:
● Past XID s are txid_current - 231 to txid_current - 1 .
● Future XID s are txid_current + 1 to txid_current - 231 - 1 .
So we have ~2 billion transactions.
About 1 million transactions before the Not a good time , Postgres will not allow any new
transactions, and will start a VACUUM , even if autovacuum is not enabled.
Transaction Exhaustion
What Not a good time looks like in the logs:
WARNING: database "mydb" must be vacuumed within x transactions
HINT: To avoid database shutdown, execute a database-wide VACUUM in "mydb"
Vacuum
It has 2 jobs:
1. Remove dead tuples from tables or materialized views.
2. Freeze tuples.
VACUUM steps: SELECT datname, phase FROM pg_stat_progress_vacuum .
1. initializing
2. scanning heap
3. vacuuming indexes
4. vacuuming heap
5. cleaning up indexes
6. truncating heap
7. performing final cleanup
See: https://www.postgresql.org/docs/current/progress-reporting.html
Postgres Reporting
Tuple Freezing
Each row has a frozen bit which, if set, means that no matter what the xmin and xmax is set
to, this row is always in the past.
Tuple freezing is the process of setting this bit on all tuples that are in the past of all the current
transactions.
Each table has a relfrozenxid value that is the xmin of the oldest row that is not frozen.
datfrozenxid is the oldest relfrozenxid for the database.
So datfrozenxid + 231 - 1 million is actually Not a good time .
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = 0 .
datfrozenxid
datfrozenxid+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~500M .
Transactions start
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~900M .
Everything is still seems normal
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~1 400M .
We’re starting to get close to the database
shutting down
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = 0 .
txid_current = ~2 000M .
Not a good time happened.
At around 2 trillion transactions, the database
stops accepting connections and starts
VACUUM .
datfrozenxid
datfrozenxid+231 txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~350M .
txid_current = ~2 000M .
VACUUM starts freezing tuples.
We can only connect again when it’s done.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~700M .
txid_current = ~2 000M .
VACUUM is still freezing tuples.
Still no new transactions allowed.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 000M .
txid_current = ~2 000M .
VACUUM is still freezing tuples.
Still no new transactions allowed. datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 400M .
txid_current = ~2 000M .
VACUUM is still freezing tuples.
Still no new transactions allowed.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 800M .
txid_current = ~2 000M .
VACUUM is done.
New transactions are allowed to start again.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
datfrozenxid = ~1 800M .
txid_current = ~2 300M .
Database is working again.
datfrozenxid
datfrozenxid+231
txid_current
txid_current+231
Transaction Exhaustion and Tuple Freezing
Visualized
I made a little animation to help understand:
https://tuple-freezing-demo.angusd.com
Tuple Freezing Demo

More Related Content

What's hot

PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013
Andrew Dunstan
 
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
Wei Shan Ang
 

What's hot (20)

JahiaOne - Performance Tuning
JahiaOne - Performance TuningJahiaOne - Performance Tuning
JahiaOne - Performance Tuning
 
PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013PostgreSQL and Redis - talk at pgcon 2013
PostgreSQL and Redis - talk at pgcon 2013
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Kafka monitoring and metrics
Kafka monitoring and metricsKafka monitoring and metrics
Kafka monitoring and metrics
 
Evolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best PracticesEvolution of MongoDB Replicaset and Its Best Practices
Evolution of MongoDB Replicaset and Its Best Practices
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenter
 
How Prometheus Store the Data
How Prometheus Store the DataHow Prometheus Store the Data
How Prometheus Store the Data
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming Replication
 
Pgcenter overview
Pgcenter overviewPgcenter overview
Pgcenter overview
 
Logical replication with pglogical
Logical replication with pglogicalLogical replication with pglogical
Logical replication with pglogical
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
 
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
 
Learning postgresql
Learning postgresqlLearning postgresql
Learning postgresql
 
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
Using Apache Spark to Solve Sessionization Problem in Batch and StreamingUsing Apache Spark to Solve Sessionization Problem in Batch and Streaming
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
 

Similar to PostgreSQL Terminology

Oracle Database 12c "New features"
Oracle Database 12c "New features" Oracle Database 12c "New features"
Oracle Database 12c "New features"
Anar Godjaev
 
ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
Neeraj Mathur
 

Similar to PostgreSQL Terminology (20)

Oracle11g notes
Oracle11g notesOracle11g notes
Oracle11g notes
 
Discover Database
Discover DatabaseDiscover Database
Discover Database
 
Oracle Database 12c "New features"
Oracle Database 12c "New features" Oracle Database 12c "New features"
Oracle Database 12c "New features"
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and Elasticsearch
 
Oracle interview question & answers
Oracle interview question & answersOracle interview question & answers
Oracle interview question & answers
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An Introduction
 
Sql introduction
Sql introductionSql introduction
Sql introduction
 
Sql server-dba
Sql server-dbaSql server-dba
Sql server-dba
 
Discover database
Discover databaseDiscover database
Discover database
 
Relational Database Language.pptx
Relational Database Language.pptxRelational Database Language.pptx
Relational Database Language.pptx
 
SQL Server vs Postgres
SQL Server vs PostgresSQL Server vs Postgres
SQL Server vs Postgres
 
AWS RDS Migration Tool
AWS RDS Migration Tool AWS RDS Migration Tool
AWS RDS Migration Tool
 
Sql server
Sql serverSql server
Sql server
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
 
oracle dba
oracle dbaoracle dba
oracle dba
 
ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with ExamplesDML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
DML, DDL, DCL ,DRL/DQL and TCL Statements in SQL with Examples
 

Recently uploaded

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 

PostgreSQL Terminology

  • 1. P R E S E N T A T I O N PostgreSQL Training
  • 2. Part 1: Terminology Some of what I have here is copy-pasta from https://www.postgresql.org/docs/current/glossary.html with some extra information added from their respective pages as well as some of my own knowledge and research. Glossary
  • 3. Command You will see this used all over the documentation, but it's never explained. A command is a string that is sent to the server in order for it to do something for you. In PSQL, are separated by semicolons. A command generally is used to: ● fetch data ● modify data ● administer the PostgreSQL instance. SELECT * FROM table CREATE EXTENSION pg_stat_statements BEGIN; DELETE FROM TABLE
  • 4. Object Any object that can be created with a CREATE command. Most objects are specific to one database, and commonly known as local SQL objects.
  • 5. Local Object Schema Local Objects: Name and type are unique within each schema ● Relations ● Routines ● Data types CREATE TABLE; CREATE VIEW; CREATE INDEX CREATE FUNCTION CREATE TYPE
  • 6. Non-schema Local Objects Local Objects: Name and type are unique within each database ● Extensions ● Data type casts ● Foreign data wrappers CREATE EXTENSION CREATE CAST CREATE FOREIGN DATA WRAPPER
  • 7. Global Objects Exist entirely outside of any specific database. Names are unique within the database cluster. ● Roles ● Tablespaces ● Replication origins ● Subscriptions for logical replication ● Databases CREATE ROLE CREATE TABLESPACE CALL pg_replication_origin_create() CREATE SUBSCRIPTION CREATE DATABASE
  • 8. Tablespace A named location on the server file system. Allows database admins to define locations of the filesystem where the files representing the database objects can be stored. This is very useful if you have databases of varying sizes or for optimizing performance. You can put a bigger, or less needed database on a slower disk, and a very active database on a faster disk. Initially, a database cluster contains a single usable tablespace which is used as the default for all SQL objects, called pg_default .
  • 9. Tablespace Some examples: CREATE TABLESPACE tablespace_name LOCATION 'directory'; . Then you can create an SQL Object: CREATE DATABASE name TABLESPACE tablespace_name; . CREATE TABLE name TABLESPACE tablespace_name; . CREATE INDEX name ON table_name TABLESPACE tablespace_name; .
  • 10. Database A named collection of local SQL objects. You need to connect to a database when connecting to a cluster. The SQL standard calls databases “catalogs”, but there is no difference in practice. There’s 2 ways to create a database: 1. CREATE DATABASE dbname [OWNER relename] from an SQL environment 2. createdb [-O rolename] dbname from the shell There’s 2 ways to destroy a database: 1. DROP DATABASE dbname from an SQL environment 2. dropdb dbname from the shell
  • 11. Database Cluster A collection of databases and global SQL objects, and their common static and dynamic metadata. In PostgreSQL, the term cluster is also sometimes used to refer to an instance.
  • 12. Instance A group of backend and auxiliary processes that communicate using a common shared memory area. One postmaster process manages the instance. One instance manages exactly one database cluster with all its databases. Many instances can run on the same server as long as their TCP ports do not conflict.
  • 13. Postmaster The very first process of an instance. It manages the other processes and creates backend processes on demand.
  • 14. Backend Process of an instance which acts on behalf of a client session and handles its requests. One backed process will be forked for each client session.
  • 15. Session A state that allows a client and backend to interact, communicating over a connection.
  • 16. Connection An established line of communication between a client process and a backend process, supporting a session. Usually over a network, but also can work over a socket.
  • 17. Query A type of command sent by a client to a backend. Most of the time, a query will be retrieving data or modifying the database.
  • 18. Relation The generic term for all objects in a database that have 1. A name 2. A list of attributes defined in a specific order Includes: ● Tables ● Sequences ● Views ● Foreign Tables ● Materialized views ● Composite types ● Indexes
  • 19. Heap This is not the memory heap of the application. It is the data for a relation. The heap is stored in one or more file segments.
  • 20. File Segment A physical file which stores data for a given relation. File size is limited with --with-segsize during compilation, default is 1 GB. If a relation exceeds the size limit, it is split into multiple segments. To know more than you ever needed to, see: https://www.postgresql.org/docs/current/storage-file- layout.html Storage File Layout
  • 21. Table A relation that stores a collection of tuples having a common data structure.
  • 22. TOAST Stands for: The Oversized-Attribute Storage Technique A mechanism by which large attributes of table rows are split and stored in a secondary table, called the TOAST table. Each relation with large attributes has its own TOAST table. Long string storage is generally where you will find TOAST being used.
  • 23. Column An attribute found in a table or view.
  • 24. Tuple A collection of attributes in a fixed order. That order may be defined by the relation where the tuple is contained. When talking about a table, a tuple is generally referred to as a row.
  • 25. View A relation that is defined by a SELECT statement, but has no storage of its own. Any time a query references a view, the definition of the view is substituted into the query. This substitution happens before the query planner or optimizer.
  • 26. Materialized The property that some information has been pre-computed and stored for later use, rather than computing it on-the-fly.
  • 27. Materialized View Like an immutable table. Update the results with REFRESH MATERIALIZED VIEW . You can CREATE INDEX . You can also ALTER|DROP MATERIALIZED VIEW like you can with a table.
  • 28. Transaction A combination of commands that must act as a single atomic command. They all succeed or all fail as a single unit. Their effects are not visible to other sessions until the transaction is complete. Each transaction has a Transaction ID, or XID . A session is assigned a Transaction ID when it first causes a database modification. Manually started with BEGIN , and ends with ROLLBACK or COMMIT .
  • 29. Commit Finalizes a transaction, which makes it visible to the other transactions and assures its durability.
  • 30. Rollback Rolls back all of the changes made since the beginning of the transaction.
  • 31. Index A relation that contains data derived from a table or materialized view. Its internal structure supports fast retrieval of and access to the original data.
  • 32. Write-Ahead Log (WAL) The journal that keeps track of the changes in the database cluster. Consists of multiple WAL records, written sequentially to WAL files. There is only 1 WAL per cluster.
  • 33. WAL Record A low-level, binary description of an individual change. Replayed in the event of a database failure. It's more efficient having this write-only log instead of modifying the page files directly. Also a method of Postgres replication. Records are streamed to the replicas and replayed. A change to the cluster is considered persistent when it’s WAL record is written to disk.
  • 34. WAL File A.K.A. WAL segment. A.K.A. WAL segment file. If the system crashes, the files are read in order, eventually restoring the last state of the database. Barman ships these WAL files to allow you to restore your database to any point in time by replaying the WAL records until the requested time has been reached. Each WAL file can be released after a checkpoint writes all the changes to the corresponding data files. Releasing the file can be done by either: ● Deleting it ● Changing its name so that it will be used in the future. A.K.A. recycling.
  • 35. Checkpoint A point in the WAL sequence at which it is guaranteed that the heap and index data files have been updated with all information from shared memory modified before that checkpoint. A checkpoint record is written and flushed to WAL to mark that point. A checkpoint is started: ● Every checkpoint_timeout seconds ● If max_wal_size is about to be exceeded ● When calling CHECKPOINT Whichever comes first.
  • 36. Multi-version concurrency control (MVCC) A mechanism designed to allow several transactions to be reading and writing the same rows without one process causing other processes to stall. A read will not block a write and a write will not block a read.
  • 37. How MVCC works Postgres stores transaction information with each row: xmin and xmax . These are used to determine if a row is visible to a transaction or not. A row is visible to a transaction if xmin < XID < xmax . This depends on the isolation level. By default, as soon as a transaction is committed, the new visibility is applied to all transactions. The SERIALIZABLE isolation level works as described before.
  • 38. What actually happens A row is given: ● xmin when it is INSERTed ● xmax when it is marked as DELETEd. Updating a row is like inserting a new row and deleting the old one. You can query xmin and xmax from any row. SELECT xmin, xmax FROM table;
  • 39. Dead tuples (Dead rows) When a tuple (row) is no longer visible to any transaction, it is considered dead.
  • 40. Time for a little quest Transaction Exhaustion (Wraparound)
  • 41. Transaction Exhaustion Transaction IDs are 32-bits , so you can have a total of 232 transactions. The XID s are split into 2 parts: ● XID s in the past ● XID s in the future You can get your current XID with: SELECT txid_current() .
  • 42. Transaction Exhaustion When txid_current reaches 232 , the next transaction will wrap around back to 0. All of a sudden, all rows appear to be in the future. All deleted rows are not deleted anymore and all created rows are not created.
  • 43. Transaction Exhaustion When txid_current reaches 232 , the next transaction will wrap around back to 0. All of a sudden, all rows appear to be in the future. All deleted rows are not deleted anymore and all created rows are not created. This is referred to as Not a good time .
  • 44. Transaction Exhaustion When txid_current reaches 232 , the next transaction will wrap around back to 0. All of a sudden, all rows appear to be in the future. All deleted rows are not deleted anymore and all created rows are not created. This is referred to as Not a good time . But it actually works a bit differently...
  • 45. Transaction Exhaustion Basically: ● Past XID s are txid_current - 231 to txid_current - 1 . ● Future XID s are txid_current + 1 to txid_current - 231 - 1 . So we have ~2 billion transactions. About 1 million transactions before the Not a good time , Postgres will not allow any new transactions, and will start a VACUUM , even if autovacuum is not enabled.
  • 46. Transaction Exhaustion What Not a good time looks like in the logs: WARNING: database "mydb" must be vacuumed within x transactions HINT: To avoid database shutdown, execute a database-wide VACUUM in "mydb"
  • 47. Vacuum It has 2 jobs: 1. Remove dead tuples from tables or materialized views. 2. Freeze tuples. VACUUM steps: SELECT datname, phase FROM pg_stat_progress_vacuum . 1. initializing 2. scanning heap 3. vacuuming indexes 4. vacuuming heap 5. cleaning up indexes 6. truncating heap 7. performing final cleanup See: https://www.postgresql.org/docs/current/progress-reporting.html Postgres Reporting
  • 48. Tuple Freezing Each row has a frozen bit which, if set, means that no matter what the xmin and xmax is set to, this row is always in the past. Tuple freezing is the process of setting this bit on all tuples that are in the past of all the current transactions. Each table has a relfrozenxid value that is the xmin of the oldest row that is not frozen. datfrozenxid is the oldest relfrozenxid for the database. So datfrozenxid + 231 - 1 million is actually Not a good time .
  • 49. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = 0 . datfrozenxid datfrozenxid+231
  • 50. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~500M . Transactions start datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 51. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~900M . Everything is still seems normal datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 52. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~1 400M . We’re starting to get close to the database shutting down datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 53. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = 0 . txid_current = ~2 000M . Not a good time happened. At around 2 trillion transactions, the database stops accepting connections and starts VACUUM . datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 54. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~350M . txid_current = ~2 000M . VACUUM starts freezing tuples. We can only connect again when it’s done. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 55. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~700M . txid_current = ~2 000M . VACUUM is still freezing tuples. Still no new transactions allowed. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 56. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 000M . txid_current = ~2 000M . VACUUM is still freezing tuples. Still no new transactions allowed. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 57. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 400M . txid_current = ~2 000M . VACUUM is still freezing tuples. Still no new transactions allowed. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 58. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 800M . txid_current = ~2 000M . VACUUM is done. New transactions are allowed to start again. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 59. Transaction Exhaustion and Tuple Freezing Visualized datfrozenxid = ~1 800M . txid_current = ~2 300M . Database is working again. datfrozenxid datfrozenxid+231 txid_current txid_current+231
  • 60. Transaction Exhaustion and Tuple Freezing Visualized I made a little animation to help understand: https://tuple-freezing-demo.angusd.com Tuple Freezing Demo