SlideShare a Scribd company logo
1 of 131
• Ryan Booz & Grant Fritchey
• PGConf.EU 2023
PostgreSQL & DevOps
Advocate
@ryanbooz
/in/ryanbooz
www.softwareandbooz.com
youtube.com/@ryanbooz
DevOps Advocate
Microsoft Data Platform MVP
AWS Community Builder
grant@scarydba.com
scarydba.com
@gfritchey
linkedin.com/in/grant-fritchey
1. Configurations
2. MVCC and VACUUM
3. Security and Privileges
4. Datatypes
5. Indexes
7. Functions & Procedures
8. SQL Considerations
9. Backups
10. Migration Tools
• Application layer
• What tooling changes are necessary?
• Do you rely on SDK/framework features that are different in PostgreSQL
versions?
• If you have a multi-tenant application, could the migration be used
to improve tenant separation?
• Does your application rely on query hints?
• Why did you have them and how will you approach PostgreSQL with these
queries?
• Are partitions a better option in PostgreSQL?
• How will you manage partitions?
• Do you need to normalize some relationships more?
• Should you de-normalize a relationship that's been difficult
to maintain in its current form?
• Aggregated queries and materialized views – or methods
for doing incremental materialized views?
• Every non-serverless Postgres should be tuned
• Edit postgresql.conf
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
https://www.youtube.com/watch?v=IFIXpm73qtk
https://postgresqlco.nf/
Helpful Tuning Resources
shared_buffers:
• data cache
• at least 25% of total RAM
work_mem:
• max memory for each plan operation
• JOINS, GROUP BY, ORDER BY can all have workers
maintenance_work_mem:
• memory for background tasks like VACUUM and ANALYZE
max_connections:
• exactly as it says
wal_compression:
• Off (default)
• Default uses pglz
• PG15 can use lz4 or zstd
autovacuum:
• On (default)
• Do not turn off
• Might have to tune individual tables
autoanalyze:
• On (default)
• Do not turn off
• Might have to tune individual tables
• SHOW will display the current value
• SET allows settings to be modified (when possible)
• Examples:
• SHOW ALL – page all settings
• SHOW max_connections –configured connection
limit
For the default Read Committed Isolation Level:
• Each query only sees transactions that were committed before it
started (ID less than current transaction)
• At the beginning of a query, PostgreSQL records:
• The transaction ID (xid) of the current command
• All transaction IDs that are in process
• Multi-statement transactions also record any of their own
completed statements from earlier in transaction
• In PostgreSQL, transaction IDs are assigned when
the first command/statement is run within a
transaction, not when BEGIN is executed.
• XIDs are stored in-table and on-row
xmin xmax User Data
48 0 Hello!
INSERT
• xmin = current transaction ID
• xmax = zero (invalid)
xmin xmax User Data
48 78 Hello!
DELETE
• xmin is unchanged
• xmax = current transaction ID
Dead Tuple
xmin xmax User Data
48 78 Hello!
78 0 Hello World!
UPDATE
Original tuple:
• xmin is unchanged
• xmax = current transaction ID
New tuple:
• xmin = current transaction ID
• xmax = zero
Dead Tuple
xmin xmax User Data
48 78 Hello!
78 0 Hello!
UPDATE (with the same value)
Original tuple:
• xmin is unchanged
• xmax = current transaction ID
New tuple:
• xmin = current transaction ID
• xmax = zero
Dead Tuple
lp1 lp2
Tuple 2
Tuple 1
Index
Tuple 2
lp3
Tuple 3
(dead) (dead)
(dead)
(dead)
Index
Tuple 2
lp3
Tuple 3
(dead) (dead)
(dead)
(dead)
Index
Tuple 2
lp3
Tuple 3
lp4
Tuple 4
xmin xmax ID User data…
Sequential Scan
SELECT a, b, c FROM t;
xmin xmax ID User data…
Sequential Scan
SELECT a, b, c FROM t
WHERE ID=10;
Index
xmin xmax ID User data…
Index Scan
SELECT a, b, c FROM t
WHERE ID=10;
Index
• Serves two main purposes:
• Free up space on pages used by dead tuples
• Mark rows as "frozen" if they have a XID sufficiently older
than the most recent XID, allowing the XID to be reused
again in the future
• Additionally, if all rows on a page are frozen, the page can
be marked frozen to speed up scans
(dead) (dead)
(dead)
(dead)
Index
Tuple 2
lp3
Tuple 3
lp4
Tuple 4
(dead) (dead)
(dead)
(dead)
Index
Tuple 2
lp3
Tuple 3
lp4
Tuple 4
(unused) (unused)
Index
Tuple 2
lp3
Tuple 3
lp4
Tuple 4
"Scale Factor" = Percentage of rows modified before process runs
"Threshold" = Added number of rows to add to the percent calculation.
(this prevents very small tables from taking resources)
autovacuum_vacuum_scale_factor = 0.2 (20%)
autovacuum_vacuum_threshold = 50
autovacuum_analyze_scale_factor = 0.1 (10%)
autovacuum_analyze_threshold = 50
Global Autovacuum and Autoanalyze Settings
• Autovacuum triggers at 20% data modification
• Autoanalyze triggers at 10% data modification
• Analogous to TF-2731, but per table!
Analyze at 5% modification:
ALTER TABLE t SET (autovacuum_vacuum_scale_factor=0.05)
Analyze threshold above percentage:
ALTER TABLE t SET (autovacuum_vacuum_threshold=500)
Setting maintenance_work_mem correctly:
• maintenance_work_mem is the amount of RAM that a
maintenance process like autovacuum can use to process pages of
data. If it is not sufficient, vacuuming (and other essential
maintenance processes) will take more time and fall behind.
• If autovacuum isn't completing it's work regularly, check this setting!
Default setting = 64MB
Cluster Cluster Cluster
Server/Host (Firewall, Ports)
Port: 5432
pg_hba.conf
Port: 5433
pg_hba.conf
Port: 5434
pg_hba.conf
ROLES
Cluster
Databases
Database
Cluster
ROLE
• Own database objects
• Tables, Functions, Etc.
• Have cluster-level privileges (attributes)
• Granted privileges to database objects
• Can possibly grant privileges to other roles
• Semantically the same as roles
• By Convention:
• User = LOGIN
• Group = NOLOGIN
• PostgreSQL 8.2+ CREATE (USER|GROUP) is an
alias
• All roles always have membership in PUBLIC role
(different from public schema)
• This is where CONNECT is usually granted, but can
be provided on a role-by-role basis
• Standard PostgreSQL functions are typically
installed into the PUBLIC schema
• <=PG14 all roles had CREATE on this schema by
default
• Schemas are a common security boundary
• Search path assumes PUBLIC schema by default
• Current best practice is to never create user objects in
the PUBLIC schema
• This will require setting the "search_path" to avoid
schema qualified object reference
• Easily limit what users are allowed to do with
PostgreSQL supplied objects vs. user/application
objects
• Object creator = owner
• Initial object access = Principle of Least Privilege
• Unless specifically granted ahead of time, objects are
owned and "accessible" by the creator/superuser only
• Roles can specify default privileges to GRANT for
each object type that they create
Database
Cluster
ROLE
Database
Cluster
ROLE
ALTER DEFAULT PRIVILEGES
GRANT SELECT ON TABLES TO public;
cituscon=> ddp
Default access privileges
Owner | Schema | Type | Access privileges
----------+--------+-------+---------------------------
postgres | | table | =r/postgres +
| | | postgres=arwdDxt/postgres
• Explicit GRANTs can be added each time
• Default privileges can be set by each role
• Very useful for things like migrations with an
"admin" application role
SQL Server PostgreSQL
BIT BOOLEAN
VARCHAR/NVARCHAR TEXT
DATETIME/DATETIME2 TIMESTAMP (kind of)
DATETIMEOFFSET TIMESTAMP WITH TIME ZONE
UUID CHAR(16), uuid-ossp extension, PG14
VARBINARY BYTEA
ARRAYS
RANGE
ENUM
• TEXT is suitable for 99% of string data
• Query memory doesn't work the same way,
therefore TEXT doesn't impact the allocated
memory like VARCHAR(MAX) in SQL Server
• Space efficient like VARCHAR
• Use TEXT with length constraint rather than
VARCHAR(n)
• Allows easy modification to the column length
without costly table modification
• NULL characters can cause issues
CREATE TABLE bluebox.film (
film_id bigint NOT NULL,
title varchar(50) NOT NULL,
overview varchar(2000) NOT NULL,
release_date date,
original_language varchar(16),
rating public.mpaa_rating,
popularity real,
vote_count integer,
vote_average real,
budget bigint,
revenue bigint,
runtime integer,
fulltext tsvector GENERATED ALWAYS AS (to_tsvector('english'::regconfig,
((COALESCE(title, ''::text) || ' '::text) || COALESCE(overview,
''::text)))) STORED
);
CREATE TABLE bluebox.film (
film_id bigint NOT NULL,
title TEXT NOT NULL,
overview TEXT NOT NULL,
release_date date,
original_language TEXT,
rating public.mpaa_rating,
popularity real,
vote_count integer,
vote_average real,
budget bigint,
revenue bigint,
runtime integer,
fulltext tsvector GENERATED ALWAYS AS (to_tsvector('english'::regconfig,
((COALESCE(title, ''::text) || ' '::text) || COALESCE(overview,
''::text)))) STORED
);
SELECT e'Hello world!000';
-- ERROR: invalid byte sequence for encoding "UTF8": 0x00
INSERT INTO actor (first_name, last_name)
VALUES (e'Hello world!000','Smith');
• PostgreSQL has DATE, TIME, TIMESTAMP, and
TIMESTAMPTZ
• Most of the time you likely want TIMESTAMPTZ unless
you have a specific need for just DATE or TIME
• Date math is simple
• Use INTERVALs to modify by arbitrary units of time
CREATE TABLE bluebox.rental (
rental_id INT NOT NULL,
rental_start datetime2,
rental_end datetime2,
inventory_id INT NOT NULL,
customer_id INT NOT NULL,
staff_id INT NOT NULL,
last_update datetime2 DEFAULT now() NOT NULL
);
CREATE TABLE bluebox.rental (
rental_id INT NOT NULL,
rental_start timestamptz,
rental_end timestamptz,
inventory_id INT NOT NULL,
customer_id INT NOT NULL,
staff_id INT NOT NULL,
last_update timestamptz DEFAULT now() NOT NULL
);
• Currently supports INT, NUMERIC, DATE, and TIMSTAMP(TZ)
• A possible replacement for typical schema design of
start_date/end_date
• Indexable with GIST indexes
• Unique range type operators for comparisons like overlap and
contains
• Does present some (potentially) annoying query modifications that
need to use LOWER/UPPER to grab different ends of the range
CREATE TABLE bluebox.rental (
rental_id INT NOT NULL,
rental_start datetime2,
rental_end datetime2,
inventory_id INT NOT NULL,
customer_id INT NOT NULL,
staff_id INT NOT NULL,
last_update datetime2 DEFAULT now() NOT NULL
);
CREATE TABLE bluebox.rental (
rental_id INT NOT NULL,
rental_period tstzrange,
inventory_id INT NOT NULL,
customer_id INT NOT NULL,
staff_id INT NOT NULL,
last_update timestamptz DEFAULT now() NOT NULL
);
• Any defined data type can have a corresponding
array type
• Indexable with GIN and GIST indexes
• Not useful as a foreign key replacement
CREATE TABLE bluebox.film (
film_id bigint NOT NULL,
title varchar(50) NOT NULL,
overview varchar(2000) NOT
NULL,
release_date date,
original_language varchar(16),
rating public.mpaa_rating,
popularity real,
vote_count integer,
vote_average real,
budget bigint,
revenue bigint,
runtime integer
);
CREATE TABLE bluebox.genre (
genre_id int NOT NULL,
genre varchar(64) NOT NULL,
);
CREATE TABLE bluebox.film_genre (
film_id int NOT NULL,
genre_id int NOT NULL
);
CREATE TABLE bluebox.film (
film_id bigint NOT NULL,
title TEXT NOT NULL,
overview TEXT NOT NULL,
release_date date,
original_language TEXT,
genre_ids int[],
rating public.mpaa_rating,
popularity real,
vote_count integer,
vote_average real,
budget bigint,
revenue bigint,
runtime integer,
fulltext tsvector GENERATED ALWAYS AS (to_tsvector('english'::regconfig,
((COALESCE(title, ''::text) || ' '::text) || COALESCE(overview,
''::text)))) STORED
);
Name |Value
-----------------+-----------------------
film_id |1726
title |Iron Man
overview |After being held captiv
release_date |2008-05-02
genre_ids |{28,878,12}
original_language|en
rating |PG-13
popularity |63.926
vote_count |24957
vote_average |7.6
budget |140000000
revenue |585174222
runtime |126
SELECT * FROM film;
Name |Value
-----------------+-----------------------
film_id |1726
title |Iron Man
overview |After being held captiv
release_date |2008-05-02
genre_ids |{28,878,12}
original_language|en
rating |PG-13
popularity |63.926
vote_count |24957
vote_average |7.6
budget |140000000
revenue |585174222
runtime |126
SELECT * FROM film
WHERE genre_ids @> '{28}';
Name |Value
-----------------+-----------------------
film_id |1726
title |Iron Man
overview |After being held captiv
release_date |2008-05-02
genre_ids |{28,878,12}
original_language|en
rating |PG-13
popularity |63.926
vote_count |24957
vote_average |7.6
budget |140000000
revenue |585174222
runtime |126
SELECT * FROM film
WHERE genre_ids && '{28,100}';
Name |Value
-----------------+-----------------------
SELECT * FROM film
WHERE genre_ids @> '{28,100}';
• Creation defaults to B-tree
• Support for:
• multi-column
• partial (filtered)
• covering indexes
• CONCURRENT create, reindex, and drop
• Multiple types supported
• Custom implementations of index types supported
• PostgreSQL uses HEAP-based storage
• No support for CLUSTERED/INDEX ORGANIZED
tables
• This does impact query plans because of random
page access
• Indexing strategies may change based on this
• B-Tree
• GiST/SP-GiST
• GIN
• BRIN
• HASH
• Trigram (extension)
• Classic tree-based indexing
• Default if no index type is specified
• Used for all traditional predicate operators
• Cannot index composite types (eg. Arrays, JSONB)
• PostgreSQL 13+ provides deduplication
• Very fast pre-hash of column values
• Can only be used for equality (=) operator
• Don't use prior to PostgreSQL 10
• Were not transaction safe, crash safe, or replicated
• Generalized Search Tree Index
• Well suited for overlapping data
• Intermediate index pages are typically bitmaps
Common Uses
• Full-text search capabilities
• Geospatial
• Generalized INverted Index
• Useful when multiple values point to one row on disk
• Maintenance heavy
Common Uses
• Arrays
• JSONB
• Range types
• Block Range INdex
• Stores high/low for each range of pages on disk
• Block defaults to 128 pages
Common Uses
• Data with natural ordering on disk
• Timestamps, Zip Codes, odometer-like readings
• Don't Repeat Yourself
• "If you can dream it, there is a SQL function for it in
PostgreSQL"
• Operators, Window, Aggregate, Arrays, and more!
• Functions can be overloaded
• Written in any supported language or compiled
variant
• Numerous query planner parameters available to
influence function execution
• Three modes: IMMUTABLE/STABLE/VOLATILE
• Scaler
• Inline Table Valued
• Multi-statement Table Valued
• No tuning hints for query planner
• TSQL or CLR (less common)
SQL Server Postgres
• Scaler
• Table/Set Types
• Triggers
• Query planner tuning
parameters
• Any procedural language, but
typically SQL or PL/pgSQL
ROWS – Hint row count for planner (1,000 default)
COST – Estimated execution cost (per row)
PARALLEL – SAFE/RESTRICTED/UNSAFE
• Introduced in PostgreSQL 11
• Nobody calls them "SPROC"
• Cannot return a result set
• One INOUT parameter possible
• No plan cache improvement (mostly)
• Stored procedures can create sub-transactions
• Commit transactions that survive later failure
• Helpful in data processing/ETL routines
SQL Server
• Complex logic
• Multi-language
• Contained in calling
transaction scope
• With some work, can return
results
Postgres
• Complex logic
• Multi-language
• Can issue COMMIT/ROLLBACK
and keep processing
https://wiki.postgresql.org/wiki/Don't_Do_This
• TEXT instead of VARCHAR
• UTF-8 by default. No need for two separate kinds of
text fields
• ICU in recent versions for better collation control
• Use IDENTITY columns rather than SERIAL
• Semantically the same under the covers for now (a
sequence is still created), but more control using
SQL standard IDENTITY
• By default, IDENTITY can be inserted into
• Must use ALWAYS to prevent this
• All collations are case sensitive
• PostgreSQL converts to lower case unless text qualified
• By convention, prefer snake_case rather than
CamelCase
• Using camel case will require quotes – e.g. "CamelCase"
• PostgreSQL resolution is microseconds
• Some servers like SQL Server resolution is
nanoseconds
• Practically, it means little to our real calculations,
but if anything relies on that precision (incoming
data), something will have to be done.
• INSERT… ON CONFLICT
• DO UPDATE - Simple, constraint aware method for doing
"upserts"
• DO NOTHING - Error-free inserts when data is duplicated
• MERGE (as of PG15)
• PostgreSQL can create prepared statements with
parameters
• There is no such thing as named parameters
• Limited to 65,535
• PostgreSQL executes SQL as… SQL, not a
procedural variant
• To use procedural features, something like
pl/pgSQL needs to be invoked specifically
• Inline SQL variables must be used in an anonymous
code block, but they can't return a table/set of data
• Move towards functions or stored procedures
• Any supported and installed language can be used,
most without pre-compiling (although compiled will
be faster)
• You are only as good as your last restore
• Full Database
• Point in Time
• Extensions
• Pg_dump – By default, creates a SQL script for
recreating the database & inserting data
o You can dump to binary format, making for easier transport and
management
o If you use pg_dump to binary, you'll need to use pg_restore to
recover
• Pg_dumpall
o Pg_dump on all the databases
• Write Ahead Log (WAL) Archive
o Wal_level must be set to replica or logical
o Archive_mode must be set to on or continuous
o Edit the archive_command to supply a path to your WAL
archive
o Must have a base backup
• Base Backup
o Pg_basebackup
o All databases are copied to target location & WAL start
point is set
• First, run pg_switch_wal to close the current WAL file and move it to the archive
• Stop the service
• Remove all files & subdirectories in your data directory
• Remove all files from pg_wal
• Restore databases from Base Backup
• Change the postgresql.conf file to start recovery, add recovery_target_time
• Create recovery.signal in the cluster data directory
• Start the service
• Before/After scripts
• Logfile support
• Data casting
• Error support
• Continuous migrations
• ...and more
• Initially created by Dimitri Fontaine
• CLI application
• Many migration formats
• CSV, DBF, IXF
• SQLite, MySQL, SQL Server
• PostgreSQL to PostgreSQL
• Limited Redshift support
• Perl-based migration tool
• Open-source with long-standing usage and support
• Export all major objects and data to SQL
• Migration assessment tooling
https://ora2pg.darold.net/
• Wide-ranging conversion tool for AWS hosted
databases
• Converts schema into target database
• Data is usually handled separately with extraction
agents
https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Welcome.html
• Transparent SQL Server –> PostgreSQL migration
• Open-source
• Available in AWS Aurora Postgres
• Stackgres.io
• Not fully ready for complex applications
https://babelfishpg.org/
• Full Convert
• https://www.fullconvert.com/
• PostgreSQL Migration Toolkit
• https://www.convert-in.com/pgskit.htm
• Managed service for end-to-end migrations
• Primarily intended to migrate into AWS Aurora DB
• Ongoing change replication
• Pricing is complex like many hosted services
https://aws.amazon.com/dms/
PostgreSQL & DevOps
Advocate
@ryanbooz
/in/ryanbooz
www.softwareandbooz.com
youtube.com/@ryanbooz
DevOps Advocate
Microsoft Data Platform MVP
AWS Community Builder
grant@scarydba.com
scarydba.com
@gfritchey
linkedin.com/in/grant-fritchey
Migrating To PostgreSQL

More Related Content

Similar to Migrating To PostgreSQL

Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineDataWorks Summit
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Gruter
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQLSatoshi Nagayasu
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectMao Geng
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and librariesDuyhai Doan
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersJonathan Levin
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_SummaryHiram Fleitas León
 
The Grid the Brad and the Ugly: Using Grids to Improve Your Applications
The Grid the Brad and the Ugly: Using Grids to Improve Your ApplicationsThe Grid the Brad and the Ugly: Using Grids to Improve Your Applications
The Grid the Brad and the Ugly: Using Grids to Improve Your Applicationsbalassaitis
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data WarehousesConnor McDonald
 
Lessons learned while building Omroep.nl
Lessons learned while building Omroep.nlLessons learned while building Omroep.nl
Lessons learned while building Omroep.nlbartzon
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataRahul Jain
 
GLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New FeaturesGLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New FeaturesBiju Thomas
 
Tuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlTuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlMydbops
 
Lessons learned while building Omroep.nl
Lessons learned while building Omroep.nlLessons learned while building Omroep.nl
Lessons learned while building Omroep.nltieleman
 
System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022Stefano Stabellini
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's Newdpcobb
 

Similar to Migrating To PostgreSQL (20)

Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and libraries
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
The Grid the Brad and the Ugly: Using Grids to Improve Your Applications
The Grid the Brad and the Ugly: Using Grids to Improve Your ApplicationsThe Grid the Brad and the Ugly: Using Grids to Improve Your Applications
The Grid the Brad and the Ugly: Using Grids to Improve Your Applications
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data Warehouses
 
Lessons learned while building Omroep.nl
Lessons learned while building Omroep.nlLessons learned while building Omroep.nl
Lessons learned while building Omroep.nl
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
GLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New FeaturesGLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New Features
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 
Tuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlTuning Autovacuum in Postgresql
Tuning Autovacuum in Postgresql
 
Lessons learned while building Omroep.nl
Lessons learned while building Omroep.nlLessons learned while building Omroep.nl
Lessons learned while building Omroep.nl
 
System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's New
 

More from Grant Fritchey

Automating Database Deployments Using Azure DevOps
Automating Database Deployments Using Azure DevOpsAutomating Database Deployments Using Azure DevOps
Automating Database Deployments Using Azure DevOpsGrant Fritchey
 
Learn To Effectively Use Extended Events_Techorama.pdf
Learn To Effectively Use Extended Events_Techorama.pdfLearn To Effectively Use Extended Events_Techorama.pdf
Learn To Effectively Use Extended Events_Techorama.pdfGrant Fritchey
 
Using Query Store to Understand and Control Query Performance
Using Query Store to Understand and Control Query PerformanceUsing Query Store to Understand and Control Query Performance
Using Query Store to Understand and Control Query PerformanceGrant Fritchey
 
You Should Be Standing Here: Learn How To Present a Session
You Should Be Standing Here: Learn How To Present a SessionYou Should Be Standing Here: Learn How To Present a Session
You Should Be Standing Here: Learn How To Present a SessionGrant Fritchey
 
Redgate Community Circle: Tools For SQL Server Performance Tuning
Redgate Community Circle: Tools For SQL Server Performance TuningRedgate Community Circle: Tools For SQL Server Performance Tuning
Redgate Community Circle: Tools For SQL Server Performance TuningGrant Fritchey
 
10 Steps To Global Data Compliance
10 Steps To Global Data Compliance10 Steps To Global Data Compliance
10 Steps To Global Data ComplianceGrant Fritchey
 
Time to Use the Columnstore Index
Time to Use the Columnstore IndexTime to Use the Columnstore Index
Time to Use the Columnstore IndexGrant Fritchey
 
Introduction to SQL Server in Containers
Introduction to SQL Server in ContainersIntroduction to SQL Server in Containers
Introduction to SQL Server in ContainersGrant Fritchey
 
SQL Injection: How It Works, How to Stop It
SQL Injection: How It Works, How to Stop ItSQL Injection: How It Works, How to Stop It
SQL Injection: How It Works, How to Stop ItGrant Fritchey
 
Privacy and Protection in the World of Database DevOps
Privacy and Protection in the World of Database DevOpsPrivacy and Protection in the World of Database DevOps
Privacy and Protection in the World of Database DevOpsGrant Fritchey
 
SQL Server Tools for Query Tuning
SQL Server Tools for Query TuningSQL Server Tools for Query Tuning
SQL Server Tools for Query TuningGrant Fritchey
 
Extending DevOps to SQL Server
Extending DevOps to SQL ServerExtending DevOps to SQL Server
Extending DevOps to SQL ServerGrant Fritchey
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseGrant Fritchey
 
Introducing Azure Databases
Introducing Azure DatabasesIntroducing Azure Databases
Introducing Azure DatabasesGrant Fritchey
 
Statistis, Row Counts, Execution Plans and Query Tuning
Statistis, Row Counts, Execution Plans and Query TuningStatistis, Row Counts, Execution Plans and Query Tuning
Statistis, Row Counts, Execution Plans and Query TuningGrant Fritchey
 
Understanding Your Servers, All Your Servers
Understanding Your Servers, All Your ServersUnderstanding Your Servers, All Your Servers
Understanding Your Servers, All Your ServersGrant Fritchey
 
Changing Your Habits: Tips to Tune Your T-SQL
Changing Your Habits: Tips to Tune Your T-SQLChanging Your Habits: Tips to Tune Your T-SQL
Changing Your Habits: Tips to Tune Your T-SQLGrant Fritchey
 
Azure SQL Database for the Earthed DBA
Azure SQL Database for the Earthed DBAAzure SQL Database for the Earthed DBA
Azure SQL Database for the Earthed DBAGrant Fritchey
 
The Query Store SQL Tuning
The Query Store SQL TuningThe Query Store SQL Tuning
The Query Store SQL TuningGrant Fritchey
 

More from Grant Fritchey (20)

Automating Database Deployments Using Azure DevOps
Automating Database Deployments Using Azure DevOpsAutomating Database Deployments Using Azure DevOps
Automating Database Deployments Using Azure DevOps
 
Learn To Effectively Use Extended Events_Techorama.pdf
Learn To Effectively Use Extended Events_Techorama.pdfLearn To Effectively Use Extended Events_Techorama.pdf
Learn To Effectively Use Extended Events_Techorama.pdf
 
Using Query Store to Understand and Control Query Performance
Using Query Store to Understand and Control Query PerformanceUsing Query Store to Understand and Control Query Performance
Using Query Store to Understand and Control Query Performance
 
You Should Be Standing Here: Learn How To Present a Session
You Should Be Standing Here: Learn How To Present a SessionYou Should Be Standing Here: Learn How To Present a Session
You Should Be Standing Here: Learn How To Present a Session
 
Redgate Community Circle: Tools For SQL Server Performance Tuning
Redgate Community Circle: Tools For SQL Server Performance TuningRedgate Community Circle: Tools For SQL Server Performance Tuning
Redgate Community Circle: Tools For SQL Server Performance Tuning
 
10 Steps To Global Data Compliance
10 Steps To Global Data Compliance10 Steps To Global Data Compliance
10 Steps To Global Data Compliance
 
Time to Use the Columnstore Index
Time to Use the Columnstore IndexTime to Use the Columnstore Index
Time to Use the Columnstore Index
 
Introduction to SQL Server in Containers
Introduction to SQL Server in ContainersIntroduction to SQL Server in Containers
Introduction to SQL Server in Containers
 
DevOps for the DBA
DevOps for the DBADevOps for the DBA
DevOps for the DBA
 
SQL Injection: How It Works, How to Stop It
SQL Injection: How It Works, How to Stop ItSQL Injection: How It Works, How to Stop It
SQL Injection: How It Works, How to Stop It
 
Privacy and Protection in the World of Database DevOps
Privacy and Protection in the World of Database DevOpsPrivacy and Protection in the World of Database DevOps
Privacy and Protection in the World of Database DevOps
 
SQL Server Tools for Query Tuning
SQL Server Tools for Query TuningSQL Server Tools for Query Tuning
SQL Server Tools for Query Tuning
 
Extending DevOps to SQL Server
Extending DevOps to SQL ServerExtending DevOps to SQL Server
Extending DevOps to SQL Server
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Introducing Azure Databases
Introducing Azure DatabasesIntroducing Azure Databases
Introducing Azure Databases
 
Statistis, Row Counts, Execution Plans and Query Tuning
Statistis, Row Counts, Execution Plans and Query TuningStatistis, Row Counts, Execution Plans and Query Tuning
Statistis, Row Counts, Execution Plans and Query Tuning
 
Understanding Your Servers, All Your Servers
Understanding Your Servers, All Your ServersUnderstanding Your Servers, All Your Servers
Understanding Your Servers, All Your Servers
 
Changing Your Habits: Tips to Tune Your T-SQL
Changing Your Habits: Tips to Tune Your T-SQLChanging Your Habits: Tips to Tune Your T-SQL
Changing Your Habits: Tips to Tune Your T-SQL
 
Azure SQL Database for the Earthed DBA
Azure SQL Database for the Earthed DBAAzure SQL Database for the Earthed DBA
Azure SQL Database for the Earthed DBA
 
The Query Store SQL Tuning
The Query Store SQL TuningThe Query Store SQL Tuning
The Query Store SQL Tuning
 

Recently uploaded

Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 

Recently uploaded (20)

Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

Migrating To PostgreSQL

  • 1. • Ryan Booz & Grant Fritchey • PGConf.EU 2023
  • 3. DevOps Advocate Microsoft Data Platform MVP AWS Community Builder grant@scarydba.com scarydba.com @gfritchey linkedin.com/in/grant-fritchey
  • 4. 1. Configurations 2. MVCC and VACUUM 3. Security and Privileges 4. Datatypes 5. Indexes 7. Functions & Procedures 8. SQL Considerations 9. Backups 10. Migration Tools
  • 5.
  • 6.
  • 7.
  • 8. • Application layer • What tooling changes are necessary? • Do you rely on SDK/framework features that are different in PostgreSQL versions? • If you have a multi-tenant application, could the migration be used to improve tenant separation? • Does your application rely on query hints? • Why did you have them and how will you approach PostgreSQL with these queries?
  • 9. • Are partitions a better option in PostgreSQL? • How will you manage partitions? • Do you need to normalize some relationships more? • Should you de-normalize a relationship that's been difficult to maintain in its current form? • Aggregated queries and materialized views – or methods for doing incremental materialized views?
  • 10.
  • 11.
  • 12. • Every non-serverless Postgres should be tuned • Edit postgresql.conf https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server https://www.youtube.com/watch?v=IFIXpm73qtk https://postgresqlco.nf/ Helpful Tuning Resources
  • 13. shared_buffers: • data cache • at least 25% of total RAM work_mem: • max memory for each plan operation • JOINS, GROUP BY, ORDER BY can all have workers maintenance_work_mem: • memory for background tasks like VACUUM and ANALYZE max_connections: • exactly as it says
  • 14. wal_compression: • Off (default) • Default uses pglz • PG15 can use lz4 or zstd autovacuum: • On (default) • Do not turn off • Might have to tune individual tables autoanalyze: • On (default) • Do not turn off • Might have to tune individual tables
  • 15. • SHOW will display the current value • SET allows settings to be modified (when possible) • Examples: • SHOW ALL – page all settings • SHOW max_connections –configured connection limit
  • 16.
  • 17.
  • 18. For the default Read Committed Isolation Level: • Each query only sees transactions that were committed before it started (ID less than current transaction) • At the beginning of a query, PostgreSQL records: • The transaction ID (xid) of the current command • All transaction IDs that are in process • Multi-statement transactions also record any of their own completed statements from earlier in transaction
  • 19. • In PostgreSQL, transaction IDs are assigned when the first command/statement is run within a transaction, not when BEGIN is executed. • XIDs are stored in-table and on-row
  • 20. xmin xmax User Data 48 0 Hello! INSERT • xmin = current transaction ID • xmax = zero (invalid)
  • 21. xmin xmax User Data 48 78 Hello! DELETE • xmin is unchanged • xmax = current transaction ID Dead Tuple
  • 22. xmin xmax User Data 48 78 Hello! 78 0 Hello World! UPDATE Original tuple: • xmin is unchanged • xmax = current transaction ID New tuple: • xmin = current transaction ID • xmax = zero Dead Tuple
  • 23. xmin xmax User Data 48 78 Hello! 78 0 Hello! UPDATE (with the same value) Original tuple: • xmin is unchanged • xmax = current transaction ID New tuple: • xmin = current transaction ID • xmax = zero Dead Tuple
  • 24.
  • 25.
  • 26.
  • 27. lp1 lp2 Tuple 2 Tuple 1 Index Tuple 2 lp3 Tuple 3
  • 30.
  • 31.
  • 32. xmin xmax ID User data… Sequential Scan SELECT a, b, c FROM t;
  • 33. xmin xmax ID User data… Sequential Scan SELECT a, b, c FROM t WHERE ID=10; Index
  • 34. xmin xmax ID User data… Index Scan SELECT a, b, c FROM t WHERE ID=10; Index
  • 35. • Serves two main purposes: • Free up space on pages used by dead tuples • Mark rows as "frozen" if they have a XID sufficiently older than the most recent XID, allowing the XID to be reused again in the future • Additionally, if all rows on a page are frozen, the page can be marked frozen to speed up scans
  • 36.
  • 40.
  • 41. "Scale Factor" = Percentage of rows modified before process runs "Threshold" = Added number of rows to add to the percent calculation. (this prevents very small tables from taking resources) autovacuum_vacuum_scale_factor = 0.2 (20%) autovacuum_vacuum_threshold = 50 autovacuum_analyze_scale_factor = 0.1 (10%) autovacuum_analyze_threshold = 50 Global Autovacuum and Autoanalyze Settings
  • 42. • Autovacuum triggers at 20% data modification • Autoanalyze triggers at 10% data modification • Analogous to TF-2731, but per table!
  • 43. Analyze at 5% modification: ALTER TABLE t SET (autovacuum_vacuum_scale_factor=0.05) Analyze threshold above percentage: ALTER TABLE t SET (autovacuum_vacuum_threshold=500)
  • 44. Setting maintenance_work_mem correctly: • maintenance_work_mem is the amount of RAM that a maintenance process like autovacuum can use to process pages of data. If it is not sufficient, vacuuming (and other essential maintenance processes) will take more time and fall behind. • If autovacuum isn't completing it's work regularly, check this setting! Default setting = 64MB
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. Cluster Cluster Cluster Server/Host (Firewall, Ports) Port: 5432 pg_hba.conf Port: 5433 pg_hba.conf Port: 5434 pg_hba.conf
  • 53. • Own database objects • Tables, Functions, Etc. • Have cluster-level privileges (attributes) • Granted privileges to database objects • Can possibly grant privileges to other roles
  • 54. • Semantically the same as roles • By Convention: • User = LOGIN • Group = NOLOGIN • PostgreSQL 8.2+ CREATE (USER|GROUP) is an alias
  • 55. • All roles always have membership in PUBLIC role (different from public schema) • This is where CONNECT is usually granted, but can be provided on a role-by-role basis
  • 56. • Standard PostgreSQL functions are typically installed into the PUBLIC schema • <=PG14 all roles had CREATE on this schema by default • Schemas are a common security boundary • Search path assumes PUBLIC schema by default
  • 57. • Current best practice is to never create user objects in the PUBLIC schema • This will require setting the "search_path" to avoid schema qualified object reference • Easily limit what users are allowed to do with PostgreSQL supplied objects vs. user/application objects
  • 58. • Object creator = owner • Initial object access = Principle of Least Privilege • Unless specifically granted ahead of time, objects are owned and "accessible" by the creator/superuser only • Roles can specify default privileges to GRANT for each object type that they create
  • 61. ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO public; cituscon=> ddp Default access privileges Owner | Schema | Type | Access privileges ----------+--------+-------+--------------------------- postgres | | table | =r/postgres + | | | postgres=arwdDxt/postgres
  • 62. • Explicit GRANTs can be added each time • Default privileges can be set by each role • Very useful for things like migrations with an "admin" application role
  • 63.
  • 64.
  • 65. SQL Server PostgreSQL BIT BOOLEAN VARCHAR/NVARCHAR TEXT DATETIME/DATETIME2 TIMESTAMP (kind of) DATETIMEOFFSET TIMESTAMP WITH TIME ZONE UUID CHAR(16), uuid-ossp extension, PG14 VARBINARY BYTEA ARRAYS RANGE ENUM
  • 66. • TEXT is suitable for 99% of string data • Query memory doesn't work the same way, therefore TEXT doesn't impact the allocated memory like VARCHAR(MAX) in SQL Server • Space efficient like VARCHAR
  • 67. • Use TEXT with length constraint rather than VARCHAR(n) • Allows easy modification to the column length without costly table modification • NULL characters can cause issues
  • 68. CREATE TABLE bluebox.film ( film_id bigint NOT NULL, title varchar(50) NOT NULL, overview varchar(2000) NOT NULL, release_date date, original_language varchar(16), rating public.mpaa_rating, popularity real, vote_count integer, vote_average real, budget bigint, revenue bigint, runtime integer, fulltext tsvector GENERATED ALWAYS AS (to_tsvector('english'::regconfig, ((COALESCE(title, ''::text) || ' '::text) || COALESCE(overview, ''::text)))) STORED );
  • 69. CREATE TABLE bluebox.film ( film_id bigint NOT NULL, title TEXT NOT NULL, overview TEXT NOT NULL, release_date date, original_language TEXT, rating public.mpaa_rating, popularity real, vote_count integer, vote_average real, budget bigint, revenue bigint, runtime integer, fulltext tsvector GENERATED ALWAYS AS (to_tsvector('english'::regconfig, ((COALESCE(title, ''::text) || ' '::text) || COALESCE(overview, ''::text)))) STORED );
  • 70. SELECT e'Hello world!000'; -- ERROR: invalid byte sequence for encoding "UTF8": 0x00 INSERT INTO actor (first_name, last_name) VALUES (e'Hello world!000','Smith');
  • 71. • PostgreSQL has DATE, TIME, TIMESTAMP, and TIMESTAMPTZ • Most of the time you likely want TIMESTAMPTZ unless you have a specific need for just DATE or TIME • Date math is simple • Use INTERVALs to modify by arbitrary units of time
  • 72. CREATE TABLE bluebox.rental ( rental_id INT NOT NULL, rental_start datetime2, rental_end datetime2, inventory_id INT NOT NULL, customer_id INT NOT NULL, staff_id INT NOT NULL, last_update datetime2 DEFAULT now() NOT NULL );
  • 73. CREATE TABLE bluebox.rental ( rental_id INT NOT NULL, rental_start timestamptz, rental_end timestamptz, inventory_id INT NOT NULL, customer_id INT NOT NULL, staff_id INT NOT NULL, last_update timestamptz DEFAULT now() NOT NULL );
  • 74. • Currently supports INT, NUMERIC, DATE, and TIMSTAMP(TZ) • A possible replacement for typical schema design of start_date/end_date • Indexable with GIST indexes • Unique range type operators for comparisons like overlap and contains • Does present some (potentially) annoying query modifications that need to use LOWER/UPPER to grab different ends of the range
  • 75. CREATE TABLE bluebox.rental ( rental_id INT NOT NULL, rental_start datetime2, rental_end datetime2, inventory_id INT NOT NULL, customer_id INT NOT NULL, staff_id INT NOT NULL, last_update datetime2 DEFAULT now() NOT NULL );
  • 76. CREATE TABLE bluebox.rental ( rental_id INT NOT NULL, rental_period tstzrange, inventory_id INT NOT NULL, customer_id INT NOT NULL, staff_id INT NOT NULL, last_update timestamptz DEFAULT now() NOT NULL );
  • 77. • Any defined data type can have a corresponding array type • Indexable with GIN and GIST indexes • Not useful as a foreign key replacement
  • 78. CREATE TABLE bluebox.film ( film_id bigint NOT NULL, title varchar(50) NOT NULL, overview varchar(2000) NOT NULL, release_date date, original_language varchar(16), rating public.mpaa_rating, popularity real, vote_count integer, vote_average real, budget bigint, revenue bigint, runtime integer ); CREATE TABLE bluebox.genre ( genre_id int NOT NULL, genre varchar(64) NOT NULL, ); CREATE TABLE bluebox.film_genre ( film_id int NOT NULL, genre_id int NOT NULL );
  • 79. CREATE TABLE bluebox.film ( film_id bigint NOT NULL, title TEXT NOT NULL, overview TEXT NOT NULL, release_date date, original_language TEXT, genre_ids int[], rating public.mpaa_rating, popularity real, vote_count integer, vote_average real, budget bigint, revenue bigint, runtime integer, fulltext tsvector GENERATED ALWAYS AS (to_tsvector('english'::regconfig, ((COALESCE(title, ''::text) || ' '::text) || COALESCE(overview, ''::text)))) STORED );
  • 80. Name |Value -----------------+----------------------- film_id |1726 title |Iron Man overview |After being held captiv release_date |2008-05-02 genre_ids |{28,878,12} original_language|en rating |PG-13 popularity |63.926 vote_count |24957 vote_average |7.6 budget |140000000 revenue |585174222 runtime |126 SELECT * FROM film;
  • 81. Name |Value -----------------+----------------------- film_id |1726 title |Iron Man overview |After being held captiv release_date |2008-05-02 genre_ids |{28,878,12} original_language|en rating |PG-13 popularity |63.926 vote_count |24957 vote_average |7.6 budget |140000000 revenue |585174222 runtime |126 SELECT * FROM film WHERE genre_ids @> '{28}';
  • 82. Name |Value -----------------+----------------------- film_id |1726 title |Iron Man overview |After being held captiv release_date |2008-05-02 genre_ids |{28,878,12} original_language|en rating |PG-13 popularity |63.926 vote_count |24957 vote_average |7.6 budget |140000000 revenue |585174222 runtime |126 SELECT * FROM film WHERE genre_ids && '{28,100}';
  • 83. Name |Value -----------------+----------------------- SELECT * FROM film WHERE genre_ids @> '{28,100}';
  • 84.
  • 85. • Creation defaults to B-tree • Support for: • multi-column • partial (filtered) • covering indexes
  • 86. • CONCURRENT create, reindex, and drop • Multiple types supported • Custom implementations of index types supported
  • 87. • PostgreSQL uses HEAP-based storage • No support for CLUSTERED/INDEX ORGANIZED tables • This does impact query plans because of random page access • Indexing strategies may change based on this
  • 88. • B-Tree • GiST/SP-GiST • GIN • BRIN • HASH • Trigram (extension)
  • 89. • Classic tree-based indexing • Default if no index type is specified • Used for all traditional predicate operators • Cannot index composite types (eg. Arrays, JSONB) • PostgreSQL 13+ provides deduplication
  • 90. • Very fast pre-hash of column values • Can only be used for equality (=) operator • Don't use prior to PostgreSQL 10 • Were not transaction safe, crash safe, or replicated
  • 91. • Generalized Search Tree Index • Well suited for overlapping data • Intermediate index pages are typically bitmaps Common Uses • Full-text search capabilities • Geospatial
  • 92. • Generalized INverted Index • Useful when multiple values point to one row on disk • Maintenance heavy Common Uses • Arrays • JSONB • Range types
  • 93. • Block Range INdex • Stores high/low for each range of pages on disk • Block defaults to 128 pages Common Uses • Data with natural ordering on disk • Timestamps, Zip Codes, odometer-like readings
  • 94.
  • 95. • Don't Repeat Yourself • "If you can dream it, there is a SQL function for it in PostgreSQL" • Operators, Window, Aggregate, Arrays, and more!
  • 96. • Functions can be overloaded • Written in any supported language or compiled variant • Numerous query planner parameters available to influence function execution • Three modes: IMMUTABLE/STABLE/VOLATILE
  • 97. • Scaler • Inline Table Valued • Multi-statement Table Valued • No tuning hints for query planner • TSQL or CLR (less common) SQL Server Postgres • Scaler • Table/Set Types • Triggers • Query planner tuning parameters • Any procedural language, but typically SQL or PL/pgSQL
  • 98. ROWS – Hint row count for planner (1,000 default) COST – Estimated execution cost (per row) PARALLEL – SAFE/RESTRICTED/UNSAFE
  • 99. • Introduced in PostgreSQL 11 • Nobody calls them "SPROC" • Cannot return a result set • One INOUT parameter possible • No plan cache improvement (mostly)
  • 100. • Stored procedures can create sub-transactions • Commit transactions that survive later failure • Helpful in data processing/ETL routines
  • 101. SQL Server • Complex logic • Multi-language • Contained in calling transaction scope • With some work, can return results Postgres • Complex logic • Multi-language • Can issue COMMIT/ROLLBACK and keep processing
  • 102.
  • 103.
  • 105. • TEXT instead of VARCHAR • UTF-8 by default. No need for two separate kinds of text fields • ICU in recent versions for better collation control
  • 106. • Use IDENTITY columns rather than SERIAL • Semantically the same under the covers for now (a sequence is still created), but more control using SQL standard IDENTITY • By default, IDENTITY can be inserted into • Must use ALWAYS to prevent this
  • 107. • All collations are case sensitive • PostgreSQL converts to lower case unless text qualified • By convention, prefer snake_case rather than CamelCase • Using camel case will require quotes – e.g. "CamelCase"
  • 108. • PostgreSQL resolution is microseconds • Some servers like SQL Server resolution is nanoseconds • Practically, it means little to our real calculations, but if anything relies on that precision (incoming data), something will have to be done.
  • 109. • INSERT… ON CONFLICT • DO UPDATE - Simple, constraint aware method for doing "upserts" • DO NOTHING - Error-free inserts when data is duplicated • MERGE (as of PG15)
  • 110. • PostgreSQL can create prepared statements with parameters • There is no such thing as named parameters • Limited to 65,535
  • 111. • PostgreSQL executes SQL as… SQL, not a procedural variant • To use procedural features, something like pl/pgSQL needs to be invoked specifically • Inline SQL variables must be used in an anonymous code block, but they can't return a table/set of data
  • 112. • Move towards functions or stored procedures • Any supported and installed language can be used, most without pre-compiling (although compiled will be faster)
  • 113.
  • 114.
  • 115. • You are only as good as your last restore
  • 116. • Full Database • Point in Time • Extensions
  • 117. • Pg_dump – By default, creates a SQL script for recreating the database & inserting data o You can dump to binary format, making for easier transport and management o If you use pg_dump to binary, you'll need to use pg_restore to recover • Pg_dumpall o Pg_dump on all the databases
  • 118. • Write Ahead Log (WAL) Archive o Wal_level must be set to replica or logical o Archive_mode must be set to on or continuous o Edit the archive_command to supply a path to your WAL archive o Must have a base backup
  • 119. • Base Backup o Pg_basebackup o All databases are copied to target location & WAL start point is set
  • 120. • First, run pg_switch_wal to close the current WAL file and move it to the archive • Stop the service • Remove all files & subdirectories in your data directory • Remove all files from pg_wal • Restore databases from Base Backup • Change the postgresql.conf file to start recovery, add recovery_target_time • Create recovery.signal in the cluster data directory • Start the service
  • 121.
  • 122. • Before/After scripts • Logfile support • Data casting • Error support • Continuous migrations • ...and more • Initially created by Dimitri Fontaine • CLI application • Many migration formats • CSV, DBF, IXF • SQLite, MySQL, SQL Server • PostgreSQL to PostgreSQL • Limited Redshift support
  • 123. • Perl-based migration tool • Open-source with long-standing usage and support • Export all major objects and data to SQL • Migration assessment tooling https://ora2pg.darold.net/
  • 124. • Wide-ranging conversion tool for AWS hosted databases • Converts schema into target database • Data is usually handled separately with extraction agents https://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_Welcome.html
  • 125. • Transparent SQL Server –> PostgreSQL migration • Open-source • Available in AWS Aurora Postgres • Stackgres.io • Not fully ready for complex applications https://babelfishpg.org/
  • 126. • Full Convert • https://www.fullconvert.com/ • PostgreSQL Migration Toolkit • https://www.convert-in.com/pgskit.htm
  • 127. • Managed service for end-to-end migrations • Primarily intended to migrate into AWS Aurora DB • Ongoing change replication • Pricing is complex like many hosted services https://aws.amazon.com/dms/
  • 128.
  • 130. DevOps Advocate Microsoft Data Platform MVP AWS Community Builder grant@scarydba.com scarydba.com @gfritchey linkedin.com/in/grant-fritchey

Editor's Notes

  1. Presenter: Udara Steve will cover the intro to database DevOps Anderson will do the workshop component Talks about Breaks, Lunch and after workshop if you are available to hang around we have some drinks available at the George Bar.
  2. The goal of this training is to provide lots of tips and learnings from many years working between different database migrations. The hope is that you can take away even one bit of information that makes a big difference in your understanding. I attended a similar training years ago for SQL Server, and while there was lots of good information provided, one minor slide at the end of the day talked about a trace flag that solved one of the major problems we were having. While I learned a lot, that one thing made it all worth it. Hopefully, one thing we share this afternoon will be that kind of help.