This document provides an overview of key differences between SQL Server and PostgreSQL databases. It covers topics such as extensions, cost, case sensitivity, operating systems, processor configuration, write-ahead logging (WAL), checkpoints, disabling writes, page corruptions, MVCC, vacuum, database snapshots, system databases, tables, indexes, statistics, triggers, functions, security, backups, replication, imports/exports, maintenance, and monitoring. The document aims to help SQL Server DBAs understand how to administer and work with PostgreSQL databases.
3. Extensions
SQL Server Postgres
Inbuilt . Cant create any extensions to enable
features
Extensions uncovers some of the features available
in other vendor based DBMS.
columnar index
• postgis (location based queries)
• postpic(image processing)
• fdw – access data from external sources
• MS-SQL / SYBASE /Redshift
• ODBC – Any
• Twitter
• Files
• S3
Dblink – linked server to postgres
Etc,.
CLR Procedures with Python/Perl
4. Cost $$$
SQL Server Postgres
3K / Core Open source
commercial versions are available
Support from Microsoft Enterprise support from Enterprise DB
On cloud it may cost More than on-premise due to
licenses / core
5. Case Sensitivity
SQL Server Postgres
Case sensitive / Case Insensitive
Data is Case sensitive Only
schema case insensitive
Alternate :
use the citext datatype extension
use ILIKE instead of LIKE
use Postgres' lower() function
add an index on lower(last name)
!! You can create index on lower case , but make sure
to avoid unique constraint collision if its unique index
6. Operating System
SQL Server Postgres
Windows
Linux
Windows
Linux
Mac OS
Solaris
etc,.
On cloud EC2 on Linux is cheaper than Windows $$
7. How many Instances
SQL Server Postgres
N +1 instances called Named instances Possible to Install more than 1 Postgres on same
Server
8. Processor Configuration
SQL Server Postgres
Can chose n number of processors.
> to balance other applications runs on same server
> no.of licenses purchased
NA
Parallelism can be enabled max_parallel_workers_per_gather > 0
++ more parameters for parallel executions
at more granular level
9. WAL
SQL Server Postgres
1. User updates a record
2. transaction writes to WAL Cache flush to WAL(SQL Server = VLF) , postgres 16 MB
segments in xlog directory)
3. page from disk writes to shared-buffer
4. Checkpoint
5. flushes blocks to page cache
6. flushes to disk (fsync)
7. Checkpoint completes
If the server restarts during any power failure before a checkpoint then all transactions from
x-log after recent checkpoint will be recovered. (recovery time)
Q1. Is x-log removes the committed transactions where the data pages belongs to it written
to disk after checkpoint to limit the size of x-log ?
..cont..
10. Control the size of log
Through truncate logs after checkpoint
SQL Server Postgres
Recovery Models :
“Simple recovery model “– truncate log (remove
all committed transaction after checkpoint)
wal_level = minimal
Backup needs committed logs , but need to
truncate once log backup in place
“Full Recovery Model”
Wal_level =archive
+ archive_mode = true
Don’t Log each transaction in bulk operations but
take the extent level backup during begin and end
of bulk load operations
“Bulk Logged Recovery Model”
NA
Next : Checkpoint configurations. Cont..
11. Checkpoint configurations
Plays very important role in recovery time
SQL Server Postgres
Delay checkpoint configure “recovery time” More control through 3 parameters
1.checkpoint_timeout =frequency
2.max_wal_size = threshold to trigger
Reduce I/O
contention bet. User
requests and
checkpoint
NA # stretch checkpoint writes , instead dumping whole
data
3. checkpoint_completion_target
default 0.5 ( a fraction of checkpoint_timeout)
If checkpoint_timeout = 5 minutes then its “try” to write
all pages to disk cache in 2.5 minutes.
Leads to slow disk-writes as it stretches , but balances
I/O between checkpoints and user writes
Commit only after
writing data-page
NA synchronous_commit=on/off/remote at the cost of
delays in transaction.
Deault –On , in AWS-RDS
12. Disabling write to Data Page
SQL Server Postgres
Cluster level NA Fsync = off
Note : Severe data loss , so make sure before
configuring this setting.
For performance , rather chose
synchronous_commit=off instead this.
Only for certain use cases like , making read-only instances from backups and other testing purposes
13. Disabling write to WAL
SQL Server Postgres
Table Level NA Unlogged table.
create unlogged table….
These tables exists forever as long as cluster don’t
restart.
These are not same as temp tables(session based)
Only for certain use cases like , mostly for staging tables
14. Page Corruptions
SQL Server Postgres
Only to detect by configuring:
“Torn Page Detection”
“Checksum”
Can prevent through
“full_page_writes = on”
it writes entire page to WAL during the first update on the
page to a separate space called wal_backups
15. MVCC
SQL Server Postgres
Through snapshot Isolation Default
Writers don’t block readers and readers wont block writers
Migrating from SQL Server to Postgres :
Make sure to fully test your application with MVCC feature before migration s
from SQL Server to Postgres
16. vacuum
To prevent transaction wrap around limit
Release the space by physical deleting deleted-records
SQL Server Postgres
NA Auto-vacuum
Manual –vacuum
configure at database level and table level
Deault –On , in AWS-RDS
17. Database Snapshots
SQL Server Postgres
A database snapshot is a
read-only, static view of a SQL
Server database (the
source database).
The database snapshot is
transactionally consistent with
the source database as of the
moment of
the snapshot's creation.
This is not a replacement for
backup , nor cloning
NA
AWS-Aurora(postgres) database clones replicas acts
exactly like in SQL Server
18. System Databases
SQL Server Postgres
System
databases
Master Not exactly like master , but default database is postgres.
System information stored in 3 files
postgressql.config ( all configurations) pg_hba.config ( method of
authentication , md5,ssl,LDAP , etc) ident_file ( map the remote login
with db user)
select * from pg_settings (where these files are)
select * from pg_config( config info)
Model template
Msdb
Tempdb for entire cluster temp schemas with in each database
, physically they stored in one file location
PGDATA/base/pgsql_tmp
NA Control temp file limit for sessions , using
Temp_file_limit = int
max disk space limit for internal temp tables per session
19. Database
SQL Server Postgres
Server
Database
Schema
Tables
same
Transaction Logs Each database has it own
Transaction log
Entire cluster ( all databases) has common
transaction log
Table physical
storage
All tables in a database stored in
one/more n files
Each table stored in each physical file
** when you truncate a table , its immediately
release the space to OS
File groups Supported Supported
Cross database
queries
Allowed by specifiying 3 part
notiation
db_name.schema_name.tbl_name
NA
// alternate way is to use dblink
// postgres_fdw(extension)
20. Tables
SQL Server Postgres
Partition Supported Supported
Inheritence NA Supported
Read data from file NA Supported
Create FOREIGN table (..) options (file path..)
Temp tables supported supported
AWS RDS :
21. indexes
SQL Server Postgres
Clustered index Yes Cluster on postgres is differently implemented than SS.
It recreates the table with sorted order based on
clustered column (only 1st time). Helps for range based
queries. but need to recluster to sort new data which
holds tab lock.
ALTER TABLE T1 CLUSTER ON idx1;
Non-clustered index Yes Indexes are always on separate page
Types B Tree B-tree, Hash, GiST, GIN and BRIN.
Filtered /partial index Yes Yes
With include Yes Yes ( starting version 11)
Composite index Yes Yes
Index on view Yes with auto refresh Yes.
but view has to be refreshed to get new data
22. statistics
SQL Server Postgres
Settings Auto
1st 500 rows + 20%
More granular configuration
Auto_vacuum_analyze parameter
_scale_factor = percentage + number of rows
_threshold = after how many rows
Ability to make the settings at table level
ALTER TABLE a_lot_of_inserts SET (
autovacuum_analyze_threshold = 500 ,
autautovacuum_analyze_scale_factor = 0.01 )
Manual Yes
Update statistics
Analyze <table name>
AWS RDS :
23. Other objects
SQL Server Postgres
Views Yes Yes
Materialized view Yes with auto refresh Yes
need to refresh manually “refresh materialized view
my_view”
Trigger on view supported Supported
DDL ,DML ,EVENT
Based triggers
Supported Supported
Functions Supported Supported
Procedures Yes Starting , version 10.0
Datatypes check this link.
AWS / MS Visual studio – offers SCT which helps to
compare side-side differences bet; SQL & Postgres
24. security
SQL Server Postgres
Logins Yes Logins / Users / Roles are synonymous in postgres
Database Users Login – Database User
mapping
Grant connect on user to database
Roles Role – with set of permissions
mapped to users
A user without connect permission acts as a Role
Assign set of permissions and add users
Server Roles Sysadmin,
security admin
dbcreator
bulk adkin
Superuser
create role
create database
Priveliges Grant , Revoke , Deny Grant, Revoke , Drop
25. Backups
SQL Server Postgres
Logical statement
level backup
NA
alternate is to generate
statements through GUI , but
to get consistent state the
database has to be in read-
only mode
Pg_dump (consistent)
Physical Backup Full
Differential
Log
1. File system backup.
To capture transactions during the backup
pg_start_backup copy files pg_stop_backup
2. file system snapshot ( if the storage supports)
Continous archiving Full , following with log
backups
File system backup , following with archving x-log files
AWS RDS : snapshots with incremental snapshots
26. Replication
SQL Server Postgres
Transactional
streaming replication
Transactional replication Streaming replication
Merge replication Yes
Log-shipping Yes Yes
Log-ship single
database
Yes Entire cluster(all databases)
Replication-Single
Database
Yes. Publisher-Subscriber
Model
Yes. Publisher-Subscriber Model.
Synchronous repl Yes(2 phase commit) Yes
DR /HA with auto
failover
Always on(DR /HA)
sync /async /read-only
Pg_autofailover
(will cover in next presentation)
AWS RDS : supports read replicas
AWS Aurora – read replicas points to same storage , available in minutes
27. Import/export
SQL Server Postgres
Read external files
with out loading
NA file_fdw module
Create FOREIGN table (…) options(file path)
Load CSV Bcp(command line)
Bulk insert(tsql)
Copy command
Load data and
redirect bad rows to
separate file
SSIS
complete ETL package tool
pgloader ( external tool , apt-get install pgloader)
(advantage : it redirects bad rows to seperate file and
logs the status, need to write scripts to map the
transformations...) ^^^ this has more options than the
BCP
AWS RDS :
28. Maintanance -1
SQL Server Postgres
Vacuum NA select relname,last_vacuum, last_autovacuum,
last_analyze, last_autoanalyze from pg_stat_user_tables;
Vacuum (full) <table name>
TXID Exhaustion NA percent_towards_wraparound &
percent_towards_emergency_autovac
https://info.crunchydata.com/blog/managing-
transaction-id-wraparound-in-postgresql
^^ recommended to create an alert when it reaches
certain threshold
Backups Scheduled through Agent Archive command copies wal segments to a directory
which needs to copied to secondary location through
crontab
Monitoring backups Select * from msdb..backupset SELECT * FROM pg_stat_archiver;
** other 3rd party tools available
..continue
29. Maintanance - 2
SQL Server Postgres
Index fragmentation Sys.index_physical_stats https://wiki.postgresql.org/wiki/Index_Maintenance
Reindex Alter table.. Reindex Reindex <index name>
vacuum full removes fragmentation as well
Online reindex Option (online) Alternate is to “create index concurrent…
Index defrag Yes. arranges contiguous
pages instead rebuilding
whole table
NA
Monito stats Table sys.stats select relname,last_vacuum, last_autovacuum,
last_analyze, last_autoanalyze from pg_stat_user_tables;
Update stats Auto and manual Auto and manual
analyze <table name>
AWS RDS :
30. Monitoring nd maintanance
SQL Server Postgres
queries Exec_requests pg_stat_statements
blocking Sys.sysprocesses Pg_stat_activity
Io activity on tables
and indexes
Index_operational_stats
index_usage_stats
Pg_stat_io_all_tables
Pg_stat_io_all_indexes
Kill blocking PID Kill SPID Pg_cancel_backend(spid) – kills the transaction
pg_terminate_backend- kills and terminates con
SQL Server – 3K / 2 Core ----- Postgres ( enterprise support available from 2nd quadrant and commercial products available from enterprise DB)ORDBMS – treats each entity as an object defined with an OID.. Allows to define custom datatypes , allows to implement inheritance and polymorphism.
There is an utility called taskset ( but not recommended)