Quite often "new" people are only "new" to Postgres. This is my summary of do's and don'ts when it comes to teaching Postgres, what to take note on, with emphasis on teaching
4. What will I tell you?
● About me (done)
● Show of hands
● Who „new people” might be
– And usually – in my case – are
● About teaching
– Comfort zone, learners, stepping back
● Chosen approaches, features, gotchas and the like
● Why, why, why
● And yes, this’ll be about Postgres, but in an unusual way
13. Surprisingly
● Often your colleagues
● Sometimes older
● Sometimes more senior
● Experienced
● With success under their belts
14. Surprisingly
● Often your colleagues
● Sometimes older
● Sometimes more senior
● Experienced
● With success under their belts
● Basically: FORMED already
– Or MADE, if you will
15. Developers are problem solvers
● Your colleagues have certain problems
● Is Postgres the solution?
– Or „a solution” at least?
● And how is the learning curve
– Time including
16. Developers are not SQL people!
● Not many know JOINs very well
● Not many know how indexes work
● Not many know indexes weaknesses
● CTEs, window functions, procedures, cursors…
● They „omit” this
● Comfort zone is nice
18. Do not abandon them
● Docs
● Materials
● Tools
● Links to good content
● Pictures, pictures, pictures
● They can edit / comment (Wiki)
● Your (colleagues) time
20. What is YOUR problem?
● DBA wanting respite for your DB?
● Malpractice in SQL queries?
● Why don’t they use XYZ feature?
● From tomorrow on, teach them some SQL
● Migration from X to Postgres
● Guidelines creation
21. Xun Kuang once said
不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知
之不若行之
Xunzi book 8: Ruxiao, chapter 11
22.
23. Xun Kuang once said
不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知
之不若行之
“Not having heard something is not as good as
having heard it; having heard it is not as good as
having seen it; having seen it is not as good as
knowing it; knowing it is not as good as putting it
into practice.”
Xunzi book 8: Ruxiao, chapter 11
24. Xun Kuang paraphrase would be
不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知
之不若行之
“Not having heard something < having heard it;
having heard it < having seen it;
having seen it < knowing it;
knowing it < putting it into practice.”
Xunzi book 8: Ruxiao, chapter 11
25. How do they learn?
● „Practice makes master”
– Except it doesn’t
● Learning styles
● Docs still relevant
– If well-placed, accessible and easy to get in
26. Repetitio est mater studiorum
● Crash course
● Workshop
● Problem solving on their own
● Docs to help
● Code reviews
30. In short
● History – battle-tested, feature-rich, used
● Basics – moving around, commands, etc.
● Prepare your bait accordingly
– My faves
– Advanced features
– NoSQL angle
– …
● Don’t just drink the KoolAid!
31. Battle-tested
● Matures since 1987
● Comes in many flavours (forks)
● Largest cluster – 2PBs in Yahoo
● Skype, NASA, Instagram
● Stable:
– Many years on one version
– Good version support
– Every year something new
– Follows ANSI SQL standards
https://www.postgresql.org/about/users/
35. Great angles
● Procedures: Java, Perl, Python, CTEs...
● Enterprise / NoSQL - handles XMLs and JSONs
● Index power – spatial or geo or your own
● CTEs and FDWs => great ETL or µservice
● Pure dev: error reporting / logging, MVCC (dirty
read gone), own index, plenty of data types,
Java/Perl/… inside
● Solid internals: processes, sec built-in,
38. Parser
● Syntax checks, like FRIM is not a keyword
– SELECT * FRIM myTable;
● Catalog lookup
– MyTable may not exist
● In the end query tree is built
– Query tokenization: SELECT (keyword)
employeeName (field id) count (function call)...
40. Planner
● Where Planner Tree is built
● Where best execution is decided upon
– Seq or index scan? Index or bitmap index?
– Which join order?
– Which join strategy (nested, hashed, merge)?
– Inner or outer?
– Aggregation: plain, hashed, sorted…
● Heuristic, if finding all plans too costly
42. Example to explain EXPLAIN
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00
rows=10000 width=244)
43. Explaining EXPLAIN - what
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000
width=244)
● Startup cost – time before output phase begins
● Total cost – in page fetches, may change, assumed to
run node to completion
●
Rows – estimated number to scan (but LIMIT etc.)
● Estimated average width of output from that node (in
bytes)
44. Explaining EXPLAIN - how
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; //358|10k
●
No WHERE, no index
● Cost = disk pages read * seq page cost + rows scanned
* cpu tuple cost
● 358 * 1.0 + 10000 * 0.01 = 458 // default values
45. Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
● Actually runs the query
● More info: actual times, rows removed by filter,
sort method used, disk/memory used...
46. Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10
loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024
rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022
rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
47. Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10
loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024
rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022
rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
48. Analyzing EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10
loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024
rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022
rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
49. My Faves
● Error reporting
● PL/xSQL – feel free to use Perl, Python, Ruby, Java,
LISP...
● Data types
– XML and JSON handling
● Foreign Data Wrappers (FDW)
● Windowing functions
● Common table expressions (CTE) and recursive queries
● Power of Indexes
56. Will DB eat your cake?
● Thanks @anandology
Consider password VARCHAR(8)
57. Logging, ‘gotchas’
● Default is to stderr only
●
Set on CLI or in config, not through sets
● Where is it?
●
How to log queries… or turning log_collector on
58. Where is it?
● Default
– data/pg_log
● Launchers can set it (Mac Homebrew/plist)
● Version and config dependent
60. Logging, turn it on
● Default is to stderr only
● In PG:
logging_collector = on
log_filename = strftime-patterned filename
[log_destination = [stderr|syslog|csvlog] ]
log_statement = [none|ddl|mod|all] // all
log_min_error_statement = ERROR
log_line_prefix = '%t %c %u ' # time sessionid user
63. PL/pgSQL
● Stored procedure dilemma
– Where to keep your logic?
– How your logic is NOT in your SCM
● Over dozen of options:
– Perl, Python, Ruby,
– pgSQL, Java,
– TCL, LISP…
64. PL/pgSQL
● Stored procedure dilemma
– Where to keep your logic?
– How your logic is NOT in your SCM
● Over dozen of options:
– Perl, Python, Ruby,
– pgSQL, Java,
– TCL, LISP…
● DevOps, SysAdmins, DBAs… ETLs etc.
65. PL/pgSQL
● Stored procedure dilemma
– Where to keep your logic?
– How your logic is NOT in your SCM
● Over dozen of options:
– Perl, Python, Ruby,
– pgSQL, Java,
– TCL, LISP…
● DevOps, SysAdmins, DBAs… ETLs etc.
66. Perl function example
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
my ($x, $y) = @_;
if (not defined $x) {
return undef if not defined $y;
return $y;
}
return $x if not defined $y;
return $x if $x > $y;
return $y;
$$ LANGUAGE plperl;
67. XML or JSON support
● Parsing and retrieving XML (functions)
● Valid JSON checks (type)
● Careful with encoding!
– PG allows only one server encoding per database
– Specify it to UTF-8 or weep
● Document database instead of OO or rel
– JSON, JSONB, HSTORE – noSQL fun welcome!
69. HSTORE?
CREATE TABLE example (
id serial PRIMARY KEY,
data hstore);
INSERT INTO example (data) VALUES
('name => "John Smith", age => 28, gender => "M"'),
('name => "Jane Smith", age => 24');
70. HSTORE?
CREATE TABLE example (
id serial PRIMARY KEY,
data hstore);
INSERT INTO example (data)
VALUES
('name => "John Smith", age => 28,
gender => "M"'),
('name => "Jane Smith", age => 24');
SELECT id,
data->'name'
FROM example;
SELECT id, data->'age'
FROM example
WHERE data->'age' >=
'25';
71. XML and JSON datatype
CREATE TABLE test (
...,
xml_file xml,
json_file json,
...
);
72. XML functions example
XMLROOT (
XMLELEMENT (
NAME gazonk,
XMLATTRIBUTES (
’val’ AS name,
1 + 1 AS num
),
XMLELEMENT (
NAME qux,
’foo’
)
),
VERSION ’1.0’,
STANDALONE YES
)
<?xml version=’1.0’
standalone=’yes’ ?>
<gazonk name=’val’
num=’2’>
<qux>foo</qux>
</gazonk>
xml '<foo>bar</foo>'
'<foo>bar</foo>'::xml
76. Check out processes
●
pgrep -l postgres
●
htop > filter: postgres
● Whatever you like / use usually
●
Careful with kill -9 on connections
– kill -15 better
79. Before
● Who are they?
● What is your problem?
● How large comfort zone, how to push them out?
● Materials, docs, workshop preparation
● How much time for training?
● How much time after?
● How many people will it be?
● What indicates that problem is solved?
80. During
● Establish the goal
– And – if possible – learning styles
● Promise support (and tell how!)
– Push out from comfort zone!
● Ask for hard work and stupid questions
● Show documentation, do live tour
● Do the workshop
● Involve, find best ones
– You will have them help you later
● Expect questions, make them ask
– Again, push out from comfort zone!
81. After
● Where are the docs?
– Are they using them?
● Answer the questions
– Again, and again
● Code reviews
– Deliver on support promise!
– Involve promising students
● Is the problem gone / better?
82. Don’t omit the basics
● Joins
● Indexes – how they work
● Query path (EXPLAIN, EXPLAIN ANALYZE)
● Moving around (psql)
● Setup and getting to DB
83. Postgres is cool
● Goodies like error reporting or log line prefix
● Processes thought out
● Good for µservices and enterprise
● Not only SQL (XML, JSON, Perl, Python...)
● Ask DB
● Indexes
● Powerful: CTEs, recursive queries, FDWs...
● Battle tested and always high
84. Teaching Postgres – Tomasz Borek
Teaching Postgres
to new people
@LAFK_pl
Consultant @