© 2016 NetCracker Technology Corporation Confidential
PostgreSQL and JDBC: striving for top performance
Vladimir Sitnikov
PgConf 2016
2© 2016 NetCracker Technology Corporation Confidential
About me
• Vladimir Sitnikov, @VladimirSitnikv
• Performance architect at NetCracker
• 10 years of experience with Java/SQL
• PgJDBC committer
3© 2016 NetCracker Technology Corporation Confidential
Explain (analyze, buffers) PostgreSQL and JDBC
•Data fetch
•Data upload
•Performance
•Pitfalls
4© 2016 NetCracker Technology Corporation Confidential
Intro
Fetch of a single row via primary key lookup takes
20ms. Localhost. Database is fully cached
A. Just fine C. Kidding? Aim is 1ms!
B. It should be 1sec D. 100us
5© 2016 NetCracker Technology Corporation Confidential
Lots of small queries is a problem
Suppose a single query takes 10ms, then 100 of
them would take a whole second *
* Your Captain
6© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
•Simple query
• 'Q' + length + query_text
•Extended query
•Parse, Bind, Execute commands
7© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Super extended query
https://github.com/pgjdbc/pgjdbc/pull/478
backend protocol wanted features
8© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Simple query
•Works well for one-time queries
•Does not support binary transfer
9© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Extended query
•Eliminates planning time
•Supports binary transfer
10© 2016 NetCracker Technology Corporation Confidential
PreparedStatement
Connection con = ...;
PreparedStatement ps =
con.prepareStatement("SELECT...");
...
ps.close();
11© 2016 NetCracker Technology Corporation Confidential
PreparedStatement
Connection con = ...;
PreparedStatement ps =
con.prepareStatement("SELECT...");
...
ps.close();
12© 2016 NetCracker Technology Corporation Confidential
Smoker’s approach to PostgreSQL
PARSE S_1 as ...; // con.prepareStmt
BIND/EXEC
DEALLOCATE // ps.close()
PARSE S_2 as ...;
BIND/EXEC
DEALLOCATE // ps.close()
13© 2016 NetCracker Technology Corporation Confidential
Healthy approach to PostgreSQL
PARSE S_1 as ...;
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
...
DEALLOCATE
14© 2016 NetCracker Technology Corporation Confidential
Healthy approach to PostgreSQL
PARSE S_1 as ...;  1 once in a life
BIND/EXEC  REST call
BIND/EXEC
BIND/EXEC  one more REST call
BIND/EXEC
BIND/EXEC
...
DEALLOCATE  “never” is the best
15© 2016 NetCracker Technology Corporation Confidential
Happiness closes no statements
Conclusion №1: in order to get top performance,
you should not close statements
ps = con.prepareStatement(...)
ps.execueQuery();
ps = con.prepareStatement(...)
ps.execueQuery();
...
16© 2016 NetCracker Technology Corporation Confidential
Happiness closes no statements
Conclusion №1: in order to get top performance,
you should not close statements
ps = con.prepare...
ps.execueQuery();
ps = con.prepare...
ps.execueQuery();
...
17© 2016 NetCracker Technology Corporation Confidential
Unclosed statements in practice
@Benchmark
public Statement leakStatement() {
return con.createStatement();
}
pgjdbc < 9.4.1202, -Xmx128m, OracleJDK 1.8u40
# Warmup Iteration 1: 1147,070 ns/op
# Warmup Iteration 2: 12101,537 ns/op
# Warmup Iteration 3: 90825,971 ns/op
# Warmup Iteration 4: <failure>
java.lang.OutOfMemoryError: GC overhead limit
exceeded
18© 2016 NetCracker Technology Corporation Confidential
Unclosed statements in practice
@Benchmark
public Statement leakStatement() {
return con.createStatement();
}
pgjdbc >= 9.4.1202, -Xmx128m, OracleJDK 1.8u40
# Warmup Iteration 1: 30 ns/op
# Warmup Iteration 2: 27 ns/op
# Warmup Iteration 3: 30 ns/op
...
19© 2016 NetCracker Technology Corporation Confidential
Statements in practice
• In practice, application is always closing the
statements
• PostgreSQL has no shared query cache
• Nobody wants spending excessive time on
planning
20© 2016 NetCracker Technology Corporation Confidential
Server-prepared statements
What can we do about it?
• Wrap all the queries in PL/PgSQL
• It helps, however we had 100500 SQL of them
• Teach JDBC to cache queries
21© 2016 NetCracker Technology Corporation Confidential
Query cache in PgJDBC
• Query cache was implemented in 9.4.1202 (2015-
08-27)
see https://github.com/pgjdbc/pgjdbc/pull/319
• Is transparent to the application
• We did not bother considering PL/PgSQL again
• Server-prepare is activated after 5 executions
(prepareThreshold)
22© 2016 NetCracker Technology Corporation Confidential
Where are the numbers?
• Of course, planning time depends on the query
complexity
• We observed 20мс+ planning time for OLTP
queries: 10KiB query, 170 lines explain
• Result is ~0ms
23© 2016 NetCracker Technology Corporation Confidential
Overheads
24© 2016 NetCracker Technology Corporation Confidential
Generated queries are bad
• If a query is generated
• It results in a brand new java.lang.String object
• Thus you have to recompute its hashCode
25© 2016 NetCracker Technology Corporation Confidential
Parameter types
If the type of bind value changes, you have to
recreate server-prepared statement
ps.setInt(1, 42);
...
ps.setNull(1, Types.VARCHAR);
26© 2016 NetCracker Technology Corporation Confidential
Parameter types
If the type of bind value changes, you have to
recreate server-prepared statement
ps.setInt(1, 42);
...
ps.setNull(1, Types.VARCHAR);
It leads to DEALLOCATE  PREPARE
27© 2016 NetCracker Technology Corporation Confidential
Keep data type the same
Conclusion №1
• Even NULL values should be properly typed
28© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
If using prepared statements, the response time
gets 5'000 times slower. How’s that possible?
A. Bug C. Feature
B. Feature D. Bug
29© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
select *
from plan_flipper -- <- table
where skewed = 0 -- 1M rows
and non_skewed = 42 -- 20 rows
30© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
0.1ms  1st execution
0.05ms  2nd execution
0.05ms  3rd execution
0.05ms  4th execution
0.05ms  5th execution
250 ms  6th execution
31© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
0.1ms  1st execution
0.05ms  2nd execution
0.05ms  3rd execution
0.05ms  4th execution
0.05ms  5th execution
250 ms  6th execution
32© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
• Who is to blame?
• PostgreSQL switches to generic plan after 5
executions of a server-prepared statement
• What can we do about it?
• Add +0, OFFSET 0, and so on
• Pay attention on plan validation
• Discuss the phenomenon pgsql-hackers
33© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
We just use +0 to forbid index on a bad column
select *
from plan_flipper
where skewed+0 = 0  ~ /*+no_index*/
and non_skewed = 42
34© 2016 NetCracker Technology Corporation Confidential
Explain explain explain explain
The rule of 6 explains:
prepare x(number) as select ...;
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 10 sec
35© 2016 NetCracker Technology Corporation Confidential
Везде баг
36© 2016 NetCracker Technology Corporation Confidential
Decision problem
There’s a schema A with table X, and a schema B
with table X. What is the result of select * from
X?
A.X C. Error
B.X D. All of the above
37© 2016 NetCracker Technology Corporation Confidential
Search_path
There’s a schema A with table X, and a schema B
with table X. What is the result of select * from
X?
• search_path determines the schema used
• server-prepared statements are not prepared for
search_path changes  crazy things might
happen
38© 2016 NetCracker Technology Corporation Confidential
Search_path can go wrong
• 9.1 will just use old OIDs and execute the “previous”
query
• 9.2-9.5 might fail with "cached plan must not change
result type” error
39© 2016 NetCracker Technology Corporation Confidential
Search_path
What can we do about it?
• Keep search_path constant
• Discuss it in pgsql-hackers
• Set search_path + server-prepared statements =
cached plan must not change result type
• PL/pgSQL has exactly the same issue 
40© 2016 NetCracker Technology Corporation Confidential
To fetch or not to fetch
You are to fetch 1M rows 1KiB each, -Xmx128m
while (resultSet.next())
resultSet.getString(1);
A. No problem C. Must use
LIMIT/OFFSET
B. OutOfMemory D. autoCommit(false)
41© 2016 NetCracker Technology Corporation Confidential
To fetch or not to fetch
• PgJDBC fetches all rows by default
• To fetch in batches, you need
Statement.setFetchSize and
connection.setAutoCommit(false)
• Default value is configurable via
defaultRowFetchSize (9.4.1202+)
42© 2016 NetCracker Technology Corporation Confidential
fetchSize vs fetch time
6.48
2.28
1.76
1.04 0.97
0
2
4
6
8
10 50 100 1000 2000
Faster,ms
fetchSize
2000 rows
2000 rows
select int4, int4, int4, int4
43© 2016 NetCracker Technology Corporation Confidential
FetchSize is good for stability
Conclusion №2:
• For stability & performance reasons set
defaultRowFetchSize >= 100
44© 2016 NetCracker Technology Corporation Confidential
Data upload
For data uploads, use
• INSERT() VALUES()
• INSERT() SELECT ?, ?, ?
• INSERT() VALUES()  executeBatch
• INSERT() VALUES(), (), ()  executeBatch
• COPY
45© 2016 NetCracker Technology Corporation Confidential
Healty batch INSERT
PARSE S_1 as ...;
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
...
DEALLOCATE
46© 2016 NetCracker Technology Corporation Confidential
TCP strikes back
JDBC is busy with sending
queries, thus it has not started
fetching responses yet
DB cannot fetch more queries since it
is busy with sending responses
47© 2016 NetCracker Technology Corporation Confidential
Batch INSERT in real life
PARSE S_1 as ...;
BIND/EXEC
BIND/EXEC
SYNC  flush & wait for the response
BIND/EXEC
BIND/EXEC
SYNC  flush & wait for the response
...
48© 2016 NetCracker Technology Corporation Confidential
TCP deadlock avoidance
• PgJDBC adds SYNC to your nice batch operations
• The more the SYNCs the slower it performs
49© 2016 NetCracker Technology Corporation Confidential
Horror stories
A single line patch makes insert batch 10 times faster:
https://github.com/pgjdbc/pgjdbc/pull/380
- static int QUERY_FORCE_DESCRIBE_PORTAL = 128;
+ static int QUERY_FORCE_DESCRIBE_PORTAL = 512;
...
// 128 has already been used
static int QUERY_DISALLOW_BATCHING = 128;
50© 2016 NetCracker Technology Corporation Confidential
Trust but always measure
• Java 1.8u40+
• Core i7 2.6Ghz
• Java microbenchmark harness
• PostgreSQL 9.5
51© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c)
values(?, ?, ?)
52© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c)
values(?, ?, ?)
53© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c)
values (?, ?, ?), (?, ?, ?), (?, ?,
?), (?, ?, ?), (?, ?, ?), (?, ?, ?),
(?, ?, ?), (?, ?, ?), (?, ?, ?), ...;
54© 2016 NetCracker Technology Corporation Confidential
Тестируемые запросы: COPY
pgjdbc/ubenchmark/InsertBatch.java
COPY batch_perf_test FROM STDIN
1 s1 1
2 s2 2
3 s3 3
...
55© 2016 NetCracker Technology Corporation Confidential
Queries under test: hand-made structs
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test
select * from
unnest('{"(1,s1,1)","(2,s2,2)",
"(3,s3,3)"}'::batch_perf_test[])
56© 2016 NetCracker Technology Corporation Confidential
You’d better use batch, your C.O.
2
16
128
0
50
100
150
16 128 1024
faster,ms
The number of inserted rows
Insert
Batch
Struct
Copy
int4, varchar, int4
57© 2016 NetCracker Technology Corporation Confidential
COPY is good
0
0.5
1
1.5
2
2.5
16 128 1024
Faster,ms
The number of inserted rows
Batch
Struct
Copy
int4, varchar, int4
58© 2016 NetCracker Technology Corporation Confidential
COPY is bad for small batches
0
5
10
15
20
25
1 4 8 16 128
Faster,ms
Batch size in rows
Batch
Struct
Copy
Insert of 1024 rows
59© 2016 NetCracker Technology Corporation Confidential
Final thoughts
• PreparedStatement is our hero
• Remember to EXPLAIN ANALYZE at least six
times, a blue moon is a plus
• Don’t forget +0 and OFFSET 0
60© 2016 NetCracker Technology Corporation Confidential
About me
• Vladimir Sitnikov, @VladimirSitnikv
• Performance architect in NetCracker
• 10 years of experience with Java/SQL
• PgJDBC committer
© 2016 NetCracker Technology Corporation Confidential
Questions?
Vladimir Sitnikov,
PgConf 2016

PostgreSQL and JDBC: striving for high performance

  • 1.
    © 2016 NetCrackerTechnology Corporation Confidential PostgreSQL and JDBC: striving for top performance Vladimir Sitnikov PgConf 2016
  • 2.
    2© 2016 NetCrackerTechnology Corporation Confidential About me • Vladimir Sitnikov, @VladimirSitnikv • Performance architect at NetCracker • 10 years of experience with Java/SQL • PgJDBC committer
  • 3.
    3© 2016 NetCrackerTechnology Corporation Confidential Explain (analyze, buffers) PostgreSQL and JDBC •Data fetch •Data upload •Performance •Pitfalls
  • 4.
    4© 2016 NetCrackerTechnology Corporation Confidential Intro Fetch of a single row via primary key lookup takes 20ms. Localhost. Database is fully cached A. Just fine C. Kidding? Aim is 1ms! B. It should be 1sec D. 100us
  • 5.
    5© 2016 NetCrackerTechnology Corporation Confidential Lots of small queries is a problem Suppose a single query takes 10ms, then 100 of them would take a whole second * * Your Captain
  • 6.
    6© 2016 NetCrackerTechnology Corporation Confidential PostgreSQL frontend-backend protocol •Simple query • 'Q' + length + query_text •Extended query •Parse, Bind, Execute commands
  • 7.
    7© 2016 NetCrackerTechnology Corporation Confidential PostgreSQL frontend-backend protocol Super extended query https://github.com/pgjdbc/pgjdbc/pull/478 backend protocol wanted features
  • 8.
    8© 2016 NetCrackerTechnology Corporation Confidential PostgreSQL frontend-backend protocol Simple query •Works well for one-time queries •Does not support binary transfer
  • 9.
    9© 2016 NetCrackerTechnology Corporation Confidential PostgreSQL frontend-backend protocol Extended query •Eliminates planning time •Supports binary transfer
  • 10.
    10© 2016 NetCrackerTechnology Corporation Confidential PreparedStatement Connection con = ...; PreparedStatement ps = con.prepareStatement("SELECT..."); ... ps.close();
  • 11.
    11© 2016 NetCrackerTechnology Corporation Confidential PreparedStatement Connection con = ...; PreparedStatement ps = con.prepareStatement("SELECT..."); ... ps.close();
  • 12.
    12© 2016 NetCrackerTechnology Corporation Confidential Smoker’s approach to PostgreSQL PARSE S_1 as ...; // con.prepareStmt BIND/EXEC DEALLOCATE // ps.close() PARSE S_2 as ...; BIND/EXEC DEALLOCATE // ps.close()
  • 13.
    13© 2016 NetCrackerTechnology Corporation Confidential Healthy approach to PostgreSQL PARSE S_1 as ...; BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC ... DEALLOCATE
  • 14.
    14© 2016 NetCrackerTechnology Corporation Confidential Healthy approach to PostgreSQL PARSE S_1 as ...;  1 once in a life BIND/EXEC  REST call BIND/EXEC BIND/EXEC  one more REST call BIND/EXEC BIND/EXEC ... DEALLOCATE  “never” is the best
  • 15.
    15© 2016 NetCrackerTechnology Corporation Confidential Happiness closes no statements Conclusion №1: in order to get top performance, you should not close statements ps = con.prepareStatement(...) ps.execueQuery(); ps = con.prepareStatement(...) ps.execueQuery(); ...
  • 16.
    16© 2016 NetCrackerTechnology Corporation Confidential Happiness closes no statements Conclusion №1: in order to get top performance, you should not close statements ps = con.prepare... ps.execueQuery(); ps = con.prepare... ps.execueQuery(); ...
  • 17.
    17© 2016 NetCrackerTechnology Corporation Confidential Unclosed statements in practice @Benchmark public Statement leakStatement() { return con.createStatement(); } pgjdbc < 9.4.1202, -Xmx128m, OracleJDK 1.8u40 # Warmup Iteration 1: 1147,070 ns/op # Warmup Iteration 2: 12101,537 ns/op # Warmup Iteration 3: 90825,971 ns/op # Warmup Iteration 4: <failure> java.lang.OutOfMemoryError: GC overhead limit exceeded
  • 18.
    18© 2016 NetCrackerTechnology Corporation Confidential Unclosed statements in practice @Benchmark public Statement leakStatement() { return con.createStatement(); } pgjdbc >= 9.4.1202, -Xmx128m, OracleJDK 1.8u40 # Warmup Iteration 1: 30 ns/op # Warmup Iteration 2: 27 ns/op # Warmup Iteration 3: 30 ns/op ...
  • 19.
    19© 2016 NetCrackerTechnology Corporation Confidential Statements in practice • In practice, application is always closing the statements • PostgreSQL has no shared query cache • Nobody wants spending excessive time on planning
  • 20.
    20© 2016 NetCrackerTechnology Corporation Confidential Server-prepared statements What can we do about it? • Wrap all the queries in PL/PgSQL • It helps, however we had 100500 SQL of them • Teach JDBC to cache queries
  • 21.
    21© 2016 NetCrackerTechnology Corporation Confidential Query cache in PgJDBC • Query cache was implemented in 9.4.1202 (2015- 08-27) see https://github.com/pgjdbc/pgjdbc/pull/319 • Is transparent to the application • We did not bother considering PL/PgSQL again • Server-prepare is activated after 5 executions (prepareThreshold)
  • 22.
    22© 2016 NetCrackerTechnology Corporation Confidential Where are the numbers? • Of course, planning time depends on the query complexity • We observed 20мс+ planning time for OLTP queries: 10KiB query, 170 lines explain • Result is ~0ms
  • 23.
    23© 2016 NetCrackerTechnology Corporation Confidential Overheads
  • 24.
    24© 2016 NetCrackerTechnology Corporation Confidential Generated queries are bad • If a query is generated • It results in a brand new java.lang.String object • Thus you have to recompute its hashCode
  • 25.
    25© 2016 NetCrackerTechnology Corporation Confidential Parameter types If the type of bind value changes, you have to recreate server-prepared statement ps.setInt(1, 42); ... ps.setNull(1, Types.VARCHAR);
  • 26.
    26© 2016 NetCrackerTechnology Corporation Confidential Parameter types If the type of bind value changes, you have to recreate server-prepared statement ps.setInt(1, 42); ... ps.setNull(1, Types.VARCHAR); It leads to DEALLOCATE  PREPARE
  • 27.
    27© 2016 NetCrackerTechnology Corporation Confidential Keep data type the same Conclusion №1 • Even NULL values should be properly typed
  • 28.
    28© 2016 NetCrackerTechnology Corporation Confidential Unexpected degradation If using prepared statements, the response time gets 5'000 times slower. How’s that possible? A. Bug C. Feature B. Feature D. Bug
  • 29.
    29© 2016 NetCrackerTechnology Corporation Confidential Unexpected degradation https://gist.github.com/vlsi -> 01_plan_flipper.sql select * from plan_flipper -- <- table where skewed = 0 -- 1M rows and non_skewed = 42 -- 20 rows
  • 30.
    30© 2016 NetCrackerTechnology Corporation Confidential Unexpected degradation https://gist.github.com/vlsi -> 01_plan_flipper.sql 0.1ms  1st execution 0.05ms  2nd execution 0.05ms  3rd execution 0.05ms  4th execution 0.05ms  5th execution 250 ms  6th execution
  • 31.
    31© 2016 NetCrackerTechnology Corporation Confidential Unexpected degradation https://gist.github.com/vlsi -> 01_plan_flipper.sql 0.1ms  1st execution 0.05ms  2nd execution 0.05ms  3rd execution 0.05ms  4th execution 0.05ms  5th execution 250 ms  6th execution
  • 32.
    32© 2016 NetCrackerTechnology Corporation Confidential Unexpected degradation • Who is to blame? • PostgreSQL switches to generic plan after 5 executions of a server-prepared statement • What can we do about it? • Add +0, OFFSET 0, and so on • Pay attention on plan validation • Discuss the phenomenon pgsql-hackers
  • 33.
    33© 2016 NetCrackerTechnology Corporation Confidential Unexpected degradation https://gist.github.com/vlsi -> 01_plan_flipper.sql We just use +0 to forbid index on a bad column select * from plan_flipper where skewed+0 = 0  ~ /*+no_index*/ and non_skewed = 42
  • 34.
    34© 2016 NetCrackerTechnology Corporation Confidential Explain explain explain explain The rule of 6 explains: prepare x(number) as select ...; explain analyze execute x(42); -- 1ms explain analyze execute x(42); -- 1ms explain analyze execute x(42); -- 1ms explain analyze execute x(42); -- 1ms explain analyze execute x(42); -- 1ms explain analyze execute x(42); -- 10 sec
  • 35.
    35© 2016 NetCrackerTechnology Corporation Confidential Везде баг
  • 36.
    36© 2016 NetCrackerTechnology Corporation Confidential Decision problem There’s a schema A with table X, and a schema B with table X. What is the result of select * from X? A.X C. Error B.X D. All of the above
  • 37.
    37© 2016 NetCrackerTechnology Corporation Confidential Search_path There’s a schema A with table X, and a schema B with table X. What is the result of select * from X? • search_path determines the schema used • server-prepared statements are not prepared for search_path changes  crazy things might happen
  • 38.
    38© 2016 NetCrackerTechnology Corporation Confidential Search_path can go wrong • 9.1 will just use old OIDs and execute the “previous” query • 9.2-9.5 might fail with "cached plan must not change result type” error
  • 39.
    39© 2016 NetCrackerTechnology Corporation Confidential Search_path What can we do about it? • Keep search_path constant • Discuss it in pgsql-hackers • Set search_path + server-prepared statements = cached plan must not change result type • PL/pgSQL has exactly the same issue 
  • 40.
    40© 2016 NetCrackerTechnology Corporation Confidential To fetch or not to fetch You are to fetch 1M rows 1KiB each, -Xmx128m while (resultSet.next()) resultSet.getString(1); A. No problem C. Must use LIMIT/OFFSET B. OutOfMemory D. autoCommit(false)
  • 41.
    41© 2016 NetCrackerTechnology Corporation Confidential To fetch or not to fetch • PgJDBC fetches all rows by default • To fetch in batches, you need Statement.setFetchSize and connection.setAutoCommit(false) • Default value is configurable via defaultRowFetchSize (9.4.1202+)
  • 42.
    42© 2016 NetCrackerTechnology Corporation Confidential fetchSize vs fetch time 6.48 2.28 1.76 1.04 0.97 0 2 4 6 8 10 50 100 1000 2000 Faster,ms fetchSize 2000 rows 2000 rows select int4, int4, int4, int4
  • 43.
    43© 2016 NetCrackerTechnology Corporation Confidential FetchSize is good for stability Conclusion №2: • For stability & performance reasons set defaultRowFetchSize >= 100
  • 44.
    44© 2016 NetCrackerTechnology Corporation Confidential Data upload For data uploads, use • INSERT() VALUES() • INSERT() SELECT ?, ?, ? • INSERT() VALUES()  executeBatch • INSERT() VALUES(), (), ()  executeBatch • COPY
  • 45.
    45© 2016 NetCrackerTechnology Corporation Confidential Healty batch INSERT PARSE S_1 as ...; BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC ... DEALLOCATE
  • 46.
    46© 2016 NetCrackerTechnology Corporation Confidential TCP strikes back JDBC is busy with sending queries, thus it has not started fetching responses yet DB cannot fetch more queries since it is busy with sending responses
  • 47.
    47© 2016 NetCrackerTechnology Corporation Confidential Batch INSERT in real life PARSE S_1 as ...; BIND/EXEC BIND/EXEC SYNC  flush & wait for the response BIND/EXEC BIND/EXEC SYNC  flush & wait for the response ...
  • 48.
    48© 2016 NetCrackerTechnology Corporation Confidential TCP deadlock avoidance • PgJDBC adds SYNC to your nice batch operations • The more the SYNCs the slower it performs
  • 49.
    49© 2016 NetCrackerTechnology Corporation Confidential Horror stories A single line patch makes insert batch 10 times faster: https://github.com/pgjdbc/pgjdbc/pull/380 - static int QUERY_FORCE_DESCRIBE_PORTAL = 128; + static int QUERY_FORCE_DESCRIBE_PORTAL = 512; ... // 128 has already been used static int QUERY_DISALLOW_BATCHING = 128;
  • 50.
    50© 2016 NetCrackerTechnology Corporation Confidential Trust but always measure • Java 1.8u40+ • Core i7 2.6Ghz • Java microbenchmark harness • PostgreSQL 9.5
  • 51.
    51© 2016 NetCrackerTechnology Corporation Confidential Queries under test: INSERT pgjdbc/ubenchmark/InsertBatch.java insert into batch_perf_test(a, b, c) values(?, ?, ?)
  • 52.
    52© 2016 NetCrackerTechnology Corporation Confidential Queries under test: INSERT pgjdbc/ubenchmark/InsertBatch.java insert into batch_perf_test(a, b, c) values(?, ?, ?)
  • 53.
    53© 2016 NetCrackerTechnology Corporation Confidential Queries under test: INSERT pgjdbc/ubenchmark/InsertBatch.java insert into batch_perf_test(a, b, c) values (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), ...;
  • 54.
    54© 2016 NetCrackerTechnology Corporation Confidential Тестируемые запросы: COPY pgjdbc/ubenchmark/InsertBatch.java COPY batch_perf_test FROM STDIN 1 s1 1 2 s2 2 3 s3 3 ...
  • 55.
    55© 2016 NetCrackerTechnology Corporation Confidential Queries under test: hand-made structs pgjdbc/ubenchmark/InsertBatch.java insert into batch_perf_test select * from unnest('{"(1,s1,1)","(2,s2,2)", "(3,s3,3)"}'::batch_perf_test[])
  • 56.
    56© 2016 NetCrackerTechnology Corporation Confidential You’d better use batch, your C.O. 2 16 128 0 50 100 150 16 128 1024 faster,ms The number of inserted rows Insert Batch Struct Copy int4, varchar, int4
  • 57.
    57© 2016 NetCrackerTechnology Corporation Confidential COPY is good 0 0.5 1 1.5 2 2.5 16 128 1024 Faster,ms The number of inserted rows Batch Struct Copy int4, varchar, int4
  • 58.
    58© 2016 NetCrackerTechnology Corporation Confidential COPY is bad for small batches 0 5 10 15 20 25 1 4 8 16 128 Faster,ms Batch size in rows Batch Struct Copy Insert of 1024 rows
  • 59.
    59© 2016 NetCrackerTechnology Corporation Confidential Final thoughts • PreparedStatement is our hero • Remember to EXPLAIN ANALYZE at least six times, a blue moon is a plus • Don’t forget +0 and OFFSET 0
  • 60.
    60© 2016 NetCrackerTechnology Corporation Confidential About me • Vladimir Sitnikov, @VladimirSitnikv • Performance architect in NetCracker • 10 years of experience with Java/SQL • PgJDBC committer
  • 61.
    © 2016 NetCrackerTechnology Corporation Confidential Questions? Vladimir Sitnikov, PgConf 2016