Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

High-Performance JDBC Voxxed Bucharest 2016

9,497 views

Published on

The High-Performance JDBC presentation held at Voxxed Bucharest 2016.

Published in: Software
  • Be the first to comment

High-Performance JDBC Voxxed Bucharest 2016

  1. 1. High-Performance JDBC VLAD MIHALCEA
  2. 2. About me • @Hibernate Developer • vladmihalcea.com • @vlad_mihalcea • vladmihalcea
  3. 3. Performance Facts “More than half of application performance bottlenecks originate in the database” AppDynamics - http://www.appdynamics.com/database/
  4. 4. Data access layers
  5. 5. Poor man’s JDBC • High response time • Low throughput Photo by Amit Patel CC BY 2.0 https://www.flickr.com/photos/amitp/6069412747/
  6. 6. State of the art JDBC • Low response time • High throughput Photo by zoetnet CC BY 2.0 https://www.flickr.com/photos/zoetnet/14288129197/
  7. 7. Response time • connection acquisition time • statements submission time • statements execution time • result set fetching time • idle time prior to releasing the database connection 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  8. 8. Connection management 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  9. 9. Connection acquisition overhead Metric DB_A (ms) DB_B (ms) DB_C (ms) DB_D (ms) HikariCP (ms) min 11.174 5.441 24.468 0.860 0.001230 max 129.400 26.110 74.634 74.313 1.014051 mean 13.829 6.477 28.910 1.590 0.003458 p99 20.432 9.944 54.952 3.022 0.010263
  10. 10. Connection pooling • Logical vs physical connections • Lease vs create • Release vs close
  11. 11. Connection pool sizing
  12. 12. FlexyPool • Java EE • Bitronix / Atomikos • Apache DBCP / DBCP2 • C3P0 • HikariCP • Tomcat CP • Vibur DBCP https://github.com/vladmihalcea/flexy-pool
  13. 13. FlexyPool • concurrent connections histogram • concurrent connection requests histogram • connection acquisition time histogram • connection lease time histogram • maximum pool size histogram • retry attempts histogram https://github.com/vladmihalcea/flexy-pool
  14. 14. FlexyPool – Concurrent connection requests 1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595 622 649 676 703 730 757 784 811 838 865 892 919 946 973 1000 1027 0 2 4 6 8 10 12 Sample time (Index × 15s) Connectionrequests max mean p50 p95 p99
  15. 15. FlexyPool – Pool size growth 1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595 622 649 676 703 730 757 784 811 838 865 892 919 946 973 1000 1027 0 1 2 3 4 5 6 Sample time (Index × 15s) Maxpoolsize max mean p50 p95 p99
  16. 16. FlexyPool – Connection acquisition time 1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595 622 649 676 703 730 757 784 811 838 865 892 919 946 973 1000 1027 0 500 1000 1500 2000 2500 3000 3500 Sample time (Index × 15s) Connectionacquisitiontime(ms) max mean p50 p95 p99
  17. 17. FlexyPool – Connection lease time 1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477 505 533 561 589 617 645 673 701 729 757 785 813 841 869 897 925 953 981 1009 1037 0 5000 10000 15000 20000 25000 30000 35000 40000 Sample time (Index × 15s) Connectionleasetime(ms) max mean p50 p95 p99
  18. 18. Statement Batching statement.addBatch( "INSERT INTO post "(title, version, id) " + "VALUES ('Post no. 1', 0, 1)"); statement.addBatch( "INSERT INTO post_comment (post_id, review, version, id) " + "VALUES (1, 'Post comment 1.1', 0, 1)"); int[] updateCounts = statement.executeBatch(); 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  19. 19. Statement Batching (5k rows) 1 10 20 30 40 50 60 70 80 90 100 1000 0 500 1000 1500 2000 2500 Batch size Time(ms) DB_A DB_B DB_C DB_D
  20. 20. Oracle Statement batching • For Statement and CallableStatement, the Oracle JDBC Driver doesn’t actually support batching, each statement being executed separately.
  21. 21. MySQL Statement batching • By default, the MySQL JDBC driver doesn’t send the batched statements in a single request. • The rewriteBatchedStatements connection property adds all batched statements to a String buffer.
  22. 22. Batch PreparedStatements PreparedStatement postStatement = connection.prepareStatement( "INSERT INTO Post (title, version, id) VALUES (?, ?, ?)"); postStatement.setString(1, String.format("Post no. %1$d", 1)); postStatement.setInt(2, 0); postStatement.setLong(3, 1); postStatement.addBatch(); postStatement.setString(1, String.format("Post no. %1$d", 2)); postStatement.setInt(2, 0); postStatement.setLong(3, 2); postStatement.addBatch(); int[] updateCounts = postStatement.executeBatch();
  23. 23. Batch PreparedStatements • SQL Injection Prevention • Better performance • Hibernate can batch statements automatically
  24. 24. Insert PreparedStatement batching (5k rows) 1 10 20 30 40 50 60 70 80 90 100 1000 0 200 400 600 800 1000 1200 1400 1600 Batch size Time(ms) DB_A DB_B DB_C DB_D
  25. 25. Update PreparedStatement batching (5k rows) 1 10 20 30 40 50 60 70 80 90 100 1000 0 100 200 300 400 500 600 700 Batch size Time(ms) DB_A DB_B DB_C DB_D
  26. 26. Delete PreparedStatement batching (5k rows) 1 10 20 30 40 50 60 70 80 90 100 1000 0 200 400 600 800 1000 1200 Batch size Time(ms) DB_A DB_B DB_C DB_D
  27. 27. Statement caching 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  28. 28. Statement caching gain (one minute interval) Database System No Caching Throughput (SPM) Caching Throughput (SPM) Percentage Gain DB_A 419 833 507 286 20.83% DB_B 194 837 303 100 55.56% DB_C 116 708 166 443 42.61% DB_D 15 522 15 550 0.18%
  29. 29. Oracle server-side statement caching • Hard parse • Soft parse • Bind peeking • Adaptive cursor sharing (since 11g)
  30. 30. SQL Server server-side statement caching • Execution plan cache • Parameter sniffing • Force recompile SELECT * FROM task WHERE status = ? OPTION(RECOMPILE);
  31. 31. PostgreSQL server-side statement caching • Prior to 9.2 – execution plan caching • 9.2 – optimization and planning are deferred • The prepareThreshold connection property
  32. 32. MySQL server-side statement caching • No execution plan cache • Since Connector/J 5.0.5 PreparedStatements are only emulated • To activate server-side prepared statements: • useServerPrepStmts • cachePrepStmts
  33. 33. Client-side statement caching • Recycling Statement, PreparedStatement or CallableStatement objects • Reusing database cursors
  34. 34. Oracle implicit client-side statement caching • Connection-level cache • PreparedStatement and CallabledStatement only connectionProperties.put( "oracle.jdbc.implicitStatementCacheSize", Integer.toString(cacheSize) ); dataSource.setConnectionProperties( connectionProperties );
  35. 35. Oracle implicit client-side statement caching • Can be disabled on a per statement basis if (statement.isPoolable()) { statement.setPoolable(false); }
  36. 36. Oracle explicit client-side statement caching • Caches both metadata and execution state with data OracleConnection oracleConnection = (OracleConnection) connection; oracleConnection.setExplicitCachingEnabled(true); oracleConnection.setStatementCacheSize(cacheSize);
  37. 37. Oracle explicit client-side statement caching • Vendor-specific API PreparedStatement statement = oracleConnection. getStatementWithKey(SELECT_POST_KEY); if (statement == null) statement = connection.prepareStatement(SELECT_POST); try { statement.setInt(1, 10); statement.execute(); } finally { ((OraclePreparedStatement) statement). closeWithKey(SELECT_POST_KEY); }
  38. 38. SQL Server client-side statement caching • Microsoft JDBC Driver 4.2 disableStatementPooling • jTDS 1.3.1 – JDBC 3.0 JtdsDataSource jdtsDataSource = (JtdsDataSource) dataSource; jdtsDataSource.setMaxStatements(cacheSize);
  39. 39. PostgreSQL Server client-side statement caching • PostgreSQL JDBC Driver 9.4-1202 makes client-side statement connection-bound instead of statement-bound • Configurable: • preparedStatementCacheQueries (default is 256) • preparedStatementCacheSizeMiB (default is 5MB) • Statement.setPoolable(false) is not supported
  40. 40. MySQL Server client-side statement caching • Configurable: • cachePrepStmts (default is false) Required for server-side statement caching as well • prepStmtCacheSize (default is 25) • prepStmtCacheSqlLimit (default is 256) • Statement.setPoolable(false) works for client-side statements only
  41. 41. ResultSet fetch size • ResultSet - application-level cursor 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒 statement.setFetchSize(fetchSize);
  42. 42. Oracle ResultSet fetch size • Default fetch size is 10 • Oracle 10i and 11g JDBC Driver maximum ResultSet size memory preallocation • VARCHAR2(4000) – allocates 8000 bytes (even for 1 character) • Memory buffers are recycled only when using Statement caching • Oracle 12c allocates memory on demand • VARCHAR2(4000) – 15 bytes + the actual row column size
  43. 43. SQL Server ResultSet fetch size • Adaptive buffering • Only for the default read-only and forward-only ResultSet • Updatable cursors use fixed data blocks
  44. 44. PostgreSQL ResultSet fetch size • Fetch all – one database roundtrip • Custom fetch size – database cursor
  45. 45. MySQL ResultSet fetch size • Fetch all – one database roundtrip • Streaming – only one record at a time
  46. 46. ResultSet fetch size (10k rows) 1 10 100 1000 10000 0 100 200 300 400 500 600 Fetch size Time(ms) DB_A DB_B DB_C DB_D
  47. 47. ResultSet size • Avoid fetching data that is not required • Hibernate addresses the max-size vendor-specific SQL statement syntax
  48. 48. SQL:2008 ResultSet size limit • Oracle 12c, SQL Server 2012 and PostgreSQL 8.4 SELECT pc.id AS pc_id, p.title AS p_title FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id ORDER BY pc_id OFFSET ? ROWS FETCH FIRST (?) ROWS ONLY;
  49. 49. Oracle ResultSet size limit SELECT * FROM ( SELECT pc.id AS pc_id, p.title AS p_title FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id ORDER BY pc_id ) WHERE ROWNUM <= ?
  50. 50. SQL Server ResultSet size limit SELECT TOP (?) pc.id AS pc_id, p.title AS p_title FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id ORDER BY pc_id
  51. 51. PostgreSQL and MySQL ResultSet size limit SELECT pc.id AS pc_id, p.title AS p_title FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id ORDER BY pc_id LIMIT ?
  52. 52. Statement max rows • Vendor-independent syntax • Might not influence the execution plan • According to the documentation: “If the limit is exceeded, the excess rows are silently dropped.” statement.setMaxRows(maxRows);
  53. 53. Max size: 1 million vs 100 rows Fetch all Fetch max rows Fetch limit 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Time(ms) DB_A DB_B DB_C DB_D
  54. 54. Fetching too many columns • Fetching all column (ORM tools) SELECT * FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id INNER JOIN post_details pd ON p.id = pd.id
  55. 55. Fetching too many columns • Fetching a custom SQL projection SELECT pc.version FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id INNER JOIN post_details pd ON p.id = pd.id
  56. 56. Fetching too many columns performance impact All columns Custom projection 0 5 10 15 20 25 30 Time(ms) DB_A DB_B DB_C DB_D
  57. 57. Processing Logic • Hibernate defers connection acquisition • Release connection as soon as possible 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  58. 58. Questions and Answers 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒 • Response time • Connection management • Batch updates • Statement caching • ResultSet fetching • https://leanpub.com/high-performance-java-persistence

×