Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

High Performance Hibernate JavaZone 2016

2,949 views

Published on

The High-Performance Hibernate presentation from JavaZone 2016

Published in: Software
  • Be the first to comment

High Performance Hibernate JavaZone 2016

  1. 1. High-Performance Hibernate VLAD MIHALCEA
  2. 2. About me • @Hibernate Developer • vladmihalcea.com • @vlad_mihalcea • vladmihalcea
  3. 3. Agenda • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching
  4. 4. Performance Facts “More than half of application performance bottlenecks originate in the database” AppDynamics - http://www.appdynamics.com/database/
  5. 5. Google Ranking “Like us, our users place a lot of value in speed — that's why we've decided to take site speed into account in our search rankings.” https://webmasters.googleblog.com/2010/04/using-site-speed-in-web-search-ranking.html
  6. 6. Performance and Revenue “It has been reported that every 100ms of latency costs Amazon 1% of profit.” http://radar.oreilly.com/2008/08/radar-theme-web-ops.html
  7. 7. Response Time and Throughput • n - number of completed transactions • t - time interval 𝑇𝑎𝑣𝑔 = 𝑡 𝑛 = 1𝑠 100 = 10 𝑚𝑠 𝑋 = 𝑛 𝑡 = 100 1𝑠 = 100 𝑇𝑃𝑆
  8. 8. Response Time and Throughput 𝑋 = 1 𝑇𝑎𝑣𝑔 “The lower the Response Time, The higher the Throughput”
  9. 9. The anatomy of a database transaction
  10. 10. Response Time • connection acquisition time • statement submit time • statement execution time • result set fetching time • idle time prior to releasing database connection 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  11. 11. Agenda • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching
  12. 12. Connection Management Metric DB_A (ms) DB_B (ms) DB_C (ms) DB_D (ms) HikariCP (ms) min 11.174 5.441 24.468 0.860 0.001230 max 129.400 26.110 74.634 74.313 1.014051 mean 13.829 6.477 28.910 1.590 0.003458 p99 20.432 9.944 54.952 3.022 0.010263 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  13. 13. Connection Providers
  14. 14. DataSourceConnectionProvider
  15. 15. Connection Provisioning
  16. 16. FlexyPool • concurrent connections • concurrent connection requests • connection acquisition time • connection lease time histogram • maximum pool size • overflow pool size • retries attempts • total connection acquisition time • Java EE • Bitronix / Atomikos • Apache DBCP / DBCP2 • C3P0 • BoneCP • HikariCP • Tomcat CP • Vibur DBCP https://github.com/vladmihalcea/flexy-pool
  17. 17. FlexyPool – Concurrent connection requests 1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595 622 649 676 703 730 757 784 811 838 865 892 919 946 973 1000 1027 0 2 4 6 8 10 12 Sample time (Index × 15s) Connectionrequests max mean p50 p95 p99
  18. 18. FlexyPool – Pool size growth 1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595 622 649 676 703 730 757 784 811 838 865 892 919 946 973 1000 1027 0 1 2 3 4 5 6 Sample time (Index × 15s) Maxpoolsize max mean p50 p95 p99
  19. 19. FlexyPool – Connection acquisition time 1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487 514 541 568 595 622 649 676 703 730 757 784 811 838 865 892 919 946 973 1000 1027 0 500 1000 1500 2000 2500 3000 3500 Sample time (Index × 15s) Connectionacquisitiontime(ms) max mean p50 p95 p99
  20. 20. FlexyPool – Connection lease time 1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477 505 533 561 589 617 645 673 701 729 757 785 813 841 869 897 925 953 981 1009 1037 0 5000 10000 15000 20000 25000 30000 35000 40000 Sample time (Index × 15s) Connectionleasetime(ms) max mean p50 p95 p99
  21. 21. Agenda • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching
  22. 22. JPA Identifier Generators 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒 • IDENTITY • SEQUENCE • TABLE • AUTO
  23. 23. IDENTITY • In Hibernate, IDENTITY generator disables JDBC batch inserts • MySQL 5.7 does not offer support for database SEQUENCE
  24. 24. SEQUENCE • Oracle, PostgreSQL, and even SQL Server 2012 • May use roundtrip optimizers: hi/lo, pooled, pooled-lo • By default, Hibernate 5 uses the enhanced sequence generators <property name="hibernate.id.new_generator_mappings" value="true"/>
  25. 25. SEQUENCE - Pooled optimizer (50 rows) 1 5 10 50 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Sequence increment size Time(ms)
  26. 26. TABLE • Uses row-level locks and a separate transaction/connection • May use roundtrip optimizers: hi/lo, pooled, pooled-lo • By default, Hibernate 5 uses the enhanced sequence generators <property name="hibernate.id.new_generator_mappings" value="true"/>
  27. 27. TABLE - Pooled optimizer (50 rows) 1 5 10 50 0 0.5 1 1.5 2 2.5 3 Table increment size Time(ms)
  28. 28. IDENTITY vs TABLE (100 rows) • IDENTITY makes no use of batch inserts • TABLE generator using a pooled optimizer with an increment size of 100
  29. 29. IDENTITY vs TABLE (100 rows) 1 2 4 8 16 0 500 1000 1500 2000 2500 Thread count Time(ms) Identity Table
  30. 30. AUTO: IDENTITY vs TABLE? • Prior to Hibernate 5, AUTO would resolve to IDENTITY if the database supports such a feature • Hibernate 5 uses TABLE generator if the database does not support sequences
  31. 31. SEQUENCE vs TABLE (100 rows) • Both benefiting from JDBC batch inserts • Both using a pooled optimizer with an increment size of 100
  32. 32. SEQUENCE vs TABLE (100 rows) 1 2 4 8 16 0 200 400 600 800 1000 1200 Thread count Time(ms) Sequence Table
  33. 33. Agenda • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching
  34. 34. Relationships 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  35. 35. Agenda • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching
  36. 36. Batching 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒 • SessionFactory setting • Session-level configuration since Hibernate 5.2
  37. 37. Batching - SessionFactory <property name="hibernate.jdbc.batch_size" value="5"/> • Switching from non-batching to batching
  38. 38. Batching - Session doInJPA( this::entityManagerFactory, entityManager -> { entityManager.unwrap( Session.class ) .setJdbcBatchSize( 10 ); for ( long i = 0; i < entityCount; ++i ) { Person = new Person( i, String.format( "Person %d", i ) ); entityManager.persist( person ); if ( i % batchSize == 0 ) { entityManager.flush(); entityManager.clear(); } } } );
  39. 39. Batching DEBUG [main]: n.t.d.l.SLF4JQueryLoggingListener – Name:DATA_SOURCE_PROXY, Time:1, Success:True, Type:Prepared, Batch:True, QuerySize:1, BatchSize:10, Query: ["insert into Person (name, id) values (?, ?)"], Params:[ (Person 1, 1), (Person 2, 2), (Person 3, 3), (Person 4, 4), (Person 5, 5), (Person 6, 6), (Person 7, 7), (Person 8, 8), (Person 9, 9), (Person 10, 10) ]
  40. 40. Insert PreparedStatement batching (5k rows) 1 10 20 30 40 50 60 70 80 90 100 1000 0 200 400 600 800 1000 1200 1400 1600 Batch size Time(ms) DB_A DB_B DB_C DB_D
  41. 41. Update PreparedStatement batching (5k rows) 1 10 20 30 40 50 60 70 80 90 100 1000 0 100 200 300 400 500 600 700 Batch size Time(ms) DB_A DB_B DB_C DB_D
  42. 42. Delete PreparedStatement batching (5k rows) 1 10 20 30 40 50 60 70 80 90 100 1000 0 200 400 600 800 1000 1200 Batch size Time(ms) DB_A DB_B DB_C DB_D
  43. 43. Batching - Cascading <property name="hibernate.order_inserts" value="true"/> <property name="hibernate.order_updates" value="true"/>
  44. 44. Batching – @Version <property name="hibernate.jdbc.batch_versioned_data" value="true"/> • Enabled by default in Hibernate 5 • Disabled in Hibernate 3.x, 4.x, and for Oracle 8i, 9i, and 10g dialects
  45. 45. Agenda • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching
  46. 46. Fetching 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒 • JDBC fetch size • JDBC ResultSet size • DTO vs Entity queries • Fetching relationships
  47. 47. Fetching – JDBC Fetch Size • Oracle – Default fetch size is 10 • SQL Server – Adaptive buffering • PostgreSQL, MySQL – Fetch the whole ResultSet at once • SessionFactory setting: <property name="hibernate.jdbc.fetch_size" value="100"/>
  48. 48. Fetching - JDBC fetch size • Query-level hint: List<PostCommentSummary> summaries = entityManager.createQuery( "select new PostCommentSummary( " + " p.id, p.title, c.review ) " + "from PostComment c " + "join c.post p") .setHint(QueryHints.HINT_FETCH_SIZE, fetchSize) .getResultList();
  49. 49. Fetching – JDBC Fetch Size (10k rows) 1 10 100 1000 10000 0 100 200 300 400 500 600 Fetch size Time(ms) DB_A DB_B DB_C DB_D
  50. 50. Fetching – Pagination • JPA / Hibernate API works for both entity and native queries List<PostCommentSummary> summaries = entityManager.createQuery( "select new PostCommentSummary( " + " p.id, p.title, c.review ) " + "from PostComment c " + "join c.post p") .setFirstResult(pageStart) .setMaxResults(pageSize) .getResultList();
  51. 51. Fetching – 100k vs 100 rows Fetch all Fetch limit 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Time(ms) DB_A DB_B DB_C DB_D
  52. 52. Fetching – Pagination • Hibernate uses OFFSET pagination • Keyset pagination scales better when navigating large result sets • http://use-the-index-luke.com/no-offset
  53. 53. Fetching – Entity vs Projection • Selecting all columns vs a custom projection SELECT * FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id INNER JOIN post_details pd ON p.id = pd.id SELECT pc.version FROM post_comment pc INNER JOIN post p ON p.id = pc.post_id INNER JOIN post_details pd ON p.id = pd.id
  54. 54. Fetching – Entity vs Projection All columns Custom projection 0 5 10 15 20 25 30 Time(ms) DB_A DB_B DB_C DB_D
  55. 55. Fetching – DTO Projections • Read-only views • Tree structures (Recursive CTE) • Paginated Tables • Analytics (Window functions)
  56. 56. Fetching – Entity Queries • Writing data • Web flows / Multi-request logical transactions • Application-level repeatable reads • Detached entities / PersistenceContextType.EXTENDED • Optimistic concurrency control (e.g. version, dirty properties)
  57. 57. Fetching – Relationships Association FetchType @ManyToOne EAGER @OneToOne EAGER @OneToMany LAZY @ManyToMany LAZY • LAZY associations can be fetched eagerly • EAGER associations cannot be fetched lazily
  58. 58. Fetching – Best Practices • Default to FetchType.LAZY • Fetch directive in JPQL/Criteria API queries • Entity graphs / @FetchProfile • LazyInitializationException
  59. 59. Fetching – Open Session in View Anti-Pattern
  60. 60. Fetching – Temporary Session Anti-Pattern • “Band aid” for LazyInitializationException • One temporary Session/Connection for every lazily fetched association <property name="hibernate.enable_lazy_load_no_trans" value="true"/>
  61. 61. Agenda • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching
  62. 62. Caching 𝑇 = 𝑡 𝑎𝑐𝑞 + 𝑡 𝑟𝑒𝑞 + 𝑡 𝑒𝑥𝑒𝑐 + 𝑡 𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
  63. 63. Caching – Why 2nd - Level Caching
  64. 64. Caching – Why 2nd - Level Caching “There are only two hard things in Computer Science: cache invalidation and naming things.” Phil Karlton
  65. 65. Caching – Strategies Strategy Cache type Particularity READ_ONLY READ-THROUGH Immutable NONSTRICT_READ_WRITE READ-THROUGH Invalidation/ Inconsistency risk READ_WRITE WRITE-THROUGH Soft Locks TRANSACTIONAL WRITE-THROUGH JTA
  66. 66. Caching – Collection Cache • It complement entity caching • It stores only entity identifiers • Read-Through • Invalidation-based (Consistency over Performance)
  67. 67. Caching – Read - Write Aggregates
  68. 68. Questions and Answers https://leanpub.com/high-performance-java-persistence • Performance and Scaling • Connection providers • Identifier generators • Relationships • Batching • Fetching • Caching

×