-
1.
Performance Tuning
@Sander_Mak
branchandbound.net
-
2.
Hibernate sucks!
... because it’s slow
-
3.
Hibernate sucks!
... because it’s slow
‘The problem is sort of cultural [..] developers
use Hibernate because they are uncomfortable
with SQL and with RDBMSes. You should be
very comfortable with SQL and JDBC before you
start using Hibernate - Hibernate builds on
JDBC, it doesn’t replace it. That is the cost of
extra abstraction [..] save yourself effort,
pay attention to the database at all stages
of development.’
- Gavin King (creator)
-
4.
‘Most of the performance problems we have come
up against have been solved not by code
optimizations, but by adding new functionality.’
- Gavin King (creator)
-
5.
‘You can't communicate complexity,
only an awareness of it.’
- Alan J. Perlis (1st Turing Award winner)
-
6.
Outline
Optimization
Lazy loading
Examples ‘from the trenches’
Search queries
Large collections
Batching
Odds & Ends
-
7.
Optimization
-
8.
Optimization is hard
Performance blame
Framework vs. You
When to optimize?
Preserve correctness at all times
Unit tests ok, but not enough
Automated integration tests
-
9.
Optimization is hard
Performance blame
Framework vs. You
When to optimize?
Preserve correctness at all times
Unit tests ok, but not enough
Automated integration tests
Premature optimization is the root of all evil
- Donald Knuth
-
10.
Optimization guidelines
Measurement
Ensure stable, production-like environment
Measure time and space
Time: isolate timings in different layers
Space: more heap -> longer GC -> slower
Try to measure in RDBMS as well
IO statistics (hot cache or disk thrashing?)
Query plans
Make many measurements -> automation
-
11.
Optimization guidelines
Practical
Profiler on DAO/Session.query() methods
VisualVM etc. for heap usage
many commercial tools also have
built-in JDBC profiling
Hibernate JMX
<property name="hibernate.generate_statistics">true
</property>
RDBMS monitoring tools
-
12.
Analyzing Hibernate
Log SQL: <property name="show_sql">true</property>
<property name="format_sql">true</property>
Log4J configuration:
org.hibernate.SQL -> DEBUG
org.hibernate.type -> TRACE (see bound params)
Or use P6Spy/Log4JDBC on JDBC connection
-
13.
Analyzing Hibernate
2011-07-28 09:57:12,061 DEBUG org.hibernate.SQL - insert into BASKET_LINE_ALLOC (LAST_UPDATED, QUANTITY,
CUSTOMER_REF, NOTES, BRANCH_ID, FUND_ID, TEMPLATE_ID,
BASKET_LINE_ALLOC_ID) values (?, ?, ?, ?, ?, ?, ?, ?)
Log SQL:
2011-07-28 09:57:12,081 DEBUG org.hibernate.type.TimestampType - binding '2006-07-28 09:57:12' to parameter: 1
2011-07-28 09:57:12,081 DEBUG org.hibernate.type.IntegerType - binding '3' to parameter: 2
<property name="show_sql">true</property>
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 3
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 4
<property name="format_sql">true</property>
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '511' to parameter: 5
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '512' to parameter: 6
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding null to parameter: 7
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '180030' to parameter: 8
Hibernate: INSERT INTO mkyong.stock_transaction (CHANGE, CLOSE, DATE, OPEN, STOCK_ID, VOLUME)
Log4J configuration:
VALUES (?, ?, ?, ?, ?, ?)
2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '10.0' to parameter: 1
2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '1.1' to parameter: 2
2011-07-28 13:33:07,253 DEBUG DateType:133 - binding '30 December 2009' to parameter: 3
org.hibernate.SQL -> DEBUG
2011-07-28 13:33:07,269 DEBUG FloatType:133 - binding '1.2' to parameter: 4
2011-07-28 13:33:07,269 DEBUG IntegerType:133 - binding '11' to parameter: 5
2011-07-28 13:33:07,269 DEBUG LongType:133 - binding '1000000' to parameter: 6
2011-07-28 09:57:12,061 DEBUG org.hibernate.SQL - insert into BASKET_LINE_ALLOC (LAST_UPDATED, QUANTITY,
CUSTOMER_REF, NOTES, BRANCH_ID, FUND_ID, TEMPLATE_ID,
org.hibernate.type -> TRACE (see bound params)
BASKET_LINE_ALLOC_ID) values (?, ?, ?, ?, ?, ?, ?, ?)
2011-07-28 09:57:12,081 DEBUG org.hibernate.type.TimestampType - binding '2006-07-28 09:57:12' to
2011-07-28 09:57:12,081 DEBUG org.hibernate.type.IntegerType - binding '3' to parameter: 2
parameter: 1
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 3
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 4
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '511' to parameter: 5
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '512' to parameter: 6
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding null to parameter: 7
Or use P6Spy/Log4JDBC on JDBC connection
2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '180030' to parameter: 8
Hibernate: INSERT INTO mkyong.stock_transaction (CHANGE, CLOSE, DATE, OPEN, STOCK_ID, VOLUME)
VALUES (?, ?, ?, ?, ?, ?)
2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '10.0' to parameter: 1
2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '1.1' to parameter: 2
2011-07-28 13:33:07,253 DEBUG DateType:133 - binding '30 December 2009' to parameter: 3
2011-07-28 13:33:07,269 DEBUG FloatType:133 - binding '1.2' to parameter: 4
2011-07-28 13:33:07,269 DEBUG IntegerType:133 - binding '11' to parameter: 5
2011-07-28 13:33:07,269 DEBUG LongType:133 - binding '1000000' to parameter: 6
-
14.
Lazy loading
-
15.
Lazy loading
One entity to rule them all Request
Mostly sane defaults: 1..1
@OneToOne,
@OneToMany, User
@ManyToMany: LAZY 1..* *..1
@ManyToOne : EAGER
(Due to JPA spec.) Authorization
Extra-lazy: Hibernate *..1
specific Global
Company
1..* Company
-
16.
LazyInitializationException
-
17.
Lazy loading N+1 Selects problem
Select list of N users HQL: SELECT u FROM User
Authorizations necessary: SQL 1 query:
SELECT * FROM User
N select queries on LEFT JOIN Company c
Authorization executed! WHERE u.worksForCompany =
c.id
Solution:
FETCH JOINS
@Fetch(FetchMode.JOI
N)
@FetchProfile (enable
per session
don’t call .size()
-
18.
Lazy loading N+1 Selects problem
Select list of N users HQL: SELECT u FROM User
Authorizations necessary: SQL N queries:
SELECT * FROM Authorization
N select queries on WHERE userId = N
Authorization executed!
Solution:
FETCH JOINS
@Fetch(FetchMode.JOI
N)
@FetchProfile (enable
per session
don’t call .size()
-
19.
Lazy loading N+1 Selects problem
HQL: SELECT u FROM User OUTER
Select list of N users JOIN FETCH u.authorizations
Authorizations necessary:
SQL 1 query:
N select queries on SELECT * FROM User
Authorization executed! LEFT JOIN Company c LEFT
OUTER JOIN Authorization
Solution: ON .. WHERE
u.worksForCompany = c.id
FETCH JOINS
@Fetch(FetchMode.JOI
N)
@FetchProfile (enable
per session
don’t call .size()
-
20.
Lazy loading
Some guidelines
Laziness by default = mostly good
However, architectural impact:
Session lifetime (‘OpenSessionInView’ pattern)
Extended Persistence Context
Proxy usage (runtime code generation)
Eagerness can be forced with HQL/JPAQL/Criteria
But eagerness cannot be reverted
exception: Session.load()/EntityManager.getReference()
-
21.
Search queries
-
22.
Search queries
Search
User
Result list
1..* *..1
Authorization
*..1
Global 1..* Company
Company
Detail
-
23.
Search queries
Obvious solution:
Too much information!
Use summary objects:
UserSummary = POJO
not attached, only necessary fields (no relations)
-
24.
Search queries
Obvious solution:
Too much information!
Use summary objects:
UserSummary = POJO
not attached, only necessary fields (no relations)
Or: drop down to JDBC to fetch id + fields
-
25.
Search queries
Alternative:
Taking it further:
Pagination in queries, not in app. code
Extra count query may be necessary (totals)
Ordering necessary for paging!
-
26.
Search queries
Subtle:
Alternative: effect of applying setMaxResults
“The
or setFirstResult to a query involving
fetch joins over collections is undefined”
Taking it further:
Cause: the emitted join possibly returns
several rows per entity. So LIMIT,
rownums, TOP cannot be used!
Instead Hibernate must fetch all rows
WARNING: firstResult/maxResults specified with
Pagination in queries, not in app. code
collection fetch; applying in memory!
Extra count query may be necessary (totals)
Ordering necessary for paging!
-
27.
Analytics & reporting
ORM less relevant: no entities, but complex
aggregations
Simple sum/avg/counts possible over entities
Specialized db calls for complex reports:
Partitioning/windowing etc.
Integrate using native query
Or create a database view and map entity
-
28.
Large collections
-
29.
Large collections
Frontend:
Request
CompanyGroup
1..*
Backend:
Company
CompanyGroup
1..*
Meta-data
Company read-only entity, CompanyInGroup
backed by expensive view *..1
Company
-
30.
Large collections
Frontend:
Request
CompanyGroup
1..*
Backend:
Company
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
31.
Large collections
Frontend:
Request
CompanyGroup
1..*
Backend:
Company
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
32.
Large collections
Opening large groups sluggish
Improved performance:
Fetches many uninitialized collections in 1 query
Also possible on entity:
Request
CompanyGroup
1..*
Company
-
33.
Large collections
Opening large groups sluggish
Improved performance:
Fetches many uninitialized collections in 1 query
Also possible on entity:
Request
CompanyGroup
1..*
Company
Better solution in hindsight: fetch join
-
34.
Large collections
Extra lazy collection fetching
Efficient:
companies.size() -> count query
companies.contains() -> select 1 where ...
companies.get(n) -> select * where index = n
-
35.
Large collections
Saving large group slow: >15 sec.
Problem: Hibernate inserts row by row
Query creation overhead, network latency
Solution: <property name="hibernate.jdbc.batch_size">100
</property>
Enables JDBC batched statements
Caution: global property Request
CompanyGroup
Also: <property name="hibernate.order_inserts">true
1..*
</property> Company
<property name="hibernate.order_updates">true
</property>
-
36.
Large collections
Frontend:
Request
CompanyGroup
1..*
Backend:
Company
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
37.
Large collections
Backend:
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
38.
Large collections
Process CreateGroup (Soap) Business
Service Service
CreateGroup: ~10 min. for thousands of companies
@BatchSize on Company improved demarshalling
JDBC batch_size property marginal improvement
INFO: INSERT INTO CompanyInGroup VALUES (?,...,?)
INFO: SELECT @identity
INFO: INSERT INTO CompanyInGroup VALUES (?,...,?)
INFO: SELECT @identity
.. 1000 times CompanyGroup
1..*
Meta-data
Insert/select interleaved: due to gen. id CompanyInGroup
*..1
Company
-
39.
Large collections
Process CreateGroup (Soap) Business
Service Service
Solution: generate id in app. (not always feasible)
Running in ~3 minutes with batched inserts
Next problem: heap usage spiking
Use StatelessSession
✦ Bypass first-level cache
✦ No automatic dirty checking CompanyGroup
✦ Bypass Hibernate event model and interceptors 1..*
Meta-data
✦ No cascading of operations CompanyInGroup
✦ Collections on entities are ignored *..1
Company
-
40.
Large collections
Process CreateGroup (Soap) Business
Service Service
Solution: generate id in app. (not always feasible)
Running in ~3 minutes with batched inserts
Next problem: heap usage spiking
Use StatelessSession
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
41.
Large collections
Process CreateGroup (Soap) Business
Service Service
Now <1 min., everybody happy!
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
42.
Large collections
Process CreateGroup (Soap) Business
Service Service
Now <1 min., everybody happy!
Data loss detected!
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
43.
Large collections
Process CreateGroup (Soap) Business
Service Service Data loss detected!
StatelessSession and JDBC batch_size bug
HHH-4042: Closed, won’t fix :
CompanyGroup
1..*
Meta-data
CompanyInGroup
*..1
Company
-
44.
Odds & Ends
-
45.
Dirty little secret
validated(item) performs
read-only queries
select currentItem from Catalog where ..
Dirty collection after select spendingLimit from User where ..
each iteration insert into Item values (?, ?, ?)
select currentItem from Catalog where ..
select spendingLimit from User where ..
Batching fails insert into Item values (?, ?, ?)
Flushmode.AUTO
Loops always suspect: relational, set-based thinking
-
46.
Dirty little secret
validated(item) performs
read-only queries
Dirty collection after
each iteration
Batching fails
Flushmode.AUTO
Loops always suspect: relational, set-based thinking
-
47.
Query hints
Speed up read-only service calls:
Hibernate Query.setHint():
Also: never use 2nd level cache just ‘because we can’
-
48.
Query hints
Speed up read-only service calls:
Hibernate Query.setHint():
Also: never use 2nd level cache just ‘because we can’
@org.hibernate.annotations.Immutable
-
49.
Large updates
Naive approach:
Entities are not always necessary:
Changes are not reflected in persistence context
With optimistic concurrency: VERSIONED keyword
-
50.
Large updates
Naive approach:
Entities are not always necessary:
Changes are not reflected in persistence context
With optimistic concurrency: VERSIONED keyword
Consider use of stored procedures
-
51.
Cherish your database
Data and schema outlive your application
Good indexes make a world of difference
Stored procedures etc. are not inherently evil
Do not let Hibernate dictate your schema
Befriend a DBA instead!
There are other solutions (there I said it)
MyBatis
Squeryl (Scala)
-
52.
Thanks for listening!
@Sander_Mak
Join me later today:
Elevate your webapps
with Scala & Lift!
17:00 Room C
branchandbound.net
Sander Mak - lead developer Java - Info Support\nDutch accent\nHibernate experience, not committer who knows everything\n
Q: who has heard / said / thought this?\nLeaky abstraction -> not going to defend ORM, many advantages\n1) mapping problems -> impedance mismatch\n2) performance problems -> stop treating Hibernate as blackbox!!\n
Tangle of 37\nRed = bad -> cyclic dependency\nHibernate implementation complex, but battle-tested \nJPA tutorials rosy picture: Using Hibernate can be quite hard!\n
Tangle of 37\nRed = bad -> cyclic dependency\nHibernate implementation complex, but battle-tested \nJPA tutorials rosy picture: Using Hibernate can be quite hard!\n
Tangle of 37\nRed = bad -> cyclic dependency\nHibernate implementation complex, but battle-tested \nJPA tutorials rosy picture: Using Hibernate can be quite hard!\n
Examples use Hibernate/JPA API interchangeably: start with JPA, you will Hibernate specifics\n\n
\n
Tuning performance is a bit like refactoring: don&#x2019;t change the semantics, just the how.\n\nPreserving correctness: unit tests! However, the more you reach the edges of the DBMS, the easier you will hit an obscure bug in query optimizer, caching strategy etc.\n\n
Know your RDBMS! Database independence is nice when porting is necessary, but focus on particular DB for production situation (document!), that is what counts!\n (once you get into nitty-gritty opt. details, you will have to know the RDBMS intimately)\nIndex, covering indexes, locking strategies, &#x2018;vacuuming&#x2019;/&#x2018;transaction logs&#x2019;/reset statistics\n
Hardware vs. virtualized, real data volumes, simulate real workloads\n
SQL Server mgmt studio\n
Question hear most often: how to see parameter values\n
\n
Beware: you might retrieve your whole database in one go...\n\nCode example: will load Company eager, Auths. lazy\nExtra-lazy: discuss later with large collections\n
First encounter with lazy loading :)\n\nExtended persistence contexts, OpenSessionInView pattern and other band-aids\n\n
First encounter with lazy loading :)\n\nExtended persistence contexts, OpenSessionInView pattern and other band-aids\n\n
First encounter with lazy loading :)\n\nExtended persistence contexts, OpenSessionInView pattern and other band-aids\n\n
Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
Lazy as default, tune eager loading in queries specifically for your usecase (DAO pattern not that bad after all)\n
\n
\n
Zelfde overwegingen gelden voor reporting queries, zoveel mogelijk in de query oplossen en geen entities teruggeven als niet nodig\n
\n
Zelfde overwegingen gelden voor reporting queries, zoveel mogelijk in de query oplossen en geen entities teruggeven als niet nodig\n
Hibernate is specifically not for bulk manipulation: use stored procs for that. But when is something bulk? Collections with thousands of elements routinely in OLTP applications.\n
\n
\n
\n
\n
Example: 20 groups with uninitialized collections, access first collection: all are initialized with 1 query.\nMeasured: opening first group was slightly slower, general user experience better\n
\n
\n
\n
\n
\n
\n
\n
stateless session ideaal for fire-and-forget service calls, minder in user-facing applicatie waar consistent houden persistence context van belang is.\n
stateless session ideaal for fire-and-forget service calls, minder in user-facing applicatie waar consistent houden persistence context van belang is.\n
\n
\n
\n
Only &#x2018;full batches&#x2019; were performed\n
\n
Each item is flushed automatically before doing the lookup queries, because the collection becomes dirty after adding element\n
Each item is flushed automatically before doing the lookup queries, because the collection becomes dirty after adding element\n
Hibernate does some optimizing for read-only entities:\nIt saves execution time by not dirty-checking simple properties or single-ended associations. \nIt saves memory by deleting database snapshot\n\ncache increases load on memory, possibly more GC pauses for app. if co-located with application\n
Interesting: all instances of entity are evicted from second level cache with such a query, even if WHERE clause limits affected entities\nAlso, no events fired as Hibernate normally would do.\n
Interesting: all instances of entity are evicted from second level cache with such a query, even if WHERE clause limits affected entities\nAlso, no events fired as Hibernate normally would do.\n
\n
\n