Advanced Hibernate Notes


Published on

1 Comment
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Advanced Hibernate Notes

  1. 1. Why to use ORM ? (i) HQL offers joins and aggregate functions. HQL expressed using Domain Object properties rather than DB Columns and completely decoupled from DB Schema. SQL power is leveraged at Domain Object level. (ii) Unlike iBatis, Hibernate abstracts underlying DB and Data Model. (iii) Hibernate performs Change detection – via Snapshot Comparisons . Hibernate creates runtime proxies for persistent objects thru dynamic byte code generation using Javaassist. We can change it to CGLib. (iv) Unlike jdo – Hibernate does not need to modify persistent objects to observe their state. (v) Unlike EJB, hibernate can be run as a stand alone tool outside jree container.SessionFactory and SessionThe purpose of the Hibernate SessionFactory (called EntityManager in JEE) is to create Sessions,initialize JDBC connections and pool them (using a pluggable provider like C3P0).A SessionFactory is immutable cache of compiled mappings (plus associations / inheritence /aggregations) for a single database.It is built from a Configuration holding mapping information, cache information and a lot of otherinformation usually provided by means of a hibernate.cfg.cml file or through a Spring beanconfiguration.A Session is a unit of work at its lowest level - representing a transaction in database lingo.Session is not thread safe and is maintained as a threadlocal value.When a Session is created and operations are done on Hibernate entities, e.g. setting an attribute of anentity, Hibernate does not and update the underlying table immediately. Instead Hibernate keeps trackof the state of an entity, whether it is dirty or not, and flushes (commits) updates at the end at the end ofa unit of work. This is what Hibernate calls the first level cache.The 1st level cacheDefinition: The first level cache is where Hibernate keeps track of the possible dirty states of theongoing Sessions loaded and touched entities. The ongoing Session represents a unit of work and isalways used and can not be turned of. The purpose of the first level cache is to hinder to many SQLqueries or updates beeing made to the database, and instead batch them together at the end of theSession. When you think about the 1st level cache think Session.How does Hibernate implement Lazy-Initialization for single-ended collection?>> Hibernate3 generates proxies (at startup) for all target entities i.e. Persistent classes using runtime-bytecode enhancement (via CGLIB). Then enable them for many-2-one and one-2-many associations.>> Hibernate uses a subclass of the original class and the proxied class must implement a defaultconstructor with package visibility ... so all persistent classes should have a default constructorHow to improve performance using natural-id ?<class name=”User”> <cache usage=”read-write”/> <id......... > <natural-id> <property name=”userName”/> .. should not be mutable ... .......
  2. 2. </natural-id>session.createCriteria(User.class) .add(Restrictions.naturalId() .set(“name”, “Bob”)).setCacheable(true).uniqueResult();***Since we have mentioned that the fields used are natural keys, hibernate query cache is smart enough tounderstand that we can bypass the uptodate check and depend on the assembling logic for handling itproperly.How can I retrieve info about a collection without initializing it ?Fetch the size of a collection :((Integer) s.createFilter(collection, “select count(*)”).list().get(0)).intValue() >> retrieve a subset of a collection : s.createFilter( lazyCollection, “”).setFirstResult(0).setMaxResults(10).list();How to benefit from Query-Caching ?>> Check query cache for the query>> if results are found, check if they are latest (that is no entry in update timestamps table or one whichpredates the cache)>> if they are not up-to-date then assemble the object (assembling involves creating the object from itsprimary key or group of columns from their values or other strategies)The query cache works something like this:| ["from Person as p where and p.firstName=?", [ 1 , "Joey"] ] -> [2 ] ] |The combination of the query and the values provided as parameters to that query is used as a key, andthe value is the list of identifiers for that queryHow Query cache work with 2 Level Cache ? ndQuery-Cache : The intention is to cache the results against the query (the sql along with the parametersand their values).....We set hibernate.cache.use_query_cache = trueThen Hibernate creates 2 memory regions : Region1 : one holding cache query result sets Region2 : other holding timestamps of the most recent updates to queryable tables ...>> it caches only id values and result of value type ... in order to fetch the state of the actualentities from 2 level cache... nd***>> the safest invalidation logic is to mantain update timestamps for each table. When any value islooked up from the query cache, we would also check if any of the tables involved in this query havebeen updated since the results were cached, if they were, the safest thing to do is query the db again.
  3. 3. This is from a very simplistic point what hibernate does. It maintains the timestamps in the updatetimestamp cache. The query results are cached with the query as its key in the query cache.stores only the primary key for queries that return results of only one type.>> A lot of heavily used data is cached at the second level. However, most of this data is looked upusing its natural key, the second level cache however would store it using its primary key as the key, sowe could go ahead and use the query cache to save this lookup.****** points regarding 2nd-Level-Caching *** The 2nd level cache is a process scoped cache that is associated with one SessionFactory. It willsurvive Sessions and can be reused in new Session by same SessionFactory (which usually is one perapplication).**** The hibernate cache does not store instances of an entity - instead Hibernate uses somethingcalled dehydrated state. Hibernate dehydrates query results and persistent objects into their primitivecomponents and identifiers. Conceptually you can think of it as a Map which contains the id as key andan array as value. Or something like below for a cache region:It stores this decomposed data in the L2 and query results cache, and on a cache hit, itrehydrates/recomposes them into the requested persistent object{ id -> { atribute1, attribute2, attribute3 } }{ 1 -> { "a name", 20, null } }{ 2 -> { "another name", 30, 4 } }If the entity holds a collection of other entities then the other entity also needs to be cached. In this caseit could look something like:{ id -> { atribute1, attribute2, attribute3, Set{item1..n} } }{ 1 -> { "a name", 20, null , {1,2,5} } }{ 2 -> { "another name", 30, 4 {4,8}} }The actual implementation of the 2nd level cache is not done by Hibernate (there is a simple Hashtablecache available, not aimed for production though). Hibernate instead has a plugin concept for cachingproviders which is used by e.g. EHCache.In well-designed Hibernate domain models, we should avoid direct many-2-many collections –and instead use – one-to-many associations with inverse=true .... For these associations, the updateis handled by the many-2-one end of the association ...How to get rid of out-of-memory with hibernate cache ? >> Lessons Learned
  4. 4. If you use hibernate query caching, and actually want to use memory for caching useful results, andwaste as little as possible with overhead, follow some simple advice: • Write your HQL queries to use identifiers in any substitutable parameters.WHERE clauses, IN lists, etc. Using full objects results in the objects being kept on the heap for the life of the cache entry. • Write your Criteria restrictions to use identifiers as well. • Use the smart query cache implementation to eliminate duplicate objects used in query keys. This helps a little if you use the ids in your HQL/Criteria, but if you still must use objects then it helps a lot. • final Product product= ...;** Dont do thisfinal String hql = "from Product as product where product.order = ?"** Do thisfinal String hql = "from Product as product where = ?".... q.setParameter(0, order.getId());final Query q = session.createQuery(hql);q.setParameter(0, mate);q.setCacheable(true);**** of fetch-mode=join .. pitfalls**** of duplicate values :*** Join returning duplicate rows ..Soln. Call method setResultTransformer() on your criteria object prior to execution with argumentCriteriaSpecification.DISTINCT_ROOT_ENTITY.*** If there are any mapped collections whose elements also contain mapped collections, the results ofthe query will not only be incorrect, the objects themselves will be incorrect. Each mapped collectionmay contain repeated elements. N+1 problem :Lazy Association should be enabled if we want to avoid infamous N+1 problem ! Otherwise we canturn on eager fetching if the data associated with main query is smaller in size.
  5. 5. Other TricksIf we specify unsaved-value = null for primary key, then Hibernate will find out when to do insert andwhen to do update.In case session is closed when lots of data getting loaded upfront, then we should implementOpen Session in View (anti)pattern. Session will remain opened long enough to perform lazyassociations.Avoiding out-of-memory in cache**** making new objects persistent, you must flush() and then clear() the session regularly, tocontrol the size of the first-level cache.Session session = sessionFactory.openSession();Transaction tx = session.beginTransaction();for ( int i=0; i<100000; i++ ) { Customer customer = new Customer(.....);; if ( i % 20 == 0 ) { //20, same as the JDBC batch size //flush a batch of inserts and release memory: session.flush(); session.clear(); }}tx.commit();session.close();Optimization for Large Data SetsQuery query = session.createQuery(....);query.setFirstResult(RowsPerPage);query.setMaxResult(PageSize);Lightweight Data Pattern : Fetch only what you need Correct Transactional BoundarySet @Transactional (readOnly = true) to avoid unnecessary dirty-checking by Hibernate Engine.Use Hibernate Validator in Domain ModelUsage of Proper Caching StrategyEither use <cache usage=”read-write” /> or @Cache (usage=ConcurrencyStrategy.READ_WRITE)Refer to a properly configured cache region from query.query.setCacheRegion(“query.OrderCache”);