Jdbc Best Practices - DB2/ IDUG - Orlando, May 10, 2004
JDBC BestPractices for DB2ProgrammersDerek C. Ashmore,Delta Vortex TechnologiesSession B1 – 10:30amMay 10, 2004
JDBC Best Practices for DB2Programmers• Presentation Summary – JDBC Coding for Performance – JDBC Coding for Maintainability – JDBC Coding for Portability – Future Directions
Who Am I?• Derek C. Ashmore• Author of The J2EE™ Architect’s Handbook – Downloadable at: – http://www.dvtpress.com/javaarch• Over 6 years of Java-related experience• Over 10 years of database design/administration experience.• Can be reached at firstname.lastname@example.org
Why focus on JDBC?• JDBC the most commonly used. – Not a technical judgment – just an observation.• Why is JDBC the most common access method? – It was the first access method available. – It works – It satisfies developers needs – Most DBMSs support it. – JDBC skills are easy to find in the market• JDBC tutorial at: http://java.sun.com/docs/books/tutorial/jdbc/
JDBC Coding for Performance• Use PreparedStatements with host variable markers instead of Statements.• Beware that selects issue shared locks by default• Consider query fetch sizing• Utilize connection pooling features
Use PreparedStatementsUse PreparedStatements with parameter markers instead of Statements Use “select name from Customer where id = ?” Instead of “…. where id = ‘H23’”Statements are less typing but……. Extra String Processing to assemble the where clause. Circumvents Dynamic Caching Prevents reuse of the query access path by the database. This means that statements will be SlowerDynamic Caching improves performance of dynamic SQL DB2/UDB (all environments) caches plans associated with dynamic SQL statements Like statements will reuse that dynamically stored plan Dynamic SQL gets many of the performance benefits that static SQL has.
Beware of Shared Locks• Beware of shared locking with Select statements – Common Myth: Reading is harmless – Cursor Stability is default == Shared Locks – When only Reading: Commit as early as possible (or use autocommit) – Use “commit” or JDBCs autocommit feature for selects, but don’t use both. • Issues the commit twice lower performance
Consider Setting the query fetch size• Instruct database to return rows in batches of 10 to 100.• Example Query Fetch Sizing – statement.setFetchSize(10)• Higher isn’t always better – Can degrade performance if used for small ResultSets.• Has Fewer network round-trips – Most benefit using batches of 10 to 100 – diminishing returns after that. • Larger benefit reducing network trips from 100,000 to 1,000 than from 100,000 to 100. • The larger the batch, the more memory required.• Needs to be tested on a case-by-case basis.
Utilize Connection Pooling• Connection Pools eliminate wait time for database connections by creating them ahead of time. – I’ve seen enough J2EE apps managing connection creation directly to warrant this practice. – Connections take 30 – 50 ms depending on platform. – Allows for capacity planning of database resources – Provides automatic recovery from database or network outages• Issuing close() on a pooled connection merely returns it to the pool for use by another request.
JDBC Coding for Maintainability• Close JDBC Objects in a “finally” block.• Consolidate SQL string formation.• Always specify column names in select and insert statements.
Close all JDBC Objects• Most JDBC Objects require a Statement Handle from DB2 Client. – Includes PreparedStatement, Statement, and ResultSet – This is a finite resource (e.g. either 600 or 1300 depending upon version).• Close all JDBC Objects in a finally block – Stranded JDBC consume scarce db resources • Cause errors down the line • DB2/UDB (all environments) w/DB2 Client Statement handles are consumed• Close JDBC objects in the method that creates them. – Easier to implement this habit. – Easier to visually identify objects not being closed.• As the garbage collector “closes” these objects, but no guarantee of being under the resource limit. – You may not see problems until stress testing or production.
Closure Issues• Closing JDBC Objects is inconvenient – Close() throws a SQLException – Leads to nested try/catch logic in the finally block – A lot to type• Utility Support can make this easier – Use generic close utility that logs SQLExceptions received, but doesn’t throw an exception – Gets the “close” down to one line. – CementJ – http://sourceforge.net/projects/cementj • org.cementj.util.DatabaseUtility
Penalty for Object Leaks• Applies if you’re using DB2/Client (which most do)• Each JDBC Object acquires a statement handle within DB2/Client.• Limited to between 600 and 1300 (depending on version of DB2/Client)
Closure Issues (con’t)• Finding Stranded JDBC Objects Problematic – This is especially difficult if you need to identify leaks in an application you didn’t write. – Use P6Spy with an extension library – P6Spy is a JDBC Profiler that logs SQL statements, and their execution time. – I’ve extended P6Spy so that it will identify all stranded objects and list SQL statements associated with them. – P6Spy available at http://www.p6spy.com/ – P6Spy Extension at “Resources” link from www.dvtpress.com/javaarch
Consolidate SQL String formationSome developers dynamically build the SQL string with scattered concatenation logic String sqlStmt = “select col1, col2 from tab1”; <<< more application code >>> sqlStmt = sqlStmt + “ where col2 > 200”; <<< more application code >>> sqlStmt = sqlStmt + “ and col3 < 5”;With a small number of apps, this is necessary, but most can consolidate the logic.Disadvantages Harder to use dynamic caching Harder to read More String Processing More Memory Allocation
Consolidate SQL String ExampleUsing “static” variables for SQL text Reduces string processing and memory allocation as happens when the class is first referenced. Consolidates SQL text so that it’s easier to read.Example public static final String CUST_SQL= “select name from Cust where id = ?”; …… pStmt = conn.prepareStatement(CUST_SQL)
Specify Column Names• Always specify column names in select and insert statements. – Code won’t break if DBA changes column order – Clearer for maintenance purposes • Imagine a select or insert statement involving 20-30 columns – Hard to tell which value pertains to which column• Specify column name instead of offset when using ResultSets – Use resultSet.getString(“col1”); – Instead of resultSet.getString(3);
JDBC Coding for Portability• Limit use of platform-specific features.• Reference java.sql or javax.sql classes only – Avoid DB2-specific classes
Limit use of Platform-specific features• Portability == The ability to switch DBMSs.• Use of platform-specific features create portability obstacles – Your code might live longer than you think (Y2K).• Only use when clear benefit – not out of habit• Examples – Stored procedures using proprietary language – Proprietary Column Functions • ENCRYPT • NULLIF – Proprietary Operators • CASE • OLAP (e.g. RANK)
Reference java.sql or javax.sql classesonly• Avoid vendor-specific class implementations unless required for performance – Usually not necessary now • Was necessary in early days before formal support for – Fetch sizing/Array Processing – Statement Batching – Creates a portability issue • Harder to switch DBMSs – Creates a maintenance issue • The JDBC interfaces are familiar • Proprietary objects may not be
Latest Developments• JDBC 3.0 Specification – Return generated PK value on insert. – ResultSet Holdability – exist through commits – Support multiple ResultSets for stored procedure fans – Standardizes Connection Pooling – Adds PreparedStatement pooling – Savepoint support
Future Directions• JDBC is a maturing spec – Expect frequency of change to slow considerably• Use of Object-Relational mapping toolsets is increasing – Hibernate (www.hibernate.org) – JDO (www.jdocentral.com)• Despite technical advances, entity beans are close to becoming a part of history.
Stored Procedure Use• Aren’t Stored Procedures better performing? – Depends on platform • Sybase – yes, Oracle/DB2 – not always – As a general rule, CPU intensive actions are bad as stored procedures – SQL are statically bound • Used to be more significant before dynamic caching – As a rule, stored procedures help performance by reducing the number of network transmissions. • Conditional selects or updates • As a batch update surrogate (combining larger numbers of SQL statements)• Ask: How many network transmissions will be saved by making this a stored procedure? If the answer is “0”, performance is not likely to be improved.
Header TextQuestions• JDBC Best Practices for DB2 Programmers• Session B1• Derek C. Ashmore• Email: email@example.com