Евгений Хыст "Application performance database related problems"

www.luxoft.com
APPLICATION PERFORMANCE:
DATABASE-RELATED PROBLEMS
Evgeniy Khyst
12.12.2015

www.luxoft.com
Application Performance: Database-related Problems
● Application performance;
● Common performance problems and their solutions;
● Database-related problems;
● Lock contention;
● Locking mechanism;
● Transaction isolation level;
● URL shortener example;
● Hi/Lo algorithms;
● Payment system example.

www.luxoft.com
Application Performance
● Key performance metrics:
- Request processing time;
- Throughput;
● Poor performance:
- Long time to process single requests;
- Low number of requests processed per second.

www.luxoft.com
Request Processing Time
Request processing time = 4 seconds

www.luxoft.com
Throughput
Throughput = 3 requests per second

www.luxoft.com
Throughput
Throughput = 1 request per second

www.luxoft.com
Throughput
Throughput = 10 requests per second

www.luxoft.com
Common Performance Problems and Their Solutions
● Database-related problems;
● JVM performance problems;
● Application specific performance problems;
● Network-related problems.

www.luxoft.com
Database-related Performance Problems
● Query execution time is too big;
● Too much queries per single business function;
● Database connection management problems.

www.luxoft.com
Query Execution Time is Too Big
● Missing indexes;
● Slow SQL queries (sub-queries, too many JOINs etc);
● Slow SQL queries generated by ORM;
● Not optimal JDBC fetch size;
● Lack of proper data caching;
● Lock contention.

www.luxoft.com
Missing Indexes
To find out what indexes to create look at query execution plan:
EXPLAIN PLAN FOR
SELECT isbn FROM book;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());

www.luxoft.com
TABLE ACCESS FULL
● Full table scan is a scan made on a database where each row of
the table under scan is read in a sequential order and the
columns encountered are checked for the validity of a condition;
● Full table scans are the slowest method of scanning a table in
most of the cases;
● Create missing indexes to search by index instead of
performing full table scan.

www.luxoft.com
Slow SQL Queries
● Slow SQL queries (sub-queries, too many JOINs etc):
Solution: Rewrite query
● Slow SQL queries generated by ORM:
- JPQL/HQL and Criteria API queries are translated to SQL;
Solutions:
- Rewrite JPQL/HQL, Criteria API queries;
- Replace with plain SQL query.

www.luxoft.com
Not Optimal JDBC Fetch Size
JDBC allows to specify the number of rows fetched with each
database round-trip for a query, and this number is referred to as
the fetch size.
Solutions:
● java.sql.Statement.setFetchSize(rows)
● hibernate.jdbc.fetch_size property

www.luxoft.com
Lack of Proper Data Caching
Solutions:
● Enable ORM second-level cache;
● Enable ORM query cache;
● Implement custom cache.

www.luxoft.com
Lock Contention
Operations are waiting to obtain lock for a long time due to high
lock contention.
Solution:
Revise application logic and implementation:
● Update asynchronously;
● Replace updates with inserts (inserts are not blocking).

www.luxoft.com
Too Much Queries per Single Business Function
● Insert/update queries executed in a loop;
● "SELECT N+1" problem;
● Reduce number calls hitting database.

www.luxoft.com
Insert/Update Queries Executed in a Loop
● Use JDBC batch (keep batch size less than 1000);
● hibernate.jdbc.batch_size property;
● Periodically flush changes and clear Session/EntityManager
to control first-level cache size.

www.luxoft.com
JDBC Batch Processing
PreparedStatement preparedStatement = connection.prepareStatement("UPDATE book SET title=? WHERE isbn=?");
preparedStatement.setString(1, "Patterns of Enterprise Application Architecture");
preparedStatement.setString(2, "007-6092019909");
preparedStatement.addBatch();
preparedStatement.setString(1, "Enterprise Integration Patterns");
preparedStatement.setString(2, "978-0321200686");
preparedStatement.addBatch();
int[] affectedRecords = preparedStatement.executeBatch();
for (int i=0; i<100000; i++) {
Book book = new Book(.....);
session.save(book);
if ( i % 20 == 0 ) { // 20, same as the JDBC batch size
// flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}

www.luxoft.com
"SELECT N+1" Problem
● The first query will selected root entities only, and each
associated collection will be selected with additional query.
● So persistence provider generates N+1 SQL queries, where N is a
number of root entities in result list of user query.

www.luxoft.com
"SELECT N+1" Problem
Solutions:
● Use different fetching strategy or entity graph;
● Make child entities aggregate roots and use DAO methods to
fetch them:
- Replace bidirectional one-to-many mapping with unidirectional;
● Enable second-level and query cache.

www.luxoft.com
Reduce Number Database Calls
Solutions:
● Use Hi/Lo algorithms;
● Enable ORM second-level cache;
● Enable ORM query cache;
● Implement custom cache.

www.luxoft.com
Database Connection Management Problems
● Application is using too much DB connections:
- Application is not closing connections after using
Solution: Close all connections after using
- DB is not able to handle that much connections application uses
Solution: Use connection pooling
● Application is waiting to get connection from pool too long
Solution: Increase pool size

www.luxoft.com
JVM Performance Problems
Excessive JVM garbage collections slows down application.
Solutions:
● Analyze garbage collector logs:
- Send GC data to a log file, enable GC log rotation:
-Xloggc:gc.log -XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=1M
-XX:+PrintGCTimeStamps
● Tune GC:
- Use Garbage-First Collector: -XX:+UseG1GC

www.luxoft.com
Application Specific Performance Problems
Resource consuming computations:
● Algorithms with complexity O(N2), O(2N);
● Asymmetric RSA encryption;
● Bcrypt hashing during authentication;
● Etc.
Solution: Horizontal scalability. Increase number of instances
capable of processing requests and balance load (create cluster).

www.luxoft.com
Network-related Problems
● Network latency;
● Not configured timeout:
- mail.smtp.connectiontimeout Socket connection timeout. Default
is infinite timeout.
- mail.smtp.timeout Socket read timeout. Default is infinite timeout.

www.luxoft.com
Reducing Lock Contention
● Database-related problems
- Query execution time is too big
• Lock contention
Solutions:
● Use Hi/Lo algorithms;
● Update asynchronously;
● Replace updates with inserts.

www.luxoft.com
Locking Mechanism
Locks are mechanisms that prevent destructive interaction
between transactions accessing the same resource.
In general, multi-user databases use some form of data locking to
solve the problems associated with:
● data concurrency,
● consistency,
● integrity.

www.luxoft.com
Isolation Levels vs Locks
● Transaction isolation level does not affect the locks that are
acquired to protect data modifications.
● A transaction always gets an exclusive lock on any data it
modifies and holds that lock until the transaction completes,
regardless of the isolation level set for that transaction.
● For read operations transaction isolation levels primarily define
the level of protection from the effects of modifications made
by other transactions.

www.luxoft.com
Preventable Read Phenomena
● Dirty reads - A transaction reads data that has been written by
another transaction that has not been committed yet.
● Nonrepeatable reads - A transaction rereads data it has
previously read and finds that another committed transaction
has modified or deleted the data.
● Phantom reads - A transaction reruns a query returning a set
of rows that satisfies a search condition and finds that another
committed transaction has inserted additional rows that satisfy
the condition.

www.luxoft.com
Standard Transaction Isolation Levels
● Read uncommited
● Read commited
● Repeatable reads
● Serializable

www.luxoft.com
Isolation Levels vs Read Phenomena
Dirty reads Nonrepeatable reads Phantom reads
Read uncommited Possible Possible Possible
Read commited Not possible Possible Possible
Repeatable reads Not possible Not possible Possible
Serializable Not possible Not possible Not possible

www.luxoft.com
Default Isolation Level
Read commited isolation level is default.

www.luxoft.com
Read Commited Isolation Level
In read commited reads are not blocking.

www.luxoft.com
Read Commited Isolation Level
Conflicting writes in read commited transactions.

www.luxoft.com
URL Shortener Example
Requirements:
● Receives URL and returns "shortened" version;
● E.g. post "http://github.com" to "http://url-shortener/s/" and get
back "http://url-shortener/s/2Bi";
● The shortened URL can be resolved to original URL. E.g.
"http://url-shortener/s/2Bi" will return "http://github.com";
● Shortened URLs that were not accessed longer than some
specified amount of time should be deleted.

www.luxoft.com
● Each time URL is submitted a new record is inserted into the
database;
● Insert operations do not introduce locks in database;
● For primary key generation database sequence is used;
● The Hi/Lo algorithm allows to reduce number of database hits
to improve performance.

www.luxoft.com
● Original URL’s primary key is converted to radix 62:
- Radix 62 alphabet contains digits lower- and upper-case letters: 10000
in radix 10 = 2Bi in radix 62;
● String identifying original URL is converted back to radix 10 to
get primary key value and original URL can be found by ID.

www.luxoft.com
E.g. URL "http://github.com/" shortened to "http://url-
shortener/s/2Bi":
● Inserting new record to database with id 10000 for original URL
"http://github.com/" representing "shortened" URL
● Converting id 10000 to radix 62: 2Bi

www.luxoft.com
● During each shortened URL resolving last view timestamp is
updated in database and total number of views column is
incremented;
● These update should be asynchronous to not reduce
performance due to lock contention;
● Absence of update operations gives application better
scalability and throughput.

www.luxoft.com
Update Asynchronously
● When URL is resolved JMS message is sent to queue;
● Application consumes messages from queue and updates
records in database;
● During URL resolving there are no update operations.

www.luxoft.com
Hi/Lo Algorithms
The usage of Hi/Lo algorithm allows different application nodes not
to block each other.

www.luxoft.com
Hi/Lo Algorithms
● JPA mapping:
@SequenceGenerator(name = "MY_SEQ", sequenceName = "MY_SEQ",
allocationSize = 50)
allocationSize = N - fetch the next value from the database once in every
N persist calls and locally (in-memory) increment the value in between.
● Sequence DDL:
CREATE SEQUENCE MY_SEQ INCREMENT BY 50 START WITH 50;
INCREMENT BY should match allocationSize
START WITH should be greater or equal to allocationSize

www.luxoft.com
Payment System Example
Requirements:
● Users can add funds on their accounts (add funds)
● Users can pay to shops with funds from their accounts
(payment)
● Users and shops can withdraw money from their accounts
(withdraw funds)
● Account balance must be always up to date

www.luxoft.com
Simple solution 1
● Store account balance in table and update on each operation.
● Advantage:
- Simple

www.luxoft.com
Simple solution 1 - Data model
Table ACCOUNT_BALANCE
ACCOUNT_ID BALANCE

www.luxoft.com
Simple solution 1 - Queries
UPDATE ACCOUNT_BALANCE SET
BALANCE = BALANCE + :amount
WHERE ACCOUNT_ID = :account
SELECT ACCOUNT_ID,
BALANCE
FROM ACCOUNT_BALANCE

www.luxoft.com
Simple solution 1 - Problems
● Update operations introduce locks;
● During Christmas holidays users can make hundreds of
payments simultaneously;
● Due to lock contention payments will be slow;
● System have low throughput.

www.luxoft.com
Simple solution 2
● Do not store account balance at all;
● Store details of each transaction;
● Calculate balance dynamically based on transaction log;
● Advantages:
- Still simple enough;
- No update operations at all.

www.luxoft.com
Simple solution 2 - Data model
Table TRANSACTION_LOG
TX_ID TX_TYPE TX_DATE ACCOUNT_ID TX_AMOUNT

www.luxoft.com
Simple solution 2 - Queries
INSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_DATE,
ACCOUNT_ID, TX_AMOUNT)
VALUES(:id, :type, :date, :account, :amount)
SELECT ACCOUNT_ID,
SUM(TX_AMOUNT) AS BALANCE
FROM TRANSACTION_LOG

www.luxoft.com
Simple solution 2 - Problems
● Users can make thousands of transactions per day;
● During Christmas holidays users can make thousands of
payments per hour;
● Number of transactions continuously grow;
● More records in TRANSACTION_LOG table - slower requests.

www.luxoft.com
Better solution
● Store balance on yesterday in table;
● Update account balance once a day in background;
● Store details of each transaction;
● Calculate balance dynamically based on value of balance on
yesterday and transactions made today from transaction log.

www.luxoft.com
Better solution - Data model
Table ACCOUNT_BALANCE
Table TRANSACTION_LOG
ACCOUNT_ID BALANCE_DATE BALANCE
TX_ID TX_TYPE TX_DATE ACCOUNT_ID TX_AMOUNT

www.luxoft.com
Better solution - Queries
INSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_DATE,
ACCOUNT_ID, TX_AMOUNT)
VALUES(:id, :type, :date, :account, :amount)
-- Executed once a day at midnight
UPDATE ACCOUNT_BALANCE SET
BALANCE = BALANCE + :transactionLogSum,
BALANCE_DATE = :lastTransactionLogDate

www.luxoft.com
Better solution - Queries
SELECT ACCOUNT_ID,
BALANCE_DATE,
BALANCE AS CACHED_BALANCE
FROM ACCOUNT_BALANCE
SELECT ACCOUNT_ID,
MAX(TX_DATE) AS LAST_TX_LOG_DATE,
SUM(TX_AMOUNT) AS TX_LOG_SUM
FROM TRANSACTION_LOG
AND TX_DATE > :balanceDate
-- BALANCE = CACHED_BALANCE + TX_LOG_SUM

www.luxoft.com
Better solution - Advantages
● No updates during payment operations - no locks
● No locks - better throughput
● Number of rows in query with SUM operation is limited (1 day)
● Constant query execution time

Евгений Хыст "Application performance database related problems"

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Евгений Хыст "Application performance database related problems"

Similar to Евгений Хыст "Application performance database related problems" (20)

More from Anna Shymchenko

More from Anna Shymchenko (20)

Recently uploaded

Recently uploaded (20)

Евгений Хыст "Application performance database related problems"