1) The document discusses an industrial experience report on detecting problems in database access code of large scale Java systems. It focuses on problems related to how developers use frameworks like Hibernate, Spring, and JDBC rather than language-level problems.
2) Existing static analysis tools provide limited support for detecting bugs related to the configuration and use of these frameworks. The paper presents "DBChecker", a tool that looks for functional and performance bug patterns related to incorrect usage of ORM frameworks and transaction management frameworks.
3) When adopting DBChecker in practice, the tool prioritizes detected problems based on the database tables impacted, and developers were educated on the tool through workshops to improve understanding of detected issues. Handling
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
ICSE2016 - Detecting Problems in Database Access Code of Large Scale Systems - An Industrial Experience Report
1. Detecting Problems in the Database
Access Code of Large Scale Systems
An industrial Experience Report
1
Mohamed Nasser, Parminder Flora
Tse-Hsun(Peter) Chen Ahmed E. HassanWeiyi Shang
2. Existing static analysis tools focus on
language-related problems
2
Coverity PMD Google error-prone
Facebook InferFindBugs
However, many problems are related to
how developers use different frameworks
3. Over 67% of Java developers use
Object-Relational Mapping
(Hibernate) to access databases
3
Existing static analysis tools provide
mostly rudimentary support for JDBC!
22%67%
4. Over 40% of Java web application
developers use Spring
4
Developers use Spring to manage database
transactions in web applications
None of the static analysis tools support
Spring!
5. There is a huge need for framework-
specific tools
5
Developers leverage MANY frameworks,
but existing tools only support detecting
language-related problems.
6. An example class with Java ORM code
6
@Entity
@Table(name = “user”)
@DynamicUpdate
public class User{
@Column(name=“id”)
private int id;
@Column(name=“name”)
String userName;
@OneToMany(fetch=FetchType.EAGER)
List<Team> teams;
public void setName(String n){
userName = n;
}
… other getter and setter methods
User.java
User class is
mapped to “user”
table in DB
id is mapped to the
column “id” in the
user table
A user can belong
to multiple teams
Eagerly retrieve
associated teams
when retrieving a
user object
Performance-
related configs
7. Accessing the database using ORM
7
User u = findUserByID(1);
ORM
database
select u from user
where u.id = 1;
u.setName(“Peter”);
update user set
name=“Peter”
where user.id = 1;
Objects SQLs
8. Transaction management using Spring
8
@Transaction(Propogation.REQUIRED)
getUser(){
…
updateUserGroup(u)
…
}
By using ORM and Spring, developers
can focus more on the business logic
and functionality
Create a DB
transaction
Entire business logic will
be executed with the
same DB transaction
10. Overview of the presentation
10
Bug patterns Lessons learned when
adopting the tool in practice
11. Overview of the presentation
11
Bug patterns Lessons learned when
adopting the tool in practice
More patterns and learned
lessons in the paper
12. ORM excessive data bug pattern
Class User{
@EAGER
List<Team> teams;
}
User u = findUserById(1);
u.getName();
EOF
12
Objects
SQL
Eagerly retrieve
teams from DB
User Table Team Table
join Team data is never
used!
13. Detecting excessive data
using static analysis
13
First find all the objects that
eagerly retrieve data from DB
Class User{
@EAGER
List<Team> teams;
}
Identify all the data usages of
ORM-managed objects
User user = findUserByID(1);
Check if the eagerly retrieved
data is ever used
user.getName();
user team
user team
14. Nested transaction bug pattern
14
@Transaction(Propogation.
REQUIRED)
getUser(){
updateUserGroup(u)
…
}
Create a DB
transaction
@Transaction(Propogation.
REQUIRES_NEW)
Create a child transaction, and suspend
parent transaction until child is finished
Misconfigurations can cause unexpected
transaction timeout, deadlock, or other
performance-related problems
15. Detecting nested transaction bug
pattern
15
@Transaction(Propogation.
REQUIRED)
getUser(){
…
updateUserGroup(u)
…
}
Parse all transaction
configurations
Identify all methods with the
annotation
Propogation.REQUIRED
Propogation.REQUIRS_NEW
calls
Traverse the call graph to identify
potential misconfigurations
16. Limitation of current static analysis
tools
16
Annotations are lost
when converting source
code to byte code
Do not consider how
developers configure
frameworks
@Transaction(Propo
gation.REQUIRED)
@EAGER
Many problems
are related to
framework
configurations
Many
configurations are
set through
annotations
17. Overview of the presentation
17
Bug patterns
Lessons learned when
adopting the tool in practice
Most discussed bug
patterns are related to
incorrect usage of
frameworks
18. Overview of the presentation
18
Bug patterns
Lessons learned when
adopting the tool in practice
Most discussed bug
patterns are related to
incorrect usage of
frameworks
19. Handling a large number of detection
results
19
• Developers have limited time to fix detected problems
• Most existing static analysis frameworks do not prioritize
the detected instances for the same bug pattern
20. 20
Prioritizing based on DB tables
User
Time zone
• Problems related to large or
frequently-accessed tables are
ranked higher (more likely to be
performance bottlenecks)
• Problems related to highly
dependable tables are ranked
higher
21. Developers have different
backgrounds
21
• Not all developers are familiar with these frameworks and
databases
• Developers may not take the problems seriously if they
don’t understand the impact
22. Educating developers about
the detected problems
22
• We hosted several workshops
to educate developers about
the impact and cause of the
problems
• Walk developers through
examples of detected
problems
• May learn new bug patterns
from developers
23. Overview of the presentation
23
Bug patterns
Lessons learned when
adopting the tool in practice
Most discussed bug
patterns are related to
incorrect usage of
frameworks
We prioritize problems
based on DB tables, and
educate developers about
the problems