Amin Milani Fard
Directed Model Inference for Testing
and Analysis of Web Applications
University of British Columbia
Oct 2015
http://www.knowdiff.net/
Modern Web Applications
2
3
4
Manual Testing and Maintenance
5
Incomplete
6
Test Model
Generation
Test Case
Generation
Unit Test Fixture
Generation
Code
Maintenance
P1. Test Model Generation
• In model-based testing, models of program behaviour are
used to generate test cases.
• Dynamic analysis and exploration (crawling) derives test
models for many automated testing techniques.
7
Most industrial web applications have a huge state-space.
Exhaustive crawling (BFS, DFS, or random search), can
cause the state explosion problem.
Given a limited time, exhaustive crawlers can become mired
in specific parts of the application, yielding poor coverage.
8
Covering the whole app is infeasible in a limited time, so …
RQ1. How can we derive test models for web applications
more effectively compared to exhaustive exploration
methods?
9
Amin Milani Fard, Ali Mesbah
Feedback-Directed Exploration of Web
Applications to Derive Test Models
24th IEEE International Symposium on
Software Reliability Engineering (ISSRE), 2013
Test Model Generation
We consider 4 properties for a test model:
• Functionality Coverage: The amount of code coverage
• Navigational Coverage: The amount of covering different
navigational branches
• Page Structural Coverage: The amount of covering heterogeneous
DOM structures
• Size: The number of edges in the SFG
How can we infer a model satisfying all these?
11
Feedback-directed Exploration
• FeedEx uses the feedback obtained to predict
(1) which states should be expanded next
(2) in which order events should be executed
• Repeat until a time/state limit
• Take the crawler to state s with highest state score
• Execute the fittest event on s based on event score
12
State score is a combination of
• Code Coverage Impact: The amount of code coverage
increase
• Path Diversity: The diversity (not sharing) of two paths
• DOM Diversity: The diversity (dissimilarity) of two DOM trees
Event Score
• Event Productivity Ratio: Unexecuted events first. Penalize
events that result in an already discovered state, e.g. self-
loops.
13
Evaluation and Results
• Objects: 6 open-source JavaScript web apps
• Fixed time: 5 minutes, no limitations on: depth, number of states
17
0%
12%
24%
36%
48%
60%
Coverage
0
0
0
0
0
DOM Diversity
0
0
0
0
1
Path Diversity
0
150
300
450
600
750
Test Model Size Test Suite Size
DFS BFS RND FeedEx
10-28% imp 7-4000% imp 23-130% imp
38-86% imp 42-61% imp
18
DOM-based Testing
19
Crawling automates testing by exploring more states, but is limited in:
• Proper input values
• Choosing paths to explore
• Generating effective assertions
RQ2. Can we utilize the knowledge in existing tests to generate new
tests?
20
P2. Test Case Generation
Amin Milani Fard, Mehdi Mirzaaghaei, Ali Mesbah
Leveraging Existing Tests in
Automated
Test Generation for Web Applications
29th IEEE/ACM International Conference on
Automated Software Engineering (ASE), 2014
Combining Manual and Automated Tests
- Input data and sequence
- DOM elements to be
asserted
23
- Automated crawling
- Automated test case
generation
24
Generated
test cases
Testilizer idea
Extended State-Flow
Graph
Initial State-Flow Graph
Human-written
test cases
Exploring Alternative Paths
25
• Remain close to the manual-test paths
Initial State-Flow
Graph
Extended State-Flow
Graph
Regenerating Assertions
27
(1) Reusing the same assertion
(2) Regenerating assertions for exact DOM element/region
match
(3) Generating assertions for similar DOM region match
Assertion Reuse
An assertion on a shared state can be reused for a new test
case.
28
Assertion Regeneration
Repetition-based assertion regeneration
1. exact element-based assertions
2. exact region-based assertions
29
Checked Element
Checked Element Region
Generating Similar Region Assertions
Inexact element/region repetition of a checked element/region can also
be important for testing.
A classification problem:
• Is a block level DOM element important to be checked by an assertion?
30
<div id="header">
<div id="nav">
<div id="footer">
<div id="article"> <div
id="sidebar"
><div id="section">
<div id="header">
<div id="nav">
<div id="footer">
<span id="main"> <span
id="menu
e">
<span id="content">
33
• 150% imp over the original test suite
• 170% imp over the random assertion (RND) & exploration (RAND)
• 37% imp over the random assertion (RND)
While code coverage was not our main goal
• 30% imp over the original test suite
• 18% imp over the random exploration
0%
7%
13%
20%
26%
33%
Fault Detection Rate
ORIG RAND + RND EXND + RND Testilizer
Evaluation and Results
P3. Unit Test Fixture Generation
If an expected DOM element (test fixture) is not present, a
JavaScript unit test may throw an exception or produce an
incorrect result.
36
Test fixtures define states of the test
environment before the test.
38
Proper DOM-based fixtures are required to achieve high
coverage.
RQ3. How can we automate fixture generation for unit
testing?
Challenges
(1) DOM-related variables
(2) Hierarchical DOM relations
40
Amin Milani Fard, Ali Mesbah, Eric Wohlstadter
Generating Fixtures for JavaScript
Unit Testing
30th IEEE/ACM International Conference on
Automated Software Engineering (ASE), 2015
Apply the new
fixture on DOM
Unit Test
42
ConFix idea
Instrument the code Instrumented Code
JavaScript Code
+
Function under
test (FUT)
Solve constraints and
generate a fixture
Collect exec trace
and deduce DOM-
based constraints
Execute the FUT
Instrumentation
trace = [];
function confixWrapper(statementType, statement, varList, varValueList, enclosingFunction, actualStatement) {
trace.push({statementType: statementType, statement: statement, varList: varList, varValueList: varValueList, enclosingFunction:
enclosingFunction, actualStatement: actualStatement});
return actualStatement;
}
function getConfixTrace() {
return trace;
}
function dg(x) {
return confixWrapper("return", "return confixWrapper("functionCall", "document.getElementById(x)", ["x"], [x], "dg",
document.getElementById(x));", [""], [], "dg", confixWrapper("functionCall", "document.getElementById(x)", ["x"], [x], "dg",
document.getElementById(x)));
}
function sumTotalPrice() {
sum = confixWrapper("infix", "sum = 0", [""], [], "sumTotalPrice", 0);
itemList = confixWrapper("infix", "itemList = confixWrapper("functionCall", "dg('items')", ["items"], ['items'],
"sumTotalPrice", dg('items'))", [""], [], "sumTotalPrice", confixWrapper("functionCall", "dg('items')", ["items"], ['items'],
"sumTotalPrice", dg('items')));
if (confixWrapper("condition", "itemList.children.length === 0", [""], [], "sumTotalPrice", itemList.children.length === 0))
confixWrapper("functionCall", "dg('message')", ["message"], ['message'], "sumTotalPrice", dg('message')).innerHTML =
confixWrapper("infix", "confixWrapper("functionCall", "dg('message')", ["message"], ['message'], "sumTotalPrice",
dg('message')).innerHTML = "Item list is empty!"", [""], [], "sumTotalPrice", "Item list is empty!"); else {
for (i = confixWrapper("infix", "i = 0", [""], [], "sumTotalPrice", 0); confixWrapper("loopCondition", "i <
itemList.children.length", ["i", "itemList"], [i, itemList], "sumTotalPrice", i < itemList.children.length); i++)
{
p = confixWrapper("infix", "p = itemList.children[i].value", ["itemList.children[i]"], [itemList.children[i]],
"sumTotalPrice", itemList.children[i].value);
if (confixWrapper("condition", "p > 0", [""], [], "sumTotalPrice", p > 0))
sum += p; else confixWrapper("functionCall", "dg('message')", ["message"], ['message'], "sumTotalPrice",
dg('message')).innerHTML += " Wrong value for item " + i;
}
confixWrapper("functionCall", "dg('total')", ["total"], ['total'], "sumTotalPrice", dg('total')).value = confixWrapper("infix",
"confixWrapper("functionCall", "dg('total')", ["total"], ['total'], "sumTotalPrice", dg('total')).value = sum", [""], [],
"sumTotalPrice", sum);
}
44
45
(1) Extract DOM-based
Constraints
(2) Transform them into XPath
(3) Solve it using an available
XML XPath solver
46
47
Evaluation and Result
• Up to 67% imp in statement
coverage
• Up to 300% imp in branch
coverage
P4. Code Maintenance
• JavaScript is challenging to maintain.
• Code smells adversely influence
program comprehension and
maintainability.
• Code smells detection is time
consuming.
49
RQ4. Which JavaScript code smells are prevalent in practice
and how can we support automated code refactoring?
Amin Milani Fard, Ali Mesbah
JSNose: Detecting JavaScript Code
Smells
13th IEEE International Conference on Source Code
Analysis and Manipulation (SCAM), 2013
JavaScript Code Smell Detection
51
Static and dynamic code analysis is
used to monitor and infer information
about objects, functions, and code
blocks.
54
Evaluation and Results
• Evaluated 11 JavaScript/Ajax web applications.
• Effective code smells detection (93% precision, 98% recall)
• Lazy object, long method/function, closure smells,
JS/HTML/CSS coupling, and excessive global variables, are
the most prevalent smells.
• Strong, and significant +correlation between {LOC, #
functions, # JS files, and CC} and the types of smells, and
weaker correlation with the # smell instances.
55
Some topics to work on
• Applying learned tests on similar applications
• Generating unit tests based on client-side code similarity
and DOM-based tests based on DOM sate similarity
• JavaScript code refactoring
• Suggesting and applying refactoring for code smells
• Understanding test failures and root causes
• Generating integrated tests using existing unit tests
56
Test Model
Generation
Test Case
Generation
Unit Test Fixture
Generation
Code
Maintenance

Amin Milani Fard: Directed Model Inference for Testing and Analysis of Web Applications

  • 1.
    Amin Milani Fard DirectedModel Inference for Testing and Analysis of Web Applications University of British Columbia Oct 2015 http://www.knowdiff.net/
  • 2.
  • 3.
  • 4.
  • 5.
    Manual Testing andMaintenance 5 Incomplete
  • 6.
    6 Test Model Generation Test Case Generation UnitTest Fixture Generation Code Maintenance
  • 7.
    P1. Test ModelGeneration • In model-based testing, models of program behaviour are used to generate test cases. • Dynamic analysis and exploration (crawling) derives test models for many automated testing techniques. 7
  • 8.
    Most industrial webapplications have a huge state-space. Exhaustive crawling (BFS, DFS, or random search), can cause the state explosion problem. Given a limited time, exhaustive crawlers can become mired in specific parts of the application, yielding poor coverage. 8
  • 9.
    Covering the wholeapp is infeasible in a limited time, so … RQ1. How can we derive test models for web applications more effectively compared to exhaustive exploration methods? 9
  • 10.
    Amin Milani Fard,Ali Mesbah Feedback-Directed Exploration of Web Applications to Derive Test Models 24th IEEE International Symposium on Software Reliability Engineering (ISSRE), 2013
  • 11.
    Test Model Generation Weconsider 4 properties for a test model: • Functionality Coverage: The amount of code coverage • Navigational Coverage: The amount of covering different navigational branches • Page Structural Coverage: The amount of covering heterogeneous DOM structures • Size: The number of edges in the SFG How can we infer a model satisfying all these? 11
  • 12.
    Feedback-directed Exploration • FeedExuses the feedback obtained to predict (1) which states should be expanded next (2) in which order events should be executed • Repeat until a time/state limit • Take the crawler to state s with highest state score • Execute the fittest event on s based on event score 12
  • 13.
    State score isa combination of • Code Coverage Impact: The amount of code coverage increase • Path Diversity: The diversity (not sharing) of two paths • DOM Diversity: The diversity (dissimilarity) of two DOM trees Event Score • Event Productivity Ratio: Unexecuted events first. Penalize events that result in an already discovered state, e.g. self- loops. 13
  • 14.
    Evaluation and Results •Objects: 6 open-source JavaScript web apps • Fixed time: 5 minutes, no limitations on: depth, number of states 17 0% 12% 24% 36% 48% 60% Coverage 0 0 0 0 0 DOM Diversity 0 0 0 0 1 Path Diversity 0 150 300 450 600 750 Test Model Size Test Suite Size DFS BFS RND FeedEx 10-28% imp 7-4000% imp 23-130% imp 38-86% imp 42-61% imp
  • 15.
  • 16.
  • 17.
    Crawling automates testingby exploring more states, but is limited in: • Proper input values • Choosing paths to explore • Generating effective assertions RQ2. Can we utilize the knowledge in existing tests to generate new tests? 20 P2. Test Case Generation
  • 18.
    Amin Milani Fard,Mehdi Mirzaaghaei, Ali Mesbah Leveraging Existing Tests in Automated Test Generation for Web Applications 29th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2014
  • 19.
    Combining Manual andAutomated Tests - Input data and sequence - DOM elements to be asserted 23 - Automated crawling - Automated test case generation
  • 20.
    24 Generated test cases Testilizer idea ExtendedState-Flow Graph Initial State-Flow Graph Human-written test cases
  • 21.
    Exploring Alternative Paths 25 •Remain close to the manual-test paths Initial State-Flow Graph Extended State-Flow Graph
  • 22.
    Regenerating Assertions 27 (1) Reusingthe same assertion (2) Regenerating assertions for exact DOM element/region match (3) Generating assertions for similar DOM region match
  • 23.
    Assertion Reuse An assertionon a shared state can be reused for a new test case. 28
  • 24.
    Assertion Regeneration Repetition-based assertionregeneration 1. exact element-based assertions 2. exact region-based assertions 29 Checked Element Checked Element Region
  • 25.
    Generating Similar RegionAssertions Inexact element/region repetition of a checked element/region can also be important for testing. A classification problem: • Is a block level DOM element important to be checked by an assertion? 30 <div id="header"> <div id="nav"> <div id="footer"> <div id="article"> <div id="sidebar" ><div id="section"> <div id="header"> <div id="nav"> <div id="footer"> <span id="main"> <span id="menu e"> <span id="content">
  • 26.
    33 • 150% impover the original test suite • 170% imp over the random assertion (RND) & exploration (RAND) • 37% imp over the random assertion (RND) While code coverage was not our main goal • 30% imp over the original test suite • 18% imp over the random exploration 0% 7% 13% 20% 26% 33% Fault Detection Rate ORIG RAND + RND EXND + RND Testilizer Evaluation and Results
  • 27.
    P3. Unit TestFixture Generation If an expected DOM element (test fixture) is not present, a JavaScript unit test may throw an exception or produce an incorrect result. 36 Test fixtures define states of the test environment before the test.
  • 28.
  • 29.
    Proper DOM-based fixturesare required to achieve high coverage. RQ3. How can we automate fixture generation for unit testing? Challenges (1) DOM-related variables (2) Hierarchical DOM relations 40
  • 30.
    Amin Milani Fard,Ali Mesbah, Eric Wohlstadter Generating Fixtures for JavaScript Unit Testing 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015
  • 31.
    Apply the new fixtureon DOM Unit Test 42 ConFix idea Instrument the code Instrumented Code JavaScript Code + Function under test (FUT) Solve constraints and generate a fixture Collect exec trace and deduce DOM- based constraints Execute the FUT
  • 32.
    Instrumentation trace = []; functionconfixWrapper(statementType, statement, varList, varValueList, enclosingFunction, actualStatement) { trace.push({statementType: statementType, statement: statement, varList: varList, varValueList: varValueList, enclosingFunction: enclosingFunction, actualStatement: actualStatement}); return actualStatement; } function getConfixTrace() { return trace; } function dg(x) { return confixWrapper("return", "return confixWrapper("functionCall", "document.getElementById(x)", ["x"], [x], "dg", document.getElementById(x));", [""], [], "dg", confixWrapper("functionCall", "document.getElementById(x)", ["x"], [x], "dg", document.getElementById(x))); } function sumTotalPrice() { sum = confixWrapper("infix", "sum = 0", [""], [], "sumTotalPrice", 0); itemList = confixWrapper("infix", "itemList = confixWrapper("functionCall", "dg('items')", ["items"], ['items'], "sumTotalPrice", dg('items'))", [""], [], "sumTotalPrice", confixWrapper("functionCall", "dg('items')", ["items"], ['items'], "sumTotalPrice", dg('items'))); if (confixWrapper("condition", "itemList.children.length === 0", [""], [], "sumTotalPrice", itemList.children.length === 0)) confixWrapper("functionCall", "dg('message')", ["message"], ['message'], "sumTotalPrice", dg('message')).innerHTML = confixWrapper("infix", "confixWrapper("functionCall", "dg('message')", ["message"], ['message'], "sumTotalPrice", dg('message')).innerHTML = "Item list is empty!"", [""], [], "sumTotalPrice", "Item list is empty!"); else { for (i = confixWrapper("infix", "i = 0", [""], [], "sumTotalPrice", 0); confixWrapper("loopCondition", "i < itemList.children.length", ["i", "itemList"], [i, itemList], "sumTotalPrice", i < itemList.children.length); i++) { p = confixWrapper("infix", "p = itemList.children[i].value", ["itemList.children[i]"], [itemList.children[i]], "sumTotalPrice", itemList.children[i].value); if (confixWrapper("condition", "p > 0", [""], [], "sumTotalPrice", p > 0)) sum += p; else confixWrapper("functionCall", "dg('message')", ["message"], ['message'], "sumTotalPrice", dg('message')).innerHTML += " Wrong value for item " + i; } confixWrapper("functionCall", "dg('total')", ["total"], ['total'], "sumTotalPrice", dg('total')).value = confixWrapper("infix", "confixWrapper("functionCall", "dg('total')", ["total"], ['total'], "sumTotalPrice", dg('total')).value = sum", [""], [], "sumTotalPrice", sum); } 44
  • 33.
    45 (1) Extract DOM-based Constraints (2)Transform them into XPath (3) Solve it using an available XML XPath solver
  • 34.
  • 35.
    47 Evaluation and Result •Up to 67% imp in statement coverage • Up to 300% imp in branch coverage
  • 36.
    P4. Code Maintenance •JavaScript is challenging to maintain. • Code smells adversely influence program comprehension and maintainability. • Code smells detection is time consuming. 49 RQ4. Which JavaScript code smells are prevalent in practice and how can we support automated code refactoring?
  • 37.
    Amin Milani Fard,Ali Mesbah JSNose: Detecting JavaScript Code Smells 13th IEEE International Conference on Source Code Analysis and Manipulation (SCAM), 2013
  • 38.
    JavaScript Code SmellDetection 51 Static and dynamic code analysis is used to monitor and infer information about objects, functions, and code blocks.
  • 39.
    54 Evaluation and Results •Evaluated 11 JavaScript/Ajax web applications. • Effective code smells detection (93% precision, 98% recall) • Lazy object, long method/function, closure smells, JS/HTML/CSS coupling, and excessive global variables, are the most prevalent smells. • Strong, and significant +correlation between {LOC, # functions, # JS files, and CC} and the types of smells, and weaker correlation with the # smell instances.
  • 40.
    55 Some topics towork on • Applying learned tests on similar applications • Generating unit tests based on client-side code similarity and DOM-based tests based on DOM sate similarity • JavaScript code refactoring • Suggesting and applying refactoring for code smells • Understanding test failures and root causes • Generating integrated tests using existing unit tests
  • 41.
    56 Test Model Generation Test Case Generation UnitTest Fixture Generation Code Maintenance

Editor's Notes

  • #3 Modern web applications have changed our life in the last decade.
  • #4 Modern web apps are composed of multiple languages such as HTML, JS, CSS, and server-side code which get rendered by a web browser to show the application’s runtime Document Object Model (DOM). To avoid dealing with these interactions separately, developers test the correct behaviour of a web app through its manifested DOM, using testing frameworks such as Selenium or CasperJS.
  • #5 Because of the considerable impact of these applications on social and economic activities, it is important to ensure their quality through testing and maintenance.
  • #6 Manual testing and maintenance is time consuming and the result is incomplete. Our goal in this thesis is to provide automated techniques and tools to reduce this manual effort.
  • #7 Towards this goal we consider 4 prongs that are complementary to each other.
  • #19 Modern web apps are composed of multiple languages such as HTML, JS, CSS, and server-side code which get rendered by a web browser to show the application’s runtime Document Object Model (DOM). To avoid dealing with these interactions separately, developers test the correct behaviour of a web app through its manifested DOM, using testing frameworks such as Selenium or CasperJS.
  • #24 Using the human domain knowledge in tests: provide valid input data know what DOM elements should be asserted and how. We propose to 1) mine the human knowledge existing in manually-written test cases 2) combine that knowledge with the power of automated crawling 3) extend the test suite for uncovered/unchecked portions of the app
  • #28 Regenerating assertions is a major contribution of this work. These assertions basically checks the existence of a DOM element or its attribute and textual values. Based on the concept of matching, partial matching, and similar matching, We generate new assertions by leveraging DOM patterns in test suites.
  • #29 s1 is shared among many other paths from Index to s5. Assertions on s1 can be exactly reused for generated test cases exercising alternative paths.
  • #30 Full exact match vs partial exact match region-based assertions
  • #33 Avoiding redundant assertions: Assertions that are subsumed by other assertions are redundant and safely eliminated Prioritizing assertions: We prioritize assertions in each state in the order of specification/constraint: the original assertions the reused, the exact element/region assertion assertions similar region assertions
  • #35 Compared test suite generated by 1. human tester (ORIG) 2. Testilizer: traversing the extended SFG and generating assertions (EXND+AR) 3. traversing the extended SFG and random assertion generation (EXND+RND) 4. random exploration and random assertion generation (RAND+RND)
  • #37 Why JS Unit testing? Unit tests can detect code-level bugs that do not propagate to the DOM.
  • #38 Why JS Unit testing? Unit tests can detect code-level bugs that do not propagate to the DOM.
  • #48 To evaluate our approach we answer three research questions.
  • #57 Towards this goal we consider 4 prongs that are complementary to each other.