Software quality assurance days
25th International Conference on software
quality issues
sqadays.com
St. Petersburg, May 31 – June 1, 2019
Expanding the idea of static analysis from code check
to other development processes
About the speaker
• Maxim Stefanov (stefanov@viva64.com)
• C++/Java developer in PVS-Studio
• Activity:
• Participation in the development of the C++
analyzer core
• Participation in the development of the Java
analyzer
Content
• General concepts of static analysis
• Classic look
• Alternative look
• Conclusion
Static analysis is...
Static analysis inputs
• Program code
• Data in JSON, YAML, XML format
• Documentation / article
• Scematic, 3D model (in design)
• …
Classic look: code base analysis
Code analysis is necessary for…
• Searching for bugs, vulnerabilities, bottlenecks
• Code formatting
• Various metrics calculation
• Code compliance verification
• …
Abstract syntax tree
while(b != 0)
{
if (a > b)
{
a = a - b;
}
else
{
b = b - a;
}
}
return a;
Control flow graph
(1) while (x < 50)
{
(2) if (a / b > 5)
{
(3) a = a - b;
}
else
{
(4) b = b - a;
(5) }
(6) x++;
}
(7) doSomething();
And…
• Parsing tree
• Abstract semantic graph (semantic model)
• …
Search for bugs, vulnerabilities and
bottlenecks
Search for bugs, vulnerabilities and
bottlenecks
• AST-based defect pattern search
• Defects searching based on semantic model
• Searching based on data flow analysis
• …
In details:
Code formatting
Why watch code formatting?
• Compliance with the coding rules in company
• Easy to read
• Easy to maintain and debug
• Increased probability of defect detection
• …
Before formatting
Action doIfSkilledSpeaker(double rating, int experience)
{
int index = 0; while(index < speakers.length) { Speaker sp = speakers[index];
if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) {
return LISTEN; } else if (doThink()) {
return LISTEN;
} } }
return GO_HOME;
}
Before formatting
Action doIfSkilledSpeaker(double rating, int experience)
{
int index = 0; while(index < speakers.length) { Speaker sp = speakers[index];
if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) {
return LISTEN; } else if (doThink()) {
return LISTEN;
} } }
return GO_HOME;
}
Action getNextInterestingSpeaker(double rating, int experience)
{
int index = 0;
while(index < speakers.length)
{
Speaker sp = speakers[index++];
if (sp.getRating() >= rating || sp.getExperience() > experience)
{
if (sp.isGirl())
{
return LISTEN;
}
else if (doThink())
{
return LISTEN;
}
}
}
return GO_HOME;
}
After formatting
Before formatting
if (A)
if (B) doSomething();
else
doSomething(someObject);
if (A)
{
if (B)
{
doSomething();
}
else
{
doSomething(someObj);
}
}
After formatting
if (A)
if (B) doSomething();
else
doSomething(someObj);
Metrics calculation.
Why measure software?
• Determining the quality of an existing product or process
• Predicting productprocess quality
• Improving productprocess quality
Metrics calculation
• Quantitive metrics
• Program complexity metrics
• Program size metrics
• Program control flow complexity metrics
• Data flow complexity metrics
• Object oriented metrics
...
Quantitive metrics
• SLOC – code lines (physical, logical)
• Amount and percentage of comments
• Average number of lines for functions (classes, files)
• Code Duplication Percentage
• ...
Quantitive metrics:
defects in code metrics
• Defect density:
«Number of defects in a separate module»
--------------------------------------------------------------
«Total number of defects in software»
• Regression coefficient:
«Number of defects in old functional»
-------------------------------------------------------------------------------------
«Total number of defects, including new functional»
McCabe Metrics:
Cyclomatic Complexity
M = E − N + 2P
where:
M – cyclomatic complexity,
E – number of edges in the graph,
N – number of nodes in the graph,
P – number of connected components.
Cyclomatic complexity
Some language constructions in a graph representation
Cyclomatic complexity: example
int someMethod(...){
int bot = 0;
int top = n - 1;
int mid, cmp;
while (bot <= top) {
mid = (bot + top)/2;
if (table[mid] == item) {
return mid;
}
else if (compare) {
bot = mid + 1;
}
else {
top = mid -1;
}
}
return -1;
}
Cyclomatic complexity: example
E = 14
N = 12
P = 1
M = 14–12 + (2*1)
M = 4
Cyclomatic complexity: what for
An overly high cyclomatic complexity factor leads to complexity
of:
• Understanding, supporting and debugging code,
• testing.
Summing up the metrics. Note
• Using of metrics as punishment is dangerous
• Using of metrics for information and support is useful
• Metrics are better in dynamics
Source code obfuscation/deobfustation
The need of protection against analysis performed by both man and
machine, as well as software with increased requirements for crack
resistance, for example:
• DRM key protection
• Game protection against extraneous code (cheats and bots)
• Source protection during transfer/sale
Source code obfuscation/deobfustation
int COUNT = 100;
float TAX_RATE = 0.2;
for (int i=0; i < COUNT; i++)
{
tax[i] = orig_price[i] * TAX_RATE;
price[i] = orig_price[i] + tax[i];
}
Before:
After: for(int a=0;a<100;a++){b[a]=c[a]*0.2;d[a]=c[a]+b[a];}
Source code obfuscation/deobfustation
Validation check
Known formats used for transferring data: JSON, XML, YAML
Matching tools: JSONLint, XMLLint, YAMLLint
Perform the analysis and reveal:
• Syntax errors
• Extra commas, brackets, spaces, …
• Key duplication
• Key definition order
• …
Validation check
{
"warnings": [{
"title": "Some Title",
"code": "V6050",
"cwe": 0,
"level": 1,
"title": "Some message.", // Duplicate key 'title'
"falseAlarm": false
}]
}
Publisher: internal development
Task: checking articles and documentation for correctness
Input data: articles and documentation .
Output data:
List of errors:
• Matching links
• Image verification
• Code fragments correctness
• Checking the date and authors of the documentation, articles
• ...
Example:
checking pictures for alpha channels
Expectation Reality
Example:
link validation
The error is that in the documentation in Russian,
a link to the English language source is used
KOMPAS - Expert:
static analysis inspiration
• Input data: figure and 3D-model
• Output data:
List of errors:
• Design standards compliance
• Enterprise restriction lists compliance
• KOMPAS – 3D work rules compliance
KOMPAS - Expert:
static analysis inspiration
Conclusion:
• Static analysis is gaining momentum in recent years
• In the near future its application will only grow

Expanding the idea of static analysis from code check to other development processes

  • 1.
    Software quality assurancedays 25th International Conference on software quality issues sqadays.com St. Petersburg, May 31 – June 1, 2019 Expanding the idea of static analysis from code check to other development processes
  • 2.
    About the speaker •Maxim Stefanov (stefanov@viva64.com) • C++/Java developer in PVS-Studio • Activity: • Participation in the development of the C++ analyzer core • Participation in the development of the Java analyzer
  • 3.
    Content • General conceptsof static analysis • Classic look • Alternative look • Conclusion
  • 4.
  • 5.
    Static analysis inputs •Program code • Data in JSON, YAML, XML format • Documentation / article • Scematic, 3D model (in design) • …
  • 6.
    Classic look: codebase analysis
  • 7.
    Code analysis isnecessary for… • Searching for bugs, vulnerabilities, bottlenecks • Code formatting • Various metrics calculation • Code compliance verification • …
  • 8.
    Abstract syntax tree while(b!= 0) { if (a > b) { a = a - b; } else { b = b - a; } } return a;
  • 9.
    Control flow graph (1)while (x < 50) { (2) if (a / b > 5) { (3) a = a - b; } else { (4) b = b - a; (5) } (6) x++; } (7) doSomething();
  • 10.
    And… • Parsing tree •Abstract semantic graph (semantic model) • …
  • 11.
    Search for bugs,vulnerabilities and bottlenecks
  • 12.
    Search for bugs,vulnerabilities and bottlenecks • AST-based defect pattern search • Defects searching based on semantic model • Searching based on data flow analysis • … In details:
  • 13.
    Code formatting Why watchcode formatting? • Compliance with the coding rules in company • Easy to read • Easy to maintain and debug • Increased probability of defect detection • …
  • 14.
    Before formatting Action doIfSkilledSpeaker(doublerating, int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index]; if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; }
  • 15.
    Before formatting Action doIfSkilledSpeaker(doublerating, int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index]; if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; }
  • 16.
    Action getNextInterestingSpeaker(double rating,int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index++]; if (sp.getRating() >= rating || sp.getExperience() > experience) { if (sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; } After formatting
  • 17.
    Before formatting if (A) if(B) doSomething(); else doSomething(someObject);
  • 18.
    if (A) { if (B) { doSomething(); } else { doSomething(someObj); } } Afterformatting if (A) if (B) doSomething(); else doSomething(someObj);
  • 19.
    Metrics calculation. Why measuresoftware? • Determining the quality of an existing product or process • Predicting productprocess quality • Improving productprocess quality
  • 20.
    Metrics calculation • Quantitivemetrics • Program complexity metrics • Program size metrics • Program control flow complexity metrics • Data flow complexity metrics • Object oriented metrics ...
  • 21.
    Quantitive metrics • SLOC– code lines (physical, logical) • Amount and percentage of comments • Average number of lines for functions (classes, files) • Code Duplication Percentage • ...
  • 22.
    Quantitive metrics: defects incode metrics • Defect density: «Number of defects in a separate module» -------------------------------------------------------------- «Total number of defects in software» • Regression coefficient: «Number of defects in old functional» ------------------------------------------------------------------------------------- «Total number of defects, including new functional»
  • 23.
    McCabe Metrics: Cyclomatic Complexity M= E − N + 2P where: M – cyclomatic complexity, E – number of edges in the graph, N – number of nodes in the graph, P – number of connected components.
  • 24.
    Cyclomatic complexity Some languageconstructions in a graph representation
  • 25.
    Cyclomatic complexity: example intsomeMethod(...){ int bot = 0; int top = n - 1; int mid, cmp; while (bot <= top) { mid = (bot + top)/2; if (table[mid] == item) { return mid; } else if (compare) { bot = mid + 1; } else { top = mid -1; } } return -1; }
  • 26.
    Cyclomatic complexity: example E= 14 N = 12 P = 1 M = 14–12 + (2*1) M = 4
  • 27.
    Cyclomatic complexity: whatfor An overly high cyclomatic complexity factor leads to complexity of: • Understanding, supporting and debugging code, • testing.
  • 28.
    Summing up themetrics. Note • Using of metrics as punishment is dangerous • Using of metrics for information and support is useful • Metrics are better in dynamics
  • 29.
    Source code obfuscation/deobfustation Theneed of protection against analysis performed by both man and machine, as well as software with increased requirements for crack resistance, for example: • DRM key protection • Game protection against extraneous code (cheats and bots) • Source protection during transfer/sale
  • 30.
    Source code obfuscation/deobfustation intCOUNT = 100; float TAX_RATE = 0.2; for (int i=0; i < COUNT; i++) { tax[i] = orig_price[i] * TAX_RATE; price[i] = orig_price[i] + tax[i]; } Before: After: for(int a=0;a<100;a++){b[a]=c[a]*0.2;d[a]=c[a]+b[a];}
  • 31.
  • 32.
    Validation check Known formatsused for transferring data: JSON, XML, YAML Matching tools: JSONLint, XMLLint, YAMLLint Perform the analysis and reveal: • Syntax errors • Extra commas, brackets, spaces, … • Key duplication • Key definition order • …
  • 33.
    Validation check { "warnings": [{ "title":"Some Title", "code": "V6050", "cwe": 0, "level": 1, "title": "Some message.", // Duplicate key 'title' "falseAlarm": false }] }
  • 34.
    Publisher: internal development Task:checking articles and documentation for correctness Input data: articles and documentation . Output data: List of errors: • Matching links • Image verification • Code fragments correctness • Checking the date and authors of the documentation, articles • ...
  • 35.
    Example: checking pictures foralpha channels Expectation Reality
  • 36.
    Example: link validation The erroris that in the documentation in Russian, a link to the English language source is used
  • 37.
    KOMPAS - Expert: staticanalysis inspiration • Input data: figure and 3D-model • Output data: List of errors: • Design standards compliance • Enterprise restriction lists compliance • KOMPAS – 3D work rules compliance
  • 38.
    KOMPAS - Expert: staticanalysis inspiration
  • 39.
    Conclusion: • Static analysisis gaining momentum in recent years • In the near future its application will only grow