SlideShare a Scribd company logo
1 of 57
Does static analysis need
machine learning?
Anti-Talk
Victoria Khanieva
PVS-Studio
Speaker
2
Victoria Khanieva
• С++ developer in PVS-Studio
• Supported the MISRA standard
• Wrote articles in checks of open-source
projects
khanieva@viva64.com
www.viva64.com
 Introduction to static analysis
 Existing solutions and approaches they implement
 Problems and pitfalls when creating an analyzer:
 When learning «manually»
 When learning on a real large code base
 Most promising approaches
Agenda
3
About the analysis
4
 Code review
Types of code analysis
5
 Code review
 Dynamic analysis
Types of code analysis
6
 Code review
 Dynamic analysis
 Static analysis
Types of code analysis
7
 How to reveal errors and flaws in the source code
of programs.
 Detect errors in programs
 Get tips on code formatting
 Count metrics
 ….
Static analysis
8
void createCube(float halfExtentsX,
float halfExtentsY,
float halfExtentsZ,
....){
....
m_model->addVertex(halfExtentsX,
halfExtentsY,
halfExtentsY,
....);
....
}
Diagnostics
9
void createCube(float halfExtentsX,
float halfExtentsY,
float halfExtentsZ,
....){
....
m_model->addVertex(halfExtentsX,
halfExtentsY,
halfExtentsY,
....);
....
}
Diagnostics
10
V751 Parameter 'halfExtentsZ' is not used inside function body.
TinyRenderer.cpp 375
You'd think…
11
When ML is useful
12
 Useful: Scanning photos and videos
When ML is useful
13
 Useful: Scanning photos and videos
 Unuseful: Calculator
When ML is useful
14
 Useful: Scanning photos and videos
 Unuseful: Calculator
Possible result
15
Existing solutions
16
Existing solutions
17
Existing solutions
18
 Java, JS, TS, Python, C, C++
 Code review and audit
 You can check out demos on an open-source project
 Related posts
DeepCode
19
Link
DeepCode
20
 Java, C, C++, Objective-C
 By Facebook
 Open-source code
 You can try Infer on your projects
 Based on the Хоара and separation logic,
bi-abduction, and the abstract interpretation
theory
Infer
21
Link
 Handles Infer results
 Suggests possible edits
SapFix
22
 Platform to analyze code quality
 System of edits suggestion
 Searches for dependencies
between functions and methods
by NLP
Embold
23
 Open-source
 Related posts
 Repository with dataset for learning
 Code-style detection
 Platform for collecting metrics and statistics
Source{d}
24
Link
Fixing code style in Source{d}
25
Based on the article
“STYLE-ANALYZER: fixing
code style inconsistencies
with interpretable
unsupervised algorithms”
Link
 By Mozilla+Ubisoft
 Searches for suspicious commits
 Based on the publication: “CLEVER: Combining Code
Metrics with Clone Detection for Just-In-Time Fault
Prevention and Resolution in Large Industrial Projects”
Clever-Commit
26
Link
 Java
 By Amazon
 Recommendations on best practices from the
documentation and code base
CodeGuru
27
28
 Analyze code to search for errors
 Analyze code to search for deviations from best
practices
 Analyze artifacts’ code
 Collect metrics and data on code
 Suggest code-style fixes
Main directions
29
 Selected base of open-source repositories
 Dataset selected manually
 Own project base
Ways to learn
30
Problems and pitfalls
31
* in the view of a classic static analyzer developer
How it may look like:
• if (X && A == A)
• if (A + 1 == A + 1)
• if (A[i] == A[i])
• if ((A) == (A))
• …
«Manual» dataset selection
32
We need to find:
if (A == A)
Example from DeepCode
33
«Manual» learning
34
We need to find:
int y = x / 0;
In practice
35
How it may look like:
template <class T> class numeric_limits {
....
}
namespace boost {
....
}
namespace boost {
namespace hash_detail {
template <class T> void dsizet(size_t x) {
size_t length = x / (limits<int>::digits - 31);
}
}
}
@Override
public String getText(Mode mode) {
StringBuilder sb = new StringBuilder();
....
if (filter.getMessage()
.toLowerCase(Locale.ENGLISH)
.startsWith("Each ")) {
sb.append(" has base power and toughness ");
} else {
sb.append(" have base power and toughness ");
}
....
return sb.toString();
}
Data flow analysis
36
Data flow analysis
37
uint32_t* BnNew() {
uint32_t* result = new uint32_t[kBigIntSize];
memset(result, 0, kBigIntSize * sizeof(uint32_t));
return result;
}
std::string AndroidRSAPublicKey(crypto::RSAPrivateKey* key) {
....
uint32_t* n = BnNew();
....
RSAPublicKey pkey;
....
if (pkey.n0inv == 0)
return kDummyRSAPublicKey; // <=
....
}
 «So many projects on GitHub! The analyzer will learn from their
repositories and commits» turns into commits’ collection and
markup.
 If a manually collected learning base is unreliable, what to
expect from an automatically collected one?
Learning on many projects
38
 Check out the commit with the word «fix»:
Learning on many projects
39
 Analyzer has to be up-to-date in terms of the checked
language
 Most projects use outdated standards
 Most projects don’t use new constructions
Outdated code
40
New construction:
std::vector<int> numbers;
....
for (int num : numbers)
foo(num);
New error pattern:
for (int num : numbers)
numbers.push_back(num * 2);
Example
41
Documentation
42
 Code example:
char check(const uint8 *hash_stage2)
{
....
return memcmp(hash_stage2, hash_stage2_reassured,
SHA1_HASH_SIZE);
}
 The analyzer hypothetically suggests to fix as follows:
int check(const uint8 *hash_stage2)
{
....
return memcmp(hash_stage2, hash_stage2_reassured,
SHA1_HASH_SIZE);
}
Why documentation matters
43
Classic approach: documentation
44
Code example:
ObjectOutputStream out = new ObjectOutputStream(....);
SerializedObject obj = new SerializedObject();
obj.state = 100;
out.writeObject(obj);
obj.state = 200;
out.writeObject(obj);
out.close();
Why documentation matters
45
The analyzer suggests:
ObjectOutputStream out = new ObjectOutputStream(....);
SerializedObject obj = new SerializedObject();
obj.state = 100;
out.writeObject(obj);
obj = new SerializedObject(); // Add this line
obj.state = 200;
out.writeObject(obj);
out.close();
Why documentation matters
46
What happens without the edit:
ObjectOutputStream out = new ObjectOutputStream(....);
SerializedObject obj = new SerializedObject();
obj.state = 100;
out.writeObject(obj); // stores the object with the state = 100
obj.state = 200;
out.writeObject(obj); // stores the object with the state = 100
out.close();
Why documentation matters
47
Unambiguous behavior
48
Unambiguous behavior
49
Unambiguous behavior
50
std::vector<int> numbers;
....
for (int num : numbers)
{
if (num < 5)
{
numbers.push_back(0);
break; // or, for example, return
}
}
False positives
51
 Reason for getting a warning may be unclear.
Reason for NOT getting a warning may be unclear as well.
 How to fix?
 Additional learning (will it help?)
 Mechanism to hide warnings (not universal)
False positives
52
In case of successful analyzer learning
53
 Code style by specific symbols
 Collecting additional metrics and information
Promising directions
54
 Best-practices for a specific framework/code base/platform
Promising directions
55
56
https://pvs-studio.com/en/pvs-studio/download/
Download a PVS-Studio one-month trial version and
check your projects using a classic static analysis:
Q&A
viva64.com
57
khanieva@viva64.com

More Related Content

What's hot

Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Pavneet Singh Kochhar
 
Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Rapita Systems Ltd
 
A year of SonarQube and TFS/VSTS
A year of SonarQube and TFS/VSTSA year of SonarQube and TFS/VSTS
A year of SonarQube and TFS/VSTSMatteo Emili
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis Engineering Software Lab
 
ISTQB Certified Tester Exam Question Papers and Answers - 11
ISTQB Certified Tester Exam Question Papers and Answers - 11ISTQB Certified Tester Exam Question Papers and Answers - 11
ISTQB Certified Tester Exam Question Papers and Answers - 11BDDazza
 
Code Coverage Revised : EclEmma on JaCoCo
Code Coverage Revised : EclEmma on JaCoCoCode Coverage Revised : EclEmma on JaCoCo
Code Coverage Revised : EclEmma on JaCoCoEvgeny Mandrikov
 
How to become a testing expert
How to become a testing expertHow to become a testing expert
How to become a testing expertgaoliang641
 
Code coverage & tools
Code coverage & toolsCode coverage & tools
Code coverage & toolsRajesh Kumar
 
Top 5 Code Coverage Tools in DevOps
Top 5 Code Coverage Tools in DevOpsTop 5 Code Coverage Tools in DevOps
Top 5 Code Coverage Tools in DevOpsscmGalaxy Inc
 
How to improve code quality for iOS apps?
How to improve code quality for iOS apps?How to improve code quality for iOS apps?
How to improve code quality for iOS apps?Kate Semizhon
 
Videos about static code analysis
Videos about static code analysisVideos about static code analysis
Videos about static code analysisPVS-Studio
 
Java Code Quality Tools
Java Code Quality ToolsJava Code Quality Tools
Java Code Quality ToolsAnju ML
 
Track code quality with SonarQube - short version
Track code quality with SonarQube - short versionTrack code quality with SonarQube - short version
Track code quality with SonarQube - short versionDmytro Patserkovskyi
 
Building a high quality+ products with SCA
Building a high quality+ products with SCABuilding a high quality+ products with SCA
Building a high quality+ products with SCASuman Sourav
 
Continuous code quality_in_java
Continuous code quality_in_javaContinuous code quality_in_java
Continuous code quality_in_javaManimekalai48
 
Test driven development_continuous_integration
Test driven development_continuous_integrationTest driven development_continuous_integration
Test driven development_continuous_integrationhaochenglee
 

What's hot (20)

Debugging C# Applications
Debugging C# ApplicationsDebugging C# Applications
Debugging C# Applications
 
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
 
Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage"
 
A year of SonarQube and TFS/VSTS
A year of SonarQube and TFS/VSTSA year of SonarQube and TFS/VSTS
A year of SonarQube and TFS/VSTS
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
 
ISTQB Certified Tester Exam Question Papers and Answers - 11
ISTQB Certified Tester Exam Question Papers and Answers - 11ISTQB Certified Tester Exam Question Papers and Answers - 11
ISTQB Certified Tester Exam Question Papers and Answers - 11
 
Parasoft fda software compliance part2
Parasoft fda software compliance   part2Parasoft fda software compliance   part2
Parasoft fda software compliance part2
 
Code Coverage Revised : EclEmma on JaCoCo
Code Coverage Revised : EclEmma on JaCoCoCode Coverage Revised : EclEmma on JaCoCo
Code Coverage Revised : EclEmma on JaCoCo
 
How to become a testing expert
How to become a testing expertHow to become a testing expert
How to become a testing expert
 
Static Code Analysis
Static Code AnalysisStatic Code Analysis
Static Code Analysis
 
Code coverage & tools
Code coverage & toolsCode coverage & tools
Code coverage & tools
 
Top 5 Code Coverage Tools in DevOps
Top 5 Code Coverage Tools in DevOpsTop 5 Code Coverage Tools in DevOps
Top 5 Code Coverage Tools in DevOps
 
How to improve code quality for iOS apps?
How to improve code quality for iOS apps?How to improve code quality for iOS apps?
How to improve code quality for iOS apps?
 
Videos about static code analysis
Videos about static code analysisVideos about static code analysis
Videos about static code analysis
 
Java Code Quality Tools
Java Code Quality ToolsJava Code Quality Tools
Java Code Quality Tools
 
Track code quality with SonarQube - short version
Track code quality with SonarQube - short versionTrack code quality with SonarQube - short version
Track code quality with SonarQube - short version
 
Building a high quality+ products with SCA
Building a high quality+ products with SCABuilding a high quality+ products with SCA
Building a high quality+ products with SCA
 
Secure Coding in C/C++
Secure Coding in C/C++Secure Coding in C/C++
Secure Coding in C/C++
 
Continuous code quality_in_java
Continuous code quality_in_javaContinuous code quality_in_java
Continuous code quality_in_java
 
Test driven development_continuous_integration
Test driven development_continuous_integrationTest driven development_continuous_integration
Test driven development_continuous_integration
 

Similar to Does static analysis need machine learning

Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?Andrey Karpov
 
How to create a high quality static code analyzer
How to create a high quality static code analyzerHow to create a high quality static code analyzer
How to create a high quality static code analyzerAndrey Karpov
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareAndrey Karpov
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerAndrey Karpov
 
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP StackLorna Mitchell
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisSebastiano Panichella
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineAndrey Karpov
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesAndrey Karpov
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksYuki Ueda
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works奈良先端大 情報科学研究科
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
A453 programming task 1
A453 programming task 1A453 programming task 1
A453 programming task 1Tom Dale
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTao Xie
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error predictionNIKHIL NAWATHE
 
GPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators FamiliesGPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators FamiliesMohamed BOUSSAA
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosaPharo
 

Similar to Does static analysis need machine learning (20)

Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
 
How to create a high quality static code analyzer
How to create a high quality static code analyzerHow to create a high quality static code analyzer
How to create a high quality static code analyzer
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
 
Tool up your lamp stack
Tool up your lamp stackTool up your lamp stack
Tool up your lamp stack
 
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP Stack
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal Engine
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutes
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works
 
Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019
 
Pragmatic Code Coverage
Pragmatic Code CoveragePragmatic Code Coverage
Pragmatic Code Coverage
 
A453 programming task 1
A453 programming task 1A453 programming task 1
A453 programming task 1
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to Practice
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
 
STAMP
STAMPSTAMP
STAMP
 
GPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators FamiliesGPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators Families
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 

More from Andrey Karpov

60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста60 антипаттернов для С++ программиста
60 антипаттернов для С++ программистаAndrey Karpov
 
60 terrible tips for a C++ developer
60 terrible tips for a C++ developer60 terrible tips for a C++ developer
60 terrible tips for a C++ developerAndrey Karpov
 
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...Andrey Karpov
 
PVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error ExamplesPVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error ExamplesAndrey Karpov
 
PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокAndrey Karpov
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Andrey Karpov
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesAndrey Karpov
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaAndrey Karpov
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)Andrey Karpov
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Andrey Karpov
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerAndrey Karpov
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsAndrey Karpov
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++Andrey Karpov
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youAndrey Karpov
 
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsPVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsAndrey Karpov
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...Andrey Karpov
 
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Andrey Karpov
 
PVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCIPVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCIAndrey Karpov
 
PVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOpsPVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOpsAndrey Karpov
 

More from Andrey Karpov (20)

60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста
 
60 terrible tips for a C++ developer
60 terrible tips for a C++ developer60 terrible tips for a C++ developer
60 terrible tips for a C++ developer
 
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
 
PVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error ExamplesPVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error Examples
 
PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибок
 
PVS-Studio в 2021
PVS-Studio в 2021PVS-Studio в 2021
PVS-Studio в 2021
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' Mistakes
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and Java
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for you
 
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsPVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
 
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
 
PVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCIPVS-Studio in the Clouds: CircleCI
PVS-Studio in the Clouds: CircleCI
 
PVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOpsPVS-Studio in the Clouds: Azure DevOps
PVS-Studio in the Clouds: Azure DevOps
 

Recently uploaded

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 

Recently uploaded (20)

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 

Does static analysis need machine learning

  • 1. Does static analysis need machine learning? Anti-Talk Victoria Khanieva PVS-Studio
  • 2. Speaker 2 Victoria Khanieva • С++ developer in PVS-Studio • Supported the MISRA standard • Wrote articles in checks of open-source projects khanieva@viva64.com www.viva64.com
  • 3.  Introduction to static analysis  Existing solutions and approaches they implement  Problems and pitfalls when creating an analyzer:  When learning «manually»  When learning on a real large code base  Most promising approaches Agenda 3
  • 5.  Code review Types of code analysis 5
  • 6.  Code review  Dynamic analysis Types of code analysis 6
  • 7.  Code review  Dynamic analysis  Static analysis Types of code analysis 7
  • 8.  How to reveal errors and flaws in the source code of programs.  Detect errors in programs  Get tips on code formatting  Count metrics  …. Static analysis 8
  • 9. void createCube(float halfExtentsX, float halfExtentsY, float halfExtentsZ, ....){ .... m_model->addVertex(halfExtentsX, halfExtentsY, halfExtentsY, ....); .... } Diagnostics 9
  • 10. void createCube(float halfExtentsX, float halfExtentsY, float halfExtentsZ, ....){ .... m_model->addVertex(halfExtentsX, halfExtentsY, halfExtentsY, ....); .... } Diagnostics 10 V751 Parameter 'halfExtentsZ' is not used inside function body. TinyRenderer.cpp 375
  • 12. When ML is useful 12  Useful: Scanning photos and videos
  • 13. When ML is useful 13  Useful: Scanning photos and videos  Unuseful: Calculator
  • 14. When ML is useful 14  Useful: Scanning photos and videos  Unuseful: Calculator
  • 19.  Java, JS, TS, Python, C, C++  Code review and audit  You can check out demos on an open-source project  Related posts DeepCode 19 Link
  • 21.  Java, C, C++, Objective-C  By Facebook  Open-source code  You can try Infer on your projects  Based on the Хоара and separation logic, bi-abduction, and the abstract interpretation theory Infer 21 Link
  • 22.  Handles Infer results  Suggests possible edits SapFix 22
  • 23.  Platform to analyze code quality  System of edits suggestion  Searches for dependencies between functions and methods by NLP Embold 23
  • 24.  Open-source  Related posts  Repository with dataset for learning  Code-style detection  Platform for collecting metrics and statistics Source{d} 24 Link
  • 25. Fixing code style in Source{d} 25 Based on the article “STYLE-ANALYZER: fixing code style inconsistencies with interpretable unsupervised algorithms” Link
  • 26.  By Mozilla+Ubisoft  Searches for suspicious commits  Based on the publication: “CLEVER: Combining Code Metrics with Clone Detection for Just-In-Time Fault Prevention and Resolution in Large Industrial Projects” Clever-Commit 26 Link
  • 27.  Java  By Amazon  Recommendations on best practices from the documentation and code base CodeGuru 27
  • 28. 28
  • 29.  Analyze code to search for errors  Analyze code to search for deviations from best practices  Analyze artifacts’ code  Collect metrics and data on code  Suggest code-style fixes Main directions 29
  • 30.  Selected base of open-source repositories  Dataset selected manually  Own project base Ways to learn 30
  • 31. Problems and pitfalls 31 * in the view of a classic static analyzer developer
  • 32. How it may look like: • if (X && A == A) • if (A + 1 == A + 1) • if (A[i] == A[i]) • if ((A) == (A)) • … «Manual» dataset selection 32 We need to find: if (A == A)
  • 35. We need to find: int y = x / 0; In practice 35 How it may look like: template <class T> class numeric_limits { .... } namespace boost { .... } namespace boost { namespace hash_detail { template <class T> void dsizet(size_t x) { size_t length = x / (limits<int>::digits - 31); } } }
  • 36. @Override public String getText(Mode mode) { StringBuilder sb = new StringBuilder(); .... if (filter.getMessage() .toLowerCase(Locale.ENGLISH) .startsWith("Each ")) { sb.append(" has base power and toughness "); } else { sb.append(" have base power and toughness "); } .... return sb.toString(); } Data flow analysis 36
  • 37. Data flow analysis 37 uint32_t* BnNew() { uint32_t* result = new uint32_t[kBigIntSize]; memset(result, 0, kBigIntSize * sizeof(uint32_t)); return result; } std::string AndroidRSAPublicKey(crypto::RSAPrivateKey* key) { .... uint32_t* n = BnNew(); .... RSAPublicKey pkey; .... if (pkey.n0inv == 0) return kDummyRSAPublicKey; // <= .... }
  • 38.  «So many projects on GitHub! The analyzer will learn from their repositories and commits» turns into commits’ collection and markup.  If a manually collected learning base is unreliable, what to expect from an automatically collected one? Learning on many projects 38
  • 39.  Check out the commit with the word «fix»: Learning on many projects 39
  • 40.  Analyzer has to be up-to-date in terms of the checked language  Most projects use outdated standards  Most projects don’t use new constructions Outdated code 40
  • 41. New construction: std::vector<int> numbers; .... for (int num : numbers) foo(num); New error pattern: for (int num : numbers) numbers.push_back(num * 2); Example 41
  • 43.  Code example: char check(const uint8 *hash_stage2) { .... return memcmp(hash_stage2, hash_stage2_reassured, SHA1_HASH_SIZE); }  The analyzer hypothetically suggests to fix as follows: int check(const uint8 *hash_stage2) { .... return memcmp(hash_stage2, hash_stage2_reassured, SHA1_HASH_SIZE); } Why documentation matters 43
  • 45. Code example: ObjectOutputStream out = new ObjectOutputStream(....); SerializedObject obj = new SerializedObject(); obj.state = 100; out.writeObject(obj); obj.state = 200; out.writeObject(obj); out.close(); Why documentation matters 45
  • 46. The analyzer suggests: ObjectOutputStream out = new ObjectOutputStream(....); SerializedObject obj = new SerializedObject(); obj.state = 100; out.writeObject(obj); obj = new SerializedObject(); // Add this line obj.state = 200; out.writeObject(obj); out.close(); Why documentation matters 46
  • 47. What happens without the edit: ObjectOutputStream out = new ObjectOutputStream(....); SerializedObject obj = new SerializedObject(); obj.state = 100; out.writeObject(obj); // stores the object with the state = 100 obj.state = 200; out.writeObject(obj); // stores the object with the state = 100 out.close(); Why documentation matters 47
  • 51. std::vector<int> numbers; .... for (int num : numbers) { if (num < 5) { numbers.push_back(0); break; // or, for example, return } } False positives 51
  • 52.  Reason for getting a warning may be unclear. Reason for NOT getting a warning may be unclear as well.  How to fix?  Additional learning (will it help?)  Mechanism to hide warnings (not universal) False positives 52
  • 53. In case of successful analyzer learning 53
  • 54.  Code style by specific symbols  Collecting additional metrics and information Promising directions 54
  • 55.  Best-practices for a specific framework/code base/platform Promising directions 55
  • 56. 56 https://pvs-studio.com/en/pvs-studio/download/ Download a PVS-Studio one-month trial version and check your projects using a classic static analysis: