SlideShare a Scribd company logo
1 of 4
Download to read offline
An	ideal	static	analyzer,	or	why	ideals	are	
unachievable
Author: Evgeniy Ryzhkov
Date: 15.03.2012
Being inspired by Eugene Laspersky's post about an ideal antivirus, I decided to write a similar post
about an ideal static analyzer. And meanwhile think how far from being it our PVS-Studio is.
An ideal static analyzer's characteristics
Those who are not familiar with the notion static code analysis, please follow the link. So let me
enumerate the characteristics right away:
• 100% detection of all the types of programming errors;
• 0% false positives;
• high performance - "whooosh, and the code is analyzed completely almost at the same
instance";
• integration with my favorite (i.e. every) IDE; ability to work under my favorite (i.e. every)
operating system; analysis of code in my favorite (i.e. any) programming language;
• free (freeware, open source);
• high-quality and prompt customer support.
Of course, this ideal can never be achieved but it shows the direction towards which companies
developing solutions in this sphere can head.
100% detection of all the types of programming errors
You should understand that none of the static analyzers will ever provide 100% error detection. Why?
Well, if only because some error types are better detected by dynamic analyzers. And it's ridiculous to
try to compete with them in this area. As well as dynamic analyzers cannot compete with static ones
regarding some rule types.
It's difficult to obtain 100% error detection even for diagnostics characteristic of static analysis. First, any
live programming language is constantly developing acquiring new syntax and therefore new ways of
making an error. Second, even old syntax can be with time used by people in a rather unusual way
analyzer developers did not think of.
Finally, a static analyzer doesn't possess knowledge about what a program SHOULD contain, it doesn't
have AI. If there is a phrase in a program "A is equal to B", while the correct one is "A is not equal to B",
static analysis won't help you with that.
That's why the only real way out is to constantly create new diagnostic rules. It will never give you 100%
error detection but will keep you close to it all the time.
0% false positives
Any static analyzer produces false positives, as, in the long run, only the programmer KNOWS what
exactly the code IS TO do. But an analyzer sees what the code really DOES and tries to UNDERSTAND
what it SHOULD do.
Returning to the previous section about "100% detection of all the errors", one can make a naïve
suggestion: "Why, let's detect everything that moves and we'll be happy!" That is, let's detect everything
that looks like an error in the least bit. But this approach is wrong because the number of false positives
will go overboard. And there is an opinion that when a user sees 10 false positives in a row, he/she
closes the tool not to deal with it anymore.
We have the following ways to reduce the number of false positives:
1. Constantly handling existing rules to refine their formulations. For example, if in a test project a
rule "was triggered" 100 times and 50 of them were false positives, refining the rule can reduce
this number to 10. However, you can lose 1 or 2 real warnings, but it's the eternal issue of
making compromises.
2. Refusing rules which are no more relevant. If you only add new rules and never remove (turn
off) obsolete ones, some of your diagnostics lose their relevance with time.
3. Having useful tools to handle false positives. For instance, PVS-Studio provides a mechanism to
suppress false positives. Once marking a message as a false report, you won't see it next time.
High performance
Everybody wants software to work fast but it's not always possible. Usually the code analysis technology
requires more resources than the compiler - because the compiler checks only very crude errors, while
the analyzer's aim is to perform fine analysis. Of course, it needs more data for that. The more the data,
the deeper analysis is and the more interesting errors can be found.
An obvious solution to enhance performance is to provide support of several processor cores when
analyzing the code. It's rather easy to implement in static analyzers: each file is checked separately and
the results are simply combined then.
Less obvious is an attempt to check just a code fragment instead of a whole compilation unit (a file). This
is a very complicated task and, taken generally, it's quite difficult to solve (for any language). You have to
find and "calculate" data types, analyze classes being used and so on. Costs on "extracting" the part you
need to analyze might be even higher than just analyzing the whole code completely.
Integration with my favorite (i.e. every) IDE; ability to work under my
favorite (i.e. every) operating system; analysis of code in my favorite (i.e.
any) programming language
The issue of providing support for a certain operating system, development environment or analyzable
programming language is important in choosing between static analysis tools. To my great surprise,
programmers, being the main users of static analyzers, often cannot understand the difficulties of
implementing support for the whole zoo of operating systems they want. But let's discuss it in due
order.
Supported (analyzable) programming languages
Programming errors detected by code analyzers surely can occur in every programming language and
these errors have common features: in every language programmers forget to initialize variables,
confuse keys when typing a program and so on. But parsing and analysis of a program is VERY different
from language to language.
If some analyzer is announced to support analysis of software in several programming languages, it
means that there are most likely several analysis modules in it too. It can even be hidden from users! I'm
writing this just for people to understand that the phrase: "Why don't you make the SAME but for
C#/PHP/Java?" implies very much work.
Supported operating systems
It's very naïve to think that a code analyzer "just" handles text and therefore can work in any operating
system. Of course, different programming languages are "tied" to the environment to various extents:
some are more, like C++; others are less, like PHP.
Where does this difference come from? The point is that there exist several compilers for large and
powerful languages like C++, considering all their differences and subtleties in the language syntax. The
code written for Windows-based compilers is just a bit yet noticeably different from the code written for
Linux-based compilers. Though this difference is not very crucial from the user code's viewpoint, it might
be important from the viewpoint of a static analyzer - because if the code being analyzed contains key
words that are used in this very compiler, the analyzer needs to be "taught" them. In this sense, support
of one more compiler and support of one more operating system are equal tasks, generally speaking.
Note that this is an easier task for simpler languages than C++.
Thus, supported operating systems include not only platforms an executed file is run on, but the code
for these platforms the analyzer can "understand".
Supported IDE
There are a lot of development tools for different languages. What is important for users is this:
• a static analyzer should be able to integrate with their favorite development environment;
• the tool can be run in automatic mode at night;
• the analyzer should be able to integrate into continuous integration systems;
The last two points are often called "support of command line version" but it has nothing to do with the
command line. No one nowadays actually finds it interesting to watch white letters on the black screen
instead of a conveniently organized report which can be converted into a text file and sent via e-mail or
written into the build system's log.
Support of different IDE's is a difficult, effort-intensive task, as each IDE imposes certain restrictions on
their plugins. These restrictions often vary in different systems.
Free (freeware, open source) and high-quality customer support
I've united two sections into one because they are closely connected.
Static code analysis tools refer to the software type for which quality and continuous support are very
important. Yes, there are a few tools distributed for free, but I believe they will never reach the market
leaders (Coverity, Klocwork, Parasoft).
Generally speaking, a static analysis tool can become free and open-source if the developer company is
purchased by some giant like Google, Microsoft or Intel, but this is a special case.
Static analysis tools are usually sold according to the model of annually renewable license. Some users
might not like it, but I will try to explain why this scheme is the best. And please forgive me if you have
entered the "Free" section and now are reading about licensing schemes.
As I've already said, customer support is very important for static analysis tools. In the field of static
analysis, support implies, first of all, cases when the analyzer cannot parse user code (because of
complex C++ templates, non-standard compiler extensions, etc.). In these cases you need to promptly
(during several days) improve the analyzer so that it can parse the customer's code. User support also
includes aid in integrating the tool into their development process. Well, implementation of customer
requests that makes use of the tool more convenient is also necessary.
All this costs money. That's why you cannot sell a license once and support your users for free for the
rest of your life.
One could sell new major-releases, for example, versions v3, v4, v5... What is bad about this scheme is
that it makes the developer "hold" new cool capabilities of the tool till the next major-version instead of
releasing them right away as soon as they are ready.
Thus, it appears that annual license renewal is the best way. Meanwhile, some developer companies set
the renewal price at the 100% of the initial price, while others set a lower price (making a discount for
renewal). Regarding the latter case, it can be explained this way: the first year's price includes additional
costs on teaching the customer to work with the tool.
So, it appears that a quality tool with quality support cannot be free, if only it is not being developed by
a company-giant, but in this case you can forget about targeted individual customer support.
Conclusion
In this article I've tried to show you what characteristics an ideal static code analysis tool should possess;
how users want it to look. And it is users, of course, who decide how much this or that tool really
corresponds to this ideal.

More Related Content

What's hot

Automation testing: how tools are important?
Automation testing: how tools are important?Automation testing: how tools are important?
Automation testing: how tools are important?MD ISLAM
 
Meetup of test mini conference on ai in testing
Meetup of test mini conference  on ai in testingMeetup of test mini conference  on ai in testing
Meetup of test mini conference on ai in testingKai Lepler
 
Jeremias Rößler
Jeremias RößlerJeremias Rößler
Jeremias RößlerCodeFest
 
How to apply AI to Testing
How to apply AI to TestingHow to apply AI to Testing
How to apply AI to TestingSAP SE
 
27 000 Errors in the Tizen Operating System
27 000 Errors in the Tizen Operating System27 000 Errors in the Tizen Operating System
27 000 Errors in the Tizen Operating SystemPVS-Studio
 
Tool Support For Testing (Types of Test Tool)
Tool Support For Testing (Types of Test Tool)Tool Support For Testing (Types of Test Tool)
Tool Support For Testing (Types of Test Tool)sarahrambe
 
.Net Debugging Techniques
.Net Debugging Techniques.Net Debugging Techniques
.Net Debugging TechniquesBala Subra
 
Automating Pragmatically - Testival 20190604
Automating Pragmatically - Testival 20190604Automating Pragmatically - Testival 20190604
Automating Pragmatically - Testival 20190604Alan Richardson
 
Comparing PVS-Studio with other code analyzers
Comparing PVS-Studio with other code analyzersComparing PVS-Studio with other code analyzers
Comparing PVS-Studio with other code analyzersPVS-Studio
 
QA Fest 2017. Jeremias Rößler. Applying AI to testing
QA Fest 2017. Jeremias Rößler. Applying AI to testingQA Fest 2017. Jeremias Rößler. Applying AI to testing
QA Fest 2017. Jeremias Rößler. Applying AI to testingQAFest
 
The limits of unit testing by Craig Stuntz
The limits of unit testing by Craig StuntzThe limits of unit testing by Craig Stuntz
The limits of unit testing by Craig StuntzQA or the Highway
 
Testing without assertions - #HUSTEF2019
Testing without assertions - #HUSTEF2019Testing without assertions - #HUSTEF2019
Testing without assertions - #HUSTEF2019SAP SE
 
We continue checking Microsoft projects: analysis of PowerShell
We continue checking Microsoft projects: analysis of PowerShellWe continue checking Microsoft projects: analysis of PowerShell
We continue checking Microsoft projects: analysis of PowerShellPVS-Studio
 
Initial thoughts on live user tests for games
Initial thoughts on live user tests for gamesInitial thoughts on live user tests for games
Initial thoughts on live user tests for gamesJohan Hoberg
 
Practical Test Automation Deep Dive
Practical Test Automation Deep DivePractical Test Automation Deep Dive
Practical Test Automation Deep DiveAlan Richardson
 
Automating Strategically or Tactically when Testing
Automating Strategically or Tactically when TestingAutomating Strategically or Tactically when Testing
Automating Strategically or Tactically when TestingAlan Richardson
 

What's hot (17)

Automation testing: how tools are important?
Automation testing: how tools are important?Automation testing: how tools are important?
Automation testing: how tools are important?
 
Why Unit Testingl
Why Unit TestinglWhy Unit Testingl
Why Unit Testingl
 
Meetup of test mini conference on ai in testing
Meetup of test mini conference  on ai in testingMeetup of test mini conference  on ai in testing
Meetup of test mini conference on ai in testing
 
Jeremias Rößler
Jeremias RößlerJeremias Rößler
Jeremias Rößler
 
How to apply AI to Testing
How to apply AI to TestingHow to apply AI to Testing
How to apply AI to Testing
 
27 000 Errors in the Tizen Operating System
27 000 Errors in the Tizen Operating System27 000 Errors in the Tizen Operating System
27 000 Errors in the Tizen Operating System
 
Tool Support For Testing (Types of Test Tool)
Tool Support For Testing (Types of Test Tool)Tool Support For Testing (Types of Test Tool)
Tool Support For Testing (Types of Test Tool)
 
.Net Debugging Techniques
.Net Debugging Techniques.Net Debugging Techniques
.Net Debugging Techniques
 
Automating Pragmatically - Testival 20190604
Automating Pragmatically - Testival 20190604Automating Pragmatically - Testival 20190604
Automating Pragmatically - Testival 20190604
 
Comparing PVS-Studio with other code analyzers
Comparing PVS-Studio with other code analyzersComparing PVS-Studio with other code analyzers
Comparing PVS-Studio with other code analyzers
 
QA Fest 2017. Jeremias Rößler. Applying AI to testing
QA Fest 2017. Jeremias Rößler. Applying AI to testingQA Fest 2017. Jeremias Rößler. Applying AI to testing
QA Fest 2017. Jeremias Rößler. Applying AI to testing
 
The limits of unit testing by Craig Stuntz
The limits of unit testing by Craig StuntzThe limits of unit testing by Craig Stuntz
The limits of unit testing by Craig Stuntz
 
Testing without assertions - #HUSTEF2019
Testing without assertions - #HUSTEF2019Testing without assertions - #HUSTEF2019
Testing without assertions - #HUSTEF2019
 
We continue checking Microsoft projects: analysis of PowerShell
We continue checking Microsoft projects: analysis of PowerShellWe continue checking Microsoft projects: analysis of PowerShell
We continue checking Microsoft projects: analysis of PowerShell
 
Initial thoughts on live user tests for games
Initial thoughts on live user tests for gamesInitial thoughts on live user tests for games
Initial thoughts on live user tests for games
 
Practical Test Automation Deep Dive
Practical Test Automation Deep DivePractical Test Automation Deep Dive
Practical Test Automation Deep Dive
 
Automating Strategically or Tactically when Testing
Automating Strategically or Tactically when TestingAutomating Strategically or Tactically when Testing
Automating Strategically or Tactically when Testing
 

Viewers also liked

Analyzing the Dolphin-emu project
Analyzing the Dolphin-emu projectAnalyzing the Dolphin-emu project
Analyzing the Dolphin-emu projectPVS-Studio
 
Myths about static analysis. The fifth myth - a small test program is enough ...
Myths about static analysis. The fifth myth - a small test program is enough ...Myths about static analysis. The fifth myth - a small test program is enough ...
Myths about static analysis. The fifth myth - a small test program is enough ...PVS-Studio
 
Studying methods of attracting people to a software product's website
Studying methods of attracting people to a software product's websiteStudying methods of attracting people to a software product's website
Studying methods of attracting people to a software product's websitePVS-Studio
 
Wade not in unknown waters. Part one.
Wade not in unknown waters. Part one.Wade not in unknown waters. Part one.
Wade not in unknown waters. Part one.PVS-Studio
 
Checking Intel IPP Samples for Windows - Continuation
Checking Intel IPP Samples for Windows - ContinuationChecking Intel IPP Samples for Windows - Continuation
Checking Intel IPP Samples for Windows - ContinuationPVS-Studio
 
Farewell to #define private public
Farewell to #define private publicFarewell to #define private public
Farewell to #define private publicPVS-Studio
 

Viewers also liked (6)

Analyzing the Dolphin-emu project
Analyzing the Dolphin-emu projectAnalyzing the Dolphin-emu project
Analyzing the Dolphin-emu project
 
Myths about static analysis. The fifth myth - a small test program is enough ...
Myths about static analysis. The fifth myth - a small test program is enough ...Myths about static analysis. The fifth myth - a small test program is enough ...
Myths about static analysis. The fifth myth - a small test program is enough ...
 
Studying methods of attracting people to a software product's website
Studying methods of attracting people to a software product's websiteStudying methods of attracting people to a software product's website
Studying methods of attracting people to a software product's website
 
Wade not in unknown waters. Part one.
Wade not in unknown waters. Part one.Wade not in unknown waters. Part one.
Wade not in unknown waters. Part one.
 
Checking Intel IPP Samples for Windows - Continuation
Checking Intel IPP Samples for Windows - ContinuationChecking Intel IPP Samples for Windows - Continuation
Checking Intel IPP Samples for Windows - Continuation
 
Farewell to #define private public
Farewell to #define private publicFarewell to #define private public
Farewell to #define private public
 

Similar to An ideal static analyzer, or why ideals are unachievable

Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...PVS-Studio
 
Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...Andrey Karpov
 
What do static analysis and search engines have in common? A good "top"!
What do static analysis and search engines have in common? A good "top"!What do static analysis and search engines have in common? A good "top"!
What do static analysis and search engines have in common? A good "top"!PVS-Studio
 
Machine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeMachine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeAndrey Karpov
 
Static Analysis: From Getting Started to Integration
Static Analysis: From Getting Started to IntegrationStatic Analysis: From Getting Started to Integration
Static Analysis: From Getting Started to IntegrationAndrey Karpov
 
The Development History of PVS-Studio for Linux
The Development History of PVS-Studio for LinuxThe Development History of PVS-Studio for Linux
The Development History of PVS-Studio for LinuxPVS-Studio
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityAndrey Karpov
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityPVS-Studio
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityPVS-Studio
 
Konstantin Knizhnik: static analysis, a view from aside
Konstantin Knizhnik: static analysis, a view from asideKonstantin Knizhnik: static analysis, a view from aside
Konstantin Knizhnik: static analysis, a view from asidePVS-Studio
 
Testing parallel programs
Testing parallel programsTesting parallel programs
Testing parallel programsPVS-Studio
 
If the coding bug is banal, it doesn't meant it's not crucial
If the coding bug is banal, it doesn't meant it's not crucialIf the coding bug is banal, it doesn't meant it's not crucial
If the coding bug is banal, it doesn't meant it's not crucialPVS-Studio
 
Software design.edited (1)
Software design.edited (1)Software design.edited (1)
Software design.edited (1)FarjanaAhmed3
 
PVS-Studio confesses its love for Linux
PVS-Studio confesses its love for LinuxPVS-Studio confesses its love for Linux
PVS-Studio confesses its love for LinuxPVS-Studio
 
Searching for bugs in Mono: there are hundreds of them!
Searching for bugs in Mono: there are hundreds of them!Searching for bugs in Mono: there are hundreds of them!
Searching for bugs in Mono: there are hundreds of them!PVS-Studio
 
PVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ codePVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ codePVS-Studio
 
debugging (1).ppt
debugging (1).pptdebugging (1).ppt
debugging (1).pptjerlinS1
 
An important characteristic of a test suite that is computed by a dynamic ana...
An important characteristic of a test suite that is computed by a dynamic ana...An important characteristic of a test suite that is computed by a dynamic ana...
An important characteristic of a test suite that is computed by a dynamic ana...jeyasrig
 

Similar to An ideal static analyzer, or why ideals are unachievable (20)

Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...
 
Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...Static analysis is most efficient when being used regularly. We'll tell you w...
Static analysis is most efficient when being used regularly. We'll tell you w...
 
What do static analysis and search engines have in common? A good "top"!
What do static analysis and search engines have in common? A good "top"!What do static analysis and search engines have in common? A good "top"!
What do static analysis and search engines have in common? A good "top"!
 
Machine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeMachine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source Code
 
Static Analysis: From Getting Started to Integration
Static Analysis: From Getting Started to IntegrationStatic Analysis: From Getting Started to Integration
Static Analysis: From Getting Started to Integration
 
test
testtest
test
 
The Development History of PVS-Studio for Linux
The Development History of PVS-Studio for LinuxThe Development History of PVS-Studio for Linux
The Development History of PVS-Studio for Linux
 
test
testtest
test
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usability
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usability
 
Difficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usabilityDifficulties of comparing code analyzers, or don't forget about usability
Difficulties of comparing code analyzers, or don't forget about usability
 
Konstantin Knizhnik: static analysis, a view from aside
Konstantin Knizhnik: static analysis, a view from asideKonstantin Knizhnik: static analysis, a view from aside
Konstantin Knizhnik: static analysis, a view from aside
 
Testing parallel programs
Testing parallel programsTesting parallel programs
Testing parallel programs
 
If the coding bug is banal, it doesn't meant it's not crucial
If the coding bug is banal, it doesn't meant it's not crucialIf the coding bug is banal, it doesn't meant it's not crucial
If the coding bug is banal, it doesn't meant it's not crucial
 
Software design.edited (1)
Software design.edited (1)Software design.edited (1)
Software design.edited (1)
 
PVS-Studio confesses its love for Linux
PVS-Studio confesses its love for LinuxPVS-Studio confesses its love for Linux
PVS-Studio confesses its love for Linux
 
Searching for bugs in Mono: there are hundreds of them!
Searching for bugs in Mono: there are hundreds of them!Searching for bugs in Mono: there are hundreds of them!
Searching for bugs in Mono: there are hundreds of them!
 
PVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ codePVS-Studio advertisement - static analysis of C/C++ code
PVS-Studio advertisement - static analysis of C/C++ code
 
debugging (1).ppt
debugging (1).pptdebugging (1).ppt
debugging (1).ppt
 
An important characteristic of a test suite that is computed by a dynamic ana...
An important characteristic of a test suite that is computed by a dynamic ana...An important characteristic of a test suite that is computed by a dynamic ana...
An important characteristic of a test suite that is computed by a dynamic ana...
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

An ideal static analyzer, or why ideals are unachievable

  • 1. An ideal static analyzer, or why ideals are unachievable Author: Evgeniy Ryzhkov Date: 15.03.2012 Being inspired by Eugene Laspersky's post about an ideal antivirus, I decided to write a similar post about an ideal static analyzer. And meanwhile think how far from being it our PVS-Studio is. An ideal static analyzer's characteristics Those who are not familiar with the notion static code analysis, please follow the link. So let me enumerate the characteristics right away: • 100% detection of all the types of programming errors; • 0% false positives; • high performance - "whooosh, and the code is analyzed completely almost at the same instance"; • integration with my favorite (i.e. every) IDE; ability to work under my favorite (i.e. every) operating system; analysis of code in my favorite (i.e. any) programming language; • free (freeware, open source); • high-quality and prompt customer support. Of course, this ideal can never be achieved but it shows the direction towards which companies developing solutions in this sphere can head. 100% detection of all the types of programming errors You should understand that none of the static analyzers will ever provide 100% error detection. Why? Well, if only because some error types are better detected by dynamic analyzers. And it's ridiculous to try to compete with them in this area. As well as dynamic analyzers cannot compete with static ones regarding some rule types. It's difficult to obtain 100% error detection even for diagnostics characteristic of static analysis. First, any live programming language is constantly developing acquiring new syntax and therefore new ways of making an error. Second, even old syntax can be with time used by people in a rather unusual way analyzer developers did not think of. Finally, a static analyzer doesn't possess knowledge about what a program SHOULD contain, it doesn't have AI. If there is a phrase in a program "A is equal to B", while the correct one is "A is not equal to B", static analysis won't help you with that. That's why the only real way out is to constantly create new diagnostic rules. It will never give you 100% error detection but will keep you close to it all the time.
  • 2. 0% false positives Any static analyzer produces false positives, as, in the long run, only the programmer KNOWS what exactly the code IS TO do. But an analyzer sees what the code really DOES and tries to UNDERSTAND what it SHOULD do. Returning to the previous section about "100% detection of all the errors", one can make a naïve suggestion: "Why, let's detect everything that moves and we'll be happy!" That is, let's detect everything that looks like an error in the least bit. But this approach is wrong because the number of false positives will go overboard. And there is an opinion that when a user sees 10 false positives in a row, he/she closes the tool not to deal with it anymore. We have the following ways to reduce the number of false positives: 1. Constantly handling existing rules to refine their formulations. For example, if in a test project a rule "was triggered" 100 times and 50 of them were false positives, refining the rule can reduce this number to 10. However, you can lose 1 or 2 real warnings, but it's the eternal issue of making compromises. 2. Refusing rules which are no more relevant. If you only add new rules and never remove (turn off) obsolete ones, some of your diagnostics lose their relevance with time. 3. Having useful tools to handle false positives. For instance, PVS-Studio provides a mechanism to suppress false positives. Once marking a message as a false report, you won't see it next time. High performance Everybody wants software to work fast but it's not always possible. Usually the code analysis technology requires more resources than the compiler - because the compiler checks only very crude errors, while the analyzer's aim is to perform fine analysis. Of course, it needs more data for that. The more the data, the deeper analysis is and the more interesting errors can be found. An obvious solution to enhance performance is to provide support of several processor cores when analyzing the code. It's rather easy to implement in static analyzers: each file is checked separately and the results are simply combined then. Less obvious is an attempt to check just a code fragment instead of a whole compilation unit (a file). This is a very complicated task and, taken generally, it's quite difficult to solve (for any language). You have to find and "calculate" data types, analyze classes being used and so on. Costs on "extracting" the part you need to analyze might be even higher than just analyzing the whole code completely. Integration with my favorite (i.e. every) IDE; ability to work under my favorite (i.e. every) operating system; analysis of code in my favorite (i.e. any) programming language The issue of providing support for a certain operating system, development environment or analyzable programming language is important in choosing between static analysis tools. To my great surprise, programmers, being the main users of static analyzers, often cannot understand the difficulties of implementing support for the whole zoo of operating systems they want. But let's discuss it in due order.
  • 3. Supported (analyzable) programming languages Programming errors detected by code analyzers surely can occur in every programming language and these errors have common features: in every language programmers forget to initialize variables, confuse keys when typing a program and so on. But parsing and analysis of a program is VERY different from language to language. If some analyzer is announced to support analysis of software in several programming languages, it means that there are most likely several analysis modules in it too. It can even be hidden from users! I'm writing this just for people to understand that the phrase: "Why don't you make the SAME but for C#/PHP/Java?" implies very much work. Supported operating systems It's very naïve to think that a code analyzer "just" handles text and therefore can work in any operating system. Of course, different programming languages are "tied" to the environment to various extents: some are more, like C++; others are less, like PHP. Where does this difference come from? The point is that there exist several compilers for large and powerful languages like C++, considering all their differences and subtleties in the language syntax. The code written for Windows-based compilers is just a bit yet noticeably different from the code written for Linux-based compilers. Though this difference is not very crucial from the user code's viewpoint, it might be important from the viewpoint of a static analyzer - because if the code being analyzed contains key words that are used in this very compiler, the analyzer needs to be "taught" them. In this sense, support of one more compiler and support of one more operating system are equal tasks, generally speaking. Note that this is an easier task for simpler languages than C++. Thus, supported operating systems include not only platforms an executed file is run on, but the code for these platforms the analyzer can "understand". Supported IDE There are a lot of development tools for different languages. What is important for users is this: • a static analyzer should be able to integrate with their favorite development environment; • the tool can be run in automatic mode at night; • the analyzer should be able to integrate into continuous integration systems; The last two points are often called "support of command line version" but it has nothing to do with the command line. No one nowadays actually finds it interesting to watch white letters on the black screen instead of a conveniently organized report which can be converted into a text file and sent via e-mail or written into the build system's log. Support of different IDE's is a difficult, effort-intensive task, as each IDE imposes certain restrictions on their plugins. These restrictions often vary in different systems. Free (freeware, open source) and high-quality customer support I've united two sections into one because they are closely connected.
  • 4. Static code analysis tools refer to the software type for which quality and continuous support are very important. Yes, there are a few tools distributed for free, but I believe they will never reach the market leaders (Coverity, Klocwork, Parasoft). Generally speaking, a static analysis tool can become free and open-source if the developer company is purchased by some giant like Google, Microsoft or Intel, but this is a special case. Static analysis tools are usually sold according to the model of annually renewable license. Some users might not like it, but I will try to explain why this scheme is the best. And please forgive me if you have entered the "Free" section and now are reading about licensing schemes. As I've already said, customer support is very important for static analysis tools. In the field of static analysis, support implies, first of all, cases when the analyzer cannot parse user code (because of complex C++ templates, non-standard compiler extensions, etc.). In these cases you need to promptly (during several days) improve the analyzer so that it can parse the customer's code. User support also includes aid in integrating the tool into their development process. Well, implementation of customer requests that makes use of the tool more convenient is also necessary. All this costs money. That's why you cannot sell a license once and support your users for free for the rest of your life. One could sell new major-releases, for example, versions v3, v4, v5... What is bad about this scheme is that it makes the developer "hold" new cool capabilities of the tool till the next major-version instead of releasing them right away as soon as they are ready. Thus, it appears that annual license renewal is the best way. Meanwhile, some developer companies set the renewal price at the 100% of the initial price, while others set a lower price (making a discount for renewal). Regarding the latter case, it can be explained this way: the first year's price includes additional costs on teaching the customer to work with the tool. So, it appears that a quality tool with quality support cannot be free, if only it is not being developed by a company-giant, but in this case you can forget about targeted individual customer support. Conclusion In this article I've tried to show you what characteristics an ideal static code analysis tool should possess; how users want it to look. And it is users, of course, who decide how much this or that tool really corresponds to this ideal.