Konstantin Knizhnik: static analysis, a view from aside


Published on

The article is an interview with Konstantin Knizhnik taken by Andrey Karpov, "Program Verification Systems" company's worker. In this interview the issues of static code analysis, relevance of solutions made in this sphere and prospects of using static analysis while developing applications are discussed.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Konstantin Knizhnik: static analysis, a view from aside

  1. 1. Konstantin Knizhnik: static analysis, aview from asideAuthor: Andrey KarpovDate: 10.01.2009AbstractThe article is an interview with Konstantin Knizhnik taken by Andrey Karpov, "Program VerificationSystems" companys worker. In this interview the issues of static code analysis, relevance of solutionsmade in this sphere and prospects of using static analysis while developing applications are discussed.IntroductionOOO "Program Verification Systems" developing tools in the sphere of program testing and verificationasked Konstantin Knizhnik, a specialist in the sphere of static code analysis methodology, to answersome questions. The interview has been performed and presented in the form of this article by AndreyKarpov, OOO "Program Verification Systems" worker.The interview touches upon issues of static code analysis and relevance of solutions made in this sphere.Prospects of using static analysis while developing parallel applications are also discussed. A sideevaluation of the static analysis tools Viva64 and VivaMP developed by OOO "Program VerificationSystems" is made. Besides, some common issues of program verification are discussed which, as wehope, will be interesting for the readers who explore this sphere of application testing.The questions are asked by (the questions are in bold):Candidate of Physico-mathematical Sciences, Andrey Karpov - "Program Verification Systems"Companys technical director; develops the static code analysis tools Viva64 and VivaMP for testing 64-bit and parallel applications. The author of some articles on static code analysis.The questions are answered by:Candidate of Physico-mathematical Sciences, Konstantin Knizhnik - the author of some articles devotedto static program code analysis, developer of Java-application verifiers; participated and continues toparticipate in many interesting projects, for example, in WebAlta.The interviews textIs that true that you had investigated the subject of static analysis andeven participated in creating a static code analyzer for Java applications?I really had investigated the subject of static analysis and program verification. It began in 1997 when Iwrote a small program jlint, a clint-analogue for Java.
  2. 2. Tell us about jlint in detail please.The program consisted of two parts - a simplest static analyzer for languages with C-like syntax. As it isknown, there are a lot of places in C languages syntax, which lead to errors difficult to detect, forexample, "=" instead of "==", an empty loop body caused by a ";" put in a wrong place etc. I wontenumerate further for I suppose that the problems are rather familiar.The second part - that was jlint - is a more or less self-dependent semantic analyzer for Java. I didntwant to get involved into writing my own Java-parser then, thats why I decided to read the alreadycompiled code byte and analyze it. Tests include reference to zero link, incompatible casts, identicallytrue or identically false expressions, accuracy loss and the like.The most interesting feature in jlint was an ability to detect potential deadlocks in a program. Java has avery simple locking mechanism - a synchronized method or a synchronized(expr) construction. On thebasis of analysis of these constructions an attempt was made to build the lock graph (where the nodesare the resources locked) and to find loops in this graph. Of course it is impossible to build a precisegraph, thats why we tried to build an approximate graph using classes instead of concrete instances.Unfortunately, this testing didnt work well in real projects and there were a lot of false responses.How do the story of creating jlint and your work in TogetherSoft andBorland on the similar projects relate? What was your duty in thesecompanies?Some years later my jlint was noticed by the creator of TogetherSoft, the company which released theUML modeling tool and headed for developing a complete IDE for Java. And I began working inTogetherSoft. But at first I developed OODBMS, then Version Control System and only after that I tookpart in developing the Java-verifier.For this time everything was serious: the complete syntactic parse, data flow analysis and other featureswere implemented. The number of audits was more than several hundreds. Some of them, of course,were rather simple, but some others claimed for rudiments of artificial intelligence. For example, theverifier detected errors of mixing up an index variable in embedded loops or performed detectingstandard sequences for working with some resource (open, read/write, close) and searching for placeswhere these sequences were broken.In general, a lot was done, including rather original things. Its funny, for sure, when you launch theverifier on the source texts, for example, on JDK and get about a dozen of critical errors of reference to azero address. In most cases it happens, of course, in the error handlers, that is in those places which arenever executed in reality.What is interesting, there were no Java-verifiers on market then. So, there had been opportunities forgreat success but somehow we didnt manage to use it.TogetherSoft Company was sold to Borland Company. We had to include support of such languages asC++, Delphi and Visual Basic, and provide integration with JBuilder and Eclipse. At long last, our verifierdid reach users but in a very poor form (generally because of the necessity to work on AST provided tous by other subsystems, which worked too slow and didnt contain the information necessary for theverifier). It was too late by that time for there were verifiers for nearly all popular IDE for Java. And
  3. 3. although few of them tried to perform such a deep analysis, in most cases everything was reduced todoctoring the syntax, but these differences were not so easy to notice.And then Borland Company was struck by the crisis and I have been working for several years already inan absolutely different sphere.What are doing now?At present I am taking part in several projects at once. For example, in WebAlta Company my business issearch engines. Besides WebAlta, I participate in creating OODBMS for plug-in systems. There are somemore other projects.And what about the further fate of JLint?As my work in TogetherSoft and then in Borland was directly connected with program verification, I hadto give up my jlint. By the way, you can download the existing jlint from my site:http://www.garret.ru/lang.html.As specialist in the sphere of static analysis what observations andpieces of advice could you share with us? What "Program VerificationSystems" Company should take into consideration continuing to developthe static analyzers Viva64 and VivaMP?Well, Ill try to briefly list the main conclusions Ive made during the years devoted to the problem ofprogram verification.1. Most errors in real large projects are found by the most "stupid" audits. Take an example of absenceof break in switch. It is very useful when such audits work in the development environment ininteractive mode (thus they immediately mark the unsafe places).2. Messages should be divided according to the confidence level (May-Be-Null, Is-Null, ...) with thepossibility to turn off separate groups as well.3. To perform a full analysis you need a symbolic calculator - to understand that i+1 > i. Of course, I amaware of the overflow due to which this condition is not always true but the verifiers task is to searchfor such suspicious places.4. The worse a language is designed, the more work is there for the verifier - for example, C syntaxcauses a lot of programmers errors and any C/C++ programmer has faced this problem more than once.In Java many of these defects were corrected, but still not all of them. Many our audits were busy tryingto detect enums (which have been absent in Java until recently) according to different heuristic rulesand providing them with something like type static control. Of course, all this turned out to be useless inC#.5. The most important in the verifier is to maintain a reasonable balance between suspiciousness and"talkativeness". If I get several thousand messages on a small project, surely, I simply wont be able tocheck them all. So, we need division into criticality degrees. But if we take some rather critical error, forexample, may-by-null which can be caused by code as follows:
  4. 4. if (x == null){DoSomething();}x.InvokeSomeMethod();we must understand that having checked the first several suspicious places without finding an error inthem, a programmer wont consider the remaining messages.Thats why the rules "Its better to say nothing than to say nonsense" and "if not sure, keep silent" arevery topical for the verifier.What do you think about practicability and usefulness of creating staticanalyzers such as VivaMP for verification of parallel programs?New languages (similar to Java and C#) with explicit memory release and absence of address arithmetichave made a programs behavior nearly determined and helped to get rid of millions of man-hours spenton program debugging ("where does memory escape", "who deletes this variable" etc) and also to getrid of tools like BoundsChecker whose task was to fight the abuse of C/C++ possibilities.But unfortunately parallel programming - creation of multithread applications without which we cannotsolve even a simple task nowadays - deprives us from this determinism and casts us to those times whena programmer had to spend too much time on debugging and launch tests for twenty-four hours inorder not to get convinced of absence of errors (for a test can show only their presence but not to provetheir absence) but mostly to clear his and the team leaders conscience.Moreover, if earlier (and even now in C++) writing a parallel program demanded great efforts, in C#/Javait is much easier to create a new thread. This seeming simplicity creates an illusion that parallelprogramming is very simple, but unfortunately this is not so and as far as I know there are no parallelismmodels allowing you to do the same thing as "garbage collection" for usual programs (of course, if not tospeak of merely functional languages where execution can be paralleled automatically withoutdistracting a programmers attention).If we cannot solve a task on the level of proper language design, we have to provide support with staticand dynamic verifiers. Thats why I find the possibility of detecting deadlocks and race conditions inprograms one of the most important tasks of the modern verifiers.You have read our articles "20 issues of porting C++ code on a 64-bitplatform", "32 OpenMP traps for C++ developers" and others. Could youevaluate and comment on them?Thank you very much for the links to these articles, I liked them. Ill send these links to my colleagues. Ithink this information will be very useful for many of them, especially for young and inexperienced ones.When I worked in Digital Equipment I came across the first 64-bit systems (DEC Alpha). Projects weremainly connected with porting (from VMS/VAX on Digital Unix/Alpha and from RSX/PDP onOpenVMS/Alhpa). Thats why we had faced ourselves all those problems of porting on a platform of a
  5. 5. different capacity which you describe in your articles. The case was even more complicated becauseAlpha required strict data deskewing.Have you considered demo-versions of Viva64 and VivaMP? What canyou advise to make them more popular? What means of promoting themon market can be successful in your opinion?I havent yet looked at Viva64 and VivaMP tools themselves but I promise that I will. But from myexperience of working with the verifier in TogetherSoft/Borland I can say, or even warn, that, as in anycommercial product, a verifier consists of nearly 10% of interesting ideas and algorithms and 90% ofrather boring things without which, unfortunately, a commercial product cannot exist: • Integration with many (and ideally with all) popular development means. This is rather difficult as it concerns not only the interface but the necessity of skilful handling of the programs original internal presentation (AST) to make it fully integrate into IDE. • Standalone mode (ones own parser, report generator etc); • Ability of incremental work. • Autofixes (an ability to automatically correct simple errors). • Generation of various reports, diagrams, export to Excel etc. • Integration with automatic building systems. • Examples (on each message there should be one simple and clear example in each of the supported languages); • Documentation. On each message there should be interactive help explaining why this message has been shown. Besides, you need the Users guide. • Detailed and convenient "target designation". For example, if we say that a deadlock can occur here we should show the user the whole possible way of the deadlocks occurrence. • Scaling. The verifier should be able to process a project of several million strings for a reasonable time. • You need a good site with a forum, a blog of one of the leading developers and regularly updated topical information. For example, in due time C-lint published in each Dr.Dobbs magazine issue an advertisement with an example of a c-lint-error. It looked like a puzzle and really attracted the publics attention to the product.All in all, as everywhere, it turns out that it is rather easy to write some product (one person can do it inseveral months). But to turn it into a real commercial product you need much more efforts of not toocreative and interesting character. And to be able to sell it and learn how to get money from that, youneed quite a different talent.Thank you very much for the conversation and interesting andinformative answers.Thank you. I was glad to talk to you too.
  6. 6. ConclusionWe would like to thank Konstantin for the interview once again and ask for permission to place thismaterial in the Internet. We find many of his pieces of advice very useful and will certainly try to fulfillthem in our static analysis program products.References 1. Konstantin Knyzhniks homepage. http://www.viva64.com/go.php?url=146 2. JLint description. http://www.viva64.com/go.php?url=147 3. OOO "Program Verification Systems" site. http://www.viva64.com 4. Konstantin Knyzhnik. Creation of multithread applications in Java (RU). http://www.viva64.com/go.php?url=148 5. Askar Rahimberdiev, Konstantin Knyzhnik, Igor Abramov. The static analyzer of errors in java- programs (RU). http://www.viva64.com/go.php?url=149 6. I.V. Abramov, S.E. Gorelkin, E.A. Gorelkina, K.A. Knyzhnik, A.M. Rahimberdiev. Experience of developing a static analyzer for searching errors in Java-programs. // Informational technologies and programming: Intercollege article collection. Issue 2 (7) M.: MSIU, 2003. 62 pp. 7. Knizhnik, Konstantin. "Reflection for C++." The Garret Group. 4 Nov. 2005. http://www.viva64.com/go.php?url=150 8. Andrey Karpov, Evgeniy Ryzhkov. 20 issues of porting C++ code on the 64-bit platform. http://www.viva64.com/art-1-2-599168895.html 9. Alexey Kolosov, Andrey Karpov, Evgeniy Ryzhkov. 32 OpenMP traps for C++ developers. http://www.viva64.com/art-3-2-1023467288.html