• Like
  • Save
Compiler Ggcc
Upcoming SlideShare
Loading in...5
×
 

Compiler Ggcc

on

  • 957 views

The presentation will start by summarizing some results of the Eureka/ITEA project GGCC (Global GNU Compiler Collection) where Julio collaborated in the design of an open platform for coding rule ...

The presentation will start by summarizing some results of the Eureka/ITEA project GGCC (Global GNU Compiler Collection) where Julio collaborated in the design of an open platform for coding rule validation.
Then, the presentation continues on ellaboration on the different connections between formal techniques, in a broad sense, and open source software development.

Finally, I will discuss how these examples lead naturally to the emergent concept of semantic forge.

Statistics

Views

Total Views
957
Views on SlideShare
957
Embed Views
0

Actions

Likes
0
Downloads
19
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Compiler Ggcc Compiler Ggcc Presentation Transcript

    • The Eureka/ITEA Global GCC Project Julio Mari˜ o n (joint work with Guillem Marpons and others) Babel Research Group — Universidad Polit´cnica de Madrid e FOSSA09, Grenoble Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 1 / 30
    • Overview 1 Project Overview 2 Coding Rule Validation Structural Rule Validation Domain-specific language: CRISP 3 The need for static analysis 4 Lessons learned 5 The way ahead Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 2 / 30
    • Context The Global GCC Project (2006–2008) ITEA-labeled consortium of industrial / research partners Industrial: Mandriva, Bertin, Telefonica I+D, small/medium-sized companies Research labs: INRIA, CEA-LIST, UPM Goal: make the GNU Compiler Collection (GCC) more attractive to the (european) software industry by transferring academic results in three areas: Project-wide static analysis Global optimization Minimise programming hazards by means of coding rules Global GCC knowledge base: integrates heterogeneous information provided by the different components of GGCC http://www.ggcc.info Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 3 / 30
    • Coding Rules Definition Coding Rules constrain admissible constructs of a language to help produce more reliable and maintainable code. Standard coding rule sets do exist, e.g.: High-Integrity C++ (HICPP): general C++ applications MISRA-C (C language): automotive industry / embedded systems Many organisations need to write their own rule sets or adapt existing ones. Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 4 / 30
    • Coding Rules Some Actual Examples “Do not call the malloc() function” (MISRA-C 20.4) “Do not use the ‘inline’ keyword for member functions” (HICPP 3.1.7) “Expressions that are effectively Boolean should not be (MISRA-C 12.6) used as operands to operators other than (&&, || and !)” “If a virtual function in a base class is not overridden in (HICPP 3.3.6) any derived class, then make it non virtual” “All automatic variables shall have been assigned a value (MISRA-C 9.1) before being used” “Behaviour should be implemented by only one member (HICPP 3.1.9) function in a class” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 5 / 30
    • Coding Rules Some Actual Examples “Do not call the malloc() function” (MISRA-C 20.4) “Do not use the ‘inline’ keyword for member functions” (HICPP 3.1.7) “Expressions that are effectively Boolean should not be (MISRA-C 12.6) used as operands to operators other than (&&, || and !)” “If a virtual function in a base class is not overridden in (HICPP 3.3.6) any derived class, then make it non virtual” “All automatic variables shall have been assigned a value (MISRA-C 9.1) before being used” “Behaviour should be implemented by only one member (HICPP 3.1.9) function in a class” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 5 / 30
    • Coding Rules Some Actual Examples “Do not call the malloc() function” (MISRA-C 20.4) “Do not use the ‘inline’ keyword for member functions” (HICPP 3.1.7) “Expressions that are effectively Boolean should not be (MISRA-C 12.6) used as operands to operators other than (&&, || and !)” “If a virtual function in a base class is not overridden in (HICPP 3.3.6) any derived class, then make it non virtual” “All automatic variables shall have been assigned a value (MISRA-C 9.1) before being used” “Behaviour should be implemented by only one member (HICPP 3.1.9) function in a class” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 5 / 30
    • Coding Rules Some Actual Examples “Do not call the malloc() function” (MISRA-C 20.4) “Do not use the ‘inline’ keyword for member functions” (HICPP 3.1.7) “Expressions that are effectively Boolean should not be (MISRA-C 12.6) used as operands to operators other than (&&, || and !)” “If a virtual function in a base class is not overridden in (HICPP 3.3.6) any derived class, then make it non virtual” “All automatic variables shall have been assigned a value (MISRA-C 9.1) before being used” “Behaviour should be implemented by only one member (HICPP 3.1.9) function in a class” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 5 / 30
    • Coding Rules Some Actual Examples “Do not call the malloc() function” (MISRA-C 20.4) “Do not use the ‘inline’ keyword for member functions” (HICPP 3.1.7) “Expressions that are effectively Boolean should not be (MISRA-C 12.6) used as operands to operators other than (&&, || and !)” “If a virtual function in a base class is not overridden in (HICPP 3.3.6) any derived class, then make it non virtual” “All automatic variables shall have been assigned a value (MISRA-C 9.1) before being used” “Behaviour should be implemented by only one member (HICPP 3.1.9) function in a class” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 5 / 30
    • Rule Conformance Checking Problems with Current Approaches Rules are specified in natural language: Ambiguity Automatic checking hindered Closed tools Lack of extensibility Proposed Solution Define a logic based language that allows for precisely specifying rule sets such as MISRA-C or HICPP Use logic programming to get an automatic rule conformance checking procedure Integrate information provided by different program analyses Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 6 / 30
    • Rule Conformance Checking Problems with Current Approaches Rules are specified in natural language: Ambiguity Automatic checking hindered Closed tools Lack of extensibility Proposed Solution Define a logic based language that allows for precisely specifying rule sets such as MISRA-C or HICPP Use logic programming to get an automatic rule conformance checking procedure Integrate information provided by different program analyses Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 6 / 30
    • Other Tools Proprietary tools: Compilers: IAR Systems (C) QA: Parasoft, Klocwork, Coverity, Semmle Code (Java) Free software: Checkstyle (Java) Gendarme (ECMA CIL, Mono and .Net) Drawbacks: Lack of appropriate extensibility mechanisms Ambiguity in natural language Interoperability is difficult Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 7 / 30
    • Motivation: C++ “Strange” Behavior class A { public : A :: A () { A (); func (); virtual void func (); } }; class B : public A B * d = new B (); { // A :: func or B :: func ? B () : A () {} virtual void func (); }; Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 8 / 30
    • Motivation: C++ “Strange” Behavior class A { public : A :: A () { A (); func (); virtual void func (); } }; class B : public A B * d = new B (); { // A :: func or B :: func ? B () : A () {} virtual void func (); }; Coding Rule: “Do not invoke virtual methods of the declared class in a constructor or destructor.” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 8 / 30
    • C++ “strange” behavior (2) class Base {}; class Derived : public Base { public : ~ Derived () {} }; void foo () { Derived * d = new Derived ; delete d ; // c o r r e c t l y calls derived d e s t r u c t o r } void boo () { Derived * d = new Derived ; Base * b = d ; delete b ; // problem ! does not call derived d e s t r u c t o r ! } Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 9 / 30
    • C++ “strange” behavior (2) class Base {}; class Derived : public Base { public : ~ Derived () {} }; void foo () { Derived * d = new Derived ; delete d ; // c o r r e c t l y calls derived d e s t r u c t o r } void boo () { Derived * d = new Derived ; Base * b = d ; delete b ; // problem ! does not call derived d e s t r u c t o r ! } Rule HICPP 3.3.2 “Write a ‘virtual’ destructor for base classes.” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 9 / 30
    • Example Rule Formalisation Rule HICPP 3.3.15 “Ensure base classes common to more than one derived class are virtual” violate hicpp 3,3,15(a, b, c, d) ← b=c ∧ direct base of (a, b) ∧ direct base of (a, c) ∧ base of (b, d) ∧ base of (c, d) ∧ ¬virtual base of (a, c) Rules are specified in an enriched LP-language with: disequality, quantifiers, constructive negation and sorts. Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 10 / 30
    • Example Extraction of Program Information and Search of Violations Rule HICPP 3.3.15 in Prolog v i o l a t e _ h i c p p _ 3 _ 3 _ 1 5 (A ,B ,C , D ) : - class ( B ) , class ( C ) , B = C , class ( D ) , class ( A ) , direc t_base_ of (A , B ) , direc t_base_ of (A , C ) , base_of (B , D ) , base_of (C , D ) , + vi rt u al _b as e _o f (A , C ). class(’:: Animal ’). class(’:: WingedAnimal ’). class(’:: Mammal ’). class(’:: Bat ’). direct base of (’:: Animal ’, ’:: Mammal ’). direct base of (’:: Animal ’, ’:: WingedAnimal ’). direct base of (’:: Mammal ’, ’:: Bat ’). direct base of (’:: WingedAnimal ’, ’:: Bat ’). virtual base of (’:: Animal ’, ’:: Mammal ’). Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 11 / 30
    • Proposed Approach 1 Formalize rules in a logic-based specification language that is executable: CRISP 2 Use GCC ?? for gathering necessary program information Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 12 / 30
    • Our Rule Checking Procedure Coding rules C++ project (in English) source files Coding rules formalized 1 Coding rule(s) written once in CRISPC++ in the logic-based formalism Coding rule g++’ 2 Extract program information compiler (project build) (+ analysis information if Coding rules Project facts available) using GCC, and compiled store it in Prolog into Prolog 3 Search (using a Prolog Ciao Prolog engine) for a counterexample engine Rule viola- tions report Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 13 / 30
    • Our Rule Checking Procedure Coding rules C++ project (in English) source files Coding rules formalized 1 Coding rule(s) written once in CRISPC++ in the logic-based formalism Coding rule g++’ 2 Extract program information compiler (project build) (+ analysis information if Coding rules Project facts available) using GCC, and compiled store it in Prolog into Prolog 3 Search (using a Prolog Ciao Prolog engine) for a counterexample engine Rule viola- tions report Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 13 / 30
    • Our Rule Checking Procedure Coding rules C++ project (in English) source files Coding rules formalized 1 Coding rule(s) written once in CRISPC++ in the logic-based formalism Coding rule g++’ 2 Extract program information compiler (project build) (+ analysis information if Coding rules Project facts available) using GCC, and compiled store it in Prolog into Prolog 3 Search (using a Prolog Ciao Prolog engine) for a counterexample engine Rule viola- tions report Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 13 / 30
    • Our Rule Checking Procedure Coding rules C++ project (in English) source files Coding rules formalized 1 Coding rule(s) written once in CRISPC++ in the logic-based formalism Coding rule g++’ 2 Extract program information compiler (project build) (+ analysis information if Coding rules Project facts available) using GCC, and compiled store it in Prolog into Prolog 3 Search (using a Prolog Ciao Prolog engine) for a counterexample engine Rule viola- tions report Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 13 / 30
    • CRISP Building Blocks 1: Sorts Variable, DataMember, LocalVariable Function, MemberFunction, Constructor Type, PointerType, Record Scope, Namespace, Record, CompoundStatement Operator ArgumentTypeInFunctionType ClassMember Thing Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 14 / 30
    • CRISP Building Blocks 2: (Binary) Relations Function calls Function Record hasImmediateBase Record Variable hasType NonFunctionType Function hasType FunctionType Thing isDefinedIn Scope Scope isNestedIn Scope Record hasMember MemberFunction Record hasMember DataMember Record hasBase Record Record isPrivateBaseOf Record Record isVirtualBaseOf Record PointerType hasPointedType Type FunctionType hasReturnType Type Record hasFriend Record Record hasFriend MemberFunction ClassMember hasVisibility Visibility Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 15 / 30
    • Example of Rule Formalization Rule HICPP 3.3.13: “Do not invoke virtual methods of the declared class in a constructor or destructor.” Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 16 / 30
    • Example of Rule Formalization Rule HICPP 3.3.13: “Do not invoke virtual methods of the declared class in a constructor or destructor.” rule HICPP 3.3.13 violated by Caller : MemberFunction; Callee : VirtualFunction when exists R : Record such that ( R hasMember Caller and R hasMember Callee and ( Caller is Constructor or Caller is Destructor ) and Caller calls+ Callee ) . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 16 / 30
    • Formalization of Rule HICPP 3.3.2 Rule HICPP 3.3.13: “Write a ‘virtual’ destructor for base classes.” rule HICPP 3.3.2 violated by C : Record when exists C’ such that C’ hasBase C and not exist VD : Destructor such that ( VD isDefinedIn C and VD is VirtualFunction ) . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 17 / 30
    • Auxiliary Sorts and Relations relation F : Function overloads F’ : Function when exists S : Scope ; N : String such that ( F isDefinedIn S and F’ isDefinedIn S and F hasUnqualifiedName N and F’ hasUnqualifiedName N and F = F’ ) . sort M : ClassMember is PrivateClassMember when exists V : Visibility such that ( M hasVisibility V and V is ‘private’ ) . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 18 / 30
    • Experimental Results P ROJECT KL OC L OAD T IME # V IOLATIONS (C HECKING T IME ) 3.3.1 3.3.2 3.3.11 3.3.15 Bacula 20 0.24 0 (0.0) 3 (0.0) 0 (0.0) 0 (0.0) CLAM 46 1.62 1 (0.0) 15 (0.5) 115 (0.1) 0 (0.2) Firebird 439 2.61 16 (0.0) 60 (1.0) 115 (0.2) 0 (0.3) IT++ 39 0.42 0 (0.0) 6 (0.0) 12 (0.0) 0 (0.0) OGRE 209 3.05 0 (0.0) 15 (0.9) 79 (0.2) 0 (0.3) Orca 89 1.17 1 (0.0) 12 (0.4) 0 (0.1) 0 (0.2) Qt 595 10.42 15 (0.0) 75 (10.5) 1155 (1.3) 4 (1.2) All times expressed in seconds. Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 19 / 30
    • Work in Progress 1 Implement / Enrich the CRISP Language 2 Implement more rules with information given by other tools 3 Open our abstract representation of programs to external tools Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 20 / 30
    • Implement / enrich the CRISP language Quantification and true negation needed Both performed over certain domains (sorts) Infinite domains may appear with templates / generics We have an implementation of constructive intensional negation Goals automatically reordered Extend CRISP to other languages: Java, Ada, C, Fortran, . . . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 21 / 30
    • Integration of Information from External Analyzers Coding rules C++ project (in English) source files Coding rules formalized in CRISPC++ Coding rule g++’ compiler (project build) Coding rules Project facts compiled in Prolog into Prolog Ciao Prolog engine Rule viola- tions report Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 22 / 30
    • Integration of Information from External Analyzers Coding rules C++ project (in English) source files Coding rules External formalized Analyzer in CRISPC++ Coding rule g++’ compiler (project build) Translation Knowledge Base about the compiled program Ciao Prolog engine Rule viola- tions report Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 22 / 30
    • Example of New Relation that Needs Specific Analysis relation F : MemberFunction maySelfCall G : MemberFunction when ( exists C : Record ; R : ProgramLocation such that ( C hasMember F and C hasMember G and F = G and F hasProgramLocation L and G calledOn L and L mayAlias ’this’ ) ) or F mustSelfCall G . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 23 / 30
    • Example of Rule that Needs Specific Analysis (1) Rule HICPP 3.4.2: “Do not return non-const handles to class data from const member functions” rule HICPP 3.4.2 violated by F : ConstMemberFunction when exists C : Record; L : ProgramLocation; A : PrivateDataMember; P : PointerType such that ( A hasType P and not P is ConstType and C hasMember A and C hasMember F and F returns L and L mayAlias A ) . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 24 / 30
    • Example of Rule that Needs Specific Analyses (2) Rule HICPP 3.2.5: “Ensure destructors release all objects owned by the object” rule HICPP 3.2.5 violated by D : Destructor when exists C : Record; A : DataMember; F : MemberFunction; L : ProgramLocation such that ( C hasMember D and C hasMember A and not D releases A and L isFreshLocationIn F and A mayPointTo L and not exists G : MemberFunction such that ( C hasMember G and not A mustBeLinkedFromHeapIn G ) ) . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 25 / 30
    • New Relations ProgramLocation mayPointTo AbstractMemoryLocation ProgramLocation mustPointTo AbstractMemoryLocation ProgramLocation mayAlias ProgramLocation ProgramLocation mustAlias ProgramLocation Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 26 / 30
    • Lessons learned go out & meet people Industrial projects are different, but there is a whole world of problems to solve out there. Take advantage of european instruments to get in contact with the industry / overall impression with ITEA quite positive. Do not try to include your own research agenda in the proposal, that will not work! . . . but it can work in the opposite direction: DESAF10S (2010–2012), Spanish Ministry of Science and Innovation PROMETIDOS (2010–2013), Madrid Regional Goverment/European Social Fund A PhD on its way! Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 27 / 30
    • Lessons learned be open, in several ways if possible Adding the open source label to your project proposal may be beneficial but try to avoid the obvious, naive argumentations. Global GCC exemplified the benefits of openness in several aspects: The GCC suite itself, as a vehicle for efficient transfer of advanced compilation techniques to the european industry, alleviating their dependency from external proprietary solutions. Our proposal for an extensible platform for coding rule specification and validation is itself open source in the sense that specs are code that can be shared and enhanced by a new market of potential users. This is only possible thanks to a variety of existing static analysers and tools (e.g. CIAO) from academia already distributed on open source licenses. Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 28 / 30
    • Lessons learned keep your ears open for unexpected applications Coding rules for COBOL and beyond. . . Tools for semi-automatic refactoring Better source code searches at Google SAFE-GCC: NXP, Trimedia. . . Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 29 / 30
    • Lessons learned some negative bits. . . The GNU compiler collection itself may be a problem, sometimes, due to an obsolete architecture Issues with copyright transfer to the FSF Multiplicity of languages has been a problem as well (i.e. multiple front-ends) Do not try to solve all the problems of our planet. . . Get focused! Read the small print — national issues concerning european projects, etc. Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 30 / 30
    • The way ahead current state of affairs Preliminary conclusions: Clean (declarative) semantics given to potentially ambiguous coding rules by means of (extended) logic programming A number of rules implemented using plain Prolog Rule violations found in highly regarded C++ projects! Checker: little resource (memory and time) consumption Future work: Complete definition of a highly expressive language aimed at specifying rules and translation scheme into efficient Prolog Connect the framework with other parts of the GGCC project Improve performance of overall checking procedure http://www.ggcc.info Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 31 / 30
    • The way ahead a research agenda Focus on tools Do not miss reliability of open software as a real issue! Bring semantics to open source software development type systems description logics (ontologies, etc.) static program analysis (abstract interpretation, model checking, etc.) programming language design (DSLs, concurrency. . . ) The future is. . . SF searching sources based on types (Foogle) ontology powered semantic desktops (Nepomuk) coherent management of packages (Mancoosi) automatic discovery and composition of sw (AMOS, EZweb) safe composition of components etc. Mari˜ o et al. (UPM) n Global GCC FOSSA, November 2009 32 / 30