Your SlideShare is downloading. ×
0
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Parsing and Type checking all 2^10000 configurations of the Linux kernel

683

Published on

In many projects, lexical preprocessors are used to manage different configurations of the project with conditional compilation. Unfortunately, while being a simple way to implement variability, …

In many projects, lexical preprocessors are used to manage different configurations of the project with conditional compilation. Unfortunately, while being a simple way to implement variability, conditional compilation and lexical preprocessor macros hinder automatic analysis, even though such analysis is urgently needed to combat variability-induced complexity. To analyze code with its variability, we need to parse it without preprocessing it. However, current parsing solutions use unsound heuristics, support only a subset of the language, or suffer from exponential explosion. We introduce a novel variability-aware parser that can parse unpreprocessed code without heuristics. On top of parsing, which detects syntax errors in all configurations, we have constructed a variability-aware type system and module system that additionally detect other compiler-time errors. With this infrastructure, we are in the process of checking the entire Linux kernel with 10000 compile-time configuration options and, hence, up to 2^10000 configurations for syntax and type errors.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
683
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Parsing and Type checking all 210000 configurations of the Linux kernel 10,000 features, 6 million lines of C code Christian Kästner
  • 2. Feature-OrientedProduct Lines
  • 3. Database Engine
  • 4. PrinterFirmware
  • 5. LinuxKernel
  • 6. 7
  • 7. Software Product LinesBoeingBosch Group in IndustryCummins, Inc.EricssonGeneral DynamicsGeneral MotorsHewlett PackardLockheed MartinLucentNASANokiaPhilipsSiemens…
  • 8. Variability ~ Complexity
  • 9. 33 features optional, independenta unique configuration for everyperson on this planet
  • 10. 320 features optional, independent more configurations than estimated atoms in the universe
  • 11. Correctness?
  • 12. Product-Line Implementation
  • 13. static int _rep_queue_filedone( Excerpt from DB_ENV *dbenv, Oracle’s Berkeley DB REP *rep, __rep_fileinfo_args *rfp) {#ifdef NO_QUEUE COMPQUIET(rep, NULL); COMPQUIET(rfp, NULL); return (__db_no_queue_am(dbenv));#else db_pgno_t first, last; u_int32_t flags; int empty, ret, t_ret;#ifdef DIAGNOSTIC DB_MSGBUF mb;#endif // over 100 lines of add. code}#endif Conditional Compilation
  • 14. static int _rep_queue_filedone( Excerpt from DB_ENV *dbenv, Oracle’s Berkeley DB REP *rep, __rep_fileinfo_args *rfp) {#ifdef NO_QUEUE COMPQUIET(rep, NULL); COMPQUIET(rfp, NULL); return (__db_no_queue_am(dbenv));#else db_pgno_t first, last; u_int32_t flags; int empty, ret, t_ret; #ifdef X#ifdef DIAGNOSTIC DB_MSGBUF mb;void foo();#endif #endif // over 100 lines of add. code} void bar() {#endif foo(); } Conditional Compilation
  • 15. Objections / Criticism Designed in the 70th and hardly evolved since “#ifdef considered harmful” “#ifdef hell” “maintenance becomes a ‘hit or miss’ process” “is difficult to determine if the code being viewed is actually compiled into the system” “incomprehensible source texts” “programming errors are easy “CPP makes maintenance difficult” to make and difficult to detect” “source code rapidly“preprocessor diagnostics are poor” becomes a maze”
  • 16. [ICSE’08,ASE’08,Tools‘09,GPCE’09,EASE‘11,..] ViewsVisual representation Disciplined mappingConsistency checking Refactorings … open source, fosd.net/cide Virtual Separation of Concerns
  • 17. 10,000 features, 6 million lines of C code
  • 18. [ICSE’10, AOSD‘11]40 Open-Source C Projects apache, berkely db, cherokee, clamav, dia, emacs, freebsd, gcc,Number of features ghostscript, gimp, glibc, gnumeric, gnuplot, irssi, libxml, lighttpd, linux, lynx, minix, mplayer, mpsolve, openldap, opensolaris, openvpn, parrot, php, pidgin, postgresql, privoxy, python, sendmail, sqlite, subversion, sylpheed, tcl, vim, xfig, xine-lib, xorg-server, xterm Lines of code
  • 19. Correctness?
  • 20. PrinterFirmware
  • 21. Checking Products 2000 Features 100 Printers 30 New Printers per YearPrinterFirmware
  • 22. Checking Products 10000 Features 210000 ConfigurationsLinuxKernel
  • 23. Checking Product LineImplementation with 10000 Features + GeneratorLinuxKernel
  • 24. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 25. Product GenerationVariability-Aware ConventionalAnalysis Analysis
  • 26. We aim for a sound and complete approach
  • 27. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 28. References Conflicts
  • 29. Presence Conditions WORLD BYEtrue true
  • 30. Reachability: pc(caller) -> pc(target) Conflicts: ¬(pc(def1) ˄ pc(def2)) ¬ (WORLD ˄ BYE) true -> (WORLD v BYE)true -> true
  • 31. Reachability: pc(caller) -> pc(target) Conflicts: ¬(pc(def1) ˄ pc(def2)) ¬ (WORLD ˄ BYE) Found 2 type errors: - [WORLD & BYE] file hello.c:8:8v BYE) true -> (WORLD redefinition of msgtrue [!WORLD & !BYE] file hello.c:11:8 - -> true msg undeclared
  • 32. P Variability Model: WORLD BYE VM ->¬ (WORLD ˄ BYE) VM -> (true -> (WORLD v BYE))VM -> (true -> true)
  • 33. AST with Variability Information true -> true greet.c …printf VWORLD VBYE main msg ε msg ε printf¬ (WORLD ˄ BYE) true -> (WORLD v BYE) msg WORLD BYE 35 Extended Lookup Mechanism
  • 34. [ASE’08, TOSEM‘11]Formalization: CFJTheorem (Product Generation Preserves Typing): Allproducts that are generated for valid feature selectionsfrom a well-typed product line are well typed.
  • 35. Product GenerationVariability-Aware ConventionalAnalysis Analysis
  • 36. Surface Complexity Inherent Complexity SAT Problem
  • 37. [TOSEM‘11]Product Line LOC Features Products Time per Time f. Product entire (sec) Product Line (sec)MobileMedia 5700 14 2784 0.3 2Mobile RSS 20 000 14 2048 1 8ReaderLampiro 45 000 11 2048 2 19Berkeley DB 70 000 42 3.6 billion 3 21
  • 38. [ICSE’10, AOSD‘11]40 Open-Source C Projects apache, berkely db, cherokee, clamav, dia,variable code in C files (in %) emacs, freebsd, gcc, ghostscript, gimp, glibc, gnumeric, gnuplot, irssi, libxml, lighttpd, linux, lynx, minix, mplayer, mpsolve, openldap, opensolaris, openvpn, parrot, php, pidgin, postgresql, privoxy, python, sendmail, sqlite, subversion, sylpheed, tcl, vim, xfig, xine-lib, xorg-server, xterm Lines of code
  • 39. Parse Type Check greet. c … VWORL printf VBYE main.c D msg ε msg ε printf Linker checks msg greet. c … VWORL printf VBYE main.c D msg ε msg ε printf msg greet. c … VWORL printf VBYE main D .c msg ε msg ε printf msg
  • 40. ChallengesReal-world C codeC preprocessorHuge sizeModule system / linker checks
  • 41. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 42. [OOPSLA‘11] greet.c …printf VWORLD VBYE main msg ε msg ε printf msg AST with Variability Information
  • 43. Parsing C without Preprocessing
  • 44. Macro expansion Undisciplinedneeded for parsing annotations Alternative macros ? ? ? greet.c + printf VWORLD VBYE + main + msg ε + msg ε printf msg
  • 45. Previous SolutionsDisciplined Subset Requires Code PreparationHeuristics and Partial Analysis Inaccurate, False PositivesBrute Force Infeasible Effort
  • 46. TypeChef ( + 2 Variability-Aware * Variability-Aware 3 * VA Lexer ) + Parser 4A 5¬A 2 3 4 5 Variability-Aware Analysishttps://github.com/ckaestne/TypeChef
  • 47. [OOPSLA‘11] 4A (¬A 4¬A˄B +¬A˄B 6¬A )¬A true( 3 + ) 4¬A˄B +¬A˄B 6¬A 4A (¬A )¬A + 4¬A˄B +¬A˄B 6¬A3 VA 4 VB Library of + 6 Variability-Aware Parser Combinators 4 6 in Scala
  • 48. 7665 C files (x86) 0 353 included header files per C file 08590 distinct macros per C file 0 72 % conditional 0 30 seconds per file (median) 0 2.6.33.3 X86 0 syntax errors
  • 49. Type Checking 20 seconds per file 2.6.33.3 X86
  • 50. 511 files260.000 lines of C code 811 features 51 minutes parsing 6 minutes type checking 4 seconds linker checkingType Checking BusyBox
  • 51. //… skipped 260 linesstruct globals { double cur_time; //… skipped 11 lines#if ENABLE_FEATURE_NTPD_SERVER int listen_fd;#endif unsigned verbose; //… skipped 73 lines};//… skipped 1761 linesint ntpd_main(int argc UNUSED_PARAM, char **argv){#undef G struct globals G; //… skipped 81 lines if (i > (ENABLE_FEATURE_NTPD_SERVER && G.listen_fd != -1)) { … ntpd.c: 2128 } …} [CONFIG_NTPD && !CONFIG_FEATURE_NTPD_SERVER] field listen_fd unknown in struct globals
  • 52. FUTURE DIRECTIONS
  • 53. Correctness?
  • 54. Product GenerationVariability-Aware ConventionalAnalysis Analysis
  • 55. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 56. Product-Line Evolution615 trillion config. 553 quintillion config.
  • 57. [GPCE’09; Grant Prop.]Reengineering Variability#ifdef plug-insparameters feature modulesbranches in VCS aspectsclones disciplined annotationsdomain knowledge runtime variability DisciplinedLegacy System refactoring Variability Implementation
  • 58. Compositional Approaches [ICSE’09, J.ASE’10, SCP’10, TSE’12]Base / Platformclass Stack { void push(Object o) { elementData[size++] = o; } ...} class Stack { void push(Object o) { Lock l = lock(o);Feature: Queue elementData[size++] = o;refines class Stack { Composition } l.unlock(); ... void push(Object o) { Lock l = lock(o); } Super.push(o); l.unlock(); } ...} Module ComponentsFeature: Diagnostic Frameworks, Plug-insaspect Diagnostics { ... Feature-Modules / Mixin Layers / …} Aspects / Subjects, Hyper/J, Deltas
  • 59. Predicting [SPLC’11, SQJ’11, ICSE’12]Nonfunctional Properties
  • 60. [AOSD’11 ESEM’11, EASE’11]Empirical methods& human factors
  • 61. Domain- SpecificLanguages: SugarJ [GPCE’11, OOPSLA’11] Runtime Updates for Java [APSEC’08, J. SP&E’11]
  • 62. Parsing and Type Checking all 210000 Configurations of the Linux Kernel greet.c 4 ( 4¬A˄B +¬A˄B 6¬A )¬A true … ( 3 + ) printf VWORLD VBYE main 4¬A˄B +¬A˄B 6¬A 4 ( )¬A 4¬A˄B +¬A˄B 6¬A msg ε msg ε printf msghttps://github.com/ckaestne/TypeChef
  • 63. TypeChef (Analyzing Real-World C Code) Virtual Separation of Concerns (Tool Support for Annotation-Based Variability Implementation) Aspect-Oriented Decomposition of Berkeley DB AOP Compiler Extensions M.Sc. Ph.D. Thesis Post Doc Thesis1982… 2006 2007 2008 2009 2010 2011 2012 Austin, Texas Magdeburg, Germany Marburg, Germany

×