Program Understanding:What Programmers Really WantEinar W. HøstMay 5th 2011
DescriptionIntroductionHow Analysis can helpUnderstandingUnderstanding Program UnderstandingAgendaAgendaEvaluationExamplesConsiderLimitationsofCurrentToolsLook at somewell-known Analyses
Program AnalysisProgram UnderstandingProgram Analysis
Why is program understanding important?
It’s what programmers do!
“To program is to understand”                       - Kristen Nygaard
Why is program understanding hard?
Size.
Complexity.
Heterogenity.
Side-effects.
The concept assignment problem.
Intangibility.
=>
Disorientation.
“Few are the programmers who can explain their code well enough so that reading it is not incredibly frustrating.”                                      - JefRaskin
How can program analysis help?
Program Understanding?$s.=($f=$ARGV[$_%2])x(substr$s,$_,1or$f)for0..500;
Program Understanding!$s.=($f=$ARGV[$_%2])x(substr$s,$_,1or$f)for0..500;Through themagic lens of a Tool
What do programmers want?
”Helpme in findingwhatneeds to be changed...
...make thechange...
...and getthe !$&?/% out!”
Just-in-time program understanding.
Complete understanding isNot realistic
Not cost-effective
Not necessaryUnderstand the program well enough toimplement a new feature
Understand the program well enough tofix a bug
Understand the program well enough toimproveperformance
Without change, there is no need for understanding!
Program understanding Understanding execution behavior
 Understanding control flow
 Understanding data flow
 Understanding dependencies
 Understanding side-effects
 Understanding change impact
 Reasoning about designat the code level
Program understanding Understanding what is relevantat the task level
Comprehension strategies
Comprehension strategiesReconstructknowledgeaboutthe program domain and mapping it to thesourcecode.Top-down
Comprehension strategiesReadcodestatements and mentallygroupthesestatementsintohigherlevelabstractions.Bottom-up
Comprehension strategiesReadcode in detail,followcontrolflow,gain global understanding.Systematic
Comprehension strategiesScancode, looking for cluesindicatingrelevance to thetask at hand.Opportunistic
Comprehension strategiesconjecturequestionsearchreadInquiries
Comprehension strategiesUnderstanding is formed by switchingbetweendifferentstrategies as needed.Integrated
Tools for program understanding
AnalysisPresentation
Analysis toolsCallgraphs
 Program slicing
 Feature location
Effectsofchange
 Software metricsPresentation toolsVisualisations
Context-awarenessWhat is a program?
print”This is not a program.”
print(”This is not a program.”)
The program astext
The program asbinary
The program asversioned
The program asprocess
The program ecosystem Source code / Compiled binary
 Execution environment (runtime)
 Test suite
 Version control system
 Issue tracking system
 Integrated development environment
 The programmer
 External sourcesWhat questions can we ask about a program?
Behavior-centric What does this piece of code do?
 What happens if I change this?
 Where is this code used?
 Which parts of the program are affected by change?Requires program code
Evolution-centric Who wrote this piece of code?
 Who can help me understand this?
 Why was this feature implemented this way?
 What are the most bug-prone parts of the program?
 Where do changes happen most often?Requires program history
Interaction-centric How was this piece of code written?
Whichtasksare hard to accomplish?
 Which parts of the program are hard to change? Requires coding-session history
What is program analysis?
Traditional program analysisAll you need is codeAll you need is codeAll you need is code, codeCode is all you need
StaticDynamic
StaticAll possible executions
StaticSound+Conservative
StaticTrades precision for soundness
Sample executionsDynamic
Efficient+PreciseDynamic
Trades completenessforefficiencyDynamic
StaticSynergyDynamic
Program analysis for program understanding
Call graph construction
Call graph constructionShow calling relationshipsbetween parts of a program.Essential idea
Call graph constructionfhmaingSample graph
Call graph constructionThe relationdescribingexactlythosecallsmade from oneentity to another in anypossibleexecutionofthe program.The ideal call graph
Call graph constructionMake the program seem less fragmented by showingthecontrolflow links between program parts.Benefit
Call graph constructionThe graphbecomestoo large and complexto be comprehensible.Limitation
Call graph construction”The sightofgcc'scallgraphfrightened my students so muchthattheyrequesteda differentproject”							- ArunLakhotiaLimitation
Program slicing
Program slicingFind a reduced program exhibitingthe same behaviorofinterest as the full program.Essential idea
Program slicingThe reduced program is called a program slice.
Program slicingC  =  < x, V >x	: a statement in program PV	: a subsetof variables in PSlicing criterion
Criterion: <12, z>1begin2read(x,y)3total := 0.04sum := 0.05if x <= 16   then sum := y7   else begin8read(z)9     total := x*y10   end11write(total, sum)12 end.1 begin2read(x,y)5 if x <= 16 then7 else8read(z)12 end.
Criterion: <12, x>1begin2read(x,y)3total := 0.04sum := 0.05if x <= 16   then sum := y7   else begin8read(z)9     total := x*y10   end11write(total, sum)12 end.1 begin2read(x,y)12 end.
Program slicingWhatwouldthe program look like ifweassume an initial statesatisfyingC ?Forward conditioning
Program slicingDeletesstatementsthatwill not be executed given the initial state.Forward conditioning
Program slicingWhatwouldthe program look like ifweassume an eventual statesatisfyingC ?Backward conditioning
Program slicingDeletesstatementswhichcannot lead to the eventual state.Backward conditioning
Program slicingThe reduced programis smaller.Benefit
Program slicingThe reduced programlooks foreign.Limitation
Program slicingHard to integrateintotheprogrammer’sworkflow.Limitation
Concept analysis
Concept analysisIdentifygroupingsofobjectsthat have commonattributes.Essential idea
Concept analysisC = (O, A, R)O	 Set of objectsA	 Set of attributesR	 Relation R⊆O ×AFormal context
Concept analysisσ(O) = {a∈A⎮∀o∈O : (o, a)∈R}O	 Set of objectsA	 Set of attributesR	 Relation R⊆O ×ACommon attributes
Concept analysisτ(A) = {o∈O⎮∀a∈A : (o, a)∈R}O	 Set of objectsA	 Set of attributesR	 Relation R⊆O ×ACommon objects
Concept analysisA pair (O, A) is a conceptif A = σ(O) and O = τ(A)Definition of concept
Concept analysisO=>  SubprogramsA=>  FeaturesFeature location
Concept analysis(s, f) ⊆ Rif subprogram s is invoked when feature f is invokedFeature location
Concept analysisIdentify parts ofthe program relevant for a feature.Benefit
Concept analysisRecoveryofcomponents and generationofhigh-levelarchitectureviews.Benefit
Concept analysisImperfecthigh-leveldescriptions have limited application for concretetasks.Limitation
Concept analysisRequireseffort from the programmer, withunclearbenefits.Limitation
Concept analysisHard to integrateintotheprogrammer’sworkflow.Limitation
Change impact analysis
Change impact analysisIdentifythepotentialconsequencesof a program change.Essential idea
Change impact analysisEstimatewhat must be modified to accomplish a changeofbehavior.Essential idea
Change impact analysisSubtyping and dynamicdispatchmeansthatchangeimpactcan be non-local and unexpected.Challenges in OO languages
Change impact analysisAtomic changes
Change impact analysisProvide a boundaryaroundtheeffectsof a program edit.Benefit
Change impact analysisProvideconfidencethat an editdoes not have unexpectedeffectsoutsidetheboundary.Benefit
Change impact analysisLimited by theprecisionofthechangeimpactanalysis.Limitation
Change impact analysisStill needassurancethatnounexpectedeffectsoccur in theimpacted part ofthe program.Limitation
Change impact analysisHard to integrateintotheprogrammer’sworkflow.Limitation
Program metrics
Program metricsQuantifyaspectsofthe program presumed to be relevant.Essential idea
Program metrics Lines of code
 Depth of inheritance tree
 Internal cohesion
 Coupling to other elements
Cyclomatic complexity
 Halstead complexity
 ...Code-centric metrics
Program metrics Test coverage
 Bug density
 Change rate
 ...Other metrics
Program metricsAnswer questions regardingprogram quality.Benefit
Program metricsIdentify potential problemareas in the program.Benefit
Program metricsMetrics are indirectindicators of quality.Limitation
Program metricsTask-generating, not task-solving.Limitation
Software visualisation
Software visualisationAvoidoverwhelmingthe programmer by compressinginformation and usinggraphics.Goal

Program understanding: What programmers really want