SlideShare a Scribd company logo

Source code comprehension on evolving software

Sung Kim
Sung Kim

Yida's PQE

1 of 26
Download to read offline
Source Code Comprehension on Evolving Software:
A Literature Survey
Yida Tao
Supervisor: Sunghun Kim
1
Motivation
Code Change Comprehension
Tao et al., FSE’12
Code change comprehension is
• Frequently required
• In major development activities, in
particular the code-review process
• How do software engineers understand code changes? An exploratory study in industry. Tao et al., FSE’12
• Expectations, outcomes, and challenges of modern code review. Bacchelli and Bird, ICSE’13
Bacchelli & Bird, ICSE’13
• “…review and understand code they
have not seen before may be more
common that a developer working on
new code”
• “From interviews, no other code
review challenge emerged as clearly as
understanding the submitted change”
2
Outline
Program Differencing
Describing code changes
Code Change Summarization
Explaining code changes
Querying and Filtering
Customization
Code Change Comprehension
3
Program Differencing
4
 Text Differencing
 Syntactic Differencing
 Semantic Differencing
Text Differencing
 Flat representation of a program
 Sequence of strings
 Unix diff
 Only output added/deleted lines, can not detect modified lines
 Hard to determine when a code fragment is moved upward or downward
 Ldiff (Canfora et al., ICSE’09)
 An enhanced line differencing tool
 Limitations
 Changes to *characters*
 No syntactic-structure information
5
Syntactic Differencing
 Structured representation of a program
 Abstract syntax tree; XML
 ChangeDistiller (Fluri et al., TSE’07)
 Tree differencing
 Node: bigram string similarity
 Control structure: subtree similarity
 Output: tree edit script (insert, delete, move, update)
 XML differecing
 srcXML (Maletic & Collard, ICSM’04): embeds abstract syntax and structure
within the source code
 diffX (Al-Ekram et al., CASCON '05)
 Limitation
 Cannot describe how the behavior of a program is changed
 Still report differences for behavior-preserving changes
6

Recommended

A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
 
Partitioning composite code changes to facilitate code review
Partitioning composite code changes to facilitate code reviewPartitioning composite code changes to facilitate code review
Partitioning composite code changes to facilitate code reviewYida Tao
 
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningSung Kim
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)Sung Kim
 
A Mono- and Multi-objective Approach for Recommending Software Refactoring
A Mono- and Multi-objective Approach for Recommending Software RefactoringA Mono- and Multi-objective Approach for Recommending Software Refactoring
A Mono- and Multi-objective Approach for Recommending Software RefactoringAli Ouni
 

More Related Content

What's hot

ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewAli Ouni
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
Using HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review AnalyticsUsing HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review AnalyticsThe University of Adelaide
 
Recommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software EnginneringRecommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software EnginneringAli Ouni
 
Review Participation in Modern Code Review: An Empirical Study of the Android...
Review Participation in Modern Code Review: An Empirical Study of the Android...Review Participation in Modern Code Review: An Empirical Study of the Android...
Review Participation in Modern Code Review: An Empirical Study of the Android...The University of Adelaide
 
Investigating Code Review Practices in Defective Files
Investigating Code Review Practices in Defective FilesInvestigating Code Review Practices in Defective Files
Investigating Code Review Practices in Defective FilesThe University of Adelaide
 
The Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyThe Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyRay Buse
 
Revisiting Code Ownership and Its Relationship with Software Quality in the S...
Revisiting Code Ownership and Its Relationship with Software Quality in the S...Revisiting Code Ownership and Its Relationship with Software Quality in the S...
Revisiting Code Ownership and Its Relationship with Software Quality in the S...The University of Adelaide
 
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Feng Zhang
 
Ph.D. Thesis Defense: Studying Reviewer Selection and Involvement in Modern ...
Ph.D. Thesis Defense:  Studying Reviewer Selection and Involvement in Modern ...Ph.D. Thesis Defense:  Studying Reviewer Selection and Involvement in Modern ...
Ph.D. Thesis Defense: Studying Reviewer Selection and Involvement in Modern ...The University of Adelaide
 
Improving Code Review Effectiveness Through Reviewer Recommendations
Improving Code Review Effectiveness Through Reviewer RecommendationsImproving Code Review Effectiveness Through Reviewer Recommendations
Improving Code Review Effectiveness Through Reviewer RecommendationsThe University of Adelaide
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Chakkrit (Kla) Tantithamthavorn
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect PredictionSung Kim
 
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...Ali Ouni
 
Synthesizing Knowledge from Software Development Artifacts
Synthesizing Knowledge from Software Development ArtifactsSynthesizing Knowledge from Software Development Artifacts
Synthesizing Knowledge from Software Development ArtifactsJeongwhan Choi
 
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...Ali Ouni
 

What's hot (20)

ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
 
Cser13.ppt
Cser13.pptCser13.ppt
Cser13.ppt
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
Using HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review AnalyticsUsing HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review Analytics
 
Recommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software EnginneringRecommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software Enginnering
 
Review Participation in Modern Code Review: An Empirical Study of the Android...
Review Participation in Modern Code Review: An Empirical Study of the Android...Review Participation in Modern Code Review: An Empirical Study of the Android...
Review Participation in Modern Code Review: An Empirical Study of the Android...
 
Icsm19.ppt
Icsm19.pptIcsm19.ppt
Icsm19.ppt
 
Investigating Code Review Practices in Defective Files
Investigating Code Review Practices in Defective FilesInvestigating Code Review Practices in Defective Files
Investigating Code Review Practices in Defective Files
 
The Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyThe Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency Statically
 
Revisiting Code Ownership and Its Relationship with Software Quality in the S...
Revisiting Code Ownership and Its Relationship with Software Quality in the S...Revisiting Code Ownership and Its Relationship with Software Quality in the S...
Revisiting Code Ownership and Its Relationship with Software Quality in the S...
 
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
 
Ph.D. Thesis Defense: Studying Reviewer Selection and Involvement in Modern ...
Ph.D. Thesis Defense:  Studying Reviewer Selection and Involvement in Modern ...Ph.D. Thesis Defense:  Studying Reviewer Selection and Involvement in Modern ...
Ph.D. Thesis Defense: Studying Reviewer Selection and Involvement in Modern ...
 
Improving Code Review Effectiveness Through Reviewer Recommendations
Improving Code Review Effectiveness Through Reviewer RecommendationsImproving Code Review Effectiveness Through Reviewer Recommendations
Improving Code Review Effectiveness Through Reviewer Recommendations
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
 
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
 
Icsm20.ppt
Icsm20.pptIcsm20.ppt
Icsm20.ppt
 
Msr17a.ppt
Msr17a.pptMsr17a.ppt
Msr17a.ppt
 
Synthesizing Knowledge from Software Development Artifacts
Synthesizing Knowledge from Software Development ArtifactsSynthesizing Knowledge from Software Development Artifacts
Synthesizing Knowledge from Software Development Artifacts
 
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
 

Viewers also liked

Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Sung Kim
 
The Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksSung Kim
 
How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012Sung Kim
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test GenerationSung Kim
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesSung Kim
 
A Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionSung Kim
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
Tensor board
Tensor boardTensor board
Tensor boardSung Kim
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect PredictionSung Kim
 
Time series classification
Time series classificationTime series classification
Time series classificationSung Kim
 

Viewers also liked (14)

Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
 
The Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social Networks
 
How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patches
 
A Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash Reproduction
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Tensor board
Tensor boardTensor board
Tensor board
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
Time series classification
Time series classificationTime series classification
Time series classification
 

Similar to Source code comprehension on evolving software

A Source Code Similarity System For Plagiarism Detection
A Source Code Similarity System For Plagiarism DetectionA Source Code Similarity System For Plagiarism Detection
A Source Code Similarity System For Plagiarism DetectionJames Heller
 
PhD Proposal talk
PhD Proposal talkPhD Proposal talk
PhD Proposal talkRay Buse
 
Code Craftsmanship Checklist
Code Craftsmanship ChecklistCode Craftsmanship Checklist
Code Craftsmanship ChecklistRyan Polk
 
Implementing Refactorings in IntelliJ IDEA
Implementing Refactorings in IntelliJ IDEAImplementing Refactorings in IntelliJ IDEA
Implementing Refactorings in IntelliJ IDEAintelliyole
 
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...Axel Reichwein
 
A Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse EngineeringA Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse Engineeringijsrd.com
 
Questions Every software engineer should answer
Questions Every software engineer should answerQuestions Every software engineer should answer
Questions Every software engineer should answerErtan Deniz
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffMartin Pinzger
 
Aspect Oriented Programming
Aspect Oriented ProgrammingAspect Oriented Programming
Aspect Oriented ProgrammingRodger Oates
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Martin Pinzger
 
Requirements Analysis and Management using Innoslate
Requirements Analysis and Management using InnoslateRequirements Analysis and Management using Innoslate
Requirements Analysis and Management using InnoslateElizabeth Steiner
 
Requirement Management.ppt
Requirement Management.pptRequirement Management.ppt
Requirement Management.pptSoham De
 
Software engineering lecture notes
Software engineering   lecture notesSoftware engineering   lecture notes
Software engineering lecture notesGarima Singh
 
A study of code change patterns for adaptive maintenance with AST analysis
A study of code change patterns for  adaptive maintenance with AST analysis A study of code change patterns for  adaptive maintenance with AST analysis
A study of code change patterns for adaptive maintenance with AST analysis IJECEIAES
 
15 implementing architectures
15 implementing architectures15 implementing architectures
15 implementing architecturesMajong DevJfu
 
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...JeffCarver32
 

Similar to Source code comprehension on evolving software (20)

A Source Code Similarity System For Plagiarism Detection
A Source Code Similarity System For Plagiarism DetectionA Source Code Similarity System For Plagiarism Detection
A Source Code Similarity System For Plagiarism Detection
 
PhD Proposal talk
PhD Proposal talkPhD Proposal talk
PhD Proposal talk
 
Code Craftsmanship Checklist
Code Craftsmanship ChecklistCode Craftsmanship Checklist
Code Craftsmanship Checklist
 
Implementing Refactorings in IntelliJ IDEA
Implementing Refactorings in IntelliJ IDEAImplementing Refactorings in IntelliJ IDEA
Implementing Refactorings in IntelliJ IDEA
 
Unit iv
Unit ivUnit iv
Unit iv
 
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
 
A Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse EngineeringA Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse Engineering
 
Questions Every software engineer should answer
Questions Every software engineer should answerQuestions Every software engineer should answer
Questions Every software engineer should answer
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
 
Se lec-uosl-8
Se lec-uosl-8Se lec-uosl-8
Se lec-uosl-8
 
Aspect Oriented Programming
Aspect Oriented ProgrammingAspect Oriented Programming
Aspect Oriented Programming
 
SE2018_Lec 17_ Coding
SE2018_Lec 17_ CodingSE2018_Lec 17_ Coding
SE2018_Lec 17_ Coding
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
 
Requirements Analysis and Management using Innoslate
Requirements Analysis and Management using InnoslateRequirements Analysis and Management using Innoslate
Requirements Analysis and Management using Innoslate
 
Requirement Management.ppt
Requirement Management.pptRequirement Management.ppt
Requirement Management.ppt
 
Software engineering lecture notes
Software engineering   lecture notesSoftware engineering   lecture notes
Software engineering lecture notes
 
SE2_Lec 18_ Coding
SE2_Lec 18_ CodingSE2_Lec 18_ Coding
SE2_Lec 18_ Coding
 
A study of code change patterns for adaptive maintenance with AST analysis
A study of code change patterns for  adaptive maintenance with AST analysis A study of code change patterns for  adaptive maintenance with AST analysis
A study of code change patterns for adaptive maintenance with AST analysis
 
15 implementing architectures
15 implementing architectures15 implementing architectures
15 implementing architectures
 
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...
 

More from Sung Kim

Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Sung Kim
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 openingSung Kim
 
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote Sung Kim
 
Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)Sung Kim
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Sung Kim
 
Software Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of CrowdsSoftware Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of CrowdsSung Kim
 
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)Sung Kim
 
Self-defending software: Automatically patching errors in deployed software ...
Self-defending software: Automatically patching  errors in deployed software ...Self-defending software: Automatically patching  errors in deployed software ...
Self-defending software: Automatically patching errors in deployed software ...Sung Kim
 
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)Sung Kim
 

More from Sung Kim (9)

Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 opening
 
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote
 
Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
 
Software Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of CrowdsSoftware Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of Crowds
 
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
 
Self-defending software: Automatically patching errors in deployed software ...
Self-defending software: Automatically patching  errors in deployed software ...Self-defending software: Automatically patching  errors in deployed software ...
Self-defending software: Automatically patching errors in deployed software ...
 
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
 

Recently uploaded

killingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfkillingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfssuser82c38d
 
CSS Notes in PDF, Easy to understand. For beginner to advanced. ...
CSS Notes in PDF, Easy to understand. For beginner to advanced.              ...CSS Notes in PDF, Easy to understand. For beginner to advanced.              ...
CSS Notes in PDF, Easy to understand. For beginner to advanced. ...syedfaisal759877
 
Design pattern talk by Kaya Weers - 2024
Design pattern talk by Kaya Weers - 2024Design pattern talk by Kaya Weers - 2024
Design pattern talk by Kaya Weers - 2024Kaya Weers
 
Open Source vs Closed Source LLMs. Pros and Cons
Open Source vs Closed Source LLMs. Pros and ConsOpen Source vs Closed Source LLMs. Pros and Cons
Open Source vs Closed Source LLMs. Pros and ConsSprings
 
Passbolt Introduction and Usage for secret managment
Passbolt Introduction and Usage for secret managmentPassbolt Introduction and Usage for secret managment
Passbolt Introduction and Usage for secret managmentThierry Gayet
 
Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Jeffrey Haguewood
 
Role of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxRole of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxMindInventory
 
killing camp 주차장 나누기-2 topology sort.pdf
killing camp 주차장 나누기-2 topology sort.pdfkilling camp 주차장 나누기-2 topology sort.pdf
killing camp 주차장 나누기-2 topology sort.pdfssuser82c38d
 
How AI is preventing account fraud at web scale
How AI is preventing account fraud at web scaleHow AI is preventing account fraud at web scale
How AI is preventing account fraud at web scaleAmir Moghimi
 
Globus for System Administrators
Globus for System AdministratorsGlobus for System Administrators
Globus for System AdministratorsGlobus
 
What are the Reasons for Tracking the Attendance of the Employees?
What are the Reasons for Tracking the Attendance of the Employees?What are the Reasons for Tracking the Attendance of the Employees?
What are the Reasons for Tracking the Attendance of the Employees?NYGGS Automation Suite
 
Agile & Scrum, Certified Scrum Master! Crash Course
Agile & Scrum,  Certified Scrum Master! Crash CourseAgile & Scrum,  Certified Scrum Master! Crash Course
Agile & Scrum, Certified Scrum Master! Crash CourseRohan Chandane
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!Anthony Dahanne
 
LLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flowLLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flowNaoki (Neo) SATO
 
Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Dmitry Zinoviev
 
Cybersecurity Measures For Remote Workers.pdf
Cybersecurity Measures For Remote Workers.pdfCybersecurity Measures For Remote Workers.pdf
Cybersecurity Measures For Remote Workers.pdfCIOWomenMagazine
 
Joseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureJoseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureHironori Washizaki
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 

Recently uploaded (20)

killingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfkillingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdf
 
CSS Notes in PDF, Easy to understand. For beginner to advanced. ...
CSS Notes in PDF, Easy to understand. For beginner to advanced.              ...CSS Notes in PDF, Easy to understand. For beginner to advanced.              ...
CSS Notes in PDF, Easy to understand. For beginner to advanced. ...
 
Design pattern talk by Kaya Weers - 2024
Design pattern talk by Kaya Weers - 2024Design pattern talk by Kaya Weers - 2024
Design pattern talk by Kaya Weers - 2024
 
Open Source vs Closed Source LLMs. Pros and Cons
Open Source vs Closed Source LLMs. Pros and ConsOpen Source vs Closed Source LLMs. Pros and Cons
Open Source vs Closed Source LLMs. Pros and Cons
 
Passbolt Introduction and Usage for secret managment
Passbolt Introduction and Usage for secret managmentPassbolt Introduction and Usage for secret managment
Passbolt Introduction and Usage for secret managment
 
2024 Trends Transforming Enterprise Resource Planning
2024 Trends Transforming Enterprise Resource Planning2024 Trends Transforming Enterprise Resource Planning
2024 Trends Transforming Enterprise Resource Planning
 
Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)
 
eLearning Content Development Company Code and Pixels.pdf
eLearning Content Development Company Code and Pixels.pdfeLearning Content Development Company Code and Pixels.pdf
eLearning Content Development Company Code and Pixels.pdf
 
Role of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxRole of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptx
 
killing camp 주차장 나누기-2 topology sort.pdf
killing camp 주차장 나누기-2 topology sort.pdfkilling camp 주차장 나누기-2 topology sort.pdf
killing camp 주차장 나누기-2 topology sort.pdf
 
How AI is preventing account fraud at web scale
How AI is preventing account fraud at web scaleHow AI is preventing account fraud at web scale
How AI is preventing account fraud at web scale
 
Globus for System Administrators
Globus for System AdministratorsGlobus for System Administrators
Globus for System Administrators
 
What are the Reasons for Tracking the Attendance of the Employees?
What are the Reasons for Tracking the Attendance of the Employees?What are the Reasons for Tracking the Attendance of the Employees?
What are the Reasons for Tracking the Attendance of the Employees?
 
Agile & Scrum, Certified Scrum Master! Crash Course
Agile & Scrum,  Certified Scrum Master! Crash CourseAgile & Scrum,  Certified Scrum Master! Crash Course
Agile & Scrum, Certified Scrum Master! Crash Course
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!
 
LLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flowLLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flow
 
Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)
 
Cybersecurity Measures For Remote Workers.pdf
Cybersecurity Measures For Remote Workers.pdfCybersecurity Measures For Remote Workers.pdf
Cybersecurity Measures For Remote Workers.pdf
 
Joseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureJoseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about Architecture
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 

Source code comprehension on evolving software

  • 1. Source Code Comprehension on Evolving Software: A Literature Survey Yida Tao Supervisor: Sunghun Kim 1
  • 2. Motivation Code Change Comprehension Tao et al., FSE’12 Code change comprehension is • Frequently required • In major development activities, in particular the code-review process • How do software engineers understand code changes? An exploratory study in industry. Tao et al., FSE’12 • Expectations, outcomes, and challenges of modern code review. Bacchelli and Bird, ICSE’13 Bacchelli & Bird, ICSE’13 • “…review and understand code they have not seen before may be more common that a developer working on new code” • “From interviews, no other code review challenge emerged as clearly as understanding the submitted change” 2
  • 3. Outline Program Differencing Describing code changes Code Change Summarization Explaining code changes Querying and Filtering Customization Code Change Comprehension 3
  • 4. Program Differencing 4  Text Differencing  Syntactic Differencing  Semantic Differencing
  • 5. Text Differencing  Flat representation of a program  Sequence of strings  Unix diff  Only output added/deleted lines, can not detect modified lines  Hard to determine when a code fragment is moved upward or downward  Ldiff (Canfora et al., ICSE’09)  An enhanced line differencing tool  Limitations  Changes to *characters*  No syntactic-structure information 5
  • 6. Syntactic Differencing  Structured representation of a program  Abstract syntax tree; XML  ChangeDistiller (Fluri et al., TSE’07)  Tree differencing  Node: bigram string similarity  Control structure: subtree similarity  Output: tree edit script (insert, delete, move, update)  XML differecing  srcXML (Maletic & Collard, ICSM’04): embeds abstract syntax and structure within the source code  diffX (Al-Ekram et al., CASCON '05)  Limitation  Cannot describe how the behavior of a program is changed  Still report differences for behavior-preserving changes 6
  • 7. Semantic Differencing  Semantic diff (Jackson and Ladd, ICSM’94)  Method-level  Variable dependencies comparison 7 ==
  • 8. Semantic Differencing (cont.)  JDiff (Apiwattanapong et al. ASE’04, 06)  Extended control-flow graph (ECFG)  Dynamic binding, class hierarchy, exception handling, etc. 8
  • 9. Semantic Differencing (cont.)  Differential symbolic execution (Person et al., FSE’08)  “Executing” a program using symbolic values 9
  • 10. Outline Program Differencing Text Differencing Syntactic differencing Semantic differencing Code Change Comprehension Code Change Summarization Explaining code changes Querying and Filtering Customization 10
  • 11. Code Change Summarization  LSdiff (Kim and Notkin, ICSE’09)  Group related changes  Detect potential inconsistencies in a code change 11
  • 12. Code Change Summarization (cont.)  DeltaDoc (Buse and Weimer, ASE’10)  Symbolic execution: obtain path predicates for each statement in both versions  Identify statements that are added, deleted, or have a changed predicates  Summarization 12
  • 13. Code Change Summarization (cont.)  Multi-document summarization (Rastkar and Murphy, ICSE’13)  Linking evolutionary documents (commit log, issue tracking entries)  Finding the most informative sentences to extract to form a summary  Similarity between a sentence and the title of the enclosing document  Overlap between a sentence and the adjacent document 13
  • 14. Code Change Summarization (cont.)  Challenges  Evolutionary documents  Linkage might not be found (Bachman et al., FSE’10, Wu et al., FSE’11)  Human-written document may be unavailable or uninformative (Buse and Weimer, ASE’10, Tao et al., FSE’12)  Automatically generated document  Verbosity  Uninteresting changes are identified, e.g., “all types that declared toString() added constructors” (Kim and Notkin, ICSE’09) 14 LSdiff DeltaDoc
  • 15. Outline Program Differencing Text Differencing Syntactic differencing Semantic differencing Code Change Summarization Rules and exceptions Control-flow changes Evolutionary documentation Code Change Comprehension Querying and Filtering Customization 15
  • 16. Querying and Filtering  Specifying and detecting meaningful changes (Yu et al., ASE’11)  Normalize the program (user-specified) before differencing  Non-trivial to construct the query 16
  • 17. Querying and Filtering (cont.)  Filtering non-essential changes (Kawrykow and Robillard, ICSE’11)  Non-essential changes: rename-induced modifications, local variable extraction, trivial keyword modification, whitespace and documentation updates  ChangeDistiller (Fluri et al., TSE’07) + Partial program analysis (Dagenais and Robillard, ICSE’08)  Goal: improving mining and recommendation accuracy instead of developers’ comprehension 17
  • 18. Outline Program Differencing Text Differencing Syntactic differencing Semantic differencing Code Change Summarization Rules and exceptions Control-flow changes Evolutionary documentation Querying and Filtering Meaningful changes Non-essential changes Code Change Comprehension 18
  • 19. Research Directions Program Differencing Text Differencing Syntactic differencing Semantic differencing Code Change Summarization Rules and exceptions Control-flow changes Evolutionary documentation Querying and Filtering Meaningful changes Non-essential changes Source Code Changes Work-item-based changes? 19
  • 20. Work-item-based Changes  Multiple work-items in a single code change (e.g., a bug fix + code cleanup + a new feature)  Very difficult to understand (Tao et al., FSE’12) 20 JFreeChart revision 1083 Trivial keyword removal Bug fix Formatting
  • 21. Work-item-based Change Detection  Multiple work-items in a single code change (e.g., a bug fix + code cleanup + a new feature)  Very difficult to understand (Tao et al., FSE’12)  Change decomposition  Program slicing (entity dependencies)  Pattern matching (similarities)  A single work-item spreads across multiple code changes (e.g., 5 changes to finally fix a bug completely)  Change aggregation  Linkage to the same issue  Heuristics like time duration, commit authors, program dependencies, etc. 21
  • 22. Research Directions Program Differencing Text Differencing Syntax differencing Semantic differencing Code Change Summarization Rules and exceptions Control-flow changes Evolutionary documentation Querying and Filtering Meaningful changes Non-essential changes Code Change Comprehension Work-item change detection Change decomposition Change aggregation 22
  • 23. Research Directions Program Differencing Text Differencing Syntax differencing Semantic differencing Code Change Summarization Rules and exceptions Control-flow changes Evolutionary documentation Querying and Filtering Meaningful changes Non-essential changes Work-item-specific changes Code Change Comprehension Work-item change detection Change decomposition Change aggregation 23
  • 24. Research Directions Program Differencing Text Differencing Syntax differencing Semantic differencing Code Change Summarization Rules and exceptions Control-flow changes Evolutionary documentation Querying and Filtering Meaningful changes Non-essential changes Work-item-specific changes Code Change Comprehension Concrete Execution Work-item change detection Change decomposition Change aggregation 24
  • 25. Explaining code changes with executions of co- changed test cases 25  Test cases  Best documentation for source code  Test cases co-changed with source code  Documentation for code changes?  Mostly synchronous co-evolution of production and test code (Zaidman et al., Empirical Software Engineering’11)  Differential test executions  Co-changed test cases T  Executing T on the old version P and new version P’  Comparing executions to explained change behaviors From StackExchange http://programmers.stackexchange.com/questions/154439/quality-of-code-in- unit-tests?newsletter=1&nlcode=67628%7c1a35 • “Unit tests are one of the best sources of documentation for your system, and arguably the most reliable form” • “Unit tests are often the first thing you look at when trying to grasp what some piece of code does” • “They can also serve as a starting point for people new to the code base”
  • 26. Research Directions Program Differencing Text Differencing Syntax differencing Semantic differencing Code Change Summarization Rules and exceptions Control-flow changes Evolutionary documentation Querying and Filtering Meaningful changes Non-essential changes Work-item-specific changes Code Change Comprehension Concrete Execution • Co-changed test cases • Differential test execution Work-item change detection Change decomposition Change aggregation 26

Editor's Notes

  1. We know that software is continuously evolving since developers practically change source code all the time. One of the consequences is, developers also have to understand these code changes, which I refer to as CCC through this talk. Last year, we conducted an exploratory study in MS, where we sent surveys and conducted interviews with MS developers for their practices on CCC. This work is published in FSE. In this work, we found first, CCC is frequently required. The majority of developers understand code changes several times each day In this year’s ICSE, B in their empirical study on modern code review, they also expressed the similar findings that CCC is more common than understanding the entire program, but CCC is also the most challenging part. These motivate our work since CCC is a challenging activity but it’s also fundamental to developers’ daily practices.
  2. So in the literature survey, I identify 3 major categories related to CCC. First is program differencing. This line of work try to help developers by describing code changes Second is …. Studies in this category take one step further to try to reasoning and explain code changes Third is. This is sort of “customized” CCC.
  3. Unix diff is the most well-known example in this category. But it’s also well-recognized for two major limitations. Ldiff: diff: Longest common subsequence All possible hunk pairs -> similarity (vector space cosine similarity) -> pick the topmost pairs Line matching -> Levenhstein edit distance -> above threshold is marked as changed Unmatched lines are new hunks -> iterate step 2 Since these techniques treat program as normal text, they report program difference as changes to characters. But from a developer’s point of view, the syntactic, or structure information about the source code is lost. This motivates another line of work, which we call “syntax differencing”
  4. This line of work uses structured representation of a program. Changedistiller, which represents a program as an abstract syntax tree and applies tree differencing algorithm. In addition to AST, studies also represent code in XML, which can also embed …Then we can apply XML differencing algorithms, like diffX proposed in, to compute program differences. In cases when developers perform behavior-preserving modifications such as switch the order of if-else, it will still report the differences although from developer’s perspective, they might not think it is an important change.
  5. Therefore, the next line of work focuses on semantic differencing of two program versions. Semantic diff operates on method level, and compares variable dependencies to derive behavioral changes. In the old version of method add, if x not equal to HI, add it to TOT, otherwise, add DEF to total. From this code, we can derive a list of dependencies, for example, … In the new version, developers simply want to switch the order of if-else but mistakenly uses assignment instead of equals. Therefore, when the technique computes variable dependencies and compare it to previous ones, it will report that.. These behavioral differences are certainly not expected because when x is assigned to HI, the initial value of x is always lost. In such cases, semantic diff is certainly better than syntactic diff since it can raise developers’ attention on program’s unexpected behavioral change.
  6. Another work, Jdiff, which is published in, is about semantic differencing for oo program. Simply applying syntactic differencing, we’ll only know that m1 is added, and . But developers may be more interested in how the behavior of program is changed. if the dynamic type of a is B, the call a.m1 in new version actually invokes m1 in B. The exception thrown will be caught by different catch blocks after the change. Jdiff extends CFG to combine…ECFG considers dynamic binding and exception handling for the previous example, and graph differencing algorithm can be applied to reveal the difference.
  7. Some studies also use symbolic execution to characterize programs’ behavior. This technique…instead of actual values. For example, a symbolic execution for this code fragment is like, if this condition is satisfied, return; otherwise, if…, return… XXX proposed differential symbolic execution that compares the SE of two program versions. The output is like this. Under which condition, two different versions produces different results.
  8. Now I’ve covered 3 categories in program differencing. These work basically try to help CCC by describing what the code change is. The next line of work, which I call “CCS”, takes a further step to try to explain code changes.
  9. Program is presented as a set of predicates that describe code elements, containment relationships, and structural dependencies, which are called “facts”. Then Lsdiff computes changed facts between two program versions. Inferring rules from the list of change facts Also inferring exceptions to the rules. Example: all Car’s subtypes’ start methods added calls to the Key.chk method except for the subtype Kia
  10. Finally, DeltaDoc uses some transformation heuristics to summarize these statements’ differences to human-readable documentation.
  11. The studies we’ve seen so far all extract information from source code itself. However, other software artifacts, such as commit log, can also be helpful for understanding code changes since from these artifacts, we might found useful natural language sentences related to the code changes. Motivated by this observation, …proposed… Each sentence has some features, for example. To locate the most informative or relevant sentences, they are ranked by their feature values. Here is an example of their output. For this change, its summary contains a list of relevant sentences extracted from its evolutionary documents.
  12. The major challenges of using evolutionary documents is first, linkage between these documents might not exist so we may not even be able to find documents relevant to a code change. This problem is known as the “missing link” and is studied recently. In addition, document may not… In such cases, we can not rely on them to extract informative change summaries. As for I introduced before, the biggest problem is verbosity. This is rules and exceptions generated by Lsdiff to describe a code change. This is the number of lines in the change documentation. Compared to human-written commit log, which is the black bar, documentation generated by DeltaDoc is still very long. Another challenge is, some uninterested changes can be identified automatically. For example, a rule reported by Lsdiff says…, which in the user study, participants complain that such a rule is not useful.
  13. Therefore, there are studies that customize CCC so that developers can query their interested changes and filtering out irrelevant changes.
  14. Non-essential changes include …, which is less likely to be of developers’ interest. They use ChangeDistiller to detect changes, and apply PPA to resolve type bindings for partial programs (i.e., code changes) However, the goal of this work is to…
  15. In general, studies in this category focuses on querying meaningful changes and filtering out non-essential changes.