SlideShare a Scribd company logo
1 of 23
The Interplay between SemanticThe Interplay between Semantic
Coupling and Co-Change ofCoupling and Co-Change of
Software Classes (journal first)Software Classes (journal first)
Nemitari Ajienka – EdgeHill University (UK)
Andrea Capiluppi – Brunel University London (UK)
Steve Counsell – Brunel University London (UK)
ICSE2018 - Gothenburg2 A Capiluppi
Outline
Rationale
Definitions: Semantic coupling and Co-change
Experimental set-up
Results
Conclusion
ICSE2018 - Gothenburg3 A Capiluppi
Rationale – Software changes: origin and impact
[Generated by Doxygen and Graphviz]
Certain classes have
the tendency to change
more
Identify patterns or
metrics of those
classes
ICSE2018 - Gothenburg4 A Capiluppi
Definitions
Semantic coupling
– Degree of relationship between classes’ semantic content
Co-change (Logical coupling)
– Based on historical data
– Classes changed in the same timeframe (day? Week?
Commit?)
ICSE2018 - Gothenburg5 A Capiluppi
Logical and Semantic Couplings
ICSE2018 - Gothenburg6 A Capiluppi
Semantic coupling: operationalisation
ICSE2018 - Gothenburg9 A Capiluppi
Research questions
RQ1: Is there a linear relationship between semantic
and logical coupling?
– Very similar classes (semantically) bound to co-evolve
more often?
RQ2: Is there a directional relationship between
semantic and logical coupling?
– If A and B are co-evolving, does it mean that they’re
semantically linked, or
– If A and B are semantically similar, will they co-evolve?
ICSE2018 - Gothenburg10 A Capiluppi
Experimental set-up
ICSE2018 - Gothenburg11 A Capiluppi
Data collection
Population: GoogleCode projects
– 2,599,222 projects
Sampling
– Only Java projects
– 95% confidence level, 5% confidence interval
– 380 projects
– All revisions+metadata downloaded
Pruning
– Projects with less than 20 revisions
– 79 non-trivial Java projects
– Avg: 117 revisions
ICSE2018 - Gothenburg12 A Capiluppi
Characteristics of projects in sample <excerpt>
ICSE2018 - Gothenburg13 A Capiluppi
Logical and Semantic Couplings
ICSE2018 - Gothenburg14 A Capiluppi
Co-evolution data (logical coupling)
Per project
Per revision
Per pair of OO classes
“what is the likelihood that class A and B co-evolve
together, based on historical data?”
– Low, medium, high likelihood
ICSE2018 - Gothenburg15 A Capiluppi
Logical coupling: operationalisation
Support
– class A modified in 3
transactions
– 2 also included changes to C
– Support for A C is 2.→
Confidence
– Confidence for A C (“C→
depends on A”) is 2/3 = 0.67
– Confidence for C A (“A→
depends on C”) is 2/4 = 0.5.
ICSE2018 - Gothenburg16 A Capiluppi
Semantic coupling: operationalisation
Per project
All revisions
Pair of classes
UrSQLController vs
UrSQLEntry
– N-gram similarity of 0.6
for n-grams of n=4
Vector Space Model (VSM)
text corpora (full code)
N-Gram technique: small
sentences (class identifiers)
Disco Word synonym: small
sentences (class identifiers)
ICSE2018 - Gothenburg17 A Capiluppi
Results
ICSE2018 - Gothenburg18 A Capiluppi
RQ1: linear relationship bw Logical and Semantic
Chi square test
Spearman’ Rank correlation (ρ)
Per project, per pair of classes, in all revisions:
– All confidence metrics (logical coupling)
– All coupling strengths between pairs
ICSE2018 - Gothenburg19 A Capiluppi
RQ1 results
No linear relationship
between the strengths of
logical and semantic
dependencies
Can’t infer co-evolution
frequency based on
semantic strength
Using semantic to predict
co-change has low
precision
ICSE2018 - Gothenburg20 A Capiluppi
RQ2: directional relationship bw Logical and Semantic
Co-changed Semantic Dependencies (CSD, in %)
– Percentage of sem dependencies that also co-change
Semantic Logical Dependencies (SLD, in %)
– Percentage of logical dependencies that are also
semantically related
ICSE2018 - Gothenburg21 A Capiluppi
RQ2: results
Number of semantic and logical
dependencies similar magn order
In most projects, 100%
semantic dependencies are also
logical dependencies
If two classes are semantically coupled, there is a high
chance that they will co-change in the future
ICSE2018 - Gothenburg22 A Capiluppi
Serendipity findings
Semantic coupling
– use full source code or just
identifiers?
– which is more efficient?
Chi-squared test of
independence
– VSM
– N-Gram + Disco
ICSE2018 - Gothenburg23 A Capiluppi
Results: class corpora or identifiers?
Class corpora and identifiers are related: if one shows
semantic coupling, so does the other
– Identifier-based techniques are much more effective
– N-gram more efficient than Disco
ICSE2018 - Gothenburg24 A Capiluppi
Take-away messages
Very similar classes (highly-semantically coupled) are
not co-changing more often
Semantically linked classes are very likely to co-evolve
Using identifiers instead of full corpora is an efficient
and effective way of measuring semantic coupling
Work shared at https://goo.gl/eLuDbB
ICSE2018 - Gothenburg25 A Capiluppi
Thank you

More Related Content

Similar to Interplay between semantic coupling and co-change

Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEnrico Daga
 
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...Philipp Singer
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Saeedeh Shekarpour
 
CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLHan Yang
 
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...ArchiLab 7
 
Detecting java software similarities by using different clustering
Detecting java software similarities by using different clusteringDetecting java software similarities by using different clustering
Detecting java software similarities by using different clusteringDavide Ruscio
 
201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...
201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...
201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...R "Ray" Wang
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 
Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for RequirementsClément Portet
 
stanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfstanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfAdeIndriawan1
 
Master's degree thesis testing algorithms for image &amp; video understanding
Master's degree thesis   testing algorithms for image &amp; video understandingMaster's degree thesis   testing algorithms for image &amp; video understanding
Master's degree thesis testing algorithms for image &amp; video understandingEnrico Busto
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...YONG ZHENG
 
Application and evaluation of a K-Medoidsbased shape clustering method for an...
Application and evaluation of a K-Medoidsbased shape clustering method for an...Application and evaluation of a K-Medoidsbased shape clustering method for an...
Application and evaluation of a K-Medoidsbased shape clustering method for an...Venkat Projects
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringMehdi Mirakhorli
 
Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection aftab alam
 
NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018Alexander Pico
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...
Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...
Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...Markus Borg
 

Similar to Interplay between semantic coupling and co-change (20)

Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCL
 
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
 
Detecting java software similarities by using different clustering
Detecting java software similarities by using different clusteringDetecting java software similarities by using different clustering
Detecting java software similarities by using different clustering
 
201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...
201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...
201102 Constellation Research's Seven Ss of Supply Chain Management by Jeff A...
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
Re2018 Semios for Requirements
Re2018 Semios for RequirementsRe2018 Semios for Requirements
Re2018 Semios for Requirements
 
stanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfstanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdf
 
Master's degree thesis testing algorithms for image &amp; video understanding
Master's degree thesis   testing algorithms for image &amp; video understandingMaster's degree thesis   testing algorithms for image &amp; video understanding
Master's degree thesis testing algorithms for image &amp; video understanding
 
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
 
Application and evaluation of a K-Medoidsbased shape clustering method for an...
Application and evaluation of a K-Medoidsbased shape clustering method for an...Application and evaluation of a K-Medoidsbased shape clustering method for an...
Application and evaluation of a K-Medoidsbased shape clustering method for an...
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
 
Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection
 
NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...
Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...
Test Automation with Grad-CAM Heatmaps - A Future Pipe Segment in MLOps for V...
 
RDF data clustering
RDF data clusteringRDF data clustering
RDF data clustering
 
PechaKucha (FormaliSE'2018)
PechaKucha (FormaliSE'2018)PechaKucha (FormaliSE'2018)
PechaKucha (FormaliSE'2018)
 

Recently uploaded

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutionsmonugehlot87
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 

Recently uploaded (20)

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutions
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 

Interplay between semantic coupling and co-change

  • 1. The Interplay between SemanticThe Interplay between Semantic Coupling and Co-Change ofCoupling and Co-Change of Software Classes (journal first)Software Classes (journal first) Nemitari Ajienka – EdgeHill University (UK) Andrea Capiluppi – Brunel University London (UK) Steve Counsell – Brunel University London (UK)
  • 2. ICSE2018 - Gothenburg2 A Capiluppi Outline Rationale Definitions: Semantic coupling and Co-change Experimental set-up Results Conclusion
  • 3. ICSE2018 - Gothenburg3 A Capiluppi Rationale – Software changes: origin and impact [Generated by Doxygen and Graphviz] Certain classes have the tendency to change more Identify patterns or metrics of those classes
  • 4. ICSE2018 - Gothenburg4 A Capiluppi Definitions Semantic coupling – Degree of relationship between classes’ semantic content Co-change (Logical coupling) – Based on historical data – Classes changed in the same timeframe (day? Week? Commit?)
  • 5. ICSE2018 - Gothenburg5 A Capiluppi Logical and Semantic Couplings
  • 6. ICSE2018 - Gothenburg6 A Capiluppi Semantic coupling: operationalisation
  • 7. ICSE2018 - Gothenburg9 A Capiluppi Research questions RQ1: Is there a linear relationship between semantic and logical coupling? – Very similar classes (semantically) bound to co-evolve more often? RQ2: Is there a directional relationship between semantic and logical coupling? – If A and B are co-evolving, does it mean that they’re semantically linked, or – If A and B are semantically similar, will they co-evolve?
  • 8. ICSE2018 - Gothenburg10 A Capiluppi Experimental set-up
  • 9. ICSE2018 - Gothenburg11 A Capiluppi Data collection Population: GoogleCode projects – 2,599,222 projects Sampling – Only Java projects – 95% confidence level, 5% confidence interval – 380 projects – All revisions+metadata downloaded Pruning – Projects with less than 20 revisions – 79 non-trivial Java projects – Avg: 117 revisions
  • 10. ICSE2018 - Gothenburg12 A Capiluppi Characteristics of projects in sample <excerpt>
  • 11. ICSE2018 - Gothenburg13 A Capiluppi Logical and Semantic Couplings
  • 12. ICSE2018 - Gothenburg14 A Capiluppi Co-evolution data (logical coupling) Per project Per revision Per pair of OO classes “what is the likelihood that class A and B co-evolve together, based on historical data?” – Low, medium, high likelihood
  • 13. ICSE2018 - Gothenburg15 A Capiluppi Logical coupling: operationalisation Support – class A modified in 3 transactions – 2 also included changes to C – Support for A C is 2.→ Confidence – Confidence for A C (“C→ depends on A”) is 2/3 = 0.67 – Confidence for C A (“A→ depends on C”) is 2/4 = 0.5.
  • 14. ICSE2018 - Gothenburg16 A Capiluppi Semantic coupling: operationalisation Per project All revisions Pair of classes UrSQLController vs UrSQLEntry – N-gram similarity of 0.6 for n-grams of n=4 Vector Space Model (VSM) text corpora (full code) N-Gram technique: small sentences (class identifiers) Disco Word synonym: small sentences (class identifiers)
  • 15. ICSE2018 - Gothenburg17 A Capiluppi Results
  • 16. ICSE2018 - Gothenburg18 A Capiluppi RQ1: linear relationship bw Logical and Semantic Chi square test Spearman’ Rank correlation (ρ) Per project, per pair of classes, in all revisions: – All confidence metrics (logical coupling) – All coupling strengths between pairs
  • 17. ICSE2018 - Gothenburg19 A Capiluppi RQ1 results No linear relationship between the strengths of logical and semantic dependencies Can’t infer co-evolution frequency based on semantic strength Using semantic to predict co-change has low precision
  • 18. ICSE2018 - Gothenburg20 A Capiluppi RQ2: directional relationship bw Logical and Semantic Co-changed Semantic Dependencies (CSD, in %) – Percentage of sem dependencies that also co-change Semantic Logical Dependencies (SLD, in %) – Percentage of logical dependencies that are also semantically related
  • 19. ICSE2018 - Gothenburg21 A Capiluppi RQ2: results Number of semantic and logical dependencies similar magn order In most projects, 100% semantic dependencies are also logical dependencies If two classes are semantically coupled, there is a high chance that they will co-change in the future
  • 20. ICSE2018 - Gothenburg22 A Capiluppi Serendipity findings Semantic coupling – use full source code or just identifiers? – which is more efficient? Chi-squared test of independence – VSM – N-Gram + Disco
  • 21. ICSE2018 - Gothenburg23 A Capiluppi Results: class corpora or identifiers? Class corpora and identifiers are related: if one shows semantic coupling, so does the other – Identifier-based techniques are much more effective – N-gram more efficient than Disco
  • 22. ICSE2018 - Gothenburg24 A Capiluppi Take-away messages Very similar classes (highly-semantically coupled) are not co-changing more often Semantically linked classes are very likely to co-evolve Using identifiers instead of full corpora is an efficient and effective way of measuring semantic coupling Work shared at https://goo.gl/eLuDbB
  • 23. ICSE2018 - Gothenburg25 A Capiluppi Thank you