Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Linking E-Mails and Source Code Artifacts

on

  • 584 views

Slides of the presentation given at ICSE 2010 (http://www.sbs.co.za/ICSE2010/) on the paper (http://www.inf.usi.ch/faculty/lanza/Downloads/Bacc2010b.pdf).

Slides of the presentation given at ICSE 2010 (http://www.sbs.co.za/ICSE2010/) on the paper (http://www.inf.usi.ch/faculty/lanza/Downloads/Bacc2010b.pdf).

Statistics

Views

Total Views
584
Views on SlideShare
584
Embed Views
0

Actions

Likes
1
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Linking E-Mails and Source Code Artifacts Linking E-Mails and Source Code Artifacts Presentation Transcript

  • Linking E-Mails and Source Code Artifacts Alberto Bacchelli, Michele Lanza REVEAL @ Faculty of Informatics University of Lugano Romain Robbes PLEIAD @ DCC University of Chile
  • Linking E-Mails and Source Code Artifacts
  • Linking E-Mails and Source Code Artifacts View slide
  • Linking E-Mails and Source Code Artifacts View slide
  • Linking E-Mails and Source Code Artifacts
  • Linking E-Mails and Source Code Artifacts
  • Linking E-Mails and Source Code Artifacts
  • Linking E-Mails and Source Code Artifacts
  • Linking E-Mails and Source Code Artifacts
  • Linking E-Mails and Source Code Artifacts
  • E-mails are precious for software engineering
  • Jun-95! Sep-95! Dec-95! Mar-96! Jun-96! Sep-96! Dec-96! Mar-97! Jun-97! Sep-97! Dec-97! Mar-98! Jun-98! Sep-98! Dec-98! Mar-99! Jun-99! Sep-99! Dec-99! Mar-00! Jun-00! Sep-00! Dec-00! Mar-01! Jun-01! E-mails are the Sep-01! Dec-01! Mar-02! Jun-02! - Karl Fogel, creator of the Subversion project Sep-02! Dec-02! Mar-03! Jun-03! Sep-03! Dec-03! “bread and butter of Mar-04! Jun-04! Sep-04! Dec-04! Mar-05! Jun-05! project communication” Sep-05! Dec-05! Mar-06! Jun-06! Sep-06! Dec-06! Mar-07! Jun-07! Sep-07! Dec-07! Mar-08! Jun-08! Sep-08! Dec-08! Mar-09! Jun-09! Sep-09! Dec-09! Mar-10! 0! 2000! 6000! 8000! 4000! 10000! 12000! 14000! 16000! Number of e-mails
  • Effectiveness 6 Unplanned 5 External Meetings Planned Documents Internal Web Meetings Documents E-mails 4 IM Bug Database Phone 3 Other 2 1 Frequency of usage 0 0% 5% 10% 15% 20% 25% 30% Maintaining mental models: a study of developer work habits LaToza, Venolia, DeLine [ICSE 2006]
  • E-mails are widely used Effectiveness 6 and highly effective Unplanned 5 External Meetings Planned Documents Internal Web Meetings Documents E-mails 4 IM Bug Database Phone 3 Other 2 1 Frequency of usage 0 0% 5% 10% 15% 20% 25% 30% Maintaining mental models: a study of developer work habits LaToza, Venolia, DeLine [ICSE 2006]
  • E-mails are people-centric information used to exchange knowledge
  • Linking E-Mails and Source Code Artifacts
  • Recovering Traceability Links - State of the Art Probabilistic Model Vector Space Model Latent Semantic Indexing
  • Recovering Traceability Links - State of the Art Probabilistic Model Antoniol, Canfora, Casazza, De Lucia, Merlo Vector Space Model TSE 2002 Latent Semantic Indexing
  • Recovering Traceability Links - State of the Art Probabilistic Model Marcus and Maletic ICSE 2003 Vector Space Model Latent Semantic Indexing
  • Recovering Traceability Links Vector Space Model Latent Semantic Indexing
  • Recovering Traceability Links Vector Space Model Latent Semantic Indexing
  • Recovering Traceability Links Vector Space Model Latent Semantic Indexing
  • Without robust, well-designed time-tested, and, eventually well-established and accepted benchmarks, research on application of IR methods to problems in Software Engineering will not reach its full potential. - Alex Dekhtyar and Jane Huffman Hayes, ICSM 2006
  • Without benchmarks, Software Engineering will not reach its full potential.
  • Benchmarking the Link System ArgoUML Augeas Away3D Freenet Habari JMeter
  • Benchmarking the Link System Language ArgoUML Java Augeas Away3D Freenet Java Habari JMeter Java
  • Benchmarking the Link System Language ArgoUML Java Augeas C Away3D ActionScript Freenet Java Habari PHP5 JMeter Java
  • Benchmarking the Link System Language Releases ArgoUML Java 11 Augeas C 17 Away3D ActionScript 9 Freenet Java 30 Habari PHP5 12 JMeter Java 20
  • Benchmarking the Link System Language Releases Entities ArgoUML Java 11 18,252 Augeas C 17 8,042 Away3D ActionScript 9 2,351 Freenet Java 30 37,878 Habari PHP5 12 1,105 JMeter Java 20 11,105
  • Benchmarking the Link System Language Releases Entities E-Mails ArgoUML Java 11 18,252 355 Augeas C 17 8,042 281 Away3D ActionScript 9 2,351 370 Freenet Java 30 37,878 379 Habari PHP5 12 1,105 374 JMeter Java 20 11,105 380
  • Benchmarking the Link System Language Releases Entities E-Mails ArgoUML Java 11 18,252 355 Augeas C 17 8,042 281 Away3D ActionScript 9 2,351 370 Freenet Java 30 37,878 379 Habari PHP5 12 1,105 374 JMeter Java 20 11,105 380
  • The Miler Web Application
  • The Miler Web Application
  • The Miler Web Application release history
  • The Miler Web Application release history
  • The Miler Web Application release history
  • The Miler Web Application release history
  • The Miler Web Application release history
  • The Miler Web Application release history
  • The Miler Web Application release history
  • Benchmarking the Link System Language Releases Entities E-Mails ArgoUML Java 11 18,252 355 Augeas C 17 8,042 281 Away3D ActionScript 9 2,351 370 Freenet Java 30 37,878 379 Habari PHP5 12 1,105 374 JMeter Java 20 11,105 380
  • Benchmarking the Link System Language Releases Entities E-Mails ArgoUML Java 11 18,252 355 Augeas C 17 8,042 281 Away3D ActionScript 9 2,351 370 Freenet Java 30 37,878 379 Habari PHP5 12 1,105 374 JMeter Java 20 11,105 380
  • Benchmarking the Link System Language Releases Entities E-Mails ArgoUML Java 11 18,252 355 Augeas C 17 8,042 281 Away3D ActionScript 9 2,351 370 Freenet Java 30 37,878 379 Habari PHP5 12 1,105 374 JMeter Java 20 11,105 380
  • Benchmarking the Link System Language Releases Entities E-Mails ArgoUML Java 11 18,252 355 si.ch Augeas C 17 8,042 281 http://m Away3D iler.inf.u ActionScript 9 2,351 370 Freenet Java 30 37,878 379 Habari PHP5 12 1,105 374 JMeter Java 20 11,105 380
  • Vector Space Model
  • Vector Space Model D1 D2 D3 ... DN t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model D1 D2 D3 ... DN t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model D1 D2 D3 ... DN t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model D1 D2 D3 ... DN t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model D1 D2 D3 ... DN t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model D1 D2 D3 ... DN t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model E1 E2 E3 ... EN t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model E1 E2 E3 ... EN term frequency t1 0 1 2 0 t2 0 0 1 4 ... tC 1 2 0 0
  • Vector Space Model E1 E2 E3 ... EN term frequency t1 0 0.3 0.3 0 t2 0 0 0.01 0.5 ... tC 0.02 0.4 0 0
  • Vector Space Model E1 E2 E3 ... EN term frequency t1 0 0.3 0.3 0 t2 0 0 0.01 0.5 inverse document ... frequency tC 0.02 0.4 0 0
  • Vector Space Model E1 E2 E3 ... EN term frequency t1 0 0.01 0.01 0 t2 0 0 0.01 0.5 inverse document ... frequency tC 0.02 0.4 0 0
  • Vector Space Model E1 E2 E3 ... EN Q t1 0 0.01 0.01 0 0 t2 0 0 0.01 0.5 0.2 ... tC 0.02 0.4 0 0 0.01
  • Vector Space Model E1 E2 E3 ... EN Q t1 0 0.01 0.01 0 0 t2 0 0 0.01 0.5 0.2 ... tC 0.02 0.4 0 0 0.01
  • Vector Space Model E1 E2 E3 ... EN Q t1 0 0.01 0.01 0 0 E1 t2 0 0 0.01 0.5 0.2 ... tC 0.02 0.4 0 0 0.01
  • Vector Space Model E1 E2 E3 ... EN Q t1 0 0.01 0.01 0 0 E1 t2 0 0 0.01 0.5 0.2 ... tC 0.02 0.4 0 0 0.01
  • Vector Space Model E1 E2 E3 ... EN Q t1 0 0.01 0.01 0 0 E1 t2 0 0 0.01 0.5 0.2 ... tC 0.02 0.4 0 0 0.01
  • Vector Space Model E1 E2 E3 ... EN Q t1 0 0.01 0.01 0 0 E1 E3 t2 0 0 0.01 0.5 0.2 E7 ... tC 0.02 0.4 0 0 0.01
  • VSM on JMeter - Choosing query type and threshold F-Measure Threshold entire content classname&package classname
  • VSM on JMeter - Choosing query type and threshold F-Measure 0.4 0.3 0.2 0.1 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold entire content classname&package classname
  • VSM on JMeter - Choosing query type and threshold F-Measure 0.4 0.3 0.2 0.1 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold entire content classname&package classname
  • VSM on JMeter - Choosing query type and threshold F-Measure 0.4 0.3 0.2 0.1 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold entire content classname&package classname
  • VSM on JMeter - Best configuration results 1.0 0.8 0.6 0.4 0.2 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold precision recall f-measure
  • VSM - Best configuration results F-Measure 0.4 0.3 0.2 0.1 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold ArgoUML Freenet JMeter Away3D Habari Augeas
  • VSM - Best configuration results F-Measure 0.4 0.3 0.2 0.1 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold ArgoUML Freenet JMeter Away3D Habari Augeas
  • Latent Semantic Indexing
  • Latent Semantic Indexing ‣ Synonymy
  • Latent Semantic Indexing ‣ Synonymy NSUML NSUMLModelFacade
  • Latent Semantic Indexing ‣ Synonymy NSUML = NSUMLModelFacade
  • Latent Semantic Indexing ‣ Synonymy NSUML = NSUMLModelFacade ‣ Polysemy
  • Latent Semantic Indexing ‣ Synonymy NSUML = NSUMLModelFacade ‣ Polysemy dialog Dialog
  • Latent Semantic Indexing ‣ Synonymy NSUML = NSUMLModelFacade ‣ Polysemy dialog = Dialog
  • Latent Semantic Indexing E1 E2 ... EN t1 0 1 0 t2 0 0 4 ... tC 1 2 0
  • Latent Semantic Indexing E1 E2 ... EN t1 0 1 0 Single Value t2 0 0 4 Decomposition ... tC 1 2 0
  • Latent Semantic Indexing E1 E2 ... EN E1 E2 ... EN t1 0 1 0 tpc1 0 0.02 0 Single Value t2 0 0 4 tpc2 0 0 0.4 Decomposition ... ... tpcK 0.1 0.2 0 tC 1 2 0
  • Latent Semantic Indexing E1 E2 ... EN E1 E2 ... EN t1 0 1 0 tpc1 0 0.02 0 Single Value t2 0 0 4 tpc2 0 0 0.4 Decomposition ... ... tpcK 0.1 0.2 0 tC 1 2 0
  • Latent Semantic Indexing E1 E2 ... EN E1 E2 ... EN t1 0 1 0 tpc1 0 0.02 0 Single Value t2 0 0 4 tpc2 0 0 0.4 Decomposition ... ... tpcK 0.1 0.2 0 tC 1 2 0
  • LSI - Choosing the number of topics and query type F-Measure Number of topics entire content classname&package classname
  • LSI - Choosing the number of topics and query type F-Measure 0.4 0.3 0.2 0.1 Number of topics 0 10 30 50 70 90 110 130 150 170 190 210 230 250 270 290 310 330 350 entire content classname&package classname
  • LSI - Choosing the number of topics and query type F-Measure 0.4 0.3 0.2 0.1 Number of topics 0 10 30 50 70 90 110 130 150 170 190 210 230 250 270 290 310 330 350 entire content classname&package classname
  • LSI - Choosing the number of topics and query type F-Measure 0.4 0.3 0.2 0.1 Number of topics 0 10 30 50 70 90 110 130 150 170 190 210 230 250 270 290 310 330 350 entire content classname&package classname
  • LSI on JMeter - Best configuration results 0.8 0.6 0.4 0.2 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold precision recall f-measure
  • LSI - Best configuration results F-Measure 0.60 0.45 0.30 0.15 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold ArgoUML Freenet JMeter Away3D Habari Augeas
  • LSI - Best configuration results F-Measure 0.60 0.45 0.30 0.15 0 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91 Threshold ArgoUML Freenet JMeter Away3D Habari Augeas
  • What replaces PluggableImport and Generator2? (and other language module questions) Tom Morris tfmo...@gmail.com September 23, 2006 - 13:12:51 We're trying to implement support in ArgoEclipse for reverse engineering which means that we need to deal with the PluggableImport interface. It doesn't really make sense to modify that interface because it is deprecated, but I can't figure o u t wh a t r e p l a c e s i t . Th e c o m m e n t s s ay t o r e g i s t e r w i t h org.argouml.uml.reveng.Import but that class has no registration method. Additionally, it itself depends on the deprecated PluggableImport interface. On the code generation side of things, Generator2 has been deprecated in favor of CodeGenerator, but they don't appear to have equivalent functionality, so I don't understand how this is meant to work. Are there examples of modules which have been converted to the new structure? Is there a design discussion somewhere which describes how to convert old style modules to new style modules? Who's working on this stuff? I'm happy to help if I can get an idea of what the design direction is. Tom
  • What replaces PluggableImport and Generator2? (and other language module questions) Tom Morris tfmo...@gmail.com September 23, 2006 - 13:12:51 We're trying to implement support in ArgoEclipse for reverse engineering which means that we need to deal with the PluggableImport interface. It doesn't really make sense to modify that interface because it is deprecated, but I can't figure o u t wh a t r e p l a c e s i t . Th e c o m m e n t s s ay t o r e g i s t e r w i t h org.argouml.uml.reveng.Import but that class has no registration method. Additionally, it itself depends on the deprecated PluggableImport interface. On the code generation side of things, Generator2 has been deprecated in favor of CodeGenerator, but they don't appear to have equivalent functionality, so I don't understand how this is meant to work. Are there examples of modules which have been converted to the new structure? Is there a design discussion somewhere which describes how to convert old style modules to new style modules? Who's working on this stuff? I'm happy to help if I can get an idea of what the design direction is. Tom
  • Text Matching
  • Text Matching Entity Name
  • Text Matching Entity Name dictionary word?
  • Text Matching Entity Name dictionary word? no
  • Text Matching Entity Name dictionary word? no Name case sensitive
  • Text Matching Entity Name dictionary word? no yes Name case sensitive Regular Expression
  • Text Matching Entity Name dictionary word? no yes Name case sensitive Regular Expression
  • Text Matching - Regular Expression
  • Text Matching - Regular Expression Classname
  • Text Matching - Regular Expression . space Classname /
  • Text Matching - Regular Expression . space Classname / space
  • Text Matching - Regular Expression . . / space Classname / space
  • Text Matching - Regular Expression java class . as . / php space Classname c / space
  • Text Matching - Regular Expression java class . as . / php space Classname package c / space
  • Text Matching - Regular Expression java class . as . / . php space package space Classname c / / space
  • Text Matching Entity Name dictionary word? no yes Name case sensitive Regular Expression
  • Text Matching dictionary word?
  • Text Matching dictionary word? Dialog DialogTree
  • Text Matching Entity Name dictionary word? no yes Name case sensitive Regular Expression
  • Text Matching Entity Name CamelCase? yes no Name case sensitive Regular Expression
  • Text Matching 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8
  • Text Matching 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 Recall
  • Text Matching 0.8 0.6 0.4 Precision 0.2 0 0 0.2 0.4 0.6 0.8 Recall
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 0 0 0.2 0.4 0.6 0.8 R
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 Java JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 0 0 0.2 0.4 0.6 0.8 R
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 0 0 0.2 0.4 0.6 0.8 R
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 ActionScript JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 0 0 0.2 0.4 0.6 0.8 R
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 0 0 0.2 0.4 0.6 0.8 R
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 PHP5 0 0 0.2 0.4 0.6 0.8 R
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 0 0 0.2 0.4 0.6 0.8 R
  • Text Matching Precision Recall F ArgoUML 0.61 0.64 0.63 Freenet 0.59 0.59 0.59 JMeter 0.59 0.65 0.62 P Away3D 0.41 0.72 0.52 0.8 Habari 0.49 0.38 0.43 0.6 Augeas 0.15 0.64 0.24 0.4 0.2 C 0 0 0.2 0.4 0.6 0.8 R
  • Overall results Precision Recall
  • Overall results Freenet 0.8 0.6 Precision 0.4 0.2 0 0 0.2 0.4 0.6 0.8 Recall
  • Overall results Freenet 0.8 0.6 Precision 0.4 0.2 0 0 0.2 0.4 0.6 0.8 Recall LSI VSM Text Matching
  • ArgoUML Freenet JMeter 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 LSI VSM Text Matching Away3D Habari Augeas 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8