5. If we would automatically identify
cross-language relations we could:
• Recognize them
So I am aware that this ID is
related to something else
• Support refactoring
• Validate them
• Navigate them
6. If we would automatically identify
cross-language relations we could:
• Recognize them
• Support refactoring
If I change one, the others are
updated
• Validate them
• Navigate them
7. If we would automatically identify
cross-language relations we could:
• Recognize them
• Support refactoring
• Validate them
See broken relations as errors
• Navigate them
8. If we would automatically identify
cross-language relations we could:
• Recognize them
• Support refactoring
• Validate them
• Navigate them
Click to see the other side of
the relation
14. Context of a node:
all the descendants
+
the siblings and their descendants
15. Context of a node:
all the descendants
+
the siblings and their descendants
16. How to compare contexts:
1) Take all the values in the context (IDs, strings,
numbers)
+
2) Employ different metrics
Some metrics we use:
• Number of shared values
• Min and max number of different values
• Tversky Index
𝑇𝑉 𝑋, 𝑌 =
|𝑋∩𝑌|
|𝑋∩𝑌|+𝛼|𝑋−𝑌|+𝛽|𝑌−𝑋|
• Jaro, Jaccard, tf-idf and others
17. How to combine those metrics:
Random Tree tells us
We built a golden set of 1200 candidate relations
(around 140 real relations, the other just same ID)
We train it with golden set
Random Tree find out the best way to combine those
metrics to decide if a pair is related or not
Output of Random Tree
Rule to understand if two nodes with same ID are
connected
19. What we have
• A tool that spot automatically cross-language relations
with a precision and recall > 90% (on a first in-house
dataset)
What now?
• We want to build a larger golden set
• We want to integrate support in editors
Code available at:
https://github.com/orgs/CrossLanguageProject
20. Spotting Automatically
Cross-Language Relations
Federico Tomassetti, Giuseppe Rizzo, Marco Torchiano
CSMR 2014, Antwerpen, Belgium
Preprint at:
http://www.di.unito.it/~rizzo/publications/Tomassetti_Rizzo-CSMRWCRE2014.pdf
www.slideshare.net/FTomassetti
Code available at:
https://github.com/orgs/CrossLanguageProject