"A Tree Kernel based approach for clone detection"
Slides presented at the International Conference of Software Maintenance (ICSM) 2010 - TImsoara, Romania
Principles of functional progrmming in scalaehsoon
a short outline on necessity of functional programming and principles of functional programming in Scala.
In the article some keyword are used but not explained (to keep the article short and simple), the interested reader can look them up in internet.
Classification using Apache SystemML by Prithviraj SenArvind Surve
This deck will cover various algorithms at high level. Those algorithms include "Supervised Learning and Classification", "Training Discriminative Classifiers", "Representer Theorem", "Support Vector Machines", "Logistic Regression", "Generative Classifiers: Naive Bayes", "Deep Learning" and "Tree Ensembles"
Principles of functional progrmming in scalaehsoon
a short outline on necessity of functional programming and principles of functional programming in Scala.
In the article some keyword are used but not explained (to keep the article short and simple), the interested reader can look them up in internet.
Classification using Apache SystemML by Prithviraj SenArvind Surve
This deck will cover various algorithms at high level. Those algorithms include "Supervised Learning and Classification", "Training Discriminative Classifiers", "Representer Theorem", "Support Vector Machines", "Logistic Regression", "Generative Classifiers: Naive Bayes", "Deep Learning" and "Tree Ensembles"
This presentation describes key concepts in Java. I call it The Java Quicky.
This is part of a series of presentations to cover the Java programming language and its new offerings and versions in depth.
This is my attempt to compose a brief and cursory introduction to concepts in Java programming language. I call it Java Quicky.
I plan to extend and enhance it over time.
Java Serialization is often considered a dark art of Java programmers. This session will lift the veil and show what serialization is and isn't, how you can use it for profit and evil. After this session no NotSerializableException will be unconquerable.
In this core java training session, you will learn Collections. Topics covered in this session are:
• Recap of Arrays
• Introduction to Collections API
• Lists – ArrayList, Vector, LinkedList
For more information about this course visit on this link: https://www.mindsmapped.com/courses/software-development/learn-java-fundamentals-hands-on-training-on-core-java-concepts/
"Number Crunching in Python": slides presented at EuroPython 2012, Florence, Italy
Slides have been authored by me and by Dr. Enrico Franchi.
Scientific and Engineering Computing, Numpy NDArray implementation and some working case studies are reported.
Web frameworks: web development done right.
Lecture of Web Technologies, University of Naples Federico II
Main topics:
- Evolution of Web Technolgies
- Web frameworks and Design Principles
- Django and Google App engine: web frameworks in Python
Machine Learning for Software MaintainabilityValerio Maggio
"Machine Learning for Software Maintainability":
Slides presented at the Joint Workshop on Intelligent Methods for Software System Engineering (JIMSE) 2012
This presentation describes key concepts in Java. I call it The Java Quicky.
This is part of a series of presentations to cover the Java programming language and its new offerings and versions in depth.
This is my attempt to compose a brief and cursory introduction to concepts in Java programming language. I call it Java Quicky.
I plan to extend and enhance it over time.
Java Serialization is often considered a dark art of Java programmers. This session will lift the veil and show what serialization is and isn't, how you can use it for profit and evil. After this session no NotSerializableException will be unconquerable.
In this core java training session, you will learn Collections. Topics covered in this session are:
• Recap of Arrays
• Introduction to Collections API
• Lists – ArrayList, Vector, LinkedList
For more information about this course visit on this link: https://www.mindsmapped.com/courses/software-development/learn-java-fundamentals-hands-on-training-on-core-java-concepts/
"Number Crunching in Python": slides presented at EuroPython 2012, Florence, Italy
Slides have been authored by me and by Dr. Enrico Franchi.
Scientific and Engineering Computing, Numpy NDArray implementation and some working case studies are reported.
Web frameworks: web development done right.
Lecture of Web Technologies, University of Naples Federico II
Main topics:
- Evolution of Web Technolgies
- Web frameworks and Design Principles
- Django and Google App engine: web frameworks in Python
Machine Learning for Software MaintainabilityValerio Maggio
"Machine Learning for Software Maintainability":
Slides presented at the Joint Workshop on Intelligent Methods for Software System Engineering (JIMSE) 2012
Lecture of Software Engineering II
University of Naples Federico II
Main Topics:
- What is TDD
- TDD and XP
- TDD Mantra
- TDD Principles and Practices
LINSEN an efficient approach to split identifiers and expand abbreviationsValerio Maggio
"Linsen an efficient approach to split identifiers and expand abbreviations"
Slides presented at the International Conference of Software Maintenance (ICSM) 2012, Riva del Garda (TN), Italy
Unsupervised Machine Learning for clone detectionValerio Maggio
"Unsupervised Machine Learning for clone detection" highlights the main topics of using Unsupervised Machine Learning techniques (Kernel methods and data clustering) for the code clones detection task.
Software Engineering Class on (main) Scaffolding techniques using JMock Java Framework.
Slides report some working examples and highlight differences among Mock Objects, Test Double and Test Stubs.
Lecture of Software Engineering II - University of Naples Federico II Main Topics:
- Testing Taxonimes
- Unit Testing with JUnit 4.x
- Testing Scaffolding and Mocking
- JMock
Improving Software Maintenance using Unsupervised Machine Learning techniquesValerio Maggio
"Improving Software Maintenance using Unsupervised Machine Learning techniques": Ph.D. defence presentation.
Unsupervised Machine Learning techniques have been used to face different software maintenance issues such as Software Modularisation and Clone detection.
Refactoring: Improve the design of existing codeValerio Maggio
Refactoring: Improve the design of existing code
Software Engineering class on main refactoring techniques and bad smells reported in the famous Fawler's book on this topic!
Multi-Armed Bandits: Intro, examples and tricksIlias Flaounas
In this talk Ilias will discuss some variations of the Multi-Armed Bandits (MABs), a less popular although important area of Machine Learning. MABs enable us to build adaptive systems capable of finding solutions for tasks based on the interactions with their environment. MABs solve a task by acquiring useful knowledge at every step of an iterative process while they balance the exploration-exploitation dilemma. They are used to tackle practical problems like selecting appropriate online ads and personalized content for presentation to users; assigning people to cohorts in controlled trials; supporting decision making and more. To solve these kinds of problems solutions need to be identified as fast as possible since accepting errors can be costly. Ilias will discuss some examples from industry and academia as well as some of the related work at Atlassian.
very useful for cluster analysis. supportive for engineering student as well as it students. also provide example for every topic helps in numerical problems. good material for reading.
- POSTECH EECE695J, "딥러닝 기초 및 철강공정에의 활용", 2017-11-10
- Contents: introduction to reccurent neural networks, LSTM, variants of RNN, implementation of RNN, case studies
- Video: https://youtu.be/pgqiEPb4pV8
Deep dive into the mathematics and algorithms of neural nets. Covers the sigmoid activation function, cross-entropy loss function, gradient descent and the derivatives used in back propagation.
Table of contents
Download R script from my Google drive:
20190828_R-user-group_string-manipulation.R
What is it like to manipulate string?
What are special characters?
How to specify a pattern?
Scenarios that you will handle string
● Manipulating output from a R object
● Subsetting files through their names or paths
● Subsetting groups
Summary
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
A behind-the-scenes look into the latest renderer technology powering the critically acclaimed DOOM. The lecture will cover how technology was designed for balancing a good visual quality and performance ratio. Numerous topics will be covered, among them details about the lighting solution, techniques for decoupling costs frequency and GCN specific approaches.
A Comparative Analysis of Feature Selection Methods for Clustering DNA SequencesCSCJournals
Large-scale analysis of genome sequences is in progress around the world, the major application of which is to establish the evolutionary relationship among the species using phylogenetic trees. Hierarchical agglomerative algorithms can be used to generate such phylogenetic trees given the distance matrix representing the dissimilarity among the species. ClustalW and Muscle are two general purpose programs that generates distance matrix from the input DNA or protein sequences. The limitation of these programs is that they are based on Smith-Waterman algorithm which uses dynamic programming for doing the pair-wise alignment. This is an extremely time consuming process and the existing systems may even fail to work for larger input data set. To overcome this limitation, we have used the frequency of codons usage as an approximation to find dissimilarity among species. The proposed technique further reduces the complexity by extracting only the significant features of the species from the mtDNA sequences using the techniques like frequent codons, codons with maximum range value or PCA technique. We have observed that the proposed system produces nearly accurate results in a significantly reduced running time.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Generative AI Deep Dive: Advancing from Proof of Concept to Production
A Tree Kernel based approach for clone detection
1. A Tree Kernel Based
Approach for
Clone Detection
1) University of Naples Federico II
2) University of Basilicata
Anna Corazza1
, Sergio Di Martino1
,
Valerio Maggio1
, Giuseppe Scanniello2
2. Outline
►Background
○ Clone detection definition
○ State of the Art Techniques Taxonomy
►Our Abstract Syntax Tree based Proposal
○ A Tree Kernel based approach for clone detection
►A preliminary evaluation
3. Code Clones
► Two code fragments form a clone if they are similar enough
according to a given measure of similarity (I.D. Baxter, 1998)
3. R. Tiarks, R. Koschke, and R. Falke,
An assessment of type-3 clones as detected by state-of-the-art tools
1
4. Code Clones
► Two code fragments form a clone if they are similar enough
according to a given measure of similarity (I.D. Baxter, 1998)
► Similarity based on Program Text or on “Semantics”
3. R. Tiarks, R. Koschke, and R. Falke,
An assessment of type-3 clones as detected by state-of-the-art tools
1
5. Code Clones
► Two code fragments form a clone if they are similar enough
according to a given measure of similarity (I.D. Baxter, 1998)
► Similarity based on Program Text or on “Semantics”
► Program Text can be further distinguished by their degree of similarity1
○ Type 1 Clone: Exact Copy
○ Type 2 Clone: Parameter Substituted Clone
○ Type 3 Clone: Modified/Structure Substituted Clone
1. R. Tiarks, R. Koschke, and R. Falke,
An assessment of type-3 clones as detected by state-of-the-art tools
1
6. State of the Art Techniques
► Classified in terms of Program Text representation2
○ String, token, syntax tree, control structures, metric vectors
► String/Token based Techniques
► Abstract Syntax Tree (AST) Techniques
► ...
2
2. Roy, Cordy, Koschke Comparison and Evaluation of Clone Detection Tools and Technique 2009
7. State of the Art Techniques
► String/Token based Techniques
► Abstract Syntax Tree (AST) Techniques
► ...
► Combined Techniques (a.k.a. Hybrid)
○Combine different representations
○Combine different techniques
○Combine different sources of information
●Tree Kernel based approach (Our approach :)
2
9. The Goal
► Define an AST based technique able to detect up to Type 3
Clones
3
10. The Goal
► Define an AST based technique able to detect up to Type 3
Clones
► The Key Ideas:
○ Improve the amount of information carried by ASTs by adding (also)
lexical information
○ Define a proper measure to compute similarities among (sub)trees,
exploiting such information
3
11. The Goal
► Define an AST based technique able to detect up to Type 3
Clones
► The Key Ideas:
○ Improve the amount of information carried by ASTs by adding (also)
lexical information
○ Define a proper measure to compute similarities among (sub)trees,
exploiting such information
► As a measure we propose the use of a
(Tree) Kernel Function
3
12. Kernels for Structured Data
► Kernels are a class of functions with many appealing features:
○ Are based on the idea that a complex object can be described in terms of
its constituent parts
○ Can be easily tailored to a specific domain
► There exist different classes of Kernels:
○ String Kernels
○ Graph Kernels
○ …
○ Tree Kernels
● Applied to NLP Parse Trees (Collins and Duffy 2004)
4
13. Defining a new Tree Kernel
► The definition of a new Tree Kernel requires the
specification of:
(1) A set of features to annotate nodes of
compared trees
5
14. Defining a new Tree Kernel
► The definition of a new Tree Kernel requires the
specification of:
(1) A set of features to annotate nodes of compared
trees
(2) A (primitive) Kernel Function to measure the
similarity of each pair of nodes
5
15. Defining a new Tree Kernel
► The definition of a new Tree Kernel requires the
specification of:
(1) A set of features to annotate nodes of compared
trees
(2) A (primitive) Kernel Function to measure the
similarity of each pair of nodes
(3) A proper Kernel Function to compare subparts of
trees
5
16. (1) The defined features
► We annotate each node of AST by 4 features:
6
17. (1) The defined features
► We annotate each node of AST by 4 features:
○ Instruction Class
● i.e. LOOP, CONDITIONAL CONTROL, CONTROL FLOW
CONTROL,...
6
18. (1) The defined features
► We annotate each node of AST by 4 features:
○ Instruction Class
● i.e. LOOP, CONDITIONAL CONTROL, CONTROL FLOW
CONTROL,...
○ Instruction
● i.e. FOR, WHILE, IF, RETURN, CONTINUE,...
6
19. (1) The defined features
► We annotate each node of AST by 4 features:
○ Instruction Class
● i.e. LOOP, CONDITIONAL CONTROL, CONTROL FLOW CONTROL,...
○ Instruction
● i.e. FOR, WHILE, IF, RETURN, CONTINUE,...
○ Context
● Instruction class of statement in which node is
enclosed
6
20. (1) The defined features
► We annotate each node of AST by 4 features:
○ Instruction Class
● i.e. LOOP, CONDITIONAL CONTROL, CONTROL FLOW CONTROL,...
○ Instruction
● i.e. FOR, WHILE, IF, RETURN, CONTINUE,...
○ Context
● Instruction class of statement in which node is enclosed
○ Lexemes
● Lexical information within the code
6
21. Context Feature
► Rationale: two nodes are more similar if they appear in the same
Instruction class
for (int i=0; i<10; i++)
x += i+2;
if (i<10)
x += i+2;
while (i<10)
x += i+2;
7
22. Context Feature
► Rationale: two nodes are more similar if they appear in the same
Instruction class
for (int i=0; i<10; i++)
x += i+2;
if (i<10)
x += i+2;
while (i<10)
x += i+2;
7
23. Context Feature
► Rationale: two nodes are more similar if they appear in the same
Instruction class
for (int i=0; i<10; i++)
x += i+2;
if (i<10)
x += i+2;
while (i<10)
x += i+2;
7
24. Context Feature
► Rationale: two nodes are more similar if they appear in the same
Instruction class
for (int i=0; i<10; i++)
x += i+2;
if (i<10)
x += i+2;
while (i<10)
x += i+2;
7
25. Context Feature
► Rationale: two nodes are more similar if they appear in the same
Instruction class
for (int i=0; i<10; i++)
x += i+2;
if (i<10)
x += i+2;
while (i<10)
x += i+2;
7
26. Lexemes Feature
► For leaf nodes:
○ It is the lexeme associated to the node
► For internal nodes:
○ It is the set of lexemes that recursively comes from
subtrees with minimum height
8
33. (2) Applying features in a Kernel
We exploits these features to compute similarity among pairs of
nodes, as follows:
► Instruction Class filters comparable nodes
○ We compare only nodes with the same Instruction Class
► Instruction, Context and Lexemes are used to define a value of
similarity between compared nodes
10
34. (Primitive) Kernel Function between nodes
1.0 If two nodes have the same values of
features
0.8 If two nodes differ in lexemes
(same instruction and context)
0.7 If two nodes share lexemes and are
the same instruction
0.5 If two nodes share lexemes and are
enclosed in the same context
0.25 If two nodes have at least one feature
in common
0.0 no match
s(n1,n2)=
11
35. (3) Tree Kernel: Kernel on entire Tree Structures
►We apply nodes comparison recursively to compute
similarity between subtrees
►We aim to identify the maximum isomorphic
tree/subtree
12
38. Evaluation Description
► We considered a small Java software system
○ We choose to identify clones at method level
► We checked system against the presence of up to Type 3 clones
○ Removed all detected clones through refactoring operations
► We manually and randomly injected a set of artificially created clones
○ One set for each type of clones
► We applied our prototype and CloneDigger* to mutated systems
► We evaluated performances in terms of Precision, Recall and F1
*http://clonedigger.sourceforge.net/
14
39. Results (1)
► Type 1 and Type 2 Clones:
○ We were able to detect all clones without any false
positive
○ This was obtained also by CloneDigger
○ Both tools expressed the potential of AST-based
approaches
15
40. Results (2)
► Type 3 clones:
○ We classified results as “true Type 3 clones” according to
different thresholds on similarity values
○ We measured performance on different thresholds
We get best results with
threshold equals to 0.70
16
41. Conclusions and Future Works
► Measure performance on real systems and projects
○ Bellon's Benchmark
○ Investigate best results with 0.7 as threshold
○ Measure Time Performances
► Improve the scalability of the approach
○ Avoid to compare all pairs
► Improve similarity computation
○ Avoid manual weighting features
► Extend Supported Languages
○ Now we support Java, C, Python
17