This document presents a method for detecting when client code imitates library APIs. It uses data dependency graphs (DDGs) to represent code and defines the concepts of traces, subtraces, and trace subsumption. The algorithm generates DDGs for the library and client code with different method call inlinings. It detects an imitation if the client DDG trace subsumes the library DDG trace. The method was evaluated on open source projects, achieving 82% precision for imported libraries and 75% precision for static libraries.
Python libraries presentation Contains all top 10 labraries information like numpy,tenslorflow,scikit-learn,Numpy,keras,PyToruch,LightGBM,Eli5,scipy,theano,pandas
(Costless) Software Abstractions for Parallel ArchitecturesJoel Falcou
Performing large, intensive or non-trivial computing on array like data structures is one of the most common task in scientific computing, video game development and other fields. This matter of fact is backed up by the large number of tools, languages and libraries to perform such tasks. If we restrict ourselves to C++ based solutions, more than a dozen such libraries exists from BLAS/LAPACK C++ binding to template meta-programming based Blitz++ or Eigen. If all of these libraries provide good performance or good abstraction, none of them seems to fit the need of so many different user types.
Moreover, as parallel system complexity grows, the need to maintain all those components quickly become unwieldy. This talk explores various software design techniques - like Generative Programming, MetaProgramming and Generic Programming - and their application to the implementation of a parallel computing librariy in such a way that:
- abstraction and expressiveness are maximized - cost over efficiency is minimized
We'll skim over various applications and see how they can benefit from such tools. We will conclude by discussing what lessons were learnt from this kind of implementation and how those lessons can translate into new directions for the language itself.
Answer set programming (ASP) is a prominent knowledge representation and reasoning paradigm that found both industrial and scientific applications. The success of ASP is due to the combination of two factors: a rich modeling language and the availability of efficient ASP implementations. In this talk we trace the history of ASP systems, describing the key evaluation techniques and their implementation in actual tools.
Python libraries presentation Contains all top 10 labraries information like numpy,tenslorflow,scikit-learn,Numpy,keras,PyToruch,LightGBM,Eli5,scipy,theano,pandas
(Costless) Software Abstractions for Parallel ArchitecturesJoel Falcou
Performing large, intensive or non-trivial computing on array like data structures is one of the most common task in scientific computing, video game development and other fields. This matter of fact is backed up by the large number of tools, languages and libraries to perform such tasks. If we restrict ourselves to C++ based solutions, more than a dozen such libraries exists from BLAS/LAPACK C++ binding to template meta-programming based Blitz++ or Eigen. If all of these libraries provide good performance or good abstraction, none of them seems to fit the need of so many different user types.
Moreover, as parallel system complexity grows, the need to maintain all those components quickly become unwieldy. This talk explores various software design techniques - like Generative Programming, MetaProgramming and Generic Programming - and their application to the implementation of a parallel computing librariy in such a way that:
- abstraction and expressiveness are maximized - cost over efficiency is minimized
We'll skim over various applications and see how they can benefit from such tools. We will conclude by discussing what lessons were learnt from this kind of implementation and how those lessons can translate into new directions for the language itself.
Answer set programming (ASP) is a prominent knowledge representation and reasoning paradigm that found both industrial and scientific applications. The success of ASP is due to the combination of two factors: a rich modeling language and the availability of efficient ASP implementations. In this talk we trace the history of ASP systems, describing the key evaluation techniques and their implementation in actual tools.
Java 8 is coming soon. In this presentation I have outlined the major Java 8 features. You get information about interface improvements, functional interfaces, method references, lambdas, java.util.function, java.util.stream
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...ICSM 2011
Paper: "Crossing the Boundaries while Analyzing Heterogeneous Component-Based Software Systems"
Authors: Amir Reza Yazdanshenas, Leon Moonen
Session: Research Track Session 7: Components
Natural Language Analysis - Mining Java Class Naming ConventionsICSM 2011
Paper: Mining Java Class Naming Conventions
Authors: Simon Butler, Michel Wermelinger, Yijun Yu and Helen Sharp
Session: Research Track 4 - Natural Language Analysis
Industry - Evolution and migration - Incremental and Iterative Reengineering ...ICSM 2011
Paper: Incremental and Iterative Reengineering towards Software Product Line: An Industrial Case Study
Authors: Gang Zhang, Liwei Shen, Xin Peng, Zhenchang Xing and Wenyun Zhao
Session: Industry Track Session 3: Evolution and migration
Industry - Testing & Quality Assurance in Data Migration Projects ICSM 2011
Paper: Testing & Quality Assurance in Data Migration Projects
Authors: Klaus Haller, Florian Matthes, Christopher Schulz
Session: Industry Track Session 3: Evolution and migration
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...ICSM 2011
Paper: Structural Conformance Checking with Design Tests: An Evaluation of Usability and Scalability.
Authors: João Brunet, Dalton Dario Serey Guerrero and Jorge Figueiredo.
Session: Research Track 5: Traceability
Paper: Tracking Technical Debt- An Exploratory Case Study
Authors: Yuepu Guo, Carolyn Seaman, Rebeka Gomes, Antonio Cavalcanti, Graziela Tonin, Fabio Q. B. Da Silva, André L. M. Santos, Clauirton Siebra
Session: Early Research Achievement Track Session 3
Reliability and Quality - Predicting post-release defects using pre-release f...ICSM 2011
Paper : Predicting Post-release Defects Using Pre-release Field Testing Results
Authors : Foutse Khomh, Brian Chan, Ying Zou, Anand Sinha and Dave Dietz
Session: Research Track Session 9: Reliability and Quality
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...ICSM 2011
Paper: Localizing Failure-Inducing Program Edits Based on Spectrum Information.
Authors: Lingming Zhang, Miryung Kim, Sarfraz Khurshid.
Session: Research Track Session 1: Faults and Regression Testing
Industry - Estimating software maintenance effort from use cases an indu...ICSM 2011
Paper: Estimating Software Maintenance Effort from Use Cases: an Industrial Case Study
Authors:Yan Ku, Jing Du, Ye Yang, Qing Wang
Session: Industry Tracking 5: Metrics and
Estimation
Java 8 is coming soon. In this presentation I have outlined the major Java 8 features. You get information about interface improvements, functional interfaces, method references, lambdas, java.util.function, java.util.stream
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...ICSM 2011
Paper: "Crossing the Boundaries while Analyzing Heterogeneous Component-Based Software Systems"
Authors: Amir Reza Yazdanshenas, Leon Moonen
Session: Research Track Session 7: Components
Natural Language Analysis - Mining Java Class Naming ConventionsICSM 2011
Paper: Mining Java Class Naming Conventions
Authors: Simon Butler, Michel Wermelinger, Yijun Yu and Helen Sharp
Session: Research Track 4 - Natural Language Analysis
Industry - Evolution and migration - Incremental and Iterative Reengineering ...ICSM 2011
Paper: Incremental and Iterative Reengineering towards Software Product Line: An Industrial Case Study
Authors: Gang Zhang, Liwei Shen, Xin Peng, Zhenchang Xing and Wenyun Zhao
Session: Industry Track Session 3: Evolution and migration
Industry - Testing & Quality Assurance in Data Migration Projects ICSM 2011
Paper: Testing & Quality Assurance in Data Migration Projects
Authors: Klaus Haller, Florian Matthes, Christopher Schulz
Session: Industry Track Session 3: Evolution and migration
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...ICSM 2011
Paper: Structural Conformance Checking with Design Tests: An Evaluation of Usability and Scalability.
Authors: João Brunet, Dalton Dario Serey Guerrero and Jorge Figueiredo.
Session: Research Track 5: Traceability
Paper: Tracking Technical Debt- An Exploratory Case Study
Authors: Yuepu Guo, Carolyn Seaman, Rebeka Gomes, Antonio Cavalcanti, Graziela Tonin, Fabio Q. B. Da Silva, André L. M. Santos, Clauirton Siebra
Session: Early Research Achievement Track Session 3
Reliability and Quality - Predicting post-release defects using pre-release f...ICSM 2011
Paper : Predicting Post-release Defects Using Pre-release Field Testing Results
Authors : Foutse Khomh, Brian Chan, Ying Zou, Anand Sinha and Dave Dietz
Session: Research Track Session 9: Reliability and Quality
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...ICSM 2011
Paper: Localizing Failure-Inducing Program Edits Based on Spectrum Information.
Authors: Lingming Zhang, Miryung Kim, Sarfraz Khurshid.
Session: Research Track Session 1: Faults and Regression Testing
Industry - Estimating software maintenance effort from use cases an indu...ICSM 2011
Paper: Estimating Software Maintenance Effort from Use Cases: an Industrial Case Study
Authors:Yan Ku, Jing Du, Ye Yang, Qing Wang
Session: Industry Tracking 5: Metrics and
Estimation
Metrics - You can't control the unfamiliarICSM 2011
Paper: You Can't Control the Unfamiliar: A Study on the Relations Between Aggregation Techniques for Software Metrics
Authors: Bogdan Vasilescu, Alexander Serebrenik and Mark Van Den Brand
Session: Research Track 11 - Metrics
Faults and Regression Testing - Fault interaction and its repercussionsICSM 2011
Paper: Fault Interaction and its Repercussions
Authors: Nicholas DiGiuseppe and James A. Jones
Seesion: Research Track 1: Faults and Regression Testing
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...ICSM 2011
Paper: "Precise Detection of Un-Initialized Variables in Large, Real-life COBOL Programs in Presence of Un-realizable Paths"
Authors: Rahul Jiresal, Adnan Contractor and Ravindra Naik
Session: Industry Track Session 4: Program analysis and Verification
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...ICSM 2011
Paper: Expanding Identifiers to Normalize Source Code Vocabulary
Authors: Dave Binkley and Dawn Lawrie
Session: Research Track 4: Natural Language Analysis
ERA - Measuring Maintainability of Spreadsheets in the Wild ICSM 2011
Paper: Measuring Maintainability of Spreadsheets in the Wild
Authors: José Pedro Correia and Miguel Alexandre Ferreira
Session: Early Research Achievements Track Session 2: Software Changes and Maintainability
Metrics - Using Source Code Metrics to Predict Change-Prone Java InterfacesICSM 2011
Paper title: Using Source Code Metrics to Predict Change-Prone Java Interfaces
Authors: Daniele Romano and Martin Pinzger
Session: Research Track Session 11: Metrics
Abstract: In collaborative agile ontology development projects support for modular reuse of ontologies from large existing remote repositories, ontology project life cycle management, and transitive dependency management are important needs. The Apache Maven approach has proven its success in distributed collaborative Software Engineering by its widespread adoption. The contribution of this paper is a new design artifact called OntoMaven. OntoMaven adopts the Maven-based development methodology and adapts its concepts to knowledge engineering for Maven-based ontology development and management of ontology artifacts in distributed ontology repositories.
Biperpedia: An ontology of Search ApplicationHarsh Kevadia
This is topic is describe by Rahul Gupta, Alon Halevy, Xuezhi Wang, Steven Whang, Fei Wu. This is only i read report and make a presentation to explain the paper what is actually author want to say.
Python array API standardization - current state and benefitsRalf Gommers
Talk given at GTC Fall 2021.
The Python array API standard, which was first announced towards the end of 2020, is maturing and becoming available to Python end users. NumPy now has a reference implementation, PyTorch support is close to complete, and other libraries have started to implement support. In this talk we will discuss the current state of implementations, and look at a concrete use case of moving a scientific analysis workflow to using the API standard - thereby gaining access to GPU acceleration.
Multi-dimensional exploration of API usage - ICPC13 - 21-05-13Coen De Roover
Presented at the 21st IEEE International Conference on Program Comprehension (ICPC 2013), San Francisco (USA). Website of the paper: http://softlang.uni-koblenz.de/explore-API-usage/
See 2020 update: https://derwen.ai/s/h88s
SF Python Meetup, 2017-02-08
https://www.meetup.com/sfpython/events/237153246/
PyTextRank is a pure Python open source implementation of *TextRank*, based on the [Mihalcea 2004 paper](http://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf) -- a graph algorithm which produces ranked keyphrases from texts. Keyphrases generally more useful than simple keyword extraction. PyTextRank integrates use of `TextBlob` and `SpaCy` for NLP analysis of texts, including full parse, named entity extraction, etc. It also produces auto-summarization of texts, making use of an approximation algorithm, `MinHash`, for better performance at scale. Overall, the package is intended to complement machine learning approaches -- specifically deep learning used for custom search and recommendations -- by developing better feature vectors from raw texts. This package is in production use at O'Reilly Media for text analytics.
Automatic Migration of Legacy Java Method Implementations to InterfacesRaffi Khatchadourian
Java 8 is one of the largest upgrades to the popular language and framework in over a decade. In this talk, I will first overview several new, key features of Java 8 that can help make programs easier to read, write, and maintain, especially in regards to collections. These features include Lambda Expressions, the Stream API, and enhanced interfaces, many of which help bridge the gap between functional and imperative programming paradigms and allow for succinct concurrency implementations. Next, I will discuss several open issues related to automatically migrating (refactoring) legacy Java software to use such features correctly, efficiently, and as completely as possible. Solving these problems will help developers to maximally understand and adopt these new features thus improving their software.
Similar to Components - Graph Based Detection of Library API Limitations (20)
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Neuro-symbolic is not enough, we need neuro-*semantic*
Components - Graph Based Detection of Library API Limitations
1. Graph-based Detection of
Library API Imitations
Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang
National University of Singapore
1 October 6, 2011
2. Motivation – Software Libraries
Common practice to employ 3rd-party software libraries
Providing certain functionalities / hiding implementation details
Improving productivity
Well tested
Enhancing program quality
Application Programming Interfaces (APIs)
Exported by libraries
Ways for programmers to interact with libraries
2 October 6, 2011
3. Motivation – Problem
APIs are not always effectively used by programmers
Imitation: client code re-implements the behavior of library
APIs
Reasons
Unfamiliar with the library,
Library evolution
Cost
Waste unnecessary resources, time and energy
Error-prone, software maintenance issue
3 October 6, 2011
5. Motivation – Example from JBoss
Imitation (1): method.getInterceptors() == null ||
method.getInterceptors().length < 1
5 October 6, 2011
6. Motivation – Example from JBoss
Imitation (1): method.getInterceptors() == null ||
method.getInterceptors().length < 1
API: return (interceptors != null && interceptors.length > 0)
6 October 6, 2011
7. Motivation – Example from JBoss
Imitation (1): method.getInterceptors() == null ||
method.getInterceptors().length < 1
Refactor to: !method.hasAdvices()
7 October 6, 2011
8. Motivation – Example from JBoss
Imitation (1): method.getInterceptors() == null ||
method.getInterceptors().length < 1
Refactor to: !method.hasAdvices()
8 October 6, 2011
9. Motivation
A library API imitation can be
Not exactly the same
Inter-procedural
9 October 6, 2011
10. Motivation
A library API imitation can be
Not exactly the same
Inter-procedural
Goal: to accurately detect such imitations
10 October 6, 2011
11. Detection of Library API Imitations
Motivation
Definitions
Data Dependency Graph
Trace & Subtrace
Trace Subsumption
Potential Imitation
Algorithms
Pre- & Post-processing
Case Studies
Conclusion
11 October 6, 2011
12. Definitions – Overview
Employing Data Dependency Graphs (DDG) to represent
code
Semantic representation
Capturing data flows within a method
Carrying a portion of control flow information
A library DDG is trace-subsumed by a client DDG
potential API imitation
Relaxation of sub-graph isomorphism
More efficient
Minor-difference tolerant
12 October 6, 2011
13. Definitions – Data Dependency Graph
DDG – a graphical representation of a method
Vertices: basic statements (three address form)
Edges v u: direction represents data dependency
vertex u is data dependent on vertex v
a variable var
defined at v
used at u
and there is an execution path P from v to u, and along P, the
var is not redefined.
13 October 6, 2011
14. Definitions – Trace & Subtrace
A trace in a data dependency graph
A path of vertices, <v1, v2, …, vm>
The first vertex is an entry of the graph
14 October 6, 2011
15. Definitions – Trace & Subtrace
A trace in a data dependency graph T1 = <C, D, E>
A path of vertices, <v1, v2, …, vm> T2 = <A, B, C, D, E, F>
The first vertex is an entry of the graph
Given two traces T1 = <v1, v2, …, vm> and T2 = <u1, u2, …, un>,T1
is a subtrace of T2 (T1 ≤ T2) if there exists an integer i,
0≤i≤n–m
match(v1, u1 + i), match(v2, u2 + i), …, match(vm, um + i)
Subtrace is a generalization of substring relation.
15 October 6, 2011
16. Definitions – Trace & Subtrace
A trace in a data dependency graph T1 = <C, D, E>
A path of vertices, <v1, v2, …, vm> T2 = <A, B, C, D, E, F>
i=2
The first vertex is an entry of the graph
Given two traces T1 = <v1, v2, …, vm> and T2 = <u1, u2, …, un>,T1
is a subtrace of T2 (T1 ≤ T2) if there exists an integer i,
0≤i≤n–m
match(v1, u1 + i), match(v2, u2 + i), …, match(vm, um + i)
Subtrace is a generalization of substring relation.
16 October 6, 2011
17. Definitions – Trace Subsumption
A data dependency graph Glib
A data dependency graph Gclt
Gclt trace subsumes Glib , if and only if
for each trace there exists at least one trace
such that is a subtrace of
17 October 6, 2011
18. Definitions – Potential Imitation
A client method Clt potentially imitates a library
method Lib, if
A DDG Gclt of Clt, resulting from inlining zero or some
method calls into Clt
A DDG Glib of Lib, resulting from inlining zero or some
method calls into Lib
Gclt trace subsumes Glib
18 October 6, 2011
19. Detection of Library API Imitations
Motivation
Definitions
Algorithms
Overall Algorithm
Trace Subsumption Checking
Pre- & Post-processing
Case Studies
Conclusion
19 October 6, 2011
20. Algorithms – Overall Algorithm
Input
A library API Lib
A client method Clt
A set S of all method calls in both Lib and Clt
Output true if Clt potentially imitates Lib
Body
for each sub-set s of S {
Lib’ = a copy of Lib with calls in s inlined
Clt’ = a copy of Clt with calls in s inlined
if the DDG of Clt’ trace subsumes the DDG of Lib’
return true
}
return false;
20 October 6, 2011
21. Algorithms – Trace Subsumption
Input
A DDG of a library API Glib
A DDG of a client method Gclt
Output
true if Gclt trace subsumes Glib
Depth-first Search,
Step-by-step checking
21 October 6, 2011
23. Algorithms – An Example
Locating all vertices in client matching each entry of the library Stack: (A, {A, A})
Current:
23 October 6, 2011
24. Algorithms – An Example
Locating client vertices matching library A’s successor D Stack:
Current: (A, {A, A})
24 October 6, 2011
25. Algorithms – An Example
Locating client vertices matching library A’s successor D Stack: (D, {D})
Current: (A, {A, A})
25 October 6, 2011
26. Algorithms – An Example
Locating client vertices matching library A’s successor B Stack: (D, {D})
Current: (A, {A, A})
26 October 6, 2011
27. Algorithms – An Example
Locating client vertices matching library A’s successor B Stack: (B, {B})
(D, {D})
Current: (A, {A, A})
27 October 6, 2011
28. Algorithms – An Example
Locating client vertices matching B’s successor {} in library Stack: (D, {D})
Current: (B, {B})
28 October 6, 2011
29. Algorithms – An Example
Locating client vertices matching library D’s successor M Stack:
Current: (D, {D})
29 October 6, 2011
30. Detection of Library API Imitations
Motivation
Definitions
Algorithms
Pre-processing & Post-validation
Case Studies
Conclusion
30 October 6, 2011
31. Pre-processing Libraries
Remove nullness checks
If (a ==) {
return Constant;
} else {
a.XXX();
}
Remove assertions
if (…)
throw Exception();
…….
Remove exception handlers
try {
} catch (…) {}
31 October 6, 2011
32. Post-validating Reported Imitations
Reject the following two cases
Unmatched Inlined Vertices in Client
Matching All References to Library Locals
32 October 6, 2011
33. Detection of Library API Imitations
Motivation
Definitions
Algorithms
Pre-processing & Post-validation
Case Studies
Conclusion
33 October 6, 2011
34. Case Studies
Evaluation measure
Subjects – 10 open-source Java projects
Testbed:
Intel Core 2 Quad CPU 3.00GHz and 8GB memory
34 October 6, 2011
35. Case Studies – Two Experiments
Detecting Imitations of Imported Libraries
Testing all method pairs (lib, clt), where the declaring class of
lib is already imported in the client class
Precision = 313 / 383 = 82%
Runtime = 314 seconds
35 October 6, 2011
36. Case Studies – Two Experiments
Detecting Imitations of Imported Libraries
Testing all method pairs (lib, clt), where the declaring class of
lib is already imported in the client class
Precision = 313 / 383 = 82%
Runtime = 314 seconds
Detecting Imitations of Static Libraries
Testing all method pairs (lib, clt), where lib is a public static
method
Precision = 116 / 155 = 75%
Runtime = 396 seconds
36 October 6, 2011
38. Conclusion
A common practice to employ 3rd party software libraries
Client code re-implements behavior of existing APIs
An algorithm based on data dependency graphs to detect
complex imitations
Average precision 82% & 75%
38 October 6, 2011