Call graphs are widely used; in particular for advanced control- and data-flow analyses. Even though many call graph algorithms with different precision and scalability properties have been proposed, a comprehensive understanding of sources of unsoundness, their relevance, and the capabilities of existing call graph algorithms in this respect is missing. To address this problem, we propose Judge, a toolchain that helps with understanding sources of unsoundness and improving the soundness of call graphs. In several experiments, we use Judge and an extensive test suite related to sources of unsoundness to (a) compute capability profiles for call graph implementations of Soot, WALA, DOOP, and OPAL, (b) to determine the prevalence of language features and APIs that affect soundness in modern Java Bytecode, (c) to compare the call graphs of Soot, WALA, DOOP, and OPAL – highlighting important differences in their implementations, and (d) to evaluate the necessary effort to achieve project-specific reasonable sound call graphs. We show that soundness-relevant features/APIs are frequently used and that support for them differs vastly, up to the point where comparing call graphs computed by the same base algorithms (e.g., RTA) but different frameworks is bogus. We also show that Judge can support users in establishing the soundness of call graphs with reasonable effort.
Developers often wonder how to implement a certain functionality
(e.g., how to parse XML files) using APIs. Obtaining
an API usage sequence based on an API-related natural
language query is very helpful in this regard. Given a query,
existing approaches utilize information retrieval models to
search for matching API sequences. These approaches treat
queries and APIs as bags-of-words and lack a deep understanding
of the semantics of the query.
We propose DeepAPI, a deep learning based approach to
generate API usage sequences for a given natural language
query. Instead of a bag-of-words assumption, it learns the
sequence of words in a query and the sequence of associated
APIs. DeepAPI adapts a neural language model named
RNN Encoder-Decoder. It encodes a word sequence (user
query) into a fixed-length context vector, and generates an
API sequence based on the context vector. We also augment
the RNN Encoder-Decoder by considering the importance
of individual APIs. We empirically evaluate our approach
with more than 7 million annotated code snippets collected
from GitHub. The results show that our approach generates
largely accurate API sequences and outperforms the related
approaches.
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...Thomas Wuerthinger
Multi-language runtimes providing simultaneously high performance for several programming languages still remain an illusion. Industrial-strength managed language runtimes are built with a focus on one language (e.g., Java or C#). Other languages may compile to the bytecode formats of those managed language runtimes. However, the performance characteristics of the bytecode generation approach are often lagging behind compared to language runtimes specialized for a specific language. The performance of JavaScript is for example still orders of magnitude better on specialized runtimes (e.g., V8 or SpiderMonkey).
We present a solution to this problem by providing guest languages with a new way of interfacing with the host runtime. The semantics of the guest language is communicated to the host runtime not via generating bytecodes, but via an interpreter written in the host language. This gives guest languages a simple way to express the semantics of their operations including language-specific mechanisms for collecting profiling feedback. The efficient machine code is derived from the interpreter via automatic partial evaluation. The main components reused from the underlying runtime are the compiler and the garbage collector. They are both agnostic to the executed guest languages.
The host compiler derives the optimized machine code for hot parts of the guest language application via partial evaluation of the guest language interpreter. The interpreter definition can guide the host compiler to generate deoptimization points, i.e., exits from the compiled code. This allows guest language operations to use speculations: An operation could for example speculate that the type of an incoming parameter is constant. Furthermore, the guest language interpreter can use global assumptions about the system state that are registered with the compiled code. Finally, part of the interpreter's code can be excluded from the partial evaluation and remain shared across the system. This is useful for avoiding code explosion and appropriate for infrequently executed paths of an operation. These basic mechanisms are provided by the underlying language-agnostic host runtime and allow separation of concerns between guest and host runtime.
We implemented Truffle, the guest language runtime framework, on top of the Graal compiler and the HotSpot virtual machine. So far, there are prototypes for C, J, Python, JavaScript, R, Ruby, and Smalltalk running on top of the Truffle framework. The prototypes are still incomplete with respect to language semantics. However, most of them can run non-trivial benchmarks to demonstrate the core promise of the Truffle system: Multiple languages within one runtime system at competitive performance.
Updates on the current status of Graal VM, a platform dedicated to run multiple programming languages at excellent performance. Experimental binaries are available from http://www.oracle.com/technetwork/oracle-labs/program-languages/overview/index.html.
Graal is a dynamic meta-circular research compiler for Java that is designed for extensibility and modularity. One of its main distinguishing elements is the handling of optimistic assumptions obtained via profiling feedback and the representation of deoptimization guards in the compiled code. Truffle is a self-optimizing runtime system on top of Graal that uses partial evaluation to derive compiled code from interpreters. Truffle is suitable for creating high-performance implementations for dynamic languages with only moderate effort. The presentation includes a description of the Truffle multi-language API and performance comparisons within the industry of current prototype Truffle language implementations (JavaScript, Ruby, and R). Both Graal and Truffle are open source and form themselves research platforms in the area of virtual machine and programming language implementation (http://openjdk.java.net/projects/graal/).
Developers often wonder how to implement a certain functionality
(e.g., how to parse XML files) using APIs. Obtaining
an API usage sequence based on an API-related natural
language query is very helpful in this regard. Given a query,
existing approaches utilize information retrieval models to
search for matching API sequences. These approaches treat
queries and APIs as bags-of-words and lack a deep understanding
of the semantics of the query.
We propose DeepAPI, a deep learning based approach to
generate API usage sequences for a given natural language
query. Instead of a bag-of-words assumption, it learns the
sequence of words in a query and the sequence of associated
APIs. DeepAPI adapts a neural language model named
RNN Encoder-Decoder. It encodes a word sequence (user
query) into a fixed-length context vector, and generates an
API sequence based on the context vector. We also augment
the RNN Encoder-Decoder by considering the importance
of individual APIs. We empirically evaluate our approach
with more than 7 million annotated code snippets collected
from GitHub. The results show that our approach generates
largely accurate API sequences and outperforms the related
approaches.
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...Thomas Wuerthinger
Multi-language runtimes providing simultaneously high performance for several programming languages still remain an illusion. Industrial-strength managed language runtimes are built with a focus on one language (e.g., Java or C#). Other languages may compile to the bytecode formats of those managed language runtimes. However, the performance characteristics of the bytecode generation approach are often lagging behind compared to language runtimes specialized for a specific language. The performance of JavaScript is for example still orders of magnitude better on specialized runtimes (e.g., V8 or SpiderMonkey).
We present a solution to this problem by providing guest languages with a new way of interfacing with the host runtime. The semantics of the guest language is communicated to the host runtime not via generating bytecodes, but via an interpreter written in the host language. This gives guest languages a simple way to express the semantics of their operations including language-specific mechanisms for collecting profiling feedback. The efficient machine code is derived from the interpreter via automatic partial evaluation. The main components reused from the underlying runtime are the compiler and the garbage collector. They are both agnostic to the executed guest languages.
The host compiler derives the optimized machine code for hot parts of the guest language application via partial evaluation of the guest language interpreter. The interpreter definition can guide the host compiler to generate deoptimization points, i.e., exits from the compiled code. This allows guest language operations to use speculations: An operation could for example speculate that the type of an incoming parameter is constant. Furthermore, the guest language interpreter can use global assumptions about the system state that are registered with the compiled code. Finally, part of the interpreter's code can be excluded from the partial evaluation and remain shared across the system. This is useful for avoiding code explosion and appropriate for infrequently executed paths of an operation. These basic mechanisms are provided by the underlying language-agnostic host runtime and allow separation of concerns between guest and host runtime.
We implemented Truffle, the guest language runtime framework, on top of the Graal compiler and the HotSpot virtual machine. So far, there are prototypes for C, J, Python, JavaScript, R, Ruby, and Smalltalk running on top of the Truffle framework. The prototypes are still incomplete with respect to language semantics. However, most of them can run non-trivial benchmarks to demonstrate the core promise of the Truffle system: Multiple languages within one runtime system at competitive performance.
Updates on the current status of Graal VM, a platform dedicated to run multiple programming languages at excellent performance. Experimental binaries are available from http://www.oracle.com/technetwork/oracle-labs/program-languages/overview/index.html.
Graal is a dynamic meta-circular research compiler for Java that is designed for extensibility and modularity. One of its main distinguishing elements is the handling of optimistic assumptions obtained via profiling feedback and the representation of deoptimization guards in the compiled code. Truffle is a self-optimizing runtime system on top of Graal that uses partial evaluation to derive compiled code from interpreters. Truffle is suitable for creating high-performance implementations for dynamic languages with only moderate effort. The presentation includes a description of the Truffle multi-language API and performance comparisons within the industry of current prototype Truffle language implementations (JavaScript, Ruby, and R). Both Graal and Truffle are open source and form themselves research platforms in the area of virtual machine and programming language implementation (http://openjdk.java.net/projects/graal/).
ProbeDroid - Crafting Your Own Dynamic Instrument Tool on Android for App Beh...ZongXian Shen
The design memo and hack note of ProbeDroid
A dynamic binary instrumentation kit targeting Android(Lollipop) 5.0 and above
This is the first complete draft.
Improved version will be updated in a few days.
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
Yida's presentation at MSR 2015!
Abstract—Developers expend significant effort on reviewing source code changes, hence the comprehensibility of code changes directly affects development productivity. Our prior study has suggested that composite code changes, which mix multiple development issues together, are typically difficult to review. Unfortunately, our manual inspection of 453 open source code changes reveals a non-trivial occurrence (up to 29%) of such composite changes.
In this paper, we propose a heuristic-based approach to automatically partition composite changes, such that each sub-change in the partition is more cohesive and self-contained. Our quantitative and qualitative evaluation results are promising in demonstrating the potential benefits of our approach for facilitating code review of composite code changes.
Yannick Moy's presentation on the Hi-Lite project at the ERTS 2012 event in Toulouse France. The paper "Integrating Formal Program Verification with Testing" can be found at http://www.erts2012.org/Site/0P2RUC89/7A-1.pdf
Property-based testing an open-source compiler, pflua (FOSDEM 2015)Igalia
By Katerina Barone-Adesi.
Discover property-based testing, and see how it works on a real project, the pflua compiler.
How do you find a lot of non-obvious bugs in an afternoon? Write a property that should always be true (like "this code should have the same result before and after it's optimized"), generate random valid expressions, and study the counter-examples!
Property-based testing is a powerful technique for finding bugs quickly. It can partly replace unit tests, leading to a more flexible test suite that generates more cases and finds more bugs in less time.
It's really quick and easy to get started with property-based testing. You can use existing tools like QuickCheck, or write your own: Andy Windo and I wrote pflua-quickcheck and found a half-dozen bugs with it in one afternoon, using pure Lua and no external libraries.
In this talk, I will introduce property-based testing, demonstrate a tool for using it in Lua - and how to write your own property-based testing tool from scratch, and explain how simple properties found bugs in pflua.
(c) 2015 FOSDEM VZW
CC BY 2.0 BE
https://archive.fosdem.org/2015/
The PVS-Studio developers' team has carried out comparison of the own static code analyzer PVS-Studio with the open-source Cppcheck static code analyzer. As a material for comparison, the source codes of the three open-source projects by id Software were chosen: Doom 3, Quake 3: Arena, Wolfenstein: Enemy Territory. The article describes the comparison methodology and lists of detected errors. The conclusions section at the end of the article contains "non-conclusions" actually, as we consciously avoid drawing any conclusions: you can reproduce our comparison and draw your own ones.
The SonarQube Platform is made of 4 components:
- Server, Database, Plugins and Scanner
One or more SonarQube Scanners running on your Build / Continuous Integration Servers to analyze projects
ProbeDroid - Crafting Your Own Dynamic Instrument Tool on Android for App Beh...ZongXian Shen
The design memo and hack note of ProbeDroid
A dynamic binary instrumentation kit targeting Android(Lollipop) 5.0 and above
This is the first complete draft.
Improved version will be updated in a few days.
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
Yida's presentation at MSR 2015!
Abstract—Developers expend significant effort on reviewing source code changes, hence the comprehensibility of code changes directly affects development productivity. Our prior study has suggested that composite code changes, which mix multiple development issues together, are typically difficult to review. Unfortunately, our manual inspection of 453 open source code changes reveals a non-trivial occurrence (up to 29%) of such composite changes.
In this paper, we propose a heuristic-based approach to automatically partition composite changes, such that each sub-change in the partition is more cohesive and self-contained. Our quantitative and qualitative evaluation results are promising in demonstrating the potential benefits of our approach for facilitating code review of composite code changes.
Yannick Moy's presentation on the Hi-Lite project at the ERTS 2012 event in Toulouse France. The paper "Integrating Formal Program Verification with Testing" can be found at http://www.erts2012.org/Site/0P2RUC89/7A-1.pdf
Property-based testing an open-source compiler, pflua (FOSDEM 2015)Igalia
By Katerina Barone-Adesi.
Discover property-based testing, and see how it works on a real project, the pflua compiler.
How do you find a lot of non-obvious bugs in an afternoon? Write a property that should always be true (like "this code should have the same result before and after it's optimized"), generate random valid expressions, and study the counter-examples!
Property-based testing is a powerful technique for finding bugs quickly. It can partly replace unit tests, leading to a more flexible test suite that generates more cases and finds more bugs in less time.
It's really quick and easy to get started with property-based testing. You can use existing tools like QuickCheck, or write your own: Andy Windo and I wrote pflua-quickcheck and found a half-dozen bugs with it in one afternoon, using pure Lua and no external libraries.
In this talk, I will introduce property-based testing, demonstrate a tool for using it in Lua - and how to write your own property-based testing tool from scratch, and explain how simple properties found bugs in pflua.
(c) 2015 FOSDEM VZW
CC BY 2.0 BE
https://archive.fosdem.org/2015/
The PVS-Studio developers' team has carried out comparison of the own static code analyzer PVS-Studio with the open-source Cppcheck static code analyzer. As a material for comparison, the source codes of the three open-source projects by id Software were chosen: Doom 3, Quake 3: Arena, Wolfenstein: Enemy Territory. The article describes the comparison methodology and lists of detected errors. The conclusions section at the end of the article contains "non-conclusions" actually, as we consciously avoid drawing any conclusions: you can reproduce our comparison and draw your own ones.
The SonarQube Platform is made of 4 components:
- Server, Database, Plugins and Scanner
One or more SonarQube Scanners running on your Build / Continuous Integration Servers to analyze projects
Using static code analysis tools and detecting and fixing identified issues is very important in order to improve the quality and security of the code baseline.
CodeChecker (https://github.com/Ericsson/codechecker ) is an open source analyzer tooling, defect database and viewer extension for the Clang Static Analyzer and Clang Tidy.
It provides a number of additional features:
- Good visualization of problems in the code
- Overview of results for the whole product
- Filtering
- Cross translational unit analysis and statistical checkers support
- Suppression handling
- And many others...
These features simplify the follow up of results and make it more efficient.
In the video, an overview of features and capabilities of CodeChecker is demonstrated as well as a description and recommendation of how to introduce new tools.
Recording of the demo: https://youtu.be/sQ2Qj0kHoRY published in C++ Dublin User group https://www.youtube.com/channel/UCZ4UNE_1IMUFfAhcdq7CMOg/
Useful links:
open source project: https://github.com/Ericsson/codechecker
http://codechecker-demo.eastus.cloudapp.azure.com/login.html#
demo/demo
https://codechecker.readthedocs.io/en/latest/
http://clang-analyzer.llvm.org/available_checks.html
http://clang.llvm.org/extra/clang-tidy/checks/list.html
Other related videos about Clang Static Analyzer and CodeChecker that goes a bit more deeply into how Clang Static Analyzer works:
Clang Static Analysis - Meeting C++ 2016 Gabor Horvath
https://www.youtube.com/watch?v=UcxF6CVueDM
CppCon 2016: Gabor Horvath “Make Friends with the Clang Static Analysis Tools"
https://www.youtube.com/watch?v=AQF6hjLKsnM
Title: Sista: Improving Cog’s JIT performance
Speaker: Clément Béra
Thu, August 21, 9:45am – 10:30am
Video Part1
https://www.youtube.com/watch?v=X4E_FoLysJg
Video Part2
https://www.youtube.com/watch?v=gZOk3qojoVE
Description
Abstract: Although recent improvements of the Cog VM performance made it one of the fastest available Smalltalk virtual machine, the overhead compared to optimized C code remains important. Efficient industrial object oriented virtual machine, such as Javascript V8's engine for Google Chrome and Oracle Java Hotspot can reach on many benchs the performance of optimized C code thanks to adaptive optimizations performed their JIT compilers. The VM becomes then cleverer, and after executing numerous times the same portion of codes, it stops the code execution, looks at what it is doing and recompiles critical portion of codes in code faster to run based on the current environment and previous executions.
Bio: Clément Béra and Eliot Miranda has been working together on Cog's JIT performance for the last year. Clément Béra is a young engineer and has been working in the Pharo team for the past two years. Eliot Miranda is a Smalltalk VM expert who, among others, has implemented Cog's JIT and the Spur Memory Manager for Cog.
This is the story of a great software war. Migrating Big Data legacy systems always involve great pain and sleepless nights. Migrating Big Data systems with Multiple pipelines and machine learning models only adds to the existing complexity. What about migrating legacy systems that protect Microsoft Azure Cloud Backbone from Network Cyber Attacks? That adds pressure and immense responsibility. In this session, we will share our migration story: Migrating a machine learning-based product with thousands of paying customers that process Petabytes of network events a day. We will talk about our migration strategy, how we broke down the system into migrationable parts, tested every piece of every pipeline, validated results, and overcome challenges. Lastly, we share why we picked Azure Databricks as our new modern environment for both Data Engineers and Data Scientists workloads.
Performance Testing is a type of testing to ensure software applications will perform well under their expected workload.
It evaluates the quality or capability of a product. Take your Performance Tests to next level with Gatling!
Software engineering research often requires analyzing
multiple revisions of several software projects, be it to make and
test predictions or to observe and identify patterns in how software evolves. However, code analysis tools are almost exclusively designed for the analysis of one specific version of the code, and the time and resources requirements grow linearly with each additional revision to be analyzed. Thus, code studies often observe a relatively small number of revisions and projects. Furthermore, each programming ecosystem provides dedicated tools, hence researchers typically only analyze code of one language, even when researching topics that should generalize
to other ecosystems. To alleviate these issues, frameworks and models have been developed to combine analysis tools or automate the analysis of multiple revisions, but little research has gone into actually removing redundancies in multi-revision, multi-language code analysis. We present a novel end-to-end approach that systematically avoids redundancies every step of the way: when reading sources from version control, during parsing, in the internal code representation, and during the actual analysis. We evaluate our open-source implementation, LISA, on the full
history of 300 projects, written in 3 different programming languages, computing basic code metrics for over 1.1 million program revisions. When analyzing many revisions, LISA requires less than a second on average to compute basic code metrics for all files in a single revision, even for projects consisting of millions of lines of code.
Systematic Evaluation of the Unsoundness of Call Graph Algorithms for JavaMichael Reif
This talk has been held at the SOAP'18 workshop on static program analysis.
The talk presents our test project to asses the unsoundness of built-in call graph implementation.
This session talks about how unit testing of Spark applications is done, as well as tells the best way to do it. This includes writing unit tests with and without Spark Testing Base package, which is a spark package containing base classes to use when writing tests with Spark.
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Martin Spier
Netflix accounts for more than a third of all traffic heading into American homes at peak hours. Making sure users are getting the best possible experience at all times is no simple feat and performance is at the core of this experience. In order to ensure performance and maintain development agility in a highly decentralized environment/(organization?), Netflix employs a multitude of strategies, such as production canary analysis, fully automated performance tests, simple zero-downtime deployments and rollbacks, auto-scaling clusters and a fault-tolerant stateless service architecture. We will present a set of use cases that demonstrate how and why different groups employ different strategies to achieve a common goal, great performance and stability, and detail how these strategies are incorporated into development, test and DevOps with minimal overhead.
Similar to Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in Call Graphs (20)
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Ethnobotany and Ethnopharmacology:
Ethnobotany in herbal drug evaluation,
Impact of Ethnobotany in traditional medicine,
New development in herbals,
Bio-prospecting tools for drug discovery,
Role of Ethnopharmacology in drug evaluation,
Reverse Pharmacology.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in Call Graphs
1. Judge: Identifying,
Understanding, and Evaluating
Sources of Unsoundness in Call
Graphs
Michael Reif, Florian Kübler, Michael Eichberg, Dominik Helm, and Mira Mezini
Software Technology Group
TU Darmstadt
@Reifmi
2. Why We Shouldn’t Take
Call Graphs for Granted
• Call graphs are a central data-structure for numerous static
analyses
• Call graphs directly impact a client analysis’ result
• The chosen algorithm predetermines an analysis’ precision
and recall
• Programming languages evolve (APIs and features are
added) and frameworks might not
!2
3. State-of-the-art Call-graph
Generators for Java
• Many different static analysis frameworks are available
• All can compute a different set of call graphs
• All frameworks use different approaches and make unknown
trade-offs or implementation choices
• Are they actually comparable??
!3
OPAL
5. Judge’s Overview
TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
6. Judge’s Overview
TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
7. Judge’s Overview
TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
⟨Potential
Sources of
Unsoundness⟩
.tsv
compute suitability of CG algo.
use the
respective
CG profile
8. Test Suite
TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
⟨Potential
Sources of
Unsoundness⟩
.tsv
compute suitability of CG algo.
use the
respective
CG profile
9. Test Suite
TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
⟨Potential
Sources of
Unsoundness⟩
.tsv
compute suitability of CG algo.
use the
respective
CG profile
• Each category has:
• a description
• multiple test cases
• Each test case has:
• a scenario description
• unique id
• the test code
• excepted calls
• Available annotations:
• CallSite
• IndirectCall
11. Computing the Algorithms’
Profile
!7
TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
⟨Potential
Sources of
Unsoundness⟩
.tsv
compute suitability of CG algo.
use the
respective
CG profile
12. TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
⟨Potential
Sources of
Unsoundness⟩
.tsv
compute suitability of CG algo.
use the
respective
CG profile
Finding Features in
Real Code
!8
13. TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
⟨Potential
Sources of
Unsoundness⟩
.tsv
compute suitability of CG algo.
use the
respective
CG profile
Finding Features in
Real Code
!8
[1] Reif, Michael et al. Hermes: assessment and creation of effective test corpora. SOAP ’17. ACM, 43–48.
• We used Hermes [1], a static analysis code query
infrastructure
• Each query is an analysis that checks if a specific feature
is found in a given code base
• We developed 15 Hermes queries to derive 107 Hermes
features and map the derived features to the test case ids
• All queries perform a most-conservative intra-procedural
analysis
14. Potential Sources of
Unsoundness
!9
0✘
Lambda8
(Invokedynamic -
Scala)
Lambda3
(Invokedynamic -
Java ≤ 10)
1✓
… ……
TR1
(Reflection)
2✘
Extensions
Count
3
Supported
by CG(a)
✓
BPC2
(Polymorphic Call)
Features
(Based on
Test Cases)
✘mz
my ✓
mx ✘
✓mu
……
m4 ✓
m3 ✓
m2 ✘
Reached
by CG(a)
✓m1
Name
Methods
Computed Using Feature Queries / Hermes
LibraryCodeApplicationCode
Sourceof
Unsoundness
For Project (p)
ConditionalSource
ofUnsoundness
Extensions
Mapping
TC1.jarTC2.jar⟨Test Case⟩
.jar
⟨Advanced
Test Case⟩
.jar
compile test cases
AllTestCases
<Test Fixtures
Category>.md
Test Case 1(TC1)
…
Test Case 3 (TCN)
⟨Test Fixtures⟩.md
Test Case 1
…
Test Case 3
⟨CG⟩
.json
compute CG
Done for each CG per supported
static analysis framework.
⟨CG Algorithm Profile⟩
.tsvcompute profile using CG and expected call targets
⟨Project⟩
.jar
⟨Features &
Locations⟩
.json
⟨CG⟩
.json
compute CG
run Hermes
Infrastructure used for computing the prevalence of features in
real projects.
⟨Potential
Sources of
Unsoundness⟩
.tsv
compute suitability of CG algo.
use the
respective
CG profile
• Sources of Unsoundness
definitely make the call graph
unsound
• Conditional sources of
Unsoundness might introduce
unsoundness
15. Research Questions
• RQ1: How prevalent are the language and API features?
• RQ2: How do the frameworks compare to each other?
• RQ3: Which framework is best suited for which kind of
code base?
• RQ4: How much effort is necessary to get a sound call
graph?
!10
16. Prevalent Language
Features and APIs (RQ1)
• All the API and language features supported by
Java up to version 7 are used widely across all
code bases
• Support for Java 8 is a must, unless analyzing
Android or Clojure code
• Supporting classical Reflection and Serialization
is strongly recommended, independent of the
source code’s age
• Support for many features is only required in
specific scenarios
!11
19. The Call Graphs’ Feature Support (RQ2)
!12
Standard Java
Features are well-
supported
20. The Call Graphs’ Feature Support (RQ2)
!12
Standard Java
Features are well-
supported
21. The Call Graphs’ Feature Support (RQ2)
!12
Java 8 Features
are partially
supported
Standard Java
Features are well-
supported
22. The Call Graphs’ Feature Support (RQ2)
!12
Java 8 Features
are partially
supported
Standard Java
Features are well-
supported
23. The Call Graphs’ Feature Support (RQ2)
!12
Java 8 Features
are partially
supported
The JVM is not
fully covered
Standard Java
Features are well-
supported
24. The Call Graphs’ Feature Support (RQ2)
!12
Java 8 Features
are partially
supported
The JVM is not
fully covered
Standard Java
Features are well-
supported
25. The Call Graphs’ Feature Support (RQ2)
!12
Java 8 Features
are partially
supported
The JVM is not
fully covered
Standard Java
Features are well-
supported
Reflection API
partially
supported
26. The Call Graphs’ Feature Support (RQ2)
!12
Java 8 Features
are partially
supported
The JVM is not
fully covered
Standard Java
Features are well-
supported
Reflection API
partially
supported
27. The Call Graphs’ Feature Support (RQ2)
!12
Java 8 Features
are partially
supported
The JVM is not
fully covered
Some APIs and
language features
are unsupported
Standard Java
Features are well-
supported
Reflection API
partially
supported
32. Performance Results (RQ2)
!13
avg. Runtimes
largely differ
Reachable Methods vary even for
implementations of the same algorithm
by more than 20x
33. RTA-Example
!14
void program(boolean condition){
Collection c1 = new LinkedList();
Collection c2;
if(condition){
c2 = new ArrayList();
} else {
c2 = new Vector();
}
c2.add(null);
Collection c3 = new HashSet();
}
• RTA [2] depends on the program’s instantiated
types
• Soot, WALA, and OPAL behave complete
differently
[2] D. Bacon and P. Sweeney. Fast static analysis of C++ virtual function calls. OOPSLA '96. ACM, 324-341.
34. RTA-Example
!14
void program(boolean condition){
Collection c1 = new LinkedList();
Collection c2;
if(condition){
c2 = new ArrayList();
} else {
c2 = new Vector();
}
c2.add(null);
Collection c3 = new HashSet();
}
• RTA [2] depends on the program’s instantiated
types
• Soot, WALA, and OPAL behave complete
differently
[2] D. Bacon and P. Sweeney. Fast static analysis of C++ virtual function calls. OOPSLA '96. ACM, 324-341.
35. RTA-Example
!14
void program(boolean condition){
Collection c1 = new LinkedList();
Collection c2;
if(condition){
c2 = new ArrayList();
} else {
c2 = new Vector();
}
c2.add(null);
Collection c3 = new HashSet();
}
• RTA [2] depends on the program’s instantiated
types
• Soot, WALA, and OPAL behave complete
differently
[2] D. Bacon and P. Sweeney. Fast static analysis of C++ virtual function calls. OOPSLA '96. ACM, 324-341.
{ LinkedList, ArrayList, Vector, HashSet }
36. RTA-Example
!14
void program(boolean condition){
Collection c1 = new LinkedList();
Collection c2;
if(condition){
c2 = new ArrayList();
} else {
c2 = new Vector();
}
c2.add(null);
Collection c3 = new HashSet();
}
• RTA [2] depends on the program’s instantiated
types
• Soot, WALA, and OPAL behave complete
differently
[2] D. Bacon and P. Sweeney. Fast static analysis of C++ virtual function calls. OOPSLA '96. ACM, 324-341.
{ LinkedList, ArrayList, Vector, HashSet }
37. RTA-Example
!14
void program(boolean condition){
Collection c1 = new LinkedList();
Collection c2;
if(condition){
c2 = new ArrayList();
} else {
c2 = new Vector();
}
c2.add(null);
Collection c3 = new HashSet();
}
• RTA [2] depends on the program’s instantiated
types
• Soot, WALA, and OPAL behave complete
differently
[2] D. Bacon and P. Sweeney. Fast static analysis of C++ virtual function calls. OOPSLA '96. ACM, 324-341.
{ LinkedList, ArrayList, Vector, HashSet }
{ LinkedList, ArrayList, Vector}
38. RTA-Example
!14
void program(boolean condition){
Collection c1 = new LinkedList();
Collection c2;
if(condition){
c2 = new ArrayList();
} else {
c2 = new Vector();
}
c2.add(null);
Collection c3 = new HashSet();
}
• RTA [2] depends on the program’s instantiated
types
• Soot, WALA, and OPAL behave complete
differently
[2] D. Bacon and P. Sweeney. Fast static analysis of C++ virtual function calls. OOPSLA '96. ACM, 324-341.
{ LinkedList, ArrayList, Vector, HashSet }
{ArrayList, Vector}{ LinkedList, ArrayList, Vector}
45. Project-specific Evaluation
(RQ3)
!15
Soot supports CSR
but its expensive
OPAL supports most
features but has the
smallest call graph
OPAL covers only 47
methods from Xalan
(~0.3%)
Very few call sites
have a huge impact
46. Is it worth it to do the work
manually? (RQ 4)
• GOAL: Get a reasonably sound call graph
• JVM profiling and TamiFlex [3] as ground truth
!16
[3] Bodden, Eric, et al. Taming Reflection--Static Analysis in the Presence of Reflection and Custom Class Loaders. (2010).
Apply Judge
Inspect Results
Add Entry Points
• Analyzed 10 reflective call sites
• Added 50 entry points
• manual analysis took roughly 90 minutes
• The call graph then covered 91% of all
methods contained in the profile and 121 from
198 reported by TamiFlex