Speeding-up Software Testing With Computational IntelligenceAnnibale Panichella
Software testing is a crucial activity to assess the correct behavior of a program. However, it is also costly since it consumes a large ratio of software development time. For this reason, researchers have investigated techniques to automate the process of creating test cases. The key idea is to use meta-heuristics (e.g., Genetic Algorithms) to automatically generate test cases that reveal software failures. In this talk, I will present a case study in the automotive and showing the larger effectiveness and efficiency of meta-heuristics compared to manual testing.
Speeding-up Software Testing With Computational IntelligenceAnnibale Panichella
Software testing is a crucial activity to assess the correct behavior of a program. However, it is also costly since it consumes a large ratio of software development time. For this reason, researchers have investigated techniques to automate the process of creating test cases. The key idea is to use meta-heuristics (e.g., Genetic Algorithms) to automatically generate test cases that reveal software failures. In this talk, I will present a case study in the automotive and showing the larger effectiveness and efficiency of meta-heuristics compared to manual testing.
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Feng Zhang
Defect prediction on projects with limited historical data has attracted great interest from both researchers and practitioners. Cross-project defect prediction has been the main area of progress by reusing classifiers from other projects. However, existing approaches require some degree of homogeneity (e.g., a similar distribution of metric values) between the training projects and the target project. Satisfying the homogeneity requirement often requires significant effort (currently a very active area of research).
An unsupervised classifier does not require any training data, therefore the heterogeneity challenge is no longer an issue. In this paper, we examine two types of unsupervised classifiers: a) distance-based classifiers (e.g., k-means); and b) connectivity-based classifiers. While distance-based unsupervised classifiers have been previously used in the defect prediction literature with disappointing performance, connectivity-based classifiers have never been explored before in our community.
We compare the performance of unsupervised classifiers versus supervised classifiers using data from 26 projects from three publicly available datasets (i.e., AEEEM, NASA, and PROMISE). In the cross-project setting, our proposed connectivity-based classifier (via spectral clustering) ranks as one of the top classifiers among five widely-used supervised classifiers (i.e., random forest, naive Bayes, logistic regression, decision tree, and logistic model tree) and five unsupervised classifiers (i.e., k-means, partition around medoids, fuzzy C-means, neural-gas, and spectral clustering). In the within-project setting (i.e., models are built and applied on the same project), our spectral classifier ranks in the second tier, while only random forest ranks in the first tier. Hence, connectivity-based unsupervised classifiers offer a viable solution for cross and within project defect predictions.
Data collection for software defect predictionAmmAr mobark
It is one of the important stages that software companies need it, it will be after produce the program and published, to know the reactions of the users and their impressions about the program and work on developing and improving it.
The Contents
* BACKGROUND AND RELATED WORK
* EXPERIMENTAL PLANNING
-Research Goal -Research Questions -Experimental Subjects
-Experimental Material -Tasks and Methods
-Experimental Design
عمار عبد الكريم صاحب مبارك
AmmAr Abdualkareem sahib mobark
Findability through Traceability - A Realistic Application of Candidate Tr...Markus Borg
Conference presentation from ENASE 2012 in Wroclaw, Poland.
Abstract: Since software development is of dynamic nature, the impact analysis is an inevitable work task. Traceability is known as one factor that supports this task, and several researchers have proposed traceability recovery tools to propose trace links in an existing system. However, these semi-automatic tools have not yet proven useful in industrial applications. Based on an established automation model, we analyzed the potential value of such a tool. We based our analysis on a pilot case study of an impact analysis process in a safety-critical development context, and argue that traceability recovery should be considered an investment in ndability. Moreover, several risks involved in an increased level of impact analysis automation are already plaguing the state-
of-practice workflow. Consequently, deploying a traceability recovery tool involves a lower degree
of change than has previously been acknowledged.
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
Developers often wonder how to implement a certain functionality
(e.g., how to parse XML files) using APIs. Obtaining
an API usage sequence based on an API-related natural
language query is very helpful in this regard. Given a query,
existing approaches utilize information retrieval models to
search for matching API sequences. These approaches treat
queries and APIs as bags-of-words and lack a deep understanding
of the semantics of the query.
We propose DeepAPI, a deep learning based approach to
generate API usage sequences for a given natural language
query. Instead of a bag-of-words assumption, it learns the
sequence of words in a query and the sequence of associated
APIs. DeepAPI adapts a neural language model named
RNN Encoder-Decoder. It encodes a word sequence (user
query) into a fixed-length context vector, and generates an
API sequence based on the context vector. We also augment
the RNN Encoder-Decoder by considering the importance
of individual APIs. We empirically evaluate our approach
with more than 7 million annotated code snippets collected
from GitHub. The results show that our approach generates
largely accurate API sequences and outperforms the related
approaches.
Resilience Engineering: A field of study, a community, and some perspective s...John Allspaw
These are slides from my talk on March 28, 2018 at the LA SCALE tech Meetup, graciously hosted at TicketMaster's office. (https://www.meetup.com/scalela/events/248904126/)
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Feng Zhang
Defect prediction on projects with limited historical data has attracted great interest from both researchers and practitioners. Cross-project defect prediction has been the main area of progress by reusing classifiers from other projects. However, existing approaches require some degree of homogeneity (e.g., a similar distribution of metric values) between the training projects and the target project. Satisfying the homogeneity requirement often requires significant effort (currently a very active area of research).
An unsupervised classifier does not require any training data, therefore the heterogeneity challenge is no longer an issue. In this paper, we examine two types of unsupervised classifiers: a) distance-based classifiers (e.g., k-means); and b) connectivity-based classifiers. While distance-based unsupervised classifiers have been previously used in the defect prediction literature with disappointing performance, connectivity-based classifiers have never been explored before in our community.
We compare the performance of unsupervised classifiers versus supervised classifiers using data from 26 projects from three publicly available datasets (i.e., AEEEM, NASA, and PROMISE). In the cross-project setting, our proposed connectivity-based classifier (via spectral clustering) ranks as one of the top classifiers among five widely-used supervised classifiers (i.e., random forest, naive Bayes, logistic regression, decision tree, and logistic model tree) and five unsupervised classifiers (i.e., k-means, partition around medoids, fuzzy C-means, neural-gas, and spectral clustering). In the within-project setting (i.e., models are built and applied on the same project), our spectral classifier ranks in the second tier, while only random forest ranks in the first tier. Hence, connectivity-based unsupervised classifiers offer a viable solution for cross and within project defect predictions.
Data collection for software defect predictionAmmAr mobark
It is one of the important stages that software companies need it, it will be after produce the program and published, to know the reactions of the users and their impressions about the program and work on developing and improving it.
The Contents
* BACKGROUND AND RELATED WORK
* EXPERIMENTAL PLANNING
-Research Goal -Research Questions -Experimental Subjects
-Experimental Material -Tasks and Methods
-Experimental Design
عمار عبد الكريم صاحب مبارك
AmmAr Abdualkareem sahib mobark
Findability through Traceability - A Realistic Application of Candidate Tr...Markus Borg
Conference presentation from ENASE 2012 in Wroclaw, Poland.
Abstract: Since software development is of dynamic nature, the impact analysis is an inevitable work task. Traceability is known as one factor that supports this task, and several researchers have proposed traceability recovery tools to propose trace links in an existing system. However, these semi-automatic tools have not yet proven useful in industrial applications. Based on an established automation model, we analyzed the potential value of such a tool. We based our analysis on a pilot case study of an impact analysis process in a safety-critical development context, and argue that traceability recovery should be considered an investment in ndability. Moreover, several risks involved in an increased level of impact analysis automation are already plaguing the state-
of-practice workflow. Consequently, deploying a traceability recovery tool involves a lower degree
of change than has previously been acknowledged.
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
Developers often wonder how to implement a certain functionality
(e.g., how to parse XML files) using APIs. Obtaining
an API usage sequence based on an API-related natural
language query is very helpful in this regard. Given a query,
existing approaches utilize information retrieval models to
search for matching API sequences. These approaches treat
queries and APIs as bags-of-words and lack a deep understanding
of the semantics of the query.
We propose DeepAPI, a deep learning based approach to
generate API usage sequences for a given natural language
query. Instead of a bag-of-words assumption, it learns the
sequence of words in a query and the sequence of associated
APIs. DeepAPI adapts a neural language model named
RNN Encoder-Decoder. It encodes a word sequence (user
query) into a fixed-length context vector, and generates an
API sequence based on the context vector. We also augment
the RNN Encoder-Decoder by considering the importance
of individual APIs. We empirically evaluate our approach
with more than 7 million annotated code snippets collected
from GitHub. The results show that our approach generates
largely accurate API sequences and outperforms the related
approaches.
Resilience Engineering: A field of study, a community, and some perspective s...John Allspaw
These are slides from my talk on March 28, 2018 at the LA SCALE tech Meetup, graciously hosted at TicketMaster's office. (https://www.meetup.com/scalela/events/248904126/)
These slides contain an introduction to Symbolic execution and an introduction to KLEE.
I made this for a small demo/intro for my research group's meeting.
This talk introduces the core concepts of functional programming, why it matters now and how these principles are adopted for programming.
Beyond the usage of the functional principles for programming in the small, their applicability to the design of components and further on to the architecture of software systems is also explored along with relevant examples.
The code samples used in the talk are in Haskell / Java 8 / Scala / Elm and JavaScript, but deep understanding of any of these languages are not a pre-requisite.
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
Graph Representation Learning with Deep Embedding Approach:
Graphs are commonly used data structure for representing the real-world relationships, e.g., molecular structure, knowledge graphs, social and communication networks. The effective encoding of graphical information is essential to the success of such applications. In this talk I’ll first describe a general deep learning framework, namely structure2vec, for end to end graph feature representation learning. Then I’ll present the direct application of this model on graph problems on different scales, including community detection and molecule graph classification/regression. We then extend the embedding idea to temporal evolving user-product interaction graph for recommendation. Finally I’ll present our latest work on leveraging the reinforcement learning technique for graph combinatorial optimization, including vertex cover problem for social influence maximization and traveling salesman problem for scheduling management.
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMERAndrey Karpov
Theory
Code quality (bugs, vulnerabilities)
Methodologies of code protection against defects
Code Review
Static analysis and everything related to it
Tools
Existing tools of static analysis
SonarQube
PVS-Studio for Java what is it?
Several detected examples of code with defects
More about static analysis
Conclusions
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Annibale Panichella
Information Retrieval (IR) approaches are nowadays used to support various software engineering tasks, such as feature location, traceability link recovery, clone detection, or refactoring. However, previous studies showed that inadequate instantiation of an IR technique and underlying process could significantly affect the performance of such approaches in terms of precision and recall. This paper proposes the use of Genetic Algorithms (GAs) to automatically configure and assemble an IR process for software engineering tasks. The approach (named GA-IR) determines the (near) optimal solution to be used for each stage of the IR process, ie term extraction, stop word removal, stemming, indexing and an IR algebraic method calibration. We applied GA-IR on two different software engineering tasks, namely traceability link recovery and identification of duplicate bug reports. The results of the study indicate that GA-IR outperforms approaches previously published in the literature, and that it does not significantly differ from an ideal upper bound that could be achieved by a supervised and combinatorial approach.
Tools for automatic detection of static errors. While common tools work with source-code, this approach analyses Java byte-code. It discovers possible Java Linkage Errors such as NoSuchMethodError.
Wikipedia is the largest user-generated knowledge base. We propose a structured query mechanism, entity-relationship query, for searching entities in Wikipedia corpus by their properties and inter-relationships. An entity-relationship query consists of arbitrary number of predicates on desired entities. The semantics of each predicate is specified with keywords. Entity-relationship query searches entities directly over text rather than pre-extracted structured data stores. This characteristic brings two benefits: (1) Query semantics can be intuitively expressed by keywords; (2) It avoids information loss that happens during extraction. We present a ranking framework for general entity-relationship queries and a position-based Bounded Cumulative Model for accurate ranking of query answers. Experiments on INEX benchmark queries and our own crafted queries show the effectiveness and accuracy of our ranking method.
Similar to DLint: dynamically checking bad coding practices in JavaScript (ISSTA'15 Slides) (20)
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
DLint: dynamically checking bad coding practices in JavaScript (ISSTA'15 Slides)
1. Liang Gong1, Michael Pradel2, Manu Sridharan3 and Koushik Sen1
1 UC Berkeley 2 TU Darmstadt 3 Samsung Research America
DLint: Dynamically Checking
Bad Coding Practices in JavaScript
2. Why JavaScript?
…2
• The RedMonk Programming Language Ranking (1st)
– Based on GitHub and StackOverflow
• Assembly language for the web
• Web applications, DSL, Desktop apps, Mobile apps
3. 3
• Designed and Implemented in 10 days
• Not all decisions were well-thought
• Problematic language features
– Error prone
– Poor performance
– Prone to security vulnerabilities
• Problematic features are still around
– Backward compatibility
Problematic JavaScript
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
5. • Good coding practices
• Informal rules
• Improve code quality
• Better quality means:
• Fewer correctness issues
• Better performance
• Better usability
• Better maintainability
• Fewer security loopholes
• Fewer surprises
• …
5
What are coding practices?
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
6. 6
var sum = 0, value;
var array = [11, 22, 33];
for (value in array) {
sum += value;
}
> sum ?
Rule: avoid using for..in over arrays
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
7. 7
var sum = 0, value;
var array = [11, 22, 33];
for (value in array) {
sum += value;
}
> sum ?
11 + 22 + 33 => 66
array index
(not array value)
0 + 1 + 2 => 3 array index : string
0+"0"+"1"+"2" => "0012"
Rule: avoid using for..in over arrays
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
8. • Cross-browser issues
• Result depends on the Array prototype object
8
var sum = 0, value;
var array = [11, 22, 33];
for (value in array) {
sum += value;
}
> sum ?
11 + 22 + 33 => 66
array index
(not array value)
0 + 1 + 2 => 3 array index : string
0+"0"+"1"+"2" => "0012"
> "0012indexOftoString..."
Rule: avoid using for..in over arrays
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
9. 9
var sum = 0, value;
var array = [11, 22, 33];
for (value in array) {
sum += value;
}
> sum ?
for (i=0; i < array.length; i++) {
sum += array[i];
}
function addup(element, index, array) {
sum += element;
}
array.forEach(addup);
Rule: avoid using for..in over arrays
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
10. 10
var sum = 0, value;
var array = [11, 22, 33];
for (value in array) {
sum += value;
}
> sum ?
for (i=0; i < array.length; i++) {
sum += array[i];
}
function addup(element, index, array) {
sum += element;
}
array.forEach(addup);
Rule: avoid using for..in over arrays
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
11. Coding Practices and Lint Tools
• Existing Lint-like checkers
– Inspect source code
– Rule-based checking
– Detect common mistakes
• Limitations:
– Approximates behavior
– Unknown aliases
– Lint tools favor precision over soundness
• Difficulty: Precise static program analysis
11
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
12. 12
• Dynamic Linter checking code quality rules for JS
• Open-source, robust, and extensible framework
• Formalized and implemented 28 rules
– Counterparts of static rules
– Additional rules
• Empirical study
– Compare static and dynamic checking
DLint
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
13. a.f = b.g PutField(Read("a", a), "f", GetField(Read("b", b), "g"))
if (a.f()) ... if (Branch(Method(Read("a", a), "f")()) ...
x = y + 1 x = Write("x", Binary(‘+’,Read("y", y), Literal(1))
analysis.Literal(c) analysis.Branch(c)
analysis.Read(n, x) analysis.Write(n, x)
analysis.PutField(b, f, v) analysis.Binary(op, x, y)
analysis.Function(f, isConstructor) analysis.GetField(b,f)
analysis.Method(b, f, isConstructor) analysis.Unary(op, x)
...
13
Jalangi: A Dynamic Analysis Framework for JavaScript
[Sen et al. ESEC/FSE'2013]
16. 16
Language Misuse
Avoid setting properties of primitives,
which has no effect.
var fact = 42;
fact.isTheAnswer = true;
console.log(fact.isTheAnswer);
> undefined
DLint Checker Predicate:
propWrite(base,*,*) Λ isPrim(base)
17. 17
Avoid producing NaN (Not a Number).
var x = 23 – "five";
> NaN
DLint Checker Predicate:
unOp(*, val, NaN) Λ val ≠ NaN
binOp(*, left, right, NaN) Λ left ≠ NaN
Λ right ≠ NaN
call(*, *, args,NaN, *) Λ NaN ≠ args
Uncommon Values
18. 18
Avoid concatenating undefined to string.
var value;
...
var str = "price: ";
...
var result = str + value;
> "price: undefined"
DLint Checker Predicate:
binOp(+, left, right, res)
Λ (left = undefined ∨ right = undefined)
Λ isString(res)
Uncommon Values
19. 19
Beware that all wrapped primitives
coerce to true.
var b = false;
if (new Boolean(b)) {
console.log("true");
}
> true
DLint Checker Predicate:
cond(val) Λ isWrappedBoolean(val)
Λ val.valueOf() = false
API Misuse
20. Avoid accessing the undefined property.
var x; // undefined
var y = {};
y[x] = 23; // { undefined: 23 }
propWrite(*, "undefined",*)
propRead(*, "undefined", *)
20
Type Related Checker
24. • Modify SpiderMonkey to intercept JavaScript files
• Transpile JavaScript code with Jalangi
• DLint monitors runtime states and finds issues
• Reports root cause and code location
24
JS program
Instrumented
JS program
Behavior
Information
Final Report
DLint Checkers
Jalangi
DLint Overview
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
25. 25
Evaluation
Research Questions
• DLint warnings vs. JSHint warnings?
• Additional warnings from DLint?
• Coding convention vs. page popularity?
Experimental Setup
• 200 web sites (top 50 + others)
• Comparison to JSHint
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
26. 26
• Some sites: One approach finds all
• Most sites: Better together
0% 20% 40% 60% 80% 100%
% of warnings on each site
Websites
JSHint Unique Common DLint Unique
% of Warnings: DLint vs. JSHint
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
27. 27
• 53 warnings per page
• 49 are missed by JSHint
0
1
2
3
4
I5 T6 L3 T5 A2 V2 L4 A5 T1 L2 L1 A6 A8 A3 T2 A4 I1 I4 V3 L5 I2 T4 L6 A7 T3
Logarithmicaverage.#
warning/site(base2)
DLint checkers
8
4
2
1
0
Checker shares warning with JSHint
Checker shares no warnings with JSHint
Additional Warnings Reported by DLint
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
28. 28
Correlation between Alexa popularity
and number of DLint warnings: 0.6
0
0.0025
0.005
0.0075
0.01
0.0125
0.015
0.0175
0.02
#DLintWarn./#Cov.Op.
Websites traffic ranking
Quartile 1-3
Mean
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
#JSHintWarning/LOC
Websites traffic ranking
Quartile 1-3
Mean
Coding Convention vs. Page Popularity
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
29. 29
74
359
191
416
1401
62
202
2335
4933
79
205
62
125
167
6
15
26
8
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Do not use Number, Boolean, String as a constructor.
The Function constructor is a form of eval.
The object literal notation {} is preferable.
The array literal notation [] is preferable.
document.write can be a form of eval.
Do not override built-in variables.
Implied eval (string instead of function as argument).
eval can be harmful.
Missing 'new' prefix when invoking a constructor.
Fount by JSHint Only Common with DLint
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
30. 30
E.g., 181 calls of eval(), Function(), etc. missed by JSHint
31
77
181
74
833
79
187
413
6
8
0% 20% 40% 60% 80% 100%
WrappedPrimitives
Literals
DoubleEvaluation
ArgumentsVariable
ConstructorFunctions
A8L4A1L1T5
Found by Dlint Only Common with JSHint
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
40. 40
var decode64 = "";
if (dataLength > 3 && dataLength % 4 == 0) {
while (index < dataLength) {
decode64 += String.fromCharCode(...);
}
if (sixbits[3] >= 0x40) {
decode64.length -= 1;
}
}
From Google Octane’s Game Boy Emulator benchmark:
Rule: avoid setting field on primitive values
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
41. 41
var decode64 = "";
if (dataLength > 3 && dataLength % 4 == 0) {
while (index < dataLength) {
decode64 += String.fromCharCode(...);
}
if (sixbits[3] >= 0x40) {
decode64.length -= 1;
}
}
From Google Octane’s Game Boy Emulator benchmark:
No effect because decode64 is a primitive string.
Rule: avoid setting field on primitive values
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
44. www.google.com/chrome,
included code from Modernizr:
https://github.com/Modernizr/Modernizr/pull/1419
44
Rule: avoid using for..in on arrays
for (i in props) { // props is an array
prop = props[i];
before = mStyle.style[prop];
...
}
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
45. 45
Takeaways
Dynamic lint-like checking for JavaScript
• Static checkers are not sufficient, DLint complements
• DLint is an open-source, robust, and extensible tool
– Works on real-world websites
– Found 19 clear bugs on most popular websites
More information:
• Paper: “DLint: Dynamically Checking Bad Coding Practices in JavaScript”
• Source Code: https://github.com/Berkeley-Correctness-Group/DLint
• Google “DLint Berkeley”
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.
46. 46
Takeaways
Thanks!
Dynamic lint-like checking for JavaScript
• Static checkers are not sufficient, DLint complements
• DLint is an open-source, robust, and extensible tool
– Works on real-world websites
– Found 19 clear bugs on most popular websites
More information:
• Paper: “DLint: Dynamically Checking Bad Coding Practices in JavaScript”
• Source Code: https://github.com/Berkeley-Correctness-Group/DLint
• Google “DLint Berkeley”
Liang Gong, Electric Engineering & Computer Science, University of California, Berkeley.