Professor Abhik Roychoudhury discusses automated program repair through his research project TSUNAMi. The key points discussed are:
1) TSUNAMi is a national research project in Singapore from 2015-2020 focused on developing trustworthy systems from untrusted components through techniques like vulnerability discovery, binary hardening, verification, and data protection.
2) Automated program repair aims to automatically detect and fix vulnerabilities in software. This involves techniques like syntactic and semantic repair as well as specification inference to understand intended program behavior.
3) Challenges in automated program repair include weak specifications of intended behavior, large search spaces for candidate patches, and limited applicability of existing techniques. Roychoud
Clustering is an unsupervised learning technique used to group unlabeled data points together based on similarities. It aims to maximize similarity within clusters and minimize similarity between clusters. There are several clustering methods including partitioning, hierarchical, density-based, grid-based, and model-based. Clustering has many applications such as pattern recognition, image processing, market research, and bioinformatics. It is useful for extracting hidden patterns from large, complex datasets.
This document discusses syntax-directed translation, which refers to a method of compiler implementation where the source language translation is completely driven by the parser. The parsing process and parse trees are used to direct semantic analysis and translation of the source program. Attributes and semantic rules are associated with the grammar symbols and productions to control semantic analysis and translation. There are two main representations of semantic rules: syntax-directed definitions and syntax-directed translation schemes. Syntax-directed translation schemes embed program fragments called semantic actions within production bodies and are more efficient than syntax-directed definitions as they indicate the order of evaluation of semantic actions. Attribute grammars can be used to represent syntax-directed translations.
A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system.
This document discusses spatial data mining and its applications. Spatial data mining involves extracting knowledge and relationships from large spatial databases. It can be used for applications like GIS, remote sensing, medical imaging, and more. Some challenges include the complexity of spatial data types and large data volumes. The document also covers topics like spatial data warehouses, dimensions and measures in spatial analysis, spatial association rule mining, and applications in fields such as earth science, crime mapping, and commerce.
Slot and filler structures represent knowledge through attributes (slots) and their associated values (fillers). Weak slot and filler structures provide little domain knowledge. Frames are a type of weak structure where a frame contains slots describing an entity. Semantic networks also represent knowledge with nodes and labeled links, allowing inheritance of properties through generalization hierarchies. Both frames and semantic networks enable quick retrieval of attribute values and easy description of object relations, but semantic networks additionally allow representation of non-binary predicates and partitioned reasoning about quantified statements.
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®confluent
This document discusses best practices for streaming IoT data with MQTT and Apache Kafka. It begins with an example use case of connecting vehicles in a automotive company. It then outlines an architecture showing how sensor data from vehicles can be ingested via MQTT into Kafka and processed using tools like Kafka Streams, TensorFlow, and ElasticSearch. The document also covers a live demo of streaming data from 100,000 simulated connected vehicles. It concludes with best practices for choosing the right tools, separation of concerns, handling different data types, and starting projects at a small scale while planning for future growth.
Your Roadmap for An Enterprise Graph StrategyNeo4j
This document provides a roadmap for developing an enterprise graph strategy with the following key steps:
1. Design and build a proof-of-concept graph using a small local dataset to demonstrate graph capabilities.
2. Present use cases and example queries to business stakeholders to validate the graph model and gather feedback.
3. Design the production graph schema and build APIs/services to integrate data from multiple sources.
4. Deploy the graph in the cloud and develop applications and reports to mobilize enterprise data using the graph.
Outlier analysis is used to identify outliers, which are data objects that are inconsistent with the general behavior or model of the data. There are two main types of outlier detection - statistical distribution-based detection, which identifies outliers based on how far they are from the average statistical distribution, and distance-based detection, which finds outliers based on how far they are from other data objects. Outlier analysis is useful for tasks like fraud detection, where outliers may indicate fraudulent activity that is different from normal patterns in the data.
Clustering is an unsupervised learning technique used to group unlabeled data points together based on similarities. It aims to maximize similarity within clusters and minimize similarity between clusters. There are several clustering methods including partitioning, hierarchical, density-based, grid-based, and model-based. Clustering has many applications such as pattern recognition, image processing, market research, and bioinformatics. It is useful for extracting hidden patterns from large, complex datasets.
This document discusses syntax-directed translation, which refers to a method of compiler implementation where the source language translation is completely driven by the parser. The parsing process and parse trees are used to direct semantic analysis and translation of the source program. Attributes and semantic rules are associated with the grammar symbols and productions to control semantic analysis and translation. There are two main representations of semantic rules: syntax-directed definitions and syntax-directed translation schemes. Syntax-directed translation schemes embed program fragments called semantic actions within production bodies and are more efficient than syntax-directed definitions as they indicate the order of evaluation of semantic actions. Attribute grammars can be used to represent syntax-directed translations.
A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system.
This document discusses spatial data mining and its applications. Spatial data mining involves extracting knowledge and relationships from large spatial databases. It can be used for applications like GIS, remote sensing, medical imaging, and more. Some challenges include the complexity of spatial data types and large data volumes. The document also covers topics like spatial data warehouses, dimensions and measures in spatial analysis, spatial association rule mining, and applications in fields such as earth science, crime mapping, and commerce.
Slot and filler structures represent knowledge through attributes (slots) and their associated values (fillers). Weak slot and filler structures provide little domain knowledge. Frames are a type of weak structure where a frame contains slots describing an entity. Semantic networks also represent knowledge with nodes and labeled links, allowing inheritance of properties through generalization hierarchies. Both frames and semantic networks enable quick retrieval of attribute values and easy description of object relations, but semantic networks additionally allow representation of non-binary predicates and partitioned reasoning about quantified statements.
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®confluent
This document discusses best practices for streaming IoT data with MQTT and Apache Kafka. It begins with an example use case of connecting vehicles in a automotive company. It then outlines an architecture showing how sensor data from vehicles can be ingested via MQTT into Kafka and processed using tools like Kafka Streams, TensorFlow, and ElasticSearch. The document also covers a live demo of streaming data from 100,000 simulated connected vehicles. It concludes with best practices for choosing the right tools, separation of concerns, handling different data types, and starting projects at a small scale while planning for future growth.
Your Roadmap for An Enterprise Graph StrategyNeo4j
This document provides a roadmap for developing an enterprise graph strategy with the following key steps:
1. Design and build a proof-of-concept graph using a small local dataset to demonstrate graph capabilities.
2. Present use cases and example queries to business stakeholders to validate the graph model and gather feedback.
3. Design the production graph schema and build APIs/services to integrate data from multiple sources.
4. Deploy the graph in the cloud and develop applications and reports to mobilize enterprise data using the graph.
Outlier analysis is used to identify outliers, which are data objects that are inconsistent with the general behavior or model of the data. There are two main types of outlier detection - statistical distribution-based detection, which identifies outliers based on how far they are from the average statistical distribution, and distance-based detection, which finds outliers based on how far they are from other data objects. Outlier analysis is useful for tasks like fraud detection, where outliers may indicate fraudulent activity that is different from normal patterns in the data.
The document discusses the expert system shell CLIPS (C Language Integrated Production System). It describes what an expert system is, the typical structure of an expert system including the knowledge base and inference engine, and how CLIPS allows defining facts, rules, templates, functions, and object-oriented programming concepts like classes and instances. It also covers how CLIPS provides mechanisms for pattern matching, rule execution, and message passing between rules and objects.
The document discusses various clustering approaches including partitioning, hierarchical, density-based, grid-based, model-based, frequent pattern-based, and constraint-based methods. It focuses on partitioning methods such as k-means and k-medoids clustering. K-means clustering aims to partition objects into k clusters by minimizing total intra-cluster variance, representing each cluster by its centroid. K-medoids clustering is a more robust variant that represents each cluster by its medoid or most centrally located object. The document also covers algorithms for implementing k-means and k-medoids clustering.
This document provides an overview of machine learning topics including the current state and open problems in machine learning, notable conferences and publications, and introductions to deep learning, reinforcement learning, and an overview of ICML 2013. It discusses the history and evolution of machine learning from the 1960s to present, highlighting seminal works. For deep learning, it outlines concepts like autoencoders, pre-training, and results from applying deep learning to tasks like image recognition. Reinforcement learning concepts like Markov decision processes and examples like pole balancing are briefly covered.
This document provides an overview of decision trees, including:
- Decision trees classify records by sorting them down the tree from root to leaf node, where each leaf represents a classification outcome.
- Trees are constructed top-down by selecting the most informative attribute to split on at each node, usually based on information gain.
- Trees can handle both numerical and categorical data and produce classification rules from paths in the tree.
- Examples of decision tree algorithms like ID3 that use information gain to select the best splitting attribute are described. The concepts of entropy and information gain are defined for selecting splits.
This document discusses rule-based classification. It describes how rule-based classification models use if-then rules to classify data. It covers extracting rules from decision trees and directly from training data. Key points include using sequential covering algorithms to iteratively learn rules that each cover positive examples of a class, and measuring rule quality based on both coverage and accuracy to determine the best rules.
Presented at Spacewalk 2023
Presented by Christian Posta, solo.io
Title: The Future of Service Mesh
Abstract: Service mesh is a powerful way to solve cross-cutting application-networking concerns, such as load balancing, service resilience, observability, and security. Adopting a mesh for your services can save hundreds of hours of developer time and reduce the burden placed on operations. In this talk we'll explore some common use cases for service mesh, look at some case studies, and then dig into innovation happening in this space such as "sidecar-less" service mesh.
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
This document summarizes Chapter 6 of the book "Data Mining: Concepts and Techniques" which discusses frequent pattern mining. It introduces basic concepts like frequent itemsets and association rules. It then describes several scalable algorithms for mining frequent itemsets, including Apriori, FP-Growth, and ECLAT. It also discusses optimizations to Apriori like partitioning the database and techniques to reduce the number of candidates and database scans.
This document discusses machine learning concepts including supervised vs. unsupervised learning, clustering algorithms, and specific clustering methods like k-means and k-nearest neighbors. It provides examples of how clustering can be used for applications such as market segmentation and astronomical data analysis. Key clustering algorithms covered are hierarchy methods, partitioning methods, k-means which groups data by assigning objects to the closest cluster center, and k-nearest neighbors which classifies new data based on its closest training examples.
Syntax directed translation allows semantic information to be associated with a formal language by attaching attributes to grammar symbols and defining semantic rules. There are several types of attributes including synthesized and inherited. Syntax directed definitions specify attribute values using semantic rules associated with grammar productions. Evaluation of attributes requires determining an order such as a topological sort of a dependency graph. Syntax directed translation schemes embed program fragments called semantic actions within grammar productions. Actions can be placed inside or at the ends of productions. Various parsing strategies like bottom-up can be used to execute the actions at appropriate times during parsing.
Here are the key points from the AT&T presentation on their "Network AI" framework:
- AT&T is developing an open source framework called "Network AI" to drive their software-defined network transformation.
- The goal is to apply AI/machine learning techniques to continuously optimize their network performance. This will be done by collecting massive amounts of network data and using it to train ML models.
- As part of this effort, AT&T is contributing several open source projects to the Linux Foundation like Airship, Akraino, and Acumos. Airship provides tools for deploying OpenStack and Kubernetes on the edge, while Akraino is an edge computing framework. Acumos allows for developing and
Building an Authorization Solution for Microservices Using Neo4j and OPANeo4j
1. The document discusses building an authorization solution for microservices using Neo4j and OPA.
2. It describes modeling authorization data in a graph database for role-based access control and efficient authorization queries.
3. The proposed solution uses OPA as a centralized decision engine to evaluate authorization policies for microservices in a scalable way.
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECsundarKanagaraj1
This document discusses uncertainty and statistical reasoning in artificial intelligence. It covers probability theory, Bayesian networks, and certainty factors. Key topics include probability distributions, Bayes' rule, building Bayesian networks, different types of probabilistic inferences using Bayesian networks, and defining and combining certainty factors. Case studies are provided to illustrate each algorithm.
This presentation introduces clustering analysis and the k-means clustering technique. It defines clustering as an unsupervised method to segment data into groups with similar traits. The presentation outlines different clustering types (hard vs soft), techniques (partitioning, hierarchical, etc.), and describes the k-means algorithm in detail through multiple steps. It discusses requirements for clustering, provides examples of applications, and reviews advantages and disadvantages of k-means clustering.
This document provides an overview of microservices architecture, including concepts, characteristics, infrastructure patterns, and software design patterns relevant to microservices. It discusses when microservices should be used versus monolithic architectures, considerations for sizing microservices, and examples of pioneers in microservices implementation like Netflix and Spotify. The document also covers domain-driven design concepts like bounded context that are useful for decomposing monolithic applications into microservices.
This document discusses unsupervised learning approaches including clustering, blind signal separation, and self-organizing maps (SOM). Clustering groups unlabeled data points together based on similarities. Blind signal separation separates mixed signals into their underlying source signals without information about the mixing process. SOM is an algorithm that maps higher-dimensional data onto lower-dimensional displays to visualize relationships in the data.
This presentation gives the idea about Data Preprocessing in the field of Data Mining. Images, examples and other things are adopted from "Data Mining Concepts and Techniques by Jiawei Han, Micheline Kamber and Jian Pei "
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology.
Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way integrating with various legacy and modern data sources and sinks.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
• The Automotive Industry (and it’s not only Connected Cars)
• Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
• Smart Cities (including citizen health services, communication infrastructure, …)
All these industries and sectors do not have new characteristics and requirements. They require data integration, data correlation or real decoupling, just to name a few, but are now facing massively increased volumes of data.
Real-time messaging solutions have existed for many years. Hundreds of platforms exist for data integration (including ETL and ESB tooling or specific IIoT platforms). Proprietary monoliths monitor plants, telco networks, and other infrastructures for decades in real-time. But now, Kafka combines all the above characteristics in an open, scalable, and flexible infrastructure to operate mission-critical workloads at scale in real-time. And is taking over the world of connecting data.
Dynamic Itemset Counting (DIC) is an algorithm for efficiently mining frequent itemsets from transactional data that improves upon the Apriori algorithm. DIC allows itemsets to begin being counted as soon as it is suspected they may be frequent, rather than waiting until the end of each pass like Apriori. DIC uses different markings like solid/dashed boxes and circles to track the counting status of itemsets. It can generate frequent itemsets and association rules using conviction in fewer passes over the data compared to Apriori.
Keynote given at the Asia Pacific Software Engineering Conference (APSEC), December 2020, on Automated Program Repair technologies and their applications.
Automated Program Repair, Distinguished lecture at MPI-SWSAbhik Roychoudhury
MPI-SWS Distinguished Lecture 2019. The talk focuses on fuzzing, symbolic execution as background technologies and compares their relative power. Then the use of such technologies for automated program repair is investigated.
The document discusses the expert system shell CLIPS (C Language Integrated Production System). It describes what an expert system is, the typical structure of an expert system including the knowledge base and inference engine, and how CLIPS allows defining facts, rules, templates, functions, and object-oriented programming concepts like classes and instances. It also covers how CLIPS provides mechanisms for pattern matching, rule execution, and message passing between rules and objects.
The document discusses various clustering approaches including partitioning, hierarchical, density-based, grid-based, model-based, frequent pattern-based, and constraint-based methods. It focuses on partitioning methods such as k-means and k-medoids clustering. K-means clustering aims to partition objects into k clusters by minimizing total intra-cluster variance, representing each cluster by its centroid. K-medoids clustering is a more robust variant that represents each cluster by its medoid or most centrally located object. The document also covers algorithms for implementing k-means and k-medoids clustering.
This document provides an overview of machine learning topics including the current state and open problems in machine learning, notable conferences and publications, and introductions to deep learning, reinforcement learning, and an overview of ICML 2013. It discusses the history and evolution of machine learning from the 1960s to present, highlighting seminal works. For deep learning, it outlines concepts like autoencoders, pre-training, and results from applying deep learning to tasks like image recognition. Reinforcement learning concepts like Markov decision processes and examples like pole balancing are briefly covered.
This document provides an overview of decision trees, including:
- Decision trees classify records by sorting them down the tree from root to leaf node, where each leaf represents a classification outcome.
- Trees are constructed top-down by selecting the most informative attribute to split on at each node, usually based on information gain.
- Trees can handle both numerical and categorical data and produce classification rules from paths in the tree.
- Examples of decision tree algorithms like ID3 that use information gain to select the best splitting attribute are described. The concepts of entropy and information gain are defined for selecting splits.
This document discusses rule-based classification. It describes how rule-based classification models use if-then rules to classify data. It covers extracting rules from decision trees and directly from training data. Key points include using sequential covering algorithms to iteratively learn rules that each cover positive examples of a class, and measuring rule quality based on both coverage and accuracy to determine the best rules.
Presented at Spacewalk 2023
Presented by Christian Posta, solo.io
Title: The Future of Service Mesh
Abstract: Service mesh is a powerful way to solve cross-cutting application-networking concerns, such as load balancing, service resilience, observability, and security. Adopting a mesh for your services can save hundreds of hours of developer time and reduce the burden placed on operations. In this talk we'll explore some common use cases for service mesh, look at some case studies, and then dig into innovation happening in this space such as "sidecar-less" service mesh.
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
This document summarizes Chapter 6 of the book "Data Mining: Concepts and Techniques" which discusses frequent pattern mining. It introduces basic concepts like frequent itemsets and association rules. It then describes several scalable algorithms for mining frequent itemsets, including Apriori, FP-Growth, and ECLAT. It also discusses optimizations to Apriori like partitioning the database and techniques to reduce the number of candidates and database scans.
This document discusses machine learning concepts including supervised vs. unsupervised learning, clustering algorithms, and specific clustering methods like k-means and k-nearest neighbors. It provides examples of how clustering can be used for applications such as market segmentation and astronomical data analysis. Key clustering algorithms covered are hierarchy methods, partitioning methods, k-means which groups data by assigning objects to the closest cluster center, and k-nearest neighbors which classifies new data based on its closest training examples.
Syntax directed translation allows semantic information to be associated with a formal language by attaching attributes to grammar symbols and defining semantic rules. There are several types of attributes including synthesized and inherited. Syntax directed definitions specify attribute values using semantic rules associated with grammar productions. Evaluation of attributes requires determining an order such as a topological sort of a dependency graph. Syntax directed translation schemes embed program fragments called semantic actions within grammar productions. Actions can be placed inside or at the ends of productions. Various parsing strategies like bottom-up can be used to execute the actions at appropriate times during parsing.
Here are the key points from the AT&T presentation on their "Network AI" framework:
- AT&T is developing an open source framework called "Network AI" to drive their software-defined network transformation.
- The goal is to apply AI/machine learning techniques to continuously optimize their network performance. This will be done by collecting massive amounts of network data and using it to train ML models.
- As part of this effort, AT&T is contributing several open source projects to the Linux Foundation like Airship, Akraino, and Acumos. Airship provides tools for deploying OpenStack and Kubernetes on the edge, while Akraino is an edge computing framework. Acumos allows for developing and
Building an Authorization Solution for Microservices Using Neo4j and OPANeo4j
1. The document discusses building an authorization solution for microservices using Neo4j and OPA.
2. It describes modeling authorization data in a graph database for role-based access control and efficient authorization queries.
3. The proposed solution uses OPA as a centralized decision engine to evaluate authorization policies for microservices in a scalable way.
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECsundarKanagaraj1
This document discusses uncertainty and statistical reasoning in artificial intelligence. It covers probability theory, Bayesian networks, and certainty factors. Key topics include probability distributions, Bayes' rule, building Bayesian networks, different types of probabilistic inferences using Bayesian networks, and defining and combining certainty factors. Case studies are provided to illustrate each algorithm.
This presentation introduces clustering analysis and the k-means clustering technique. It defines clustering as an unsupervised method to segment data into groups with similar traits. The presentation outlines different clustering types (hard vs soft), techniques (partitioning, hierarchical, etc.), and describes the k-means algorithm in detail through multiple steps. It discusses requirements for clustering, provides examples of applications, and reviews advantages and disadvantages of k-means clustering.
This document provides an overview of microservices architecture, including concepts, characteristics, infrastructure patterns, and software design patterns relevant to microservices. It discusses when microservices should be used versus monolithic architectures, considerations for sizing microservices, and examples of pioneers in microservices implementation like Netflix and Spotify. The document also covers domain-driven design concepts like bounded context that are useful for decomposing monolithic applications into microservices.
This document discusses unsupervised learning approaches including clustering, blind signal separation, and self-organizing maps (SOM). Clustering groups unlabeled data points together based on similarities. Blind signal separation separates mixed signals into their underlying source signals without information about the mixing process. SOM is an algorithm that maps higher-dimensional data onto lower-dimensional displays to visualize relationships in the data.
This presentation gives the idea about Data Preprocessing in the field of Data Mining. Images, examples and other things are adopted from "Data Mining Concepts and Techniques by Jiawei Han, Micheline Kamber and Jian Pei "
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology.
Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way integrating with various legacy and modern data sources and sinks.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
• The Automotive Industry (and it’s not only Connected Cars)
• Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
• Smart Cities (including citizen health services, communication infrastructure, …)
All these industries and sectors do not have new characteristics and requirements. They require data integration, data correlation or real decoupling, just to name a few, but are now facing massively increased volumes of data.
Real-time messaging solutions have existed for many years. Hundreds of platforms exist for data integration (including ETL and ESB tooling or specific IIoT platforms). Proprietary monoliths monitor plants, telco networks, and other infrastructures for decades in real-time. But now, Kafka combines all the above characteristics in an open, scalable, and flexible infrastructure to operate mission-critical workloads at scale in real-time. And is taking over the world of connecting data.
Dynamic Itemset Counting (DIC) is an algorithm for efficiently mining frequent itemsets from transactional data that improves upon the Apriori algorithm. DIC allows itemsets to begin being counted as soon as it is suspected they may be frequent, rather than waiting until the end of each pass like Apriori. DIC uses different markings like solid/dashed boxes and circles to track the counting status of itemsets. It can generate frequent itemsets and association rules using conviction in fewer passes over the data compared to Apriori.
Keynote given at the Asia Pacific Software Engineering Conference (APSEC), December 2020, on Automated Program Repair technologies and their applications.
Automated Program Repair, Distinguished lecture at MPI-SWSAbhik Roychoudhury
MPI-SWS Distinguished Lecture 2019. The talk focuses on fuzzing, symbolic execution as background technologies and compares their relative power. Then the use of such technologies for automated program repair is investigated.
Keynote in KLEE workshop on Symbolic Execution 2018
Systematic greybox fuzzing inspired by ideas from symbolic execution, work at NUS
Covers new usage of symbolic execution in automated program repair, work at NUS
This document summarizes a presentation on model-based design and analysis. It discusses how model-based development and automated analysis can be combined to significantly reduce costs and improve quality for safety-critical software. Specific techniques discussed include modeling system requirements, simulating models, translating models to formal specification languages for automated analysis using model checkers and theorem provers, and reusing models. Case studies are presented where this approach found 10 times more errors than traditional methods and reduced development costs and cycle times by half.
JavaOne 2017 CON2902 - Java Code Inspection and Testing Power ToolsJorge Hidalgo
The document discusses various code inspection, testing, and security testing tools that can be used to improve code quality. It recommends profiling code with static analysis tools to check for coding standards and best practices. It also suggests measuring code coverage and using mutation testing to understand which parts of the code are not being tested. Finally, it emphasizes the importance of security testing with tools that can check for vulnerabilities both in code and dependencies. Mocking tools are also recommended to make tests independent of the environment.
This document summarizes a talk given about the most influential paper award from ICSE2023 on program repair and auto-coding. It discusses:
1. The 2013 SemFix paper which introduced an automated repair method using symbolic execution, constraint solving, and program synthesis to generate patches without formal specifications.
2. How subsequent work incorporated learning and inference techniques to glean specifications from tests to guide repair when specifications were not available.
3. The impact of machine learning approaches on automated program repair, including learning from large code change datasets to predict edits, and opportunities for continued improvement in localization and accuracy.
Cloud Reliability: Decreasing outage frequency using fault injectionJorge Cardoso
Invited Keynote at the 9th International Workshop on Software Engineering for Resilient Systems, September 4-5, 2017, Geneva, Switzerland
Title: Cloud Reliability: Decreasing outage frequency using fault injection
Abstract: In 2016, Google Cloud had 74 minutes of total downtime, Microsoft Azure had 270 minutes, and 108 minutes of downtime for Amazon Web Services (see cloudharmony.com). Reliability is one of the most important properties of a successful cloud platform. Several approaches can be explored to increase reliability ranging from automated replication, to live migration, and to formal system analysis. Another interesting approach is to use software fault injection to test a platform during prototyping, implementation and operation. Fault injection was popularized by Netflix and their Chaos Monkey fault-injection tool to test cloud applications. The main idea behind this technique is to inject failures in a controlled manner to guarantee the ability of a system to survive failures during operations. This talk will explain how fault injection can also be applied to detect vulnerabilities of OpenStack cloud platform and how to effectively and efficiently detect the damages caused by the faults injected.
Amin Milani Fard: Directed Model Inference for Testing and Analysis of Web Ap...knowdiff
The document discusses automated testing techniques for web applications. It proposes feedback-directed exploration to generate test models more effectively than exhaustive crawling. It also leverages existing manual tests to generate new automated tests by reusing inputs, assertions and exploring alternative paths. A technique called ConFix is presented to automatically generate DOM-based fixtures for unit tests by collecting constraints from code instrumentation. Finally, the document discusses detecting prevalent JavaScript code smells like lazy objects to support automated refactoring.
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...Ganesan Narayanasamy
This presentation discusses the on-going project on building a validation and verification (V&V) testsuite of the widely popular directive-based parallel programming model, OpenMP. The talk will present results of the OpenMP offloading features implemented in various compilers targeting Summit among other systems. This project is open-source and the SOLLVE V&V team welcomes collaborations.
Data flow analysis is a type of static code analysis that examines how values are propagated through a program. It is more effective than pattern matching or regular static analysis at finding defects related to interactions between methods and classes that may be difficult to uncover through testing alone. Static analysis tools using data flow analysis can simulate execution paths to detect potential issues without requiring the code to be compiled and run. Developers are encouraged to use static testing tools to catch defects early in development, as prevention of bugs is more efficient than finding and fixing them later.
The document discusses software testing and quality assurance. It explains that testing is an essential part of the software development process and should be planned from the early stages. There are different types of testing techniques, including black-box and white-box testing, that help identify bugs and errors in software. Quality assurance involves verifying and validating the software development process.
REMOTE TRIGGERED SOFTWARE DEFINED RADIOKunal Bidkar
This document describes a remote triggered software defined radio system. It discusses introducing students to software defined radios through a remote lab that provides access over the internet at any time without requiring specialized hardware. The system architecture involves configuring lab computers with VNC server, using a cloud server with Guacamole to enable browser-based access, and rendering the virtual desktop on an HTML5 canvas. Algorithms are presented for system configuration, and results show the user interface for registration, accessing remote systems, and uploading data flow graphs for configuration. The future scope sees increased demand for software defined radios driving need for remote labs.
Towards Automated Engineering for Collective Adaptive Systems: Vision and Res...Roberto Casadei
The opportunities and challenges of recent and
forthcoming distributed computing scenarios have been promot-
ing research on languages and paradigms aimed at modelling the
macro/collective behaviour of systems as well as mechanisms to
endow them with self-* capabilities. One example is the aggregate
computing paradigm, which supports the development of self-
organising systems (e.g., robot swarms, computational ecosys-
tems, and crowd-based services) through various formalisms and
tools developed over a decade. However, very limited work has
been done by a methodological and automation perspective. In
this paper, we explore the issue of organising the development
process of aggregate computing systems. Accordingly, we outline
novel research directions that arise from careful analysis of
the peculiar issues in collective and self-organising systems, the
cornerstones of effective software engineering practices, and
recent scientific trends and insights.
Testes? Mas isso não aumenta o tempo de projecto? Não quero...Comunidade NetPonto
Os Testes são cada vez mais uma necessidade nos projectos de desenvolvimento de software... Sejam eles unitários, de carga ou de "User Interface", uma boa framework de testes ajuda a resolver os problemas mais cedo, de forma mais eficaz e mais barata.
No final da sessão vamos perceber não só para que servem, como são feitos e como o Visual Studio 2010 pode ajudar.
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
This document summarizes Ning Chen's PhD thesis defense on STAR, a stack trace based automatic crash reproduction framework. The outline includes motivation and related work, approaches of STAR including crash precondition computation and input model generation, evaluation study, challenges and future work, and contributions. STAR aims to efficiently reproduce crashes using only stack traces by performing backward symbolic execution to optimize crash precondition computation and address object creation challenges in reproducing object-oriented crashes.
The document discusses various software testing techniques. It covers the objectives of testing as finding errors and having a high probability of discovering undiscovered errors. It describes different types of testing like white-box testing, which tests internal logic and paths, and black-box testing, which tests external functionality. Specific techniques covered include basis path testing, equivalence partitioning, boundary value analysis, and graph-based testing methods. The importance of testability, traceability, simplicity, and understandability are emphasized.
The document discusses several techniques for using symbolic execution for software debugging, including regression debugging, cause clue clauses, error invariants, and angelic debugging. Regression debugging involves comparing execution paths of a failing test case in a new buggy program version to paths in an older stable version to find differences that may indicate the root cause. Other techniques use symbolic execution to extract specifications from passing tests, internal program properties, or previous versions to infer the intended behavior and identify inconsistencies in a failing run.
Reactive programming allows for non-blocking and concurrent executions. It is designed to be more efficient by using fewer threads and less memory. This makes applications more resilient and scalable to handle high connection volumes and traffic variability. The developer experience is improved through actionable stacktraces and debugging of reactive flows.
Greybox fuzzing methods to find security vulnerabilities in software systems are discussed in this talk. We discuss how fuzz testing methods can be inspired by ideas from symbolic execution and model checking to go beyond conventional fuzzing methods, without sacrificing the efficiency of fuzzing.
Overview of Fuzz Testing and the latest advances in the field are discussed. Fuzz testing is a popular method to find security vulnerabilities in software systems.
This document summarizes an expert talk on fuzz testing and greybox fuzzing. It discusses various fuzz testing techniques like black-box, white-box, and greybox fuzzing. It explains the greybox fuzzing algorithm and how techniques like directed and structured fuzzing can enhance it. It also discusses applications of fuzzing like finding crashes and vulnerabilities, and integration into tools like OSS-Fuzz. Overall, the document outlines the state-of-the-art in fuzz testing and opportunities to improve greybox fuzzing through techniques inspired by symbolic execution and model checking.
The document discusses trustworthy systems and trusted AI. It provides background on the Singapore Cybersecurity Consortium and its vision of trustworthy systems. It then summarizes ongoing work, including capabilities for security testing, formal verification of systems, and research on defending against Spectre attacks and fuzz testing. It also discusses model training and robustness, fuzzing for deep neural networks, and research on self-healing systems through specification inference and genetic programming.
Introductory talk given to PhD students starting research at NUS PhD open day 2020. Covers research in Computer Science, and some experience in research on trustworthy software systems.
This document summarizes a talk given by Prof. Abhik Roychoudhury about skills needed for a PhD. He discusses obvious skills like analyzing papers and identifying research trends. Less obvious skills include choosing impactful problems and determining if one has the right background. The least obvious skill is determining what constitutes a research contribution, which is qualitative rather than quantitative. The talk provides examples of different types of contributions and emphasizes choosing an interesting research area and topic, considering its relevance over time, potential for translation, and avoiding negative perceptions from the community.
The document summarizes Abhik Roychoudhury's presentation on automated program repair at the ISSTA Summer School 2019. It provides background on Roychoudhury as a professor at the National University of Singapore who works in program analysis and software security. It then outlines some of the key challenges in automated program repair, including the large search space, overfitting patches to test cases, and the scalability of repair techniques. Symbolic execution and random search are discussed as approaches to guiding the repair search process. Specific techniques like cause clue clauses for debugging and generating patch candidates by editing statements are also summarized.
This document summarizes a keynote presentation on timing analysis and testing. It discusses several topics:
- Timing analysis techniques including worst-case execution time analysis, detailed architectural modeling, and the Chronos timing analysis tool.
- Cache analysis including identifying thrashing scenarios, instrumenting assertions, and using symbolic execution to generate tests that expose cache performance issues.
- Applications to multi-core timing analysis, analyzing cache side channels, and generating tests or attack scenarios rather than just worst-case execution bounds.
The document advocates leveraging advances in constraint solving and symbolic execution to develop additional timing analysis applications beyond traditional worst-case execution time analysis.
The document discusses future directions for mobile software with a focus on energy and performance. Some key points:
- Energy and performance are not synonymous and energy measurements are needed to understand energy efficiency.
- Energy bugs and hotspots can be detected by analyzing energy consumption and hardware utilization traces. Refactoring code based on energy guidelines can help fix inefficiencies.
- User reviews and field failures related to battery drain can provide insights and be used to generate tests to localize defects.
- Emerging areas like drone disaster management may benefit from distributed energy management across tasks based on priority and a virtual marketplace model.
The document summarizes information about program repair and semantic repair. It discusses how most software has bugs that are often not fixed for months after being reported. It then describes reasons for program repair including generating patches as better bug reports and automating simple one-line fixes. It notes challenges with repair like weak test cases and large search spaces. It proposes using specifications, dynamic invariants, or test-driven repair as correctness criteria. It characterizes general purpose repair using generate-and-test or specification inference and discusses associated technical challenges. Finally, it discusses interactive and semantics-based repair.
This document discusses binary analysis for vulnerability detection. It describes research conducted at the National University of Singapore on binary analysis techniques like fuzz testing, comprehension, debugging, and patching. It outlines projects with DSO National Labs and the National Research Foundation of Singapore. The research aims to enhance capabilities in detecting vulnerabilities and securing software through automated binary analysis and techniques like fuzzing.
This document discusses using symbolic reasoning and dynamic symbolic execution to help with program debugging, repair, and regression testing. It presents an approach where inputs are grouped based on producing the same symbolic output to more efficiently test programs and debug issues. Relevant slice conditions are computed to precisely capture input-output relationships and group related paths. This technique aims to find a notion of "similarity" between inputs and executions that is coarser than just considering program paths. The approach is demonstrated on example programs and shown to reduce debugging time compared to only considering program paths.
SEMFIX is a program repair technique that uses semantic analysis via symbolic execution. It takes a failing test suite as input, ranks suspicious statements using statistical fault localization, symbolically executes tests to extract specifications of suspicious statements, and uses program synthesis to generate fixes by solving constraints from symbolic execution. The technique aims to infer the intended meaning of code and automatically generate fixes without human guidance.
Why does P’ behave differently than P for input t?
Programmer: P' was changed to handle negative inputs correctly. It was not supposed to change behavior for non-negative inputs like t.
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
2. TSUNAMi
0 National Research
Foundation Project
0 2015 - 2020
0 Trustworthiness in
COTS-Integrated platforms
0 Trustworthy systems from
un-trusted components
http://www.comp.nus.edu.sg/~tsunami
Vulnerability
Discovery
Binary
Hardening
Verification
Data
Protection
Research Outputs
– Publications, Tools, Academic Collaboration, Exchanges,
Seminars & Workshops
Education – NUS
(New module)
Industry
Collaboration
Singapore Cyber-security
Consortium
Agency
Collaboration
- DSTA, etc
Enhancing Local
Capabilities
2
25th ASWEC Keynote Adelaide
3. 2016 event
3
A team of hackers won $2 million by building a
machine that could hack better than they could
DARPA Cyber Grand
Challenge
Automation of Security
~ detecting and fixing of
vulnerabilities
25th ASWEC Keynote Adelaide
4. In the rest of the talk
0 Capsule version
0 Executive summary
0 Slightly detailed version
0 A vision for trustworthy software
25th ASWEC Keynote Adelaide 4
Technology is driving us to distraction
5. Repair - Why?
25thASWECKeynoteAdelaide
5
Maintaining Legacy Software
Debugging Aid
Education, Grading in MooCs
Security Patches
Self-healing systems, Drones
Buggy
Program
Correctness
Criterion
Repair Patched
Program
6. Troubles with repair
25th ASWEC Keynote Adelaide 6
• Weak description of intended behavior / correctness criterion e.g. tests
• Possibility to use “Bugs as deviant behavior” philosophy
• Weak applicability of repair techniques e.g. only overflow errors
• Large search space of candidate patches for general-purpose repair tools.
• Patch suggestions and Interactive Repair
8. 8
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3 return INVALID;
4 if (a == b && b == c)
5 return EQUILATERAL;
6 if (a == b || b != c) // bug!
7 return ISOSCELES;
8 return SCALENE;
9 }
Test id a b c oracle Pass
1 -1 -1 -1 INVALID pass
2 1 1 1 EQUILATERAL pass
3 2 2 3 ISOSCELES pass
4 2 3 2 ISOSCELES fail
5 3 2 2 ISOSCELES fail
6 2 3 4 SCALENES fail
Correct fix
(a == b || b == c || a== c)
Traverse all mutations of line 6, and check
Hard to generate correct fix since a==c never
appears elsewhere in the program.
OR
Generate the constraint
f(2,2,3)f(2,3,2) f(3,2,2)f(2,3,4)
and get the solution
f(a,b,c) = (a == b || b == c || a== c)
25th ASWEC Keynote Adelaide
9. Comparison
Syntactic Program Repair
1. Where to fix, which line?
2. Generate patches in the candidate
line
3. Validate the candidate patches
against correctness criterion.
Semantic Program Repair
1. Where to fix, which line(s)?
2. What values should be returned by
those lines, e.g. <inp ==1, ret== 0>
3. What are the expressions which
will return such values?
9
Syntax-based Schematic
for e in Search-space{
Validate e against Tests
}
Semantics-based Schematic
for t in Tests {
generate repair constraint Ψt
}
Synthesize e from ∧tΨt
25th ASWEC Keynote Adelaide
11. 11
1 i f ( hbtype == TLS1 HB REQUEST) {
2 . . .
3 memcpy (bp , pl , payload ) ;
4 . . .
5 }
(a) The buggy part of the Heartbleed-
vulnerable OpenSSL
1 i f ( hbtype == TLS1 HB REQUEST
2 && payload + 18 < s->s3->rrec.length) {
3 . . .
4 }
(b) A fix generated automatically
1 if (1 + 2 + payload + 16 > s->s3->rrec.length)
2 return 0;
3 . . .
4 i f ( hbtype == TLS1_HB_REQUEST) {
5 . . .
6 }
7 e l s e i f ( hbtype == TLS1_HB_RESPONSE) {
8 . . .
9 }
10 r e t u r n 0 ;
(c) The developer-provided repair
The Heartbleed Bug is a serious vulnerabilityin the popular OpenSSL
cryptographicsoftware library. This weakness allows stealing the
information protected, under normal conditions, by the SSL/TLS
encryption used to secure the Internet. SSL/TLS provides
communication security and privacy over the Internet for applications
such as web, email, instant messaging (IM) and some virtual private
networks (VPNs).
--- Source: heartbleed.com
25thASWECKeynoteAdelaide
12. Application in education
12
Use program repair in intelligent
tutoring systems to give the students’
individual attention.
Detailed Study in IIT-Kanpur, India by
Prof. Amey Karkare and team
(FSE17)
25th ASWEC Keynote Adelaide
13. In the rest of the talk
0 Capsule version
0 Executive summary
0 Slightly detailed version
0 A vision for trustworthy software
25th ASWEC Keynote Adelaide 13
14. Trustworthy Software:
Space of Problems
0 Fuzz Testing
0 Feed semi-random inputs to find hangs and crashes
0 Continuous fuzzing
0 Incrementally find new “problems” in software
0 Crash reproduction
0 Re-construct a reported crash, crashing input not included
due to privacy
0 Reaching nooks and corners
0 Localizing reported observable errors
0 Patching reported errors from input-output examples
1425th ASWEC Keynote Adelaide
15. Space of Techniques
Search
0 Random
0 Biased-random
0 Genetic (AFL Fuzzer)
0 Low set-up overhead
0 Fast, less accurate
0 Use objective function to
steer
Symbolic Execution
0 Dynamic Symbolic execution
0 Concolic Execution
0 Cluster paths based on symbolic
expressions of variables
0 High set-up overhead
0 Slow, more accurate
0 Use logical formula to steer
1525th ASWEC Keynote Adelaide
16. Search & SE: Interplay
(Random) Search
0 Trade-offs
0 Less systematic
0 Easy set-up, execute up to a
time budget
0 Enhance the effectiveness of
search, with symbolic execution
as inspiration
Symbolic Execution
0 Trade-offs
0 Systematic
0 More involved set-up, solver calls.
0 Explore capabilities of symbolic
execution beyond directed search
1625th ASWEC Keynote Adelaide
17. Symbolic Execution Tree
25th ASWEC Keynote Adelaide 17
int test_me(int Climb, int Up){
int sep, upward;
if (Climb > 0){
sep = Up;}
else {sep = add100(Up);}
if (sep > 150){
upward = 1;
} else {upward = 0;}
if (upward < 0){
abort;
} else return upward;
}
Climb > 0
Up > 150
Yes
1 < 0
Yes
Infeasible
Climb ==1,
Up == 200
1 < 0
No
Infeasible Climb ==1,
Up == 100
….
Yes YesNo No
18. Typical Use of Symbolic
Execution
25th ASWEC Keynote Adelaide 18
Program Analysis for Bug Finding
Concolic execution: supporting real executions
[Directed Automated Random Testing]
Symbolic execution tree construction e.g. KLEE
[Modeling system environment]
Grey-box fuzz testing for systematic path exploration
inspired by concolic execution
AFLFast
Random Testing for Bug Finding
19. AFLFast [CCS16]
25th ASWEC Keynote Adelaide 19
Schematic
• if (condition1)
• return // short path, frequented by many inputs
• else if (condition2)
• exit // short paths, frequented by many inputs
• else …. Use grey-box fuzzer which keeps track of path id for a test.
Estimate whether a discovered path is higb probability or not
Higher weightage to low probability paths discovered, to gravitate to
those -> discover new paths with minimal effort.
Integrated into distribution of AFL fuzzer within a year of
publication (CCS16), which is used on a daily basis by corporations
for finding vulnerabilities
20. Another use of Symbolic
Execution [CCS17]
25th ASWEC Keynote Adelaide 20
Reachability Analysis
Reachability of a location in the program
- Traverse the symbolic execution tree using search strategies
- Encode it as an optimization
problem inside the genetic search
of grey-box fuzzing AFLGo
22. Novel use of symbolic execution
25th ASWEC Keynote Adelaide 22
In the absence of formal specifications, analyze the buggy
program and its artifacts such as execution traces via various
heuristics to glean a specification about how it can pass tests
and what could have gone wrong!
Specification Inference (TODAY!)
(application: localization, synthesis, repair)
23. Automated Program Repair
0 Weak description of intended behavior / correctness criterion e.g. tests
0 Weak applicability of repair techniques e.g. only overflow errors
0 Large search space of candidate patches for general-purpose repair tools.
25th ASWEC Keynote Adelaide 23
24. Specification Inference
25th ASWEC Keynote Adelaide
Test input
Concrete
values
Expected output of program
Output:
Value-set or Constraint
Symbolic
execution
Program
Concrete Execution
24
ICSE13, FSE18
25. Example
1 int is_upward( int inhibit, int up_sep, int down_sep){
2 int bias;
3 if (inhibit)
4 bias = down_sep; // bias= up_sep + 100
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
inhibit up_sep down_sep Observed
output
Expected
Output
Result
1 0 100 0 0 pass
1 11 110 0 1 fail
0 100 50 1 1 pass
1 -20 60 0 1 fail
0 0 10 0 0 pass
2525th ASWEC Keynote Adelaide
26. Example
26
1 int is_upward( int inhibit, int up_sep, int down_sep){
2 int bias;
3 if (inhibit)
4 bias = f(inhibit, up_sep, down_sep) // X
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
Inhibit == 1 up_sep == 11 down_sep == 110
Symbolic Execution
( pcj outj == expected_out(t) )
f(t) == X
j Paths
Repair constraint
( (X >110 1 ==1)
(X ≤ 110 0 == 1)
)
f(1,11,110) == X
25thASWECKeynote
Adelaide
26
27. What it should have been
1 int is_upward( int inhibit, int up_sep, int
down_sep){
2 int bias;
3 if (inhibit)
4 bias = f(inhibit, up_sep, down_sep)
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
Inhibit
== 1
up_sep ==
11
down_sep
== 110
Symbolic Execution
f(1,11,110) > 110
25thASWECKeynote
Adelaide
27
28. Fix the suspect
0 Accumulated constraints
0 f(1,11, 110) > 110
0 f(1,0,100) ≤ 100
0 …
0 Find a f satisfying this constraint
0 By fixing the set of operators appearing in f
0 Candidate methods
0 Search over the space of expressions
0 Program synthesis with fixed set of operators
0 Can also be achieved by second-order constraint solving
0 Generated fix
0 f(inhibit,up_sep,down_sep) = up_sep + 100
25thASWECKeynote
Adelaide
28
29. Term = Var | Constant | Term + Term |
Term – Term | Constant * Term
(Second order) Synthesis
29
scanf(“%d”, &x);
for(i = 0; i <10; i++){
int t = (i,x);
if (t > 0) printf(“1”);
else printf(“0”);
}
P(5) “1110000000” expected “1111111000”
Buggy Program:
Sample Test:
Synthesis Specification: . i i output = expected Solve for directly
Term = Var | Constant | Term + Term |
Term – Term | Constant * Term
𝜌 0,5
> 0
𝜌 1,5
> 0
𝜌 1,5
> 0
𝜌 2,5
> 0
𝜌 2,5
> 0
𝜌 2,5
> 0
𝜌 2,5
> 0
Yes No
Yes No Yes No
𝑈𝑁𝑆𝐴𝑇
Yes
Term = Var | Constant | Term + Term |
Term – Term | Constant * Term
Existing component-based synthesis encodings provide inefficient unsatisfiability proofs,
because of using linear integer arithmetic encoding.
Designed a new encoding based on propositional logic for efficient unsatisfiability proofs.
25th ASWEC Keynote Adelaide
30. DirectFix [ICSE2015]
30
Tests Debugging DSE
Synthesis
Tests
MaxSMT solver
Conjure a function which
represents minimal change
to buggy program.
25th ASWEC Keynote Adelaide
31. Example
31
if (x > y)
if (x > z)
out =10;
else
out = 20;
else
out = 30;
return out;
if (x >= y)
if (x >= z)
out =10;
else
out = 20;
else
out = 30;
return out;
if (x > y)
if (x > z)
out =10;
else
out = 20;
else
out = 30;
return ((x==y)? ((x==z)?10: 20)): out);
SemFix
DirectFix
Test cases:
all possible
orderings of x,y,z
25th ASWEC Keynote Adelaide
#Pgm Equiv Same Loc Diff
SemFix 44 17% 46% 6.36
DirectFix 44 53% 95% 2.31
37. SemGraft Results
37
Program Commit Bug Angelix SemGraft
sed c35545a Handle empty match Correct Correct
seq f7d1c59 Wrong output Correct Correct
sed 7666fa1 Wrong output Incorrect Correct
sort d1ed3e6 Wrong output Incorrect Correct
seq d86d20b Don’t accepts 0 Incorrect Correct
sed 3a9365e Handle s/// Incorrect Correct
Program Commit Bug Angelix SemGraft
mkdir f7d1c59 Segmentation fault Incorrect Correct
mkfifo cdb1682 Segmentation fault Incorrect Correct
mknod cdb1682 Segmentation fault Incorrect Correct
copy f3653f0 Failed to copy a file Correct Correct
md5sum 739cf4e Segmentation fault Correct Correct
cut 6f374d7 Wrong output Incorrect Correct
GNU Coreutils
as reference
Linux Busybox
as reference
25th ASWEC Keynote Adelaide
38. Specification Inference [FSE18]
x = (… );
Test input
Concrete
values
Expected output of program
Output:
Value-set or Constraint
Symbolic execution
Program
Concrete Execution
38
𝜋1 𝜋2 𝜋 𝑛
Program
Tests or
Property
Environment
ModelingLibrary
Buggy
Program
Patched
Program
Tests or
Property
Repair
25th ASWEC Keynote Adelaide
39. Application in Library
Modeling
0 Modeling libraries for symbolic execution of application program.
0 Do not manually provide libraries for symbolic analysis.
0 Instead, they can be partially synthesized.
25th ASWEC Keynote Adelaide 39
void main (int argc , char * argv
[]) {
int a = atoi( argv [1]) ;
printf ("%dn", 16 / a);
}
void atoi_sketch ( char *arr []) {
int acc;
for (i = 0; i < strlen (arr); i++)
acc = (acc , arr[i]);
return acc;
}
Second order
reasoning
= xy. 10 x + y - 48
P(“4") “4";
P(“16") “1" Tests
Use this and symbolic execution to
find crashing inputs e.g. “0”
40. Application in education FSE17
40
Use program repair in intelligent tutoring systems to give the
students’ individual attention. Detailed Study in IIT-Kanpur,
(FSE17, and ongoing)
25th ASWEC Keynote Adelaide
The fact that student programs are often
significantly incorrect makes it difficult to
fix those programs.
P:8
F:2
P:5
F:5
P:6
F:4
P:9
F:1partial repair
P:10
F: 0
P: # of passing tests
F: # of failing tests
APR
(symbolic execution)
Test 1
Test 2
Test 3
Test 4
Test 5
Test 1
Test 2
Test 3
Test 4
Test 5
Test 1
Test 2
Test 3
Test 4
Test 5
+ if (true) {
S;
+ }
(search)
+ if (E) {
S;
+ }
41. Application in Education
25th ASWEC Keynote Adelaide 41
43 buggy student submissions from dataset
Across 8 unique problems
37 TA graders volunteered for study
Each TA gets all 43 submissions to grade
With repair hints for half the submissions
42. Acknowledgments
42
Discussions:
Umair Ahmed & Amey Karkare (IIT-K)
Marcel Boehme (Monash),
Cristian Cadar (Imperial)
Satish Chandra (Facebook),
Kenneth Cheong & Chia Yuan Cho (DSO),
Claire Le Goues (CMU),
Lars Grunske & Yannic Noller(Humboldt)
Sergey Mechtaev (University College London)
Martin Monperrus (KTH)
HDT Nguyen and Dawei Qi
Manh-Dung Nguyen & Van-Thuan Pham (NUS)
Michael Pradel (Darmstadt)
Mukul Prasad & Hiroaki Yoshida (Fujitsu),
Shin Hwei Tan (SUSTech)
Jooyong Yi (Innopolis)
Relevant papers:
http://www.comp.nus.edu.sg/~abhik/projects/Repair/index.html
http://www.comp.nus.edu.sg/~abhik/projects/Fuzz/
25th ASWEC Keynote Adelaide
43. JOINT R&D
Seed funding
(Industry-Academia pair)
Infrastructure sharing
TECHNOLOGY TALKS
Latest technologies and trends
Project showcases
WILD & CRAZY IDEAS
(WACI) DAY
Research ideas
Problem statements
SPECIAL INTEREST GROUPS
Knowledge and idea exchange
R&D partnership exploration
CYBERSECURITY
CAMP
Workshop, Industry talks
Hackathons
CYBERSECURITY
LEAN LAUNCHPAD
Business +
Technical Discussions
Engage via training
Engage via discussions
and advice
Engage via research
collaboration
SCYFI
Research Showcase
Research Presentations
Talks
SG Cyber-security Consortium
40 companies, 10 platinum member companies (>100M)
25th ASWEC Keynote Adelaide 43
44. Special Interest Groups (SIGs)
Future
possibilities
Threat Intelligence
and Incident Response
Led by
Data Protection
and Privacy
Led by
System and
Software Security
Led by
Mobile Security
Led by
Cybercrime and
Investigation
Led by
Led by
Cyber-Physical System
(CPS) and IoT Security
Regular meetings organized in-between major events
Discussions leading to formulation of grant calls
25th ASWEC Keynote Adelaide 44
45. Deployment
0 Grey-box Fuzzing
0 Enhanced and (more) systematic path coverage & directed-ness in grey-
box fuzzing.
0 AFLFast integrated into main AFL distribution after discussions.
0 Automated Program Repair
0 Infer specifications about repairing a program to meet correctness
requirements.
0 Angelix tool used for scalable program repair ~ 80 groups
0 Automated grading, and hints for programming assignments from partial
repairs
0 Moving forward
0 Fuzzing and repair tightly integrated for self-healing software.
0 Software can morph to respond to changes in environment.
0 Built-in Self Test (BIST) for autonomous software systems
0 Global as well as local usage via SG Cyber-security Consortium
25th ASWEC Keynote Adelaide 45
46. BIST for Software
Common in critical systems such as avionics, military and others,
supplemented by a repair.
Can autonomous software test and repair itself autonomously to cater
for corner cases? Can autonomous software repair itself subject to
changes in environment?
?