The document proposes SVMDetect, a novel approach using support vector machines (SVM) to detect anti-patterns in software systems. SVMDetect is evaluated on 3 open-source projects and is shown to have higher accuracy than the state-of-the-art rule-based approach DETEX, both in detecting anti-patterns on subsets of classes and in finding more occurrences overall. The study demonstrates that an SVM-based method can overcome limitations of previous anti-pattern detection techniques.
SVMDetect is a machine learning approach that uses support vector machines (SVM) to detect anti-patterns in software systems. An empirical study found that SVMDetect had higher precision and recall than the state-of-the-art rule-based approach DETEX when detecting four common anti-patterns (Blob, Functional Decomposition, Spaghetti Code, Swiss Army Knife) in subsets and entire systems of ArgoUML, Azureus and Xerces Java projects. The results indicate that an SVM-based approach can overcome limitations of previous anti-pattern detection methods.
Using the Machine to predict TestabilityMiguel Lopez
This document discusses using machine learning to predict testability based on source code metrics. It begins with an introduction to the presenting organization and definitions of testability and machine learning concepts. It then shows how decision trees and other machine learning approaches could be used to predict testability levels (high, medium, low) based on source code metrics like number of interfaces, abstractness, and coupling. As an example, metrics from 9 Java packages were analyzed to build and test a predictive model in the Weka machine learning software. However, the document notes the initial model is simplistic and could be improved by incorporating more metrics related to factors in the testability fishbone diagram.
This document summarizes Pooja's seminar presentation on machine learning. It introduces machine learning and compares it to traditional programming. It describes the main types of machine learning: supervised learning which uses labeled data to make predictions, unsupervised learning which finds patterns in unlabeled data, and reinforcement learning where an agent learns from feedback. The document discusses concepts like classification, regression, and feedback in machine learning systems. It also outlines some applications and concludes that machine learning can improve lives by advancing technology.
The Presentation answers various questions such as what is machine learning, how machine learning works, the difference between artificial intelligence, machine learning, deep learning, types of machine learning, and its applications.
These are slides from a talk I gave at the British Computer Society's SIGIST Conference in June 2013. The talk attempts to provoke the audience into think beyond the current standard approaches for testing in the industry.
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET Journal
This document discusses machine learning algorithms, techniques, and applications. It begins with an introduction to machine learning and different types of learning including supervised learning, unsupervised learning, reinforcement learning, and others. It then groups various machine learning algorithms based on similarities and compares the performance of popular algorithms like Naive Bayes, support vector machines, and decision trees. The document concludes that machine learning researchers aim to design more efficient algorithms that can perform better across different domains.
This document summarizes the key points from the first lecture of a course on pattern recognition. It introduces common applications of pattern recognition, related fields like machine learning and artificial intelligence. It emphasizes that understanding the properties of the data and incorporating prior knowledge are important for developing effective pattern recognition systems. The course will focus on decision theory, statistical learning methods, and evaluating algorithms rather than on specific feature extraction techniques.
SVMDetect is a machine learning approach that uses support vector machines (SVM) to detect anti-patterns in software systems. An empirical study found that SVMDetect had higher precision and recall than the state-of-the-art rule-based approach DETEX when detecting four common anti-patterns (Blob, Functional Decomposition, Spaghetti Code, Swiss Army Knife) in subsets and entire systems of ArgoUML, Azureus and Xerces Java projects. The results indicate that an SVM-based approach can overcome limitations of previous anti-pattern detection methods.
Using the Machine to predict TestabilityMiguel Lopez
This document discusses using machine learning to predict testability based on source code metrics. It begins with an introduction to the presenting organization and definitions of testability and machine learning concepts. It then shows how decision trees and other machine learning approaches could be used to predict testability levels (high, medium, low) based on source code metrics like number of interfaces, abstractness, and coupling. As an example, metrics from 9 Java packages were analyzed to build and test a predictive model in the Weka machine learning software. However, the document notes the initial model is simplistic and could be improved by incorporating more metrics related to factors in the testability fishbone diagram.
This document summarizes Pooja's seminar presentation on machine learning. It introduces machine learning and compares it to traditional programming. It describes the main types of machine learning: supervised learning which uses labeled data to make predictions, unsupervised learning which finds patterns in unlabeled data, and reinforcement learning where an agent learns from feedback. The document discusses concepts like classification, regression, and feedback in machine learning systems. It also outlines some applications and concludes that machine learning can improve lives by advancing technology.
The Presentation answers various questions such as what is machine learning, how machine learning works, the difference between artificial intelligence, machine learning, deep learning, types of machine learning, and its applications.
These are slides from a talk I gave at the British Computer Society's SIGIST Conference in June 2013. The talk attempts to provoke the audience into think beyond the current standard approaches for testing in the industry.
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET Journal
This document discusses machine learning algorithms, techniques, and applications. It begins with an introduction to machine learning and different types of learning including supervised learning, unsupervised learning, reinforcement learning, and others. It then groups various machine learning algorithms based on similarities and compares the performance of popular algorithms like Naive Bayes, support vector machines, and decision trees. The document concludes that machine learning researchers aim to design more efficient algorithms that can perform better across different domains.
This document summarizes the key points from the first lecture of a course on pattern recognition. It introduces common applications of pattern recognition, related fields like machine learning and artificial intelligence. It emphasizes that understanding the properties of the data and incorporating prior knowledge are important for developing effective pattern recognition systems. The course will focus on decision theory, statistical learning methods, and evaluating algorithms rather than on specific feature extraction techniques.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A Mono- and Multi-objective Approach for Recommending Software RefactoringAli Ouni
This document outlines Ali Ouni's Ph.D. defense presentation on recommending software refactoring using mono-objective and multi-objective approaches. The presentation includes the following key points:
1. It provides context on the need for automated software refactoring recommendation systems to address challenges in manually refactoring code.
2. It describes Ouni's research methodology which involves detecting code smells, generating refactoring recommendations using mono-objective and multi-objective search-based techniques, and evaluating the approaches.
3. It covers code smell detection including generating detection rules using genetic programming from code smell examples, and evaluating the detection approach on several systems.
4. It outlines the presentation including discussing
Recommending Software Refactoring Using Search-based Software EnginneringAli Ouni
This document discusses recommending software refactoring using search-based software engineering. It proposes a three-part approach: 1) using genetic programming to generate rules for detecting code smells, 2) applying mono-objective search algorithms like genetic algorithms to recommend refactorings to address code smells, and 3) using a multi-objective algorithm like NSGA-II to recommend refactorings that optimize multiple objectives like quality metrics and design patterns. The approach is evaluated on several systems, achieving over 90% precision on average in detecting three types of code smells.
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future ChallengesPtidej Team
The document discusses schemas, design patterns, and how patterns relate to schemas and learning. It notes that schemas are mental representations that allow people to recognize patterns. Design patterns aim to capture best practices and reusable solutions to common problems. When patterns are reused, it activates existing schemas and can lead to better design through experiences being shared. The document explores how pattern reuse relates to theories of schema and learning, and how patterns can be codified and shared between designers based on these theories.
Following the previous lectures that focus on the practical uses of good and bad practices, we frame this practices in the context of the developers' cognitive characteristics. We first recall that developers' use sight first and foremost to acquire and reason about their systems. Then, we cast the use of patterns by developers and then discuss the visual system, memory models, and mental models.
This document summarizes an empirical study that analyzed static relationships between anti-patterns (ineffective code motifs) and design patterns (effective code solutions) in software systems. The study used tools to detect instances of anti-patterns and design patterns, and found relationships like use, association, aggregation and composition between certain anti-patterns and design patterns. For example, the Command design pattern had relationships with 50% of instances of the SpeculativeGenerality anti-pattern in one system. The study provides evidence that some anti-patterns are more likely than others to have relationships with design patterns.
When and Why Your Code Starts to Smell BadFabio Palomba
In past and recent years, the issues related to man- aging technical debt received significant attention by researchers from both industry and academia. There are several factors that contribute to technical debt. One of these is represented by code bad smells, i.e., symptoms of poor design and implementation choices. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced. To fill this gap, we conducted a large empirical study over the change history of 200 open source projects from different software ecosystems and investigated when bad smells are introduced by developers, and the circumstances and reasons behind their introduction. Our study required the development of a strategy to identify smell- introducing commits, the mining of over 0.5M commits, and the manual analysis of 9,164 of them (i.e., those identified as smell- introducing). Our findings mostly contradict common wisdom stating that smells are being introduced during evolutionary tasks. In the light of our results, we also call for the need to develop a new generation of recommendation systems aimed at properly planning smell refactoring activities.
Detecting bad smells in source code using change history informationCarlos Eduardo
This document presents an approach called HIST to detect bad smells in source code by analyzing change history information extracted from version control systems. HIST detects 5 types of bad smells from Fowler's catalog by analyzing co-changes between source code artifacts. An evaluation of HIST on open source projects found it outperformed traditional static code analysis techniques, as it can detect smells characterized by how source code changes over time. Future work could explore hybrid detection approaches and applying HIST to other types of smells.
The document discusses patterns in software development. It begins by defining what a pattern is, noting that a pattern describes a general reusable solution to a commonly occurring problem within a specific context. It then discusses some qualities of patterns, including that they aim to enhance reusability, encapsulate design experiences, and provide a common vocabulary among designers. The document also notes that patterns aim to capture an ineffable "quality without a name". It provides examples of patterns from different programming languages to illustrate recurring solutions.
This document describes a study comparing SVMDetect, an anti-pattern detection approach using support vector machines (SVM), to DETEX, the previous state-of-the-art approach. SVMDetect is applied to detect four common anti-patterns in three software systems. When applied to subsets of each system, SVMDetect achieves significantly higher precision and recall than DETEX. When applied to entire systems, SVMDetect detects more total occurrences of the "Blob" anti-pattern than DETEX. The authors conclude that an SVM-based approach can overcome limitations of previous rule-based anti-pattern detection approaches.
This document describes SMURF, a SVM-based incremental anti-pattern detection approach. It presents SMURF, compares it to previous approaches, and reports on an empirical study evaluating SMURF. The study shows that SMURF has higher precision and recall than previous approaches, can be applied to subsets or whole systems, and its accuracy improves when incorporating practitioner feedback. Future work involves applying SMURF in real-world settings and integrating it into development tools.
The document describes the SMURF approach for detecting anti-patterns in software systems using support vector machines. SMURF was tested on three open source projects and shown to have higher accuracy than previous approaches like DETEX and BDTEX in terms of precision and recall. The accuracy of SMURF improved when it was trained and applied on the same system compared to different systems, and also increased when practitioner feedback was incorporated. Future work is planned to apply SMURF in more systems and anti-patterns to validate its effectiveness.
In the modern world, we are permanently using, leveraging, interacting with, and relying upon systems of ever higher sophistication, ranging from our cars, recommender systems in eCommerce, and networks when we go online, to integrated circuits when using our PCs and smartphones, security-critical software when accessing our bank accounts, and spreadsheets for financial planning and decision making. The complexity of these systems coupled with our high dependency on them implies both a non-negligible likelihood of system failures, and a high potential that such failures have significant negative effects on our everyday life. For that reason, it is a vital requirement to keep the harm of emerging failures to a minimum, which means minimizing the system downtime as well as the cost of system repair. This is where model-based diagnosis comes into play.
Model-based diagnosis is a principled, domain-independent approach that can be generally applied to troubleshoot systems of a wide variety of types, including all the ones mentioned above. It exploits and orchestrates techniques for knowledge representation, automated reasoning, heuristic problem solving, intelligent search, learning, stochastics, statistics, decision making under uncertainty, as well as combinatorics and set theory to detect, localize, and fix faults in abnormally behaving systems.
In this talk, we will give an introduction to the topic of model-based diagnosis, point out the major challenges in the field, and discuss a selection of approaches from our research addressing these challenges. For instance, we will present methods for the optimization of the time and memory performance of diagnosis systems, show efficient techniques for a semi-automatic debugging by interacting with a user or expert, and demonstrate how our algorithms can be effectively leveraged in important application domains such as scheduling or the Semantic Web.
MEME – An Integrated Tool For Advanced Computational ExperimentsGIScRG
The document describes MEME, an integrated tool for advanced computational experiments. MEME allows users to efficiently explore model responses through parameter sweeps and design of experiments. It supports running simulations in parallel on local clusters and grids. MEME collects, analyzes, and visualizes results. It implements intelligent "IntelliSweep" methods like iterative uniform interpolation and genetic algorithms to refine parameter space exploration.
IRJET - Survey on Malware Detection using Deep Learning MethodsIRJET Journal
This document discusses various machine learning methods for malware detection, including support vector machines (SVM), random forests, and decision trees. It provides an overview of each method and related works that have applied these techniques. Specifically, it examines analyses that used linear SVM, random forests on Android apps, and an improved decision tree algorithm to classify malware families. The document concludes that machine learning methods have become important for malware detection as signatures alone cannot keep up with new malware variants.
Chapter 10 Testing and Quality Assurance1Unders.docxketurahhazelhurst
Chapter 10:
Testing and Quality
Assurance
1
Understand quality & basic techniques for software verification and validation.
Analyze basics of software testing and testing techniques.
Discuss the concept of “inspection” process.
Objectives
2
Quality assurance (QA): activities designed
to measure and improve quality in a product— and process.
Quality control (QC): activities designed to validate and verify the quality of the product through detecting faults and “fixing” the defects.
Need good techniques, process, tools,
and team.
Testing Introduction
similar
3
Two traditional definitions:
Conforms to requirements.
Fit to use.
Verification: checking software conforms to
its requirements (did the software evolve
from the requirements properly; does the software “work”?)
Is the system correct?
Validation: checking software meets user requirements (fit to use)
Are we building the correct system?
What Is “Quality”?
4
Testing: executing program in a controlled environment and “verifying/validating” output.
Inspections and reviews.
Formal methods (proving software correct).
Static analysis detects “error-prone conditions.”
Some “Error-Detection” Techniques (finding errors)
5
Error: a mistake made by a programmer or software engineer that caused the fault, which in turn may cause a failure
Fault (defect, bug): condition that may cause a failure in the system
Failure (problem): inability of system to perform a function according to its spec due to some fault
Fault or failure/problem severity (based on consequences)
Fault or failure/problem priority (based on importance of developing a fix, which is in turn based
on severity)
Faults and Failures
6
Activity performed for:
Evaluating product quality
Improving products by identifying defects and having them fixed prior to software release
Dynamic (running-program) verification of program’s behavior on a finite set of test cases selected from execution domain.
Testing can NOT prove product works 100%—even though we use testing to demonstrate that parts of the software works.
Testing
Not always
done!
7
Who tests?
Programmers
Testers/Req. Analyst
Users
What is tested?
Unit code testing
Functional code testing
Integration/system testing
User interface testing
Testing (cont.)
Why test?
Acceptance (customer)
Conformance (std, laws, etc.)
Configuration (user vs. dev.)
Performance, stress, security, etc.
How (test cases designed)?
Intuition
Specification based (black box)
Code based (white box)
Existing cases (regression)
8
Progression of Testing
Equivalence Class Partitioning
Divide the input into several groups, deemed “equivalent” for purposes of finding errors.
Pick one “representative” for each class used for testing.
Equivalence classes determined by req./design specifications and some intuition
Example: pick “larger” of
two integers and . . .
Lessen duplication.
Complete coverage.
10
Suppose we have n distinct functional requirements.
Su ...
IRJET- Intrusion Detection using IP Binding in Real NetworkIRJET Journal
This document summarizes a research paper that proposes using genetic algorithms and support vector machines to improve network intrusion detection. It discusses how genetic algorithms can be used to select optimal features for support vector machine classifiers, in order to speed up training time and improve classification accuracy. The genetic algorithm optimizes the crossover and mutation probabilities during evolution to find the best feature subset for identifying network intrusions using support vector machines. Evaluation of this approach suggests it could enhance the effectiveness of intrusion detection systems.
Machine learning is a branch of computer science that deals with systems that can learn and improve automatically through experience. The document discusses key machine learning concepts like overfitting, cross-validation techniques to avoid overfitting, supervised vs unsupervised learning, and popular machine learning algorithms like decision trees, neural networks, and support vector machines. It also covers ensemble methods, dimensionality reduction techniques, and applications of machine learning in areas like computer vision and natural language processing.
This is slides used at Arithmer seminar given by Dr. Masaaki Uesaka at Arithmer inc.
It is a summary of recent methods for quality assurance of machine learning model.
Arithmer Seminar is weekly held, where professionals from within our company give lectures on their respective expertise.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
In a world where many m anual operations are mechanized, the definition of the word ‘manual’ is evolving. Computers can play chess, perform surgery, and develop into smarter, more humanlike machines with the aid of machine learning algorithms. Want to know more about NAIVE BAYES ALGORITHM visit here.. https://www.rangtech.com/blog/naive-bayes-algorithm
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
This document reviews algorithms that can be used for crime analysis and prediction. It discusses various data mining and machine learning techniques including classification algorithms like decision trees, k-nearest neighbors, and random forests as well as clustering algorithms like k-means clustering. Deep learning techniques are also examined for identifying relationships between different types of crimes and predicting where and when crimes may occur. The document evaluates these different algorithmic approaches and concludes that major developments in data science and machine learning now allow for effective crime analysis and prediction by discovering patterns in criminal data.
Identifying and classifying unknown Network Disruptionjagan477830
This document discusses identifying and classifying unknown network disruptions using machine learning algorithms. It begins by introducing the problem and importance of identifying network disruptions. Then it discusses related work on classifying network protocols. The document outlines the dataset and problem statement of predicting fault severity. It describes the machine learning workflow and various algorithms like random forest, decision tree and gradient boosting that are evaluated on the dataset. Finally, it concludes with achieving the objective of classifying disruptions and discusses future work like optimizing features and using neural networks.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A Mono- and Multi-objective Approach for Recommending Software RefactoringAli Ouni
This document outlines Ali Ouni's Ph.D. defense presentation on recommending software refactoring using mono-objective and multi-objective approaches. The presentation includes the following key points:
1. It provides context on the need for automated software refactoring recommendation systems to address challenges in manually refactoring code.
2. It describes Ouni's research methodology which involves detecting code smells, generating refactoring recommendations using mono-objective and multi-objective search-based techniques, and evaluating the approaches.
3. It covers code smell detection including generating detection rules using genetic programming from code smell examples, and evaluating the detection approach on several systems.
4. It outlines the presentation including discussing
Recommending Software Refactoring Using Search-based Software EnginneringAli Ouni
This document discusses recommending software refactoring using search-based software engineering. It proposes a three-part approach: 1) using genetic programming to generate rules for detecting code smells, 2) applying mono-objective search algorithms like genetic algorithms to recommend refactorings to address code smells, and 3) using a multi-objective algorithm like NSGA-II to recommend refactorings that optimize multiple objectives like quality metrics and design patterns. The approach is evaluated on several systems, achieving over 90% precision on average in detecting three types of code smells.
AsianPLoP'14: How and Why Design Patterns Impact Quality and Future ChallengesPtidej Team
The document discusses schemas, design patterns, and how patterns relate to schemas and learning. It notes that schemas are mental representations that allow people to recognize patterns. Design patterns aim to capture best practices and reusable solutions to common problems. When patterns are reused, it activates existing schemas and can lead to better design through experiences being shared. The document explores how pattern reuse relates to theories of schema and learning, and how patterns can be codified and shared between designers based on these theories.
Following the previous lectures that focus on the practical uses of good and bad practices, we frame this practices in the context of the developers' cognitive characteristics. We first recall that developers' use sight first and foremost to acquire and reason about their systems. Then, we cast the use of patterns by developers and then discuss the visual system, memory models, and mental models.
This document summarizes an empirical study that analyzed static relationships between anti-patterns (ineffective code motifs) and design patterns (effective code solutions) in software systems. The study used tools to detect instances of anti-patterns and design patterns, and found relationships like use, association, aggregation and composition between certain anti-patterns and design patterns. For example, the Command design pattern had relationships with 50% of instances of the SpeculativeGenerality anti-pattern in one system. The study provides evidence that some anti-patterns are more likely than others to have relationships with design patterns.
When and Why Your Code Starts to Smell BadFabio Palomba
In past and recent years, the issues related to man- aging technical debt received significant attention by researchers from both industry and academia. There are several factors that contribute to technical debt. One of these is represented by code bad smells, i.e., symptoms of poor design and implementation choices. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced. To fill this gap, we conducted a large empirical study over the change history of 200 open source projects from different software ecosystems and investigated when bad smells are introduced by developers, and the circumstances and reasons behind their introduction. Our study required the development of a strategy to identify smell- introducing commits, the mining of over 0.5M commits, and the manual analysis of 9,164 of them (i.e., those identified as smell- introducing). Our findings mostly contradict common wisdom stating that smells are being introduced during evolutionary tasks. In the light of our results, we also call for the need to develop a new generation of recommendation systems aimed at properly planning smell refactoring activities.
Detecting bad smells in source code using change history informationCarlos Eduardo
This document presents an approach called HIST to detect bad smells in source code by analyzing change history information extracted from version control systems. HIST detects 5 types of bad smells from Fowler's catalog by analyzing co-changes between source code artifacts. An evaluation of HIST on open source projects found it outperformed traditional static code analysis techniques, as it can detect smells characterized by how source code changes over time. Future work could explore hybrid detection approaches and applying HIST to other types of smells.
The document discusses patterns in software development. It begins by defining what a pattern is, noting that a pattern describes a general reusable solution to a commonly occurring problem within a specific context. It then discusses some qualities of patterns, including that they aim to enhance reusability, encapsulate design experiences, and provide a common vocabulary among designers. The document also notes that patterns aim to capture an ineffable "quality without a name". It provides examples of patterns from different programming languages to illustrate recurring solutions.
This document describes a study comparing SVMDetect, an anti-pattern detection approach using support vector machines (SVM), to DETEX, the previous state-of-the-art approach. SVMDetect is applied to detect four common anti-patterns in three software systems. When applied to subsets of each system, SVMDetect achieves significantly higher precision and recall than DETEX. When applied to entire systems, SVMDetect detects more total occurrences of the "Blob" anti-pattern than DETEX. The authors conclude that an SVM-based approach can overcome limitations of previous rule-based anti-pattern detection approaches.
This document describes SMURF, a SVM-based incremental anti-pattern detection approach. It presents SMURF, compares it to previous approaches, and reports on an empirical study evaluating SMURF. The study shows that SMURF has higher precision and recall than previous approaches, can be applied to subsets or whole systems, and its accuracy improves when incorporating practitioner feedback. Future work involves applying SMURF in real-world settings and integrating it into development tools.
The document describes the SMURF approach for detecting anti-patterns in software systems using support vector machines. SMURF was tested on three open source projects and shown to have higher accuracy than previous approaches like DETEX and BDTEX in terms of precision and recall. The accuracy of SMURF improved when it was trained and applied on the same system compared to different systems, and also increased when practitioner feedback was incorporated. Future work is planned to apply SMURF in more systems and anti-patterns to validate its effectiveness.
In the modern world, we are permanently using, leveraging, interacting with, and relying upon systems of ever higher sophistication, ranging from our cars, recommender systems in eCommerce, and networks when we go online, to integrated circuits when using our PCs and smartphones, security-critical software when accessing our bank accounts, and spreadsheets for financial planning and decision making. The complexity of these systems coupled with our high dependency on them implies both a non-negligible likelihood of system failures, and a high potential that such failures have significant negative effects on our everyday life. For that reason, it is a vital requirement to keep the harm of emerging failures to a minimum, which means minimizing the system downtime as well as the cost of system repair. This is where model-based diagnosis comes into play.
Model-based diagnosis is a principled, domain-independent approach that can be generally applied to troubleshoot systems of a wide variety of types, including all the ones mentioned above. It exploits and orchestrates techniques for knowledge representation, automated reasoning, heuristic problem solving, intelligent search, learning, stochastics, statistics, decision making under uncertainty, as well as combinatorics and set theory to detect, localize, and fix faults in abnormally behaving systems.
In this talk, we will give an introduction to the topic of model-based diagnosis, point out the major challenges in the field, and discuss a selection of approaches from our research addressing these challenges. For instance, we will present methods for the optimization of the time and memory performance of diagnosis systems, show efficient techniques for a semi-automatic debugging by interacting with a user or expert, and demonstrate how our algorithms can be effectively leveraged in important application domains such as scheduling or the Semantic Web.
MEME – An Integrated Tool For Advanced Computational ExperimentsGIScRG
The document describes MEME, an integrated tool for advanced computational experiments. MEME allows users to efficiently explore model responses through parameter sweeps and design of experiments. It supports running simulations in parallel on local clusters and grids. MEME collects, analyzes, and visualizes results. It implements intelligent "IntelliSweep" methods like iterative uniform interpolation and genetic algorithms to refine parameter space exploration.
IRJET - Survey on Malware Detection using Deep Learning MethodsIRJET Journal
This document discusses various machine learning methods for malware detection, including support vector machines (SVM), random forests, and decision trees. It provides an overview of each method and related works that have applied these techniques. Specifically, it examines analyses that used linear SVM, random forests on Android apps, and an improved decision tree algorithm to classify malware families. The document concludes that machine learning methods have become important for malware detection as signatures alone cannot keep up with new malware variants.
Chapter 10 Testing and Quality Assurance1Unders.docxketurahhazelhurst
Chapter 10:
Testing and Quality
Assurance
1
Understand quality & basic techniques for software verification and validation.
Analyze basics of software testing and testing techniques.
Discuss the concept of “inspection” process.
Objectives
2
Quality assurance (QA): activities designed
to measure and improve quality in a product— and process.
Quality control (QC): activities designed to validate and verify the quality of the product through detecting faults and “fixing” the defects.
Need good techniques, process, tools,
and team.
Testing Introduction
similar
3
Two traditional definitions:
Conforms to requirements.
Fit to use.
Verification: checking software conforms to
its requirements (did the software evolve
from the requirements properly; does the software “work”?)
Is the system correct?
Validation: checking software meets user requirements (fit to use)
Are we building the correct system?
What Is “Quality”?
4
Testing: executing program in a controlled environment and “verifying/validating” output.
Inspections and reviews.
Formal methods (proving software correct).
Static analysis detects “error-prone conditions.”
Some “Error-Detection” Techniques (finding errors)
5
Error: a mistake made by a programmer or software engineer that caused the fault, which in turn may cause a failure
Fault (defect, bug): condition that may cause a failure in the system
Failure (problem): inability of system to perform a function according to its spec due to some fault
Fault or failure/problem severity (based on consequences)
Fault or failure/problem priority (based on importance of developing a fix, which is in turn based
on severity)
Faults and Failures
6
Activity performed for:
Evaluating product quality
Improving products by identifying defects and having them fixed prior to software release
Dynamic (running-program) verification of program’s behavior on a finite set of test cases selected from execution domain.
Testing can NOT prove product works 100%—even though we use testing to demonstrate that parts of the software works.
Testing
Not always
done!
7
Who tests?
Programmers
Testers/Req. Analyst
Users
What is tested?
Unit code testing
Functional code testing
Integration/system testing
User interface testing
Testing (cont.)
Why test?
Acceptance (customer)
Conformance (std, laws, etc.)
Configuration (user vs. dev.)
Performance, stress, security, etc.
How (test cases designed)?
Intuition
Specification based (black box)
Code based (white box)
Existing cases (regression)
8
Progression of Testing
Equivalence Class Partitioning
Divide the input into several groups, deemed “equivalent” for purposes of finding errors.
Pick one “representative” for each class used for testing.
Equivalence classes determined by req./design specifications and some intuition
Example: pick “larger” of
two integers and . . .
Lessen duplication.
Complete coverage.
10
Suppose we have n distinct functional requirements.
Su ...
IRJET- Intrusion Detection using IP Binding in Real NetworkIRJET Journal
This document summarizes a research paper that proposes using genetic algorithms and support vector machines to improve network intrusion detection. It discusses how genetic algorithms can be used to select optimal features for support vector machine classifiers, in order to speed up training time and improve classification accuracy. The genetic algorithm optimizes the crossover and mutation probabilities during evolution to find the best feature subset for identifying network intrusions using support vector machines. Evaluation of this approach suggests it could enhance the effectiveness of intrusion detection systems.
Machine learning is a branch of computer science that deals with systems that can learn and improve automatically through experience. The document discusses key machine learning concepts like overfitting, cross-validation techniques to avoid overfitting, supervised vs unsupervised learning, and popular machine learning algorithms like decision trees, neural networks, and support vector machines. It also covers ensemble methods, dimensionality reduction techniques, and applications of machine learning in areas like computer vision and natural language processing.
This is slides used at Arithmer seminar given by Dr. Masaaki Uesaka at Arithmer inc.
It is a summary of recent methods for quality assurance of machine learning model.
Arithmer Seminar is weekly held, where professionals from within our company give lectures on their respective expertise.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
In a world where many m anual operations are mechanized, the definition of the word ‘manual’ is evolving. Computers can play chess, perform surgery, and develop into smarter, more humanlike machines with the aid of machine learning algorithms. Want to know more about NAIVE BAYES ALGORITHM visit here.. https://www.rangtech.com/blog/naive-bayes-algorithm
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
This document reviews algorithms that can be used for crime analysis and prediction. It discusses various data mining and machine learning techniques including classification algorithms like decision trees, k-nearest neighbors, and random forests as well as clustering algorithms like k-means clustering. Deep learning techniques are also examined for identifying relationships between different types of crimes and predicting where and when crimes may occur. The document evaluates these different algorithmic approaches and concludes that major developments in data science and machine learning now allow for effective crime analysis and prediction by discovering patterns in criminal data.
Identifying and classifying unknown Network Disruptionjagan477830
This document discusses identifying and classifying unknown network disruptions using machine learning algorithms. It begins by introducing the problem and importance of identifying network disruptions. Then it discusses related work on classifying network protocols. The document outlines the dataset and problem statement of predicting fault severity. It describes the machine learning workflow and various algorithms like random forest, decision tree and gradient boosting that are evaluated on the dataset. Finally, it concludes with achieving the objective of classifying disruptions and discusses future work like optimizing features and using neural networks.
Email Spam Detection Using Machine LearningIRJET Journal
This document discusses using machine learning techniques to detect email spam. It begins with an introduction to the growing problem of email spam and the need for detection techniques. It then discusses commonly used machine learning algorithms for classification like Naive Bayes, Support Vector Machines, Decision Trees, KNN, and Random Forests. The document outlines the objectives, scope and architecture of developing a model to classify emails as spam or not spam. It presents diagrams of the system workflow and components. The conclusion discusses building a system for spam detection using machine learning algorithms on email data and evaluating performance based on accuracy.
The document discusses current practices in software testing, specifically around the use of mock objects. It presents results from several studies:
1. A study of over 2,000 test dependencies in open source and proprietary projects found that the use of mocks depends on the class's responsibilities and architecture. Databases and external dependencies are often mocked, while domain objects are less likely to be mocked.
2. A survey of over 100 professionals found that mocks are mostly introduced when test classes are first created and tend to remain throughout the test class's lifetime. Mocks are occasionally removed in response to changes.
3. An analysis of code review practices for tests found that tests receive fewer reviews than production code. Developers see value in reviewing
- The document summarizes a presentation given by Andy Zaidman at the International Conference on Automation of Software Test (AST 2023) in Melbourne, Australia on May 16th, 2023.
- It discusses findings from studies on how developers engineer test cases and their testing behaviors in IDEs, including strategies like being guided by documentation or code.
- It also presents recommendations to improve developer testing through better tool support, clear adequacy criteria in education, and a focus on improving the user and developer experience of testing tools and processes.
From Model-based to Model and Simulation-based Systems ArchitecturesObeo
Achieving quality engineering through descriptive and analytical models
Systems architecture design is a key activity that affect the
overall systems engineering cost. It is hence fundamental
to ensure that the system architecture reaches a proper quality.
In this paper, we leverage on MBSE approaches and complement them
with simulation techniques, as a prom-ising way to improve the quality of the system architecture definition, and to come up with inno-vative solutions while securing the systems engineering process.
In this presentation I will show a set of important topics about Software Engineering Empirical Studies that can be useful for increasing quality on your thesis and monographs in general. You can read this presentation and to think about how to do a good experimentation by apply its objectives, validation methods, questions, answers expected, define metrics and measuring it.I will exhibit how the researchers selected the data for avoid case studies in a biased way using a GQM methodology to sort the study in a simpler view as well.
Sentiment Analysis using Naïve Bayes, CNN, SVMIRJET Journal
This document summarizes research on sentiment analysis techniques including Naive Bayes, CNN, and SVM algorithms. It discusses how each algorithm works, such as Naive Bayes calculating probabilities to determine sentiment class and CNN using convolutional filters on text representations. The document also reviews related work applying these algorithms to sentiment analysis and compares their performance, with Naive Bayes and SVM found to have comparable accuracy to CNN. Finally, it concludes different transformations may impact algorithm results and determining the best technique depends on the dataset and language.
The document discusses the history and current state of software engineering and its application to IoT systems. It notes that 50 years after the earliest software projects, issues still include cost overruns, property damage, risks to life and death, and challenges ensuring quality. For IoT, fragmentation across hardware, software, APIs and standards poses significant problems. The document proposes that research into IoT software engineering could help address these issues through approaches like developing software to run across diverse IoT platforms, and automatically miniaturizing software through techniques like multi-objective optimization to suit different IoT device capabilities.
1) Issue trackers are often used to track more than just bugs, including features, enhancements, and refactoring work.
2) A manual analysis found that nearly half of issues labeled as "bugs" in issue trackers were actually not bugs.
3) Relying on issue tracker labels alone can introduce significant errors into datasets used for tasks like bug prediction and severity estimation. More work is needed to clean noisy and unreliable data.
The document discusses how to derive dependency structures for legacy J2EE applications. It proposes analyzing all application tiers together using a language-independent model and parsing various artifacts. Configuration files and limited data flow analysis are used to understand dependencies. Container dependencies are explicitly codified by studying technology specifications and codifying dependency rules to apply when certain code patterns are detected in applications. This allows completing an application's dependency graph.
The document discusses the state of practices of service identification in the industry for migrating legacy systems to service-oriented architectures (SOA). It finds that while service identification is seen as important, it remains primarily a manual process focused on identifying coarse-grained business services from source code and business processes. Wrapping and clustering functionalities are common techniques. Fully automating service identification is still challenging due to the need to understand complex legacy system dependencies. The document recommends service identification be business-driven and follow proven methodologies.
This document discusses techniques for testing advanced driver assistance systems (ADAS) through physics-based simulation. It faces challenges due to the large, complex, and multidimensional test input space as well as the computational expense of simulation. The document proposes using a genetic algorithm guided by decision trees to more efficiently search for critical test cases. Classification trees are built to partition the input space into homogeneous regions in order to better guide the selection and generation of test inputs toward more critical areas.
The document reports on the findings of a survey of 45 industrial practitioners on their experiences with legacy-to-SOA migrations. The key findings include: 1) Practitioners migrate legacy systems implemented in Cobol and Java to reduce maintenance costs and improve flexibility/interoperability; 2) Identifying services is an important step but is mostly manual and business-driven; 3) The most used techniques are functionality clustering and wrapping; 4) Desired service qualities are reusability, granularity and loose coupling; 5) Identified services prioritize domain-specific over technical services; 6) RESTful services are most targeted technology.
The document investigates the impact of linguistic anti-patterns (LAs) on program comprehension. It defines LAs as bad naming, documentation, and implementation practices. A study was conducted involving 92 students assessing programs with and without LAs. The study found that LAs negatively impact understandability by increasing time and reducing correctness. Certain LAs like A2, B4, and D1 had a stronger negative effect than others like E1. The study also found that providing knowledge about LAs can help mitigate their impact by making programs easier and faster to comprehend.
The document discusses research on identifying and analyzing the impact of patterns on the quality of multi-language systems. The objectives are to collect and categorize sets of programming languages used together, detect patterns in multi-language programs to track bugs and provide best practices, and study how patterns impact quality. The contributions will be a catalog of multi-language patterns and defects, a detection tool, and analysis of patterns' effects on quality attributes. Current work includes reviewing literature on language combinations and patterns to provide recommendations for high-quality multi-language development.
This document discusses research on change impact analysis in multi-language systems. It begins by outlining recommendations for best practices when using JNI, such as passing primitive types, minimizing calls between native and Java code, and properly handling strings. It then describes a qualitative analysis of JNI usage that identified common practices and issues. Finally, it proposes future work to survey developers on applying recommendations to facilitate change impact analysis in multi-language systems.
The document summarizes a recommendation system that suggests software processes for video game projects based on similarities to past projects. The system analyzes over 100 postmortems from previous games to build a database of development processes and project contexts. It uses principal component analysis to identify similar past projects and recommends a process by combining elements from similar projects' processes. The system was evaluated both quantitatively based on correctness and coverage metrics and qualitatively through surveys and a case study with a developer team.
Will io t trigger the next software crisisPtidej Team
This document discusses how the rise of the Internet of Things (IoT) could trigger a new software crisis due to issues like fragmentation, complexity, and lack of standards. It provides a brief history of software engineering challenges over the past 50 years such as cost overruns, safety issues, and prioritizing productivity over quality. The document then examines how these same problems are emerging in the IoT context today. It argues that IoT software engineering practices need to address issues like device software, cloud/app development, and privacy in order to avoid a major crisis.
This document discusses theories related to software design patterns. It notes that while design patterns are commonly used, there is a need for more research on how they impact software quality. The document proposes several areas for developing theories, including systematically categorizing existing patterns based on underlying principles, combining principles to identify new patterns, and developing theories of patterns from developer behavior and for building software systems. Formalizing patterns and identifying their relationships could help teaching and understanding of patterns.
Laleh M. Eshkevari defended her Ph.D dissertation on developing techniques for the automatic detection and classification of identifier renamings in software projects. Her dissertation outlined a taxonomy of renamings, described approaches for renaming detection based on line mapping, entity mapping and data flow analysis, and discussed methods for classifying renamings based on their form and semantic changes. Evaluation of the approaches on several open source projects showed high precision and recall for renaming detection and identified trends in how renamings are used in practice.
1) The document analyzes the co-occurrence of code smells like anti-patterns and clones in software systems and their impact on fault-proneness.
2) It finds that over 50% of classes with anti-patterns also have clones, and 59-78% of classes with clones also participate in anti-patterns.
3) Classes with both anti-patterns and clones are significantly more fault-prone than other classes, with the risk of faults being at least 7 times higher in one system studied.
Trustrace is an approach that uses software repository links like SVN commits to improve the trust in automatically recovered traceability links between requirements and code. It calculates an initial trust value for links based on IR techniques like VSM, and then reweights the links based on additional information from the software repository. An evaluation on two case studies found Trustrace improved precision over VSM alone and showed no significant difference in recall, supporting the hypothesis that Trustrace can improve link recovery accuracy over IR-only approaches.
The document presents a taxonomy called ProMeTA for classifying program metamodels used in program reverse engineering. ProMeTA defines characteristics such as target language, abstraction level, meta-language, and more to classify popular metamodels like AST, KDM, FAMIX. The taxonomy aims to provide a comprehensive guide for researchers and practitioners to select, design, and communicate metamodels. The paper also analyzes existing metamodels according to the ProMeTA taxonomy and identifies gaps to guide future metamodel development.
This document describes a controlled, multiple case study of software evolution and defects from industrial projects. It details the data sources used, including source code repositories, issue tracking databases, and interviews. Metrics such as code smells, size, effort, and defects were collected. Programming skills of developers were also measured. Code smell detection tools and custom scripts to analyze code changes were used to extract metrics on a variety of code issues and evolution over time. The data is available online for further analysis.
The document describes a study on detecting linguistic (anti)patterns in RESTful APIs. It presents an approach called DOLAR (Detection Of Linguistic Antipatterns in REST) that analyzes REST API URIs and detects antipatterns using heuristics-based algorithms. Experiments were conducted on 309 methods from 15 public REST APIs to test DOLAR's accuracy, the extensibility of the underlying SOFA framework, and the performance of detection algorithms. The results showed that 42% of methods exhibited contextualized resource names (a pattern) while 14% had contextless resource names (an antipattern), with detection taking under a second on average.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 6
Ase12.ppt
1. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Support Vector Machines for Anti-pattern
Detection
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Abdou Maiga, Nasir Ali, Neelesh Bhattacharya, Aminata
Saban´, Yann-Ga¨l Gu´h´neuc, Giuliano Antoniol, Esma
e
e
e e
A¨
ımeur
´
DGIGL et DIRO, Ecole Polytechnique et Universit´ de Montr´al, Qu´bec,
e
e
e
Canada
E-mails: {abdou.maiga, nasir.ali, n.bhattacharya, a.sabane,
giuliano.antoniol}@polymtl.ca, {guehene, aimeur}@iro.umontreal.ca
ASE 2012
August 15, 2013
Pattern Trace Identification, Detection, and Enhancement in Java
SOftware Cost-effective Change and Evolution Research Lab
2. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Outline
Introduction
Introduction
Related Work
Related Work
Our Approach:
SVMDetect
Study Results
Our Approach: SVMDetect
Discussions
Conclusion and
Future Work
Questions
References
Study Results
Discussions
Conclusion and Future Work
Questions
2 / 25
3. SVMDetect
Introduction
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
3 / 25
Motivation
Anti-patterns: “poor” solutions to recurring design and
implementation problems.
Impact program comprehension, software evolution and
maintenance activities [8].
Important to detect them early in software
development process, to reduce the maintenance costs
4. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Introduction
Limitations
Current anti-pattern detection approaches have several
limitations:
Our Approach:
SVMDetect
they require extensive knowledge of anti-patterns
Study Results
they have limited precision and recall
Discussions
they cannot be applied on subsets of systems.
Conclusion and
Future Work
Questions
References
We propose
Apply SVM on subsets because it considers system
classes one at a time, not collectively as previous
rule-based approaches do.
To the best of our knowledge, researchers have not yet
studied the potential benefits of using SVM to
detect anti-patterns.
4 / 25
5. SVMDetect
Introduction
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Contributions
SVMDetect to detect anti-patterns using SVM
Use of precision and recall to compare SVMDetect to
DETEX [13], the best state-of-the-art approach, on 3
programs and 4 anti-patterns.
The accuracy of SVMDetect is greater than of
DETEX on subsets.
For whole system, SVMDetect find more anti-patterns
occurrences than DETEX.
We thus conclude that: a SVM-based approach can
overcome the limitations of previous approaches.
5 / 25
6. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Related Work
Smell/Anti-pattern Detection
Many researchers studied anti-patterns detection.
Alikacem et al. [1] used meta-model for representing
the source code and fuzzy thresholds.
Langelier et al. [10] used a visual approach.
Marinescu [11] used quantifiable expression of rules.
Sahraoui et al. [7] used search-based techniques.
Moha et al. [13] proposed an approach based on a set
of rules that describes each anti-pattern.
The works carried out so far suffered from some limitations:
they have limited precision and recall (if reported at all)
had not been adopted by practitioners yet
cannot be applied on subsets of systems
required sufficient knowledge of anti-patterns.
6 / 25
7. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Related Work
SVM Applications
SVM in several domains for various applications, e.g.,
bioinformatics [2], object recognition [4].
Study Results
SVM is a recent alternative to the classification
problems.
Discussions
Guihong et al. [3] used SVM, for terms classification.
Conclusion and
Future Work
SVM used in image retrieval systems by Sethia et al.
[12]
Questions
References
Kim et al. [9] proposed the change classification
approach for predicting latent software bugs based on
SVM.
To the best of our knowledge, no previous approach used
SVM for anti-pattern Detection.
7 / 25
8. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Our Approach: SVMDetect
SVMDetect
SVMDetect is based on Support Vector Machines (SVM)
using a polynomial kernel to detect occurrences of
anti-patterns.
We use SVMDetect to detect the well-known anti-patterns:
Blob, Functional Decomposition, Spaghetti code, and Swiss
Army Knife. For each anti-pattern detection, the detection
process is identical.
We illustrate the detection process with the Blob
anti-pattern for the sake of clarity. We define:
TDS = {Ci , i = 1, . . . , p}, a set of classes Ci derived
from an object-oriented system that constitutes the
training dataset;
∀i, Ci is labelled as Blob (B) or not (N);
DDS is the set of the classes of a system in which we
want to detect the Blob classes.
8 / 25
9. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Our Approach: SVMDetect
SVMDetect - Steps
To detect the Blob classes in the set DDS, we apply
SVMDetect through the following steps:
Step 1 (Object Oriented Metric Specification)
SVMDetect takes as input the training dataset TDS
with object-oriented metrics for classes.
Step 2 (Train the SVM Classifier) Train SVMDetect
with TDS defined in Step 1.
Step 3 (Construction of the dataset DDS and
detection of the occurrences of an anti-pattern)
Build detection dataset DDS and apply SVMDetect
trained in step 2 to DDS.
We use Weka to implement SVMDetect using its SVM
classifier.
9 / 25
10. SVMDetect
Empirical Study
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
10 / 25
Empirical Study
goal: validate that SVMDetect can overcome previous
approaches’ limitations
quality focus: accuracy of SVMDetect, in terms of
precision and recall.
perspective: researchers and practitioners interested in
verifying if SVMDetect can be effective in detecting
various kinds of anti-patterns, and in overcoming the
previous limitations.
11. SVMDetect
Empirical Study
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Research Questions
RQ1: How does the accuracy of SVMDetect compare
with that of DETEX, in terms of precision and recall?
We decompose RQ1 as follows:
Discussions
Conclusion and
Future Work
Questions
References
11 / 25
RQ11 : How does the accuracy of SVMDetect compare
with that of DETEX, in terms of precision and recall,
when applied on a same subset of a system?
RQ12 : How many occurrences of Blob SVMDetect can
detect when comparing with that of DETEX on a same
entire system?
12. SVMDetect
Empirical Study
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Objects
Names
Versions
ArgoUML
0.19.8
Study Results
Discussions
Conclusion and
Future Work
Questions
References
12 / 25
# Lines of Code
# Classes
113,017
A design tool for UML
# Interfaces
1,230
67
Azureus
2.3.0.6
191,963
1,449
546
A peer-to-peer client that implements the protocol BitTorrent
Xerces
2.7.0
71,217
A syntaxic analyser
513
Table : Description of the objects of the study
162
13. SVMDetect
Empirical Study
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Subjects
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
The subjects of our study are the following four
anti-patterns:
Blob
Functional Decomposition (FD)
Spaghetti Code (SC)
Swiss Army Knife (SAK)
These four anti-patterns because known anti-patterns,
commonly studied in previous work for comparison.
13 / 25
14. SVMDetect
Study Results
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Subsets of System: RQ11
Related Work
Our Approach:
SVMDetect
Table : Precision of SVMDetect vs. DETEX in subsets (%)
Study Results
Discussions
Blob
Conclusion and
Future Work
FD
Questions
References
SC
SAK
14 / 25
DETEX
SVMDetect
DETEX
SVMDetect
DETEX
SVMDetect
DETEX
SVMDetect
ArgoUML
0.00
97.09
0.00
70.68
0.00
85.00
10.00
75.46
Azureus
0.00
97.32
0.00
72.01
0.00
88.00
10.00
84.54
Xerces
0.00
95.51
0.00
66.93
0.00
86.00
0.00
80.76
15. SVMDetect
Study Results
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Subsets of System: RQ11
Related Work
Our Approach:
SVMDetect
Table : Recall of SVMDetect vs. DETEX in subsets (%)
Study Results
Discussions
Blob
Conclusion and
Future Work
FD
Questions
References
SC
SAK
15 / 25
DETEX
SVMDetect
DETEX
SVMDetect
DETEX
SVMDetect
DETEX
SVMDetect
ArgoUML
0.00
84.09
0.00
57.50
0.00
71.00
0.00
77.14
Azureus
0.00
91.33
0.00
84.28
0.00
89.00
0.00
85.71
Xerces
0.00
95.29
0.00
70.00
0.00
86.00
0.00
75.50
16. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Study Results
Complete System: RQ12
Table : Total recovered occurrences of BLOB by DETEX and
SVMDetect
ArgoUML
Azureus
Xerces
Total
DETEX
25
38
39
102
SVMDetect
40
48
55
143
Questions
References
We answer RQ1: “How does the accuracy of SVMDetect
compare with that of DETEX, in terms of precision and
recall?” as follows:
on subsets of systems, SVMDetect dramatically
outperforms DETEX.
16 / 25
on entire systems, SVMDetect detects more
occurrences of Blob than DETEX.
17. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Discussions
Threats to Validity
Threats to the validity of our results.
Construct validity (Measurement errors, subjectivity):
occurrences of anti-patterns manually validated.
Internal Validity (dependence of the obtained results
on chosen anti-patterns and systems.): used four
well-known and representative anti-patterns. used in
previous works. used 3 open-source systems with
different sizes, used in previous works.
Reliability Validity (possibility of replication): used 3
open-source systems available on-line.
External Validity (Generalisability): 3 systems with
different sizes and different domains. Representative
subset of anti-patterns.
17 / 25
18. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Conclusion and Future Work
Conclusion
introduced a novel approach to detect anti-patterns,
SVMDetect, based on SVM.
SVMDetect performs on 3 systems (ArgoUML v0.19.8,
Azureus v2.3.0.6, and Xerces v2.7.0) and 4
anti-patterns (Blob, Functional Decomposition,
Spaghetti Code, and Swiss Army Knife)
the accuracy of SVMDetect is greater than that of
DETEX on a subset of classes.
on whole system, SVMDetect is able to find more
anti-patterns occurrences than DETEX
SVM-based approach can overcome the limitations of
the previous approaches and could be more readily
adopted by practitioners.
18 / 25
19. SVMDetect
Conclusion and Future Work
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
Future Work
Future work includes:
use SVMDetect in real-world environments.
reproduce the study with other systems and
anti-patterns to increase our confidence in the
generalisability of our conclusions.
take into account the user feedback.
evaluate the impact of the quality of training dataset
and feedback set on SVMDetect results.
19 / 25
21. SVMDetect
References
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
E. H. Alikacem and H. A. Sahraoui.
D´tection d’anomalies utilisant un langage de r`gle de
e
e
qualit´.
e
In LMO. Hermes Science Publications, 2006.
Study Results
Discussions
Conclusion and
Future Work
Questions
References
21 / 25
J. Bedo, C. Sanderson, and A. Kowalczyk.
An efficient alternative to svm based recursive feature
elimination with applications in natural language
processing and bioinformatics.
In A. Sattar and B.-h. Kang, editors, AI 2006: Advances
in Artificial Intelligence, volume 4304 of Lecture Notes
in Computer Science, pages 170–180. Springer Berlin
Heidelberg, 2006.
22. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
References
G. Cao, J.-Y. Nie, J. Gao, and S. Robertson.
Selecting good expansion terms for pseudo-relevance
feedback.
In Proceedings of the 31st Annual International ACM
SIGIR Conference on Research and Development in
Information Retrieval, SIGIR 2008, Singapore, July
20-24, 2008, pages 243–250. ACM, 2008.
M. J. Choi, A. Torralba, and A. S. Willsky.
A tree-based context model for object recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 34:240–252,
Feb. 2012.
C. Cortes and V. Vapnik.
Support-vector networks.
Machine Learning, 20:273–297, 1995.
10.1007/BF00994018.
22 / 25
23. SVMDetect
References
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
Questions
References
23 / 25
Y.-G. Gu´h´neuc, H. Sahraoui, and Farouk Zaidi.
e e
Fingerprinting design patterns.
In E. Stroulia and A. de Lucia, editors, Proceedings of
the 11th Working Conference on Reverse Engineering
(WCRE). IEEE Computer Society Press, November
2004.
M. Kessentini, S. Vaucher, and H. Sahraoui.
Deviance from perfection is a better criterion than
closeness to evil when identifying risky code.
In Proceedings of the IEEE/ACM international
conference on Automated software engineering, ASE
’10, pages 113–122, New York, NY, USA, 2010. ACM.
24. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
Discussions
Conclusion and
Future Work
References
F. Khomh, M. D. Penta, and Y.-G. Gu´h´neuc.
e e
An exploratory study of the impact of antipatterns on
class change-and fault-proneness.
Journal of Empirical Software Engineering (EMSE),
2011.
S. Kim, E. J. W. Jr., and Y. Zhang.
Classifying software changes: Clean or buggy?
IEEE Trans. Software Eng., 34(2):181–196, 2008.
Questions
References
24 / 25
G. Langelier, H. Sahraoui, and P. Poulin.
Visualization-based analysis of quality for large-scale
software systems.
In Proceedings of the 20th IEEE/ACM international
Conference on Automated software engineering, ASE
’05, pages 214–223, New York, NY, USA, 2005. ACM.
25. SVMDetect
Maiga, Ali,
Bhattacharya,
Saban´,
e
Gu´h´neuc,
e e
Antoniol, A¨
ımeur
Introduction
Related Work
Our Approach:
SVMDetect
Study Results
References
R. Marinescu.
Detection strategies: Metrics-based rules for detecting
design flaws.
In In Proceedings of the IEEE 20th International
Conference on Software Maintenance. IEEE Computer
Society Press, 2004.
Discussions
Conclusion and
Future Work
Questions
L. Setia, J. Ick, and H. Burkhardt.
Svm-based relevance feedback in image retrieval using
invariant feature histograms.
References
Naouel Moha, Y.-G. Gu´h´neuc, L. Duchien, and
e e
A.-F. L. Meur.
DECOR: A method for the specification and detection of
code and design smells.
Transactions on Software Engineering (TSE), 2009.
25 / 25