Genetic Programming in Automated Test Code Generation

•

1 like•548 views

This document discusses using genetic programming to generate automated test code for multi-threaded microprocessors. It presents an experiment applying genetic programming to test code generation for the XMOS multi-threaded microprocessor. The results showed test code generated by the genetic programming approach significantly outperformed both human-generated and randomly generated test code, improving line coverage to 94% while reducing simulation cycles by up to 50%.

Technology Education

Genetic Programming in automated test code
generation for a multi-threaded microprocessor.
Neow Way Yuh
MSc. Machine Learning and Data Mining.
Supervised by Dr. Kerstin Eder and
Mr. Peter Hedinger

Introduction
• The complexity of modern microprocessors makes
verification difficult.
• Recent research on Coverage-Directed test Generation
(CDG) uses machine learning approaches.
• Genetic Programming (GP) is one of the machine
learning methods used in CDG.
• Initial experiments with GP in multi-threaded
microprocessor verification show potential.
• Presentation starts with introduction to GP and a case
study with the XMOS multi-threaded microprocessor.

• GP is based on the natural evolution process.
• Darwin Theorem: Natural selection promotes favourable
heritable traits in successive generations.
• In GP, each individual in a population is given a quantitative
measurement to reflect the quality of an individual relative to
the environment.
Fundamentals of Genetic
Programming (GP)‫‏‬

GP in test code generation
• Selection algorithm samples two individuals from the population based on their
fitness values (e.g. Higher fitness value individuals get sampled more often).
• The parents are merged using genetic operators to produce new individuals.
• An external simulator is used to evaluate the fitness of the new individuals.
• GP implementation in Coverage-Directed test Generation requires less expert
knowledge compared to other methods (e.g. Bayesian Network).

Case study: XMOS multi-
threaded microprocessor
• Software Defined Silicon (SDS).
• Up to 8 threads running
simultaneously in one single
core.
• Experiment focuses on channel
communication.
• Channel communication design
requires a number of threads to
exercise the corner cases.
However, the timing window to trigger corner cases,
such as race conditions, is very narrow.

Implemented CDG loop
• The experiment extends the existing MicoGP* program.
• Expert knowledge is encoded in an instruction library.
• An individual is a graph; the nodes in a graph are constructed from
the basic building blocks in the library.
• The extended MicroGP includes 2 crossover and 5 mutation
operators.
• A double looped system is used to enhance the performance of
MicroGP.
* MicroGP research group
http://www.cad.polito.it/research/microgp.html

Encoding multi-threaded test
code in GP
• A test program can be represented by a graph; a node in a graph maps to a
channel operation and a branch represents the content of a new thread.

Feedback measurements
• Improves basic coverage measurements such as line
coverage and branch coverage.
• Minimises simulation cycles.
• Minimises the main graph to encourage more threads in
test code.
• Balances the distribution of operations among the
threads.
• Increases operation on channel IO.
• All feedback measurements are optimised together with
different priority.

Experiment Results
• Test code generated by MicroGP method is significantly better
than human engineer and randomly generated sequences.
• Actual line coverage improved to 94% and up to 50% cycle
reduction.
12 345 678 901 234 567 890 123 456 789
65
70
75
80
Line Coverage
Random Seed
Percentage(%)
12 345 678 901 234 567 890 123 456 789
0
5000
10000
15000
20000
25000
30000
35000
Simulation Cycle
Human
Random
MicroGP
Random Seed
UnitCycle

The document discusses a method for pruning convolutional neural networks to make them more efficient for resource-constrained inference. The method uses a Taylor expansion to calculate the saliency of parameters, allowing it to prune those with the least effect on the network's loss. Experiments on networks like VGG-16 and AlexNet show the method can significantly reduce operations with little loss in accuracy. Layer-wise analysis provides insight into each layer's importance to the overall network.

Data preprocessing

Kimberly Williams

The analysis of proteins and messenger RNA is commonly used in the comparison of gene expression patterns in tissues or cells of different types and under distinct conditions. In gene expression analysis, normalization is a critical step as it guarantees the validity of downstream analyses. Data preprocessing is an indispensable step in the extraction and normalization of microarray gene expression data. The normalization of gene expression data is essential in ensuring accurate inferences. A number of normalization methods in high throughput sequencing studies are being employed. The preprocessing activity begins by a careful analysis of the gene expression data and usually involves the classification of many raw signal intensities into one expression value. The Robust Multiarray Average (RMA) is a normalization approach for microarrays that involves background correction, normalization and summarization of probe levels information without using MM probes (Lim et al., 2007). It is an algorithm commonly used in the creation of an expression matrix for Affymetrix data and is one of the most commonly used modes of preprocessing to normalize gene expression data. Values of raw intensity are initially background corrected and log2 transformed before being normalized. In order to generate an expression measure for probe sets on each array, a linear model is fitted to the normalized data.

Deep Learning as a Cat/Dog Detector

Roelof Pieters

An interactive approach to multiobjective clustering of gene expression patterns

Ravi Kumar

This document describes an interactive genetic algorithm-based multi-objective approach to cluster gene expression patterns. The proposed Interactive Multi-Objective Clustering (IMOC) algorithm simultaneously evolves the set of validity measures to optimize and finds the clustering solution. It takes input from a human decision maker during execution to learn the best validity measures and clustering for the gene expression data. The algorithm is applied to benchmark gene expression datasets and shows more biologically significant clusters than other clustering algorithms.

2020.04.07 automated molecular design and the bradshaw platform webinar

Pistoia Alliance

This presentation described how data-driven chemoinformatics methods may automate much of what has historically been done by a medicinal chemist. It explored what is reasonable to expect “AI” approaches might achieve, and what is best left with a human expert. The implications of automation for the human-machine interface were explored and illustrated with examples from Bradshaw, GSK’s experimental automated design environment.

Deep Conditional Adversarial learning for polyp Segmentation

multimediaeval

Paper: http://ceur-ws.org/Vol-2882/paper22.pdf Debapriya Banik and Debotosh Bhattacharjee : Deep Conditional Adversarial learning for polyp Segmentation. Proc. of MediaEval 2020, 14-15 December 2020, Online. This approach has addressed the Medico automatic polyp segmentation challenge which is a part of Mediaeval 2020. We have proposed a deep conditional adversarial learning based network for the automatic polyp segmentation task. The network comprises of two interdependent models namely a generator and a discriminator. The generator network is a FCN employed for the prediction of the polyp mask while the discriminator enforces the segmentation to be as similar as the real segmented mask (ground truth). Our proposed model achieved a comparative result on the test dataset provided by the organizers of the challenge.

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2023/11/learning-compact-dnn-models-for-embedded-vision-a-presentation-from-the-university-of-maryland-at-college-park/ Shuvra Bhattacharyya, Professor at the University of Maryland at College Park, presents the “Learning Compact DNN Models for Embedded Vision” tutorial at the May 2023 Embedded Vision Summit. In this talk, Bhattacharyya explores methods to transform large deep neural network (DNN) models into effective compact models. The transformation process that he focuses on—from large to compact DNN form—is referred to as pruning. Pruning involves the removal of neurons or parameters from a neural network. When performed strategically, pruning can lead to significant reductions in computational complexity without significant degradation in accuracy. It is sometimes even possible to increase accuracy through pruning. Pruning provides a general approach for facilitating real-time inference in resource-constrained embedded computer vision systems. Bhattacharyya provides an overview of important aspects to consider when applying or developing a DNN pruning method and presents details on a recently introduced pruning method called NeuroGRS. NeuroGRS considers structures and trained weights jointly throughout the pruning process and can result in significantly more compact models compared to other pruning methods.

HUG @ NGCLE@e-Novia 15.11.2017

NECST Lab @ Politecnico di Milano

The document proposes HUG, a hardware and software system to efficiently process large genomic data sets. HUG includes a hardware accelerator to speed up genomic analysis algorithms like Smith-Waterman and protein folding. It also accelerates genomic variant calling, which identifies DNA changes associated with diseases but requires comparing to large databases. The system implements regular expression matching for genomic data using a reconfigurable instruction set architecture circuit, achieving a 6x speedup over software. It aims to integrate algorithms and develop visualization tools to facilitate analysis of computed genomic data.

HUGenomics: a support for personalized medicine research

NECST Lab @ Politecnico di Milano

This document discusses hardware solutions for genomic analysis. It proposes HUG, an integrated hardware and software system to efficiently process huge amounts of genomic data and clearly visualize results. HUG would use field programmable gate arrays (FPGAs) to accelerate computationally-intensive genomic algorithms like Smith-Waterman, protein folding, and genetic variant calling through parallelization. This would help advance personalized medicine by facilitating disease correlations and treatment identification from large genomic datasets.

Design and evaluation of a genomics variant analysis pipeline using GATK Spar...

Paolo Missier

Introduction to Genetic algorithm and its significance in VLSI design and aut...

Centre for Electronics, Computer, Self development

This document provides an introduction to genetic algorithms and their applications in VLSI design and automation. It discusses the fundamentals of genetic algorithms including genetic representation, selection, crossover and mutation operators. Examples are provided for simple function optimization and the traveling salesman problem. The document also discusses applications of genetic algorithms for VLSI design problems such as partitioning, placement, routing, technology mapping and automatic test pattern generation. It provides details on genetic algorithm parameters and compares genetic algorithms to traditional optimization methods.

A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES

PNandaSai

Digital image processing is vast fields which can be using various applications. Which include Detection of criminal face, fingerprint authentication system, in medical field, object recognition etc. Brain tumor detection plays an important role in medical field. Brain tumor detection is detection of tumor affected part in the brain along with its shape size and boundary, so it useful in medical field. Segmentation and the subsequent quantitative assessment of lesions in medical images provide valuable information for the analysis of neuropathologist and are important for planning of treatment strategies, monitoring of disease progression and prediction of patient outcome. For a better understanding of the pathophysiology of diseases, quantitative imaging can reveal clues about the disease characteristics and effects on particular anatomical structures

powerpoint feb

imu409

The document proposes using an ensemble of K-nearest neighbor classifiers optimized with genetic programming for intrusion detection. It trains multiple K-NN classifiers on subsets of the KDD Cup 1999 intrusion detection dataset and then uses genetic programming to combine the classifiers to improve performance. Results show the ensemble approach reduces error rates compared to individual classifiers and the genetic programming-based ensemble achieves an area under the ROC curve of 0.99976, outperforming the component classifiers.

GIAB and long reads for bio it world 190417

GenomeInABottle

The Genome in a Bottle Consortium provides reference samples and genomic data to enable the development and benchmarking of genome analysis tools, and they have recently published new resources from linked and long read sequencing including 10x Genomics, PacBio, and Oxford Nanopore data, which have expanded their small variant benchmark set and enabled the development of a structural variant benchmark.

20211119 ntuh azure hpc workshop final

Meng-Ru (Raymond) Tsai

This document discusses using Microsoft Azure cloud computing resources to conduct genome-wide association studies (GWAS) and polygenic risk scoring (PRS) to predict COVID-19 mortality. Key steps include acquiring genotype and phenotype data, performing quality control, running GWAS and PRS analyses using HPC clusters on Azure, and downloading results. Azure provides scalable computing and storage for the large genomic datasets. Its HPC capabilities allow accelerating the analyses, which could otherwise take months to complete on-premises.

Application of machine learning and cognitive computing in intrusion detectio...

Mahdi Hosseini Moghaddam

This document describes a proposed hardware-based machine learning intrusion detection system using cognitive processors. It discusses the need for new intrusion detection approaches due to limitations of signature-based methods. The proposed system collects network packet data using a Raspberry Pi and classifies it using a Cognimem CM1K cognitive processor chip, which implements restricted coulomb energy and k-nearest neighbor algorithms. The document outlines the system architecture, data collection and normalization methodology, and analysis of results from testing the CM1K chip on both custom and NSL-KDD network datasets, finding accuracy levels around 70-80% but slower processing times than a software simulation of the chip's algorithms. Future work areas include adding more packet features, using

Microarray full detail

Devendra Choudhary

The document discusses DNA microarrays, including their applications, history, major steps, methods of construction, and technical issues. DNA microarrays allow analysis of gene expression across thousands of genes simultaneously. They have been used since the 1990s and are constructed by attaching DNA probes to a solid surface in a high-density array. Two main types are cDNA-based microarrays using amplified cDNA and oligonucleotide-based arrays like Affymetrix GeneChips containing short DNA sequences.

Testing of Object-Oriented Software

Praveen Penumathsa

This document proposes a research project to develop techniques for automated testing of object-oriented software. The objectives are to design a framework for test case generation based on an intermediate graph representation of the software and to generate test cases by analyzing this graph. The plan is to use UML diagrams to construct a communication tree and then iteratively select predicates to transform into test data. The performance of the algorithms will be evaluated by testing them on sample data and comparing results.

SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS

IRJET Journal

This document presents a semi-supervised spatial EM framework for microarray analysis to efficiently classify and predict diseases based on gene expression data. It uses a spatial EM algorithm to cluster gene expression data, followed by an SVM classifier to predict diseases and their severity levels. The proposed approach is evaluated based on classification accuracy, computation time, and ability to identify biologically significant genes. Experimental results on disease datasets show improved accuracy compared to other supervised and unsupervised methods. The authors conclude that using the same classifier for gene selection and classification enhances predictive performance, and future work will focus on partitioning genes into clusters correlated with sample categories to further improve accuracy.

I045046066

IJERA Editor

International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.

Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...

QIAGEN

Modern Computing: Cloud, Distributed, & High Performance

inside-BigData.com

In this video, Dr. Umit Catalyurek from Georgia Institute of Technology presents: Modern Computing: Cloud, Distributed, & High Performance. Ümit V. Çatalyürek is a Professor in the School of Computational Science and Engineering in the College of Computing at the Georgia Institute of Technology. He received his Ph.D. in 2000 from Bilkent University. He is a recipient of an NSF CAREER award and is the primary investigator of several awards from the Department of Energy, the National Institute of Health, and the National Science Foundation. He currently serves as an Associate Editor for Parallel Computing, and as an editorial board member for IEEE Transactions on Parallel and Distributed Computing, and the Journal of Parallel and Distributed Computing. Learn more: http://www.bigdatau.org/data-science-seminars Watch the video presentation: http://wp.me/p3RLHQ-ghU Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter

Neuro-Genetic Optimization of LDO-fired Rotary Furnace Parameters for the Pro...

IJERD Editor

This document describes a study that uses a neuro-genetic optimization technique to determine optimal parameters for a rotary furnace used in casting production. A neural network is trained on experimental data to model the relationship between furnace parameters (e.g. rotational speed, air temperature) and melting rate. The neural network is then integrated with a genetic algorithm to rapidly find parameter values that maximize melting rate in a single run and optimize casting quality. The technique was able to determine furnace parameter values that correlated well with experimental data.

Neuro-Genetic Optimization of LDO-fired Rotary Furnace Parameters for the Pro...

IJERD Editor

The rising demand for high quality homogenous castings necessitate that vast amount of manufacturing knowledge be incorporated in manufacturing systems. Rotary furnace involves several critical parameters like excess air, flame temperature, rotational speed of the furnace drum, melting time, preheat air temperature, fuel consumption and melting rate of the molten metal which should be controlled throughout the melting process. A complex relationship exists between these manufacturing parameters and hence there is a need to develop models which can capture this complex interrelationship and enable fast computation. In the present work, we propose a generic approach where the applicability and effectiveness of neural network in function approximation is used for rapid estimation of melting rate and they are integrated into the framework of genetic evolutionary algorithm to form a neuro-genetic optimization technique. A neural network model is trained with the experimental results. The results indicate that the heuristic converges to better solutions rapidly as it provides the values of various process parameters for optimizing the objective in a single run and thus assists for the improvement of quality in development of sound parts

LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project

LEGATO project

The effect of cloud computing in next generation

Sustainable Education Foundation

IP Reuse Impact on Design Verification Management Across the Enterprise

Genetic Programming in Automated Test Code Generation

Recommended

Recommended

More Related Content

Similar to Genetic Programming in Automated Test Code Generation

Similar to Genetic Programming in Automated Test Code Generation (20)

More from DVClub

More from DVClub (20)

Recently uploaded

Recently uploaded (20)

Genetic Programming in Automated Test Code Generation