This document summarizes a technique for removing numerical drift from scientific models when comparing program runs in different computing environments. The technique works by automatically inserting code to trace the values computed at each step. One program run is used to generate a trace file recording these values. Subsequent runs in other environments then compare computed values to those in the trace file, overwriting any that differ only due to numerical drift. This allows identification of errors due to coding or compilers by exposing differences beyond numerical drift. The technique is applied to the Weather Research and Forecasting (WRF) model to debug one of the most important climate modeling codes.
Maintaining the quality of the software is the major challenge in the process of software development.
Software inspections which use the methods like structured walkthroughs and formal code reviews involve
careful examination of each and every aspect/stage of software development. In Agile software
development, refactoring helps to improve software quality. This refactoring is a technique to improve
software internal structure without changing its behaviour. After much study regarding the ways to
improve software quality, our research proposes an object oriented software metric tool called
“MetricAnalyzer”. This tool is tested on different codebases and is proven to be much useful.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Abstract
Researchers in the field of software engineering, business process improvement and information engineering all want to drastically modernize software life-cycle processes and technologies to correct the problems and to improve the quality of software. Research goals have included ancillary issues, such as improving user services through conversion to new platforms and facilitating software processes by adopting automated tools. Automated tools for software development, understanding, maintenance, and documentation add to process maturity, leading to better quality and reliability of computer services and greater customer satisfaction. This paper focuses on critical issues of legacy program improvement. The program improvement needs the estimation of program from various perspectives. The paper highlights various elements of legacy program complexity which further can be taken in account for further program development.
Keywords: Legacy, Program, Software complexity, Code, Integration
The popularity of object-oriented languages requires serious measures to look into the
shortfalls of the languages and study the defects that might result from design flaws. The vast
amount of software developed using this paradigm requires certain methods and techniques to keep
the quality high by finding the defects and shortfalls. The paper will discuss some of the design flaws
and defects and look at methods of detecting them and we will look at object-oriented shortfalls
namely the crosscutting concerns and the techniques to find and correct those using Aspects and
Aspect-Oriented methods and modals
Maintaining the quality of the software is the major challenge in the process of software development.
Software inspections which use the methods like structured walkthroughs and formal code reviews involve
careful examination of each and every aspect/stage of software development. In Agile software
development, refactoring helps to improve software quality. This refactoring is a technique to improve
software internal structure without changing its behaviour. After much study regarding the ways to
improve software quality, our research proposes an object oriented software metric tool called
“MetricAnalyzer”. This tool is tested on different codebases and is proven to be much useful.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Abstract
Researchers in the field of software engineering, business process improvement and information engineering all want to drastically modernize software life-cycle processes and technologies to correct the problems and to improve the quality of software. Research goals have included ancillary issues, such as improving user services through conversion to new platforms and facilitating software processes by adopting automated tools. Automated tools for software development, understanding, maintenance, and documentation add to process maturity, leading to better quality and reliability of computer services and greater customer satisfaction. This paper focuses on critical issues of legacy program improvement. The program improvement needs the estimation of program from various perspectives. The paper highlights various elements of legacy program complexity which further can be taken in account for further program development.
Keywords: Legacy, Program, Software complexity, Code, Integration
The popularity of object-oriented languages requires serious measures to look into the
shortfalls of the languages and study the defects that might result from design flaws. The vast
amount of software developed using this paradigm requires certain methods and techniques to keep
the quality high by finding the defects and shortfalls. The paper will discuss some of the design flaws
and defects and look at methods of detecting them and we will look at object-oriented shortfalls
namely the crosscutting concerns and the techniques to find and correct those using Aspects and
Aspect-Oriented methods and modals
A tlm based platform to specify and verify component-based real-time systemsijseajournal
This paper is about modeling and verification languages with their pros and cons. Modeling is dynamic
part of system development process before realization. The cost and risky situations obligate designer to
model system before production and modeling gives designer more flexible and dynamic image of realized
system. Formal languages and modeling methods are the ways to model and verify systems but they have
their own difficulties in specifying systems. Some of them are very precise but hard to specify complex
systems like TRIO, and others do not support object oriented design and hardware/software co-design in
real-time systems. In this paper we are going to introduce systemC and the more abstracted method called
TLM 2.0 that solved all mentioned problems.
IMPLEMENTATION OF DYNAMIC COUPLING MEASUREMENT OF DISTRIBUTED OBJECT ORIENTED...IJCSEA Journal
Software metrics are increasingly playing a central role in the planning and control of software development projects. Coupling measures have important applications in software development and maintenance. Existing literature on software metrics is mainly focused on centralized systems, while work in the area of distributed systems, particularly in service-oriented systems, is scarce. Distributed systems with service oriented components are even more heterogeneous networking and execution environment. Traditional coupling measures take into account only “static” couplings. They do not account for “dynamic” couplings due to polymorphism and may significantly underestimate the complexity of software and misjudge the need for code inspection, testing and debugging. This is expected to result in poor predictive accuracy of the quality models in distributed Object Oriented systems that utilize static coupling measurements. In order to overcome these issues, we propose a hybrid model in Distributed Object Oriented Software for measure the coupling dynamically. In the proposed method, there are three steps
such as Instrumentation process, Post processing and Coupling measurement. Initially the instrumentation process is done. In this process the instrumented JVM that has been modified to trace method calls. During this process, three trace files are created namely .prf, .clp, .svp. In the second step, the information in these file are merged. At the end of this step, the merged detailed trace of each JVM contains pointers to the merged trace files of the other JVM such that the path of every remote call from the client to the server can be uniquely identified. Finally, the coupling metrics are measured dynamically. The implementation results show that the proposed system will effectively measure the coupling metrics dynamically.
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...acijjournal
Refactoring is applied to the software artifacts so as to improve its internal structure, while preserving its
external behavior. Refactoring is an uncertain process and it is difficult to give some units for
measurement. The amount to refactoring that can be applied to the source-code depends upon the skills of
the developer. In this research, we have perceived refactoring as a quantified object on an ordinal scale of
measurement. We have a proposed a model for determining the degree of refactoring opportunities in the
given source-code. The model is applied on the three projects collected from a company. UML diagrams
are drawn for each project. The values for source-code metrics, that are useful in determining the quality of
code, are calculated for each UML of the projects. Based on the nominal values of metrics, each relevant
UML is represented on an ordinal scale. A machine learning tool, weka, is used to analyze the dataset,
imported in the form of arff file, produced by the three projects
A Survey of functional verification techniquesIJSRD
In this paper, we present a survey of various techniques used in functional verification of industry hardware designs. Although the use of formal verification techniques has been increasing over time, there is still a need for an immediate practical solution resulting in an increased interest in hybrid verification techniques. Hybrid techniques combine formal and informal (traditional simulation based) techniques to take the advantage of both the worlds. A typical hybrid technique aims to address the verification bottleneck by enhancing the state space coverage.
Abstract The management of software cost, development effort and project planning are the key aspects of software development. Throughout the sixty-odd years of software development, the industry has gone at least four generations of programming languages and three major development paradigms. Still the total ability to move consistently from idea to product is yet to be achieved. In fact, recent studies document that the failure rate for software development has risen almost to 50 percent. There is no magic in managing software development successfully, but a number of issues related to software development make it unique. The basic problem of software development is risky. Some example of risk is error in estimation, schedule slips, project cancelled after numerous slips, high defect rate, system goes sour, business misunderstanding, false feature rich, staff turnover. XSoft Estimation addresses the risks by accurate measurement. A new methodology to estimate using software COSMIC-Full Function Point and named as EXtreme Software Estimation (XSoft Estimation). Based on the experience gained on the original XSoft project develpment, this paper describes what makes XSoft Estimation work from sizing to estimation. Keywords: -COSMIC function size unit, XSoft Estimation, XSoft Measurement, Cost Estimation.
Systematic Model based Testing with Coverage AnalysisIDES Editor
Aviation safety has come a long way in over one
hundred years of implementation. In aeronautics, commonly,
requirements are Simulink Models. Considering this, many
conventional low level testing methods are adapted by various
test engineers. This paper is to propose a method to undertake
Low Level testing/ debugging in comparatively easier and
faster way. As a first step, an attempt is made to simulate
developed safety critical control blocks within a specified
simulation time. For this, the blocks developed will be utilized
to test in Simulink environment. What we propose here is
Processor in loop test method using RTDX. The idea is to
simulate model (requirement) in parallel with handwritten
code (not a generated one) running on a specified target,
subjected to same inputs (test cases). Comparing the results
of model and target, fidelity can be assured. This paper suggests
a development workflow starting with a model created in
Simulink and proceeding through generating verified and
profiled code for the processor.
AN APPROACH FOR TEST CASE PRIORITIZATION BASED UPON VARYING REQUIREMENTS IJCSEA Journal
Software testing is a process continuously performed by the development team during the life cycle of the software with the motive to detect the faults as early as possible. Regressing testing is the most suitable technique for this in which we test number of test cases. As the number of test cases can be very large it is always preferable to prioritize test cases based upon certain criterions.In this paper prioritization strategy is proposed which prioritize test cases based on requirements analysis. By regressing testing if the requirements will vary in future, the software will be modified in such a manner that it will not affect the remaining parts of the software. The proposed system improves the testing process and its efficiency to achieve goals regarding quality, cost, and effort as well user satisfaction and the result of the proposed method evaluated with the help of performance evaluation metric.
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTSijseajournal
Modelling software product line (SPL) features plays a crucial role to a successful development of SPL.
Feature diagram is one of the widely used notations to model SPL variants. However, there is a lack of
precisely defined formal notations for representing and verifying such models. This paper presents an
approach that we adopt to model SPL variants by using UML and subsequently verify them by using firstorder
logic. UML provides an overall modelling view of the system. First-order logic provides a precise
and rigorous interpretation of the feature diagrams. We model variants and their dependencies by using
propositional connectives and build logical expressions. These expressions are then validated by the Alloy
verification tool. The analysis and verification process is illustrated by using Computer Aided Dispatch
(CAD) system.
how to create prachoom model on asrock 880gb-lefx size hdd 4000gb
pattern prachoom model =standalone server, single user ./school/log, /school/webmaster
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALEijseajournal
Bad smells are signs of potential problems in code. Detecting bad smells, however, remains time
consuming for software engineers despite proposals on bad smell detection and refactoring tools. Large
Class is a kind of bad smells caused by large scale, and the detection is hard to achieve automatically. In
this paper, a Large Class bad smell detection approach based on class length distribution model and
cohesion metrics is proposed. In programs, the lengths of classes are confirmed according to the certain
distributions. The class length distribution model is generalized to detect programs after grouping.
Meanwhile, cohesion metrics are analyzed for bad smell detection. The bad smell detection experiments of
open source programs show that Large Class bad smell can be detected effectively and accurately with this
approach, and refactoring scheme can be proposed for design quality improvements of programs.
Automatically inferring structure correlated variable set for concurrent atom...ijseajournal
The at
omicity of correlated variables is
quite tedious and error prone for programmers to explicitly infer
and specify in
the multi
-
threaded program
. Researchers have studied the automatic discovery of atomic set
programmers intended, such as by frequent itemset
mining and rules
-
based filter. However, due to the
lack
of inspection of inner structure
,
some
implicit sematics
independent
variables intended by user are
mistakenly
classified to be correlated
. In this paper, we present a
novel
simplification method for
program
dependency graph
and the corresponding graph
mining approach to detect the set of variables correlated
by logical structures in the source code. This approach formulates the automatic inference of the correlated
variables as mining frequent subgra
ph of
the simplified
data and control flow dependency graph. The
presented simplified
graph representation of program dependency is not only robust for coding style’s
varieties, but also essential to recognize the logical correlation. We implemented our me
thod and
compared it with previous methods on the open source programs’ repositories. The experiment results
show that our method has less false positive rate than previous methods in the development initial stage. It
is concluded that our presented method
can provide programmers in the development with the sufficient
precise correlated variable set for checking atomicity
A tlm based platform to specify and verify component-based real-time systemsijseajournal
This paper is about modeling and verification languages with their pros and cons. Modeling is dynamic
part of system development process before realization. The cost and risky situations obligate designer to
model system before production and modeling gives designer more flexible and dynamic image of realized
system. Formal languages and modeling methods are the ways to model and verify systems but they have
their own difficulties in specifying systems. Some of them are very precise but hard to specify complex
systems like TRIO, and others do not support object oriented design and hardware/software co-design in
real-time systems. In this paper we are going to introduce systemC and the more abstracted method called
TLM 2.0 that solved all mentioned problems.
IMPLEMENTATION OF DYNAMIC COUPLING MEASUREMENT OF DISTRIBUTED OBJECT ORIENTED...IJCSEA Journal
Software metrics are increasingly playing a central role in the planning and control of software development projects. Coupling measures have important applications in software development and maintenance. Existing literature on software metrics is mainly focused on centralized systems, while work in the area of distributed systems, particularly in service-oriented systems, is scarce. Distributed systems with service oriented components are even more heterogeneous networking and execution environment. Traditional coupling measures take into account only “static” couplings. They do not account for “dynamic” couplings due to polymorphism and may significantly underestimate the complexity of software and misjudge the need for code inspection, testing and debugging. This is expected to result in poor predictive accuracy of the quality models in distributed Object Oriented systems that utilize static coupling measurements. In order to overcome these issues, we propose a hybrid model in Distributed Object Oriented Software for measure the coupling dynamically. In the proposed method, there are three steps
such as Instrumentation process, Post processing and Coupling measurement. Initially the instrumentation process is done. In this process the instrumented JVM that has been modified to trace method calls. During this process, three trace files are created namely .prf, .clp, .svp. In the second step, the information in these file are merged. At the end of this step, the merged detailed trace of each JVM contains pointers to the merged trace files of the other JVM such that the path of every remote call from the client to the server can be uniquely identified. Finally, the coupling metrics are measured dynamically. The implementation results show that the proposed system will effectively measure the coupling metrics dynamically.
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...acijjournal
Refactoring is applied to the software artifacts so as to improve its internal structure, while preserving its
external behavior. Refactoring is an uncertain process and it is difficult to give some units for
measurement. The amount to refactoring that can be applied to the source-code depends upon the skills of
the developer. In this research, we have perceived refactoring as a quantified object on an ordinal scale of
measurement. We have a proposed a model for determining the degree of refactoring opportunities in the
given source-code. The model is applied on the three projects collected from a company. UML diagrams
are drawn for each project. The values for source-code metrics, that are useful in determining the quality of
code, are calculated for each UML of the projects. Based on the nominal values of metrics, each relevant
UML is represented on an ordinal scale. A machine learning tool, weka, is used to analyze the dataset,
imported in the form of arff file, produced by the three projects
A Survey of functional verification techniquesIJSRD
In this paper, we present a survey of various techniques used in functional verification of industry hardware designs. Although the use of formal verification techniques has been increasing over time, there is still a need for an immediate practical solution resulting in an increased interest in hybrid verification techniques. Hybrid techniques combine formal and informal (traditional simulation based) techniques to take the advantage of both the worlds. A typical hybrid technique aims to address the verification bottleneck by enhancing the state space coverage.
Abstract The management of software cost, development effort and project planning are the key aspects of software development. Throughout the sixty-odd years of software development, the industry has gone at least four generations of programming languages and three major development paradigms. Still the total ability to move consistently from idea to product is yet to be achieved. In fact, recent studies document that the failure rate for software development has risen almost to 50 percent. There is no magic in managing software development successfully, but a number of issues related to software development make it unique. The basic problem of software development is risky. Some example of risk is error in estimation, schedule slips, project cancelled after numerous slips, high defect rate, system goes sour, business misunderstanding, false feature rich, staff turnover. XSoft Estimation addresses the risks by accurate measurement. A new methodology to estimate using software COSMIC-Full Function Point and named as EXtreme Software Estimation (XSoft Estimation). Based on the experience gained on the original XSoft project develpment, this paper describes what makes XSoft Estimation work from sizing to estimation. Keywords: -COSMIC function size unit, XSoft Estimation, XSoft Measurement, Cost Estimation.
Systematic Model based Testing with Coverage AnalysisIDES Editor
Aviation safety has come a long way in over one
hundred years of implementation. In aeronautics, commonly,
requirements are Simulink Models. Considering this, many
conventional low level testing methods are adapted by various
test engineers. This paper is to propose a method to undertake
Low Level testing/ debugging in comparatively easier and
faster way. As a first step, an attempt is made to simulate
developed safety critical control blocks within a specified
simulation time. For this, the blocks developed will be utilized
to test in Simulink environment. What we propose here is
Processor in loop test method using RTDX. The idea is to
simulate model (requirement) in parallel with handwritten
code (not a generated one) running on a specified target,
subjected to same inputs (test cases). Comparing the results
of model and target, fidelity can be assured. This paper suggests
a development workflow starting with a model created in
Simulink and proceeding through generating verified and
profiled code for the processor.
AN APPROACH FOR TEST CASE PRIORITIZATION BASED UPON VARYING REQUIREMENTS IJCSEA Journal
Software testing is a process continuously performed by the development team during the life cycle of the software with the motive to detect the faults as early as possible. Regressing testing is the most suitable technique for this in which we test number of test cases. As the number of test cases can be very large it is always preferable to prioritize test cases based upon certain criterions.In this paper prioritization strategy is proposed which prioritize test cases based on requirements analysis. By regressing testing if the requirements will vary in future, the software will be modified in such a manner that it will not affect the remaining parts of the software. The proposed system improves the testing process and its efficiency to achieve goals regarding quality, cost, and effort as well user satisfaction and the result of the proposed method evaluated with the help of performance evaluation metric.
MANAGING AND ANALYSING SOFTWARE PRODUCT LINE REQUIREMENTSijseajournal
Modelling software product line (SPL) features plays a crucial role to a successful development of SPL.
Feature diagram is one of the widely used notations to model SPL variants. However, there is a lack of
precisely defined formal notations for representing and verifying such models. This paper presents an
approach that we adopt to model SPL variants by using UML and subsequently verify them by using firstorder
logic. UML provides an overall modelling view of the system. First-order logic provides a precise
and rigorous interpretation of the feature diagrams. We model variants and their dependencies by using
propositional connectives and build logical expressions. These expressions are then validated by the Alloy
verification tool. The analysis and verification process is illustrated by using Computer Aided Dispatch
(CAD) system.
how to create prachoom model on asrock 880gb-lefx size hdd 4000gb
pattern prachoom model =standalone server, single user ./school/log, /school/webmaster
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALEijseajournal
Bad smells are signs of potential problems in code. Detecting bad smells, however, remains time
consuming for software engineers despite proposals on bad smell detection and refactoring tools. Large
Class is a kind of bad smells caused by large scale, and the detection is hard to achieve automatically. In
this paper, a Large Class bad smell detection approach based on class length distribution model and
cohesion metrics is proposed. In programs, the lengths of classes are confirmed according to the certain
distributions. The class length distribution model is generalized to detect programs after grouping.
Meanwhile, cohesion metrics are analyzed for bad smell detection. The bad smell detection experiments of
open source programs show that Large Class bad smell can be detected effectively and accurately with this
approach, and refactoring scheme can be proposed for design quality improvements of programs.
Automatically inferring structure correlated variable set for concurrent atom...ijseajournal
The at
omicity of correlated variables is
quite tedious and error prone for programmers to explicitly infer
and specify in
the multi
-
threaded program
. Researchers have studied the automatic discovery of atomic set
programmers intended, such as by frequent itemset
mining and rules
-
based filter. However, due to the
lack
of inspection of inner structure
,
some
implicit sematics
independent
variables intended by user are
mistakenly
classified to be correlated
. In this paper, we present a
novel
simplification method for
program
dependency graph
and the corresponding graph
mining approach to detect the set of variables correlated
by logical structures in the source code. This approach formulates the automatic inference of the correlated
variables as mining frequent subgra
ph of
the simplified
data and control flow dependency graph. The
presented simplified
graph representation of program dependency is not only robust for coding style’s
varieties, but also essential to recognize the logical correlation. We implemented our me
thod and
compared it with previous methods on the open source programs’ repositories. The experiment results
show that our method has less false positive rate than previous methods in the development initial stage. It
is concluded that our presented method
can provide programmers in the development with the sufficient
precise correlated variable set for checking atomicity
THE UNIFIED APPROACH FOR ORGANIZATIONAL NETWORK VULNERABILITY ASSESSMENTijseajournal
The present business network infrastructure is quickly varying with latest servers, services, connections,
and ports added often, at times day by day, and with a uncontrollably inflow of laptops, storage media and
wireless networks. With the increasing amount of vulnerabilities and exploits coupled with the recurrent
evolution of IT infrastructure, organizations at present require more numerous vulnerability assessments.
In this paper new approach the Unified process for Network vulnerability Assessments hereafter called as
a unified NVA is proposed for network vulnerability assessment derived from Unified Software
Development Process or Unified Process, it is a popular iterative and incremental software development
process framework.
Development of an expert system for reducing medical errorsijseajournal
Recent advances in patient safety have been hampered by the hard
dealing with the development of a
uniform classification of patient safety concepts
in a systematic way
.
Therefore, m
any believe that medical
expert systems have great potential to improve health care.
A framework for computer
-
based medical
errors diagnose
s of primary systems’ deficiencies is presented.
Results of this research assisted in
developing the hierarchical structure of the medical errors expert system which
was
written and complied
in CLIPS. It has
225
rules,
52
parameters and
830
conditional pa
ragraphs. The system prompts the user
for response with suggested input formats. The system checks the user input for consistency within the
given limits. In addition, the system was validated through numerous consultations with the experts in the
field.
The benefits that
are
gained from such types of expert system
s
are eliminating
the fear from dealing
with
personal mistake, and providing the up
-
date information and helps medical staff as a learning tool.
Program analysis is useful for debugging, testing and maintenance of software systems due to information
about the structure and relationship of the program’s modules . In general, program analysis is performed
either based on control flow graph or dependence graph. However, in the case of aspect-oriented
programming (AOP), control flow graph (CFG) or dependence graph (DG) are not enough to model the
properties of Aspect-oriented (AO) programs. With respect to AO programs, although AOP is good for
modular representation and crosscutting concern, suitable model for program analysis is required to
gather information on its structure for the purpose of minimizing maintenance effort. In this paper Aspect
Oriented Dependence Flow Graph (AODFG) as an intermediate representation model is proposed to
represent the structure of aspect-oriented programs. AODFG is formed by merging the CFG and DG, thus
more information about dependencies between the join points, advice, aspects and their associated
construct with the flow of control from one statement to another are gathered. We discussthe performance
of AODFG by analysing some examples of AspectJ program taken from AspectJ Development Tools
(AJDT).
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISONijseajournal
Performance responsiveness and scalability is a make-or-break quality for software. Nearly everyone runs into performance problems at one time or another. This paper discusses about performance issues faced during Pre Examination Process Automation System (PEPAS) implemented in java technology. The challenges faced during the life cycle of the project and the mitigation actions performed. It compares 3 java technologies and shows how improvements are made through statistical analysis in response time of the application. The paper concludes with result analysis.
Integrating profiling into mde compilersijseajournal
Scientific computation requires more and more performance in its algorithms. New massively parallel
architectures suit well to these algorithms. They are known for offering high performance and power
efficiency. Unfortunately, as parallel programming for these architectures requires a complex distribution
of tasks and data, developers find difficult to implement their applications effectively. Although approaches
based on source-to-source intends to provide a low learning curve for parallel programming and take
advantage of architecture features to create optimized applications, programming remains difficult for
neophytes. This work aims at improving performance by returning to the high-level models, specific
execution data from a profiling tool enhanced by smart advices computed by an analysis engine. In order to
keep the link between execution and model, the process is based on a traceability mechanism. Once the
model is automatically annotated, it can be re-factored aiming better performances on the re-generated
code. Hence, this work allows keeping coherence between model and code without forgetting to harness the
power of parallel architectures. To illustrate and clarify key points of this approach, we provide an
experimental example in GPUs context. The example uses a transformation chain from UML-MARTE
models to OpenCL code.
Implementation of reducing features to improve code change based bug predicti...eSAT Journals
Abstract Today, we are getting plenty of bugs in the software because of variations in the software and hardware technologies. Bugs are nothing but Software faults, existing a severe challenge for system reliability and dependability. To identify the bugs from the software bug prediction is convenient approach. To visualize the presence of a bug in a source code file, recently, Machine learning classifiers approach is developed. Because of a huge number of machine learned features current classifier-based bug prediction have two major problems i) inadequate precision for practical usage ii) measured prediction time. In this paper we used two techniques first, cos-triage algorithm which have a go to enhance the accuracy and also lower the price of bug prediction and second, feature selection methods which eliminate less significant features. Reducing features get better the quality of knowledge extracted and also boost the speed of computation. Keywords: Efficiency, Bug Prediction, Classification, Feature Selection, Accuracy
A Comparative Study of Forward and Reverse Engineeringijsrd.com
With the software development at its boom compared to 20 years in the past, software developed in the past may or may not have a well-supported documentation during the software evolution. This may increase the specification gap between the document and the legacy code to make further evolutions and updates. Understanding the legacy code of the underlying decisions made during development is the prime motto, which is very well supported by Reverse Engineering. In this paper, we compare the Transformational Forward engineering, where a stepwise abstraction is obtained with the Transformational Reverse Methodology. While the forward transformation process produces overlap of the decisions, performance is affected. Hence, the use of transformational method of Reverse Engineering which is a backwards Forward Engineering process is suitable. Besides the design recognition obtained is a domain knowledge which can be used in future by the forward engineers.
Truly dependable software systems should be built with structuring techniques able to decompose the software complexity without
hiding important hypotheses and assumptions such as those regarding
their target execution environment and the expected fault- and system
models. A judicious assessment of what can be made transparent and
what should be translucent is necessary. This paper discusses a practical
example of a structuring technique built with these principles in mind:
Reflective and refractive variables. We show that our technique offers
an acceptable degree of separation of the design concerns, with limited
code intrusion; at the same time, by construction, it separates but does
not hide the complexity required for managing fault-tolerance. In particular, our technique offers access to collected system-wide information
and the knowledge extracted from that information. This can be used
to devise architectures that minimize the hazard of a mismatch between
dependable software and the target execution environments.
An Efficient Approach Towards Mitigating Soft Errors Riskssipij
Smaller feature size, higher clock frequency and lower power consumption are of core concerns of today’s nano-technology, which has been resulted by continuous downscaling of CMOS technologies. The resultant‘device shrinking’ reduces the soft error tolerance of the VLSI circuits, as very little energy is needed to change their states. Safety critical systems are very sensitive to soft errors. A bit flip due to soft error can change the value of critical variable and consequently the system control flow can completely be changed which leads to system failure. To minimize soft error risks, a novel methodology is proposed to detect and recover from soft errors considering only ‘critical code blocks’ and ‘critical variables’ rather than considering all variables and/or blocks in the whole program. The proposed method shortens space and time overhead in comparison to existing dominant approaches.
Improved Strategy for Distributed Processing and Network Application Developm...Editor IJCATR
The complexity of software development abstraction and the new development in multi-core computers have shifted the burden of distributed software performance from network and chip designers to software architectures and developers. We need to look at software development strategies that will integrate parallelization of code, concurrency factors, multithreading, distributed resources allocation and distributed processing. In this paper, a new software development strategy that integrates these factors is further experimented on parallelism. The strategy is multidimensional aligns distributed conceptualization along a path. This development strategy mandates application developers to reason along usability, simplicity, resource distribution, parallelization of code where necessary, processing time and cost factors realignment as well as security and concurrency issues in a balanced path from the originating point of the network application to its retirement.
Improved Strategy for Distributed Processing and Network Application DevelopmentEditor IJCATR
The complexity of software development abstraction and the new development in multi-core computers have shifted the
burden of distributed software performance from network and chip designers to software architectures and developers. We need to
look at software development strategies that will integrate parallelization of code, concurrency factors, multithreading, distributed
resources allocation and distributed processing. In this paper, a new software development strategy that integrates these factors is
further experimented on parallelism. The strategy is multidimensional aligns distributed conceptualization along a path. This
development strategy mandates application developers to reason along usability, simplicity, resource distribution, parallelization of
code where necessary, processing time and cost factors realignment as well as security and concurrency issues in a balanced path from
the originating point of the network application to its retirement.
Multi step automated refactoring for code smelleSAT Journals
Abstract
Brain MR Image can detect many abnormalities like tumor, cysts, bleeding, infection etc. Analysis of brain MRI using image
processing techniques has been an active research in the field of medical imaging. In this work, it is shown that MR image of brain
represent a multi fractal system which is described a continuous spectrum of exponents rather than a single exponent (fractal
dimension). Multi fractal analysis has been performed on number of images from OASIS database are analyzed. The properties of
multi fractal spectrum of a system have been exploited to prove the results. Multi fractal spectra are determined using the modified
box-counting method of fractal dimension estimation.
Keywords: Brain MR Image, Multi fractal, Box-counting
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
HW/SW Partitioning Approach on Reconfigurable Multimedia System on ChipCSCJournals
Due to the complexity and the high performance requirement of multimedia applications, the design of embedded systems is the subject of different types of design constraints such as execution time, time to market, energy consumption, etc. Some approaches of joint software/hardware design (Co-design) were proposed in order to help the designer to seek an adequacy between applications and architecture that satisfies the different design constraints. This paper presents a new methodology for hardware/software partitioning on reconfigurable multimedia system on chip, based on dynamic and static steps. The first one uses the dynamic profiling and the second one uses the design trotter tools. The validation of our approach is made through 3D image synthesis.
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTSijseajournal
Most bug assignment approaches utilize text classification and information retrieval techniques. These
approaches use the textual contents of bug reports to build recommendation models. The textual contents of
bug reports are usually of high dimension and noisy source of information. These approaches suffer from
low accuracy and high computational needs. In this paper, we investigate whether using categorical fields
of bug reports, such as component to which the bug belongs, are appropriate to represent bug reports
instead of textual description. We build a classification model by utilizing the categorical features, as a
representation, for the bug report. The experimental evaluation is conducted using three projects namely
NetBeans, Freedesktop, and Firefox. We compared this approach with two machine learning based bug
assignment approaches. The evaluation shows that using the textual contents of bug reports is important. In
addition, it shows that the categorical features can improve the classification accuracy
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...IJERA Editor
In aspect oriented development, obliviousness is one of its pillars as it helps developers to implement crosscutting concerns via aspects, which increases the overall software modularity. Despite of its merits, obliviousness brings the problem of interferences among aspects as several aspects pointcuts may address the same joinpoint for the same advice. Existing approaches deals with conflicts at design level use graphs structures, which increase in size as project size increases. In this work, a relational database model is used to map aspect oriented design models and then conflicts are extracted by an algorithm runs over this database. This approach is simpler than other approaches and enables large project sizes while the other approaches get complicated due to increment in graph size. The proposed approach can be extended to the distributed team development, dependent on the database engine used.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
The International Journal of Engineering and Science (IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
DESIGN PATTERNS IN THE WORKFLOW IMPLEMENTATION OF MARINE RESEARCH GENERAL INF...AM Publications
This paper proposes the use of design patterns in a marine research general information platform. The development of the platform refers to a design of complicated system architecture. Creation and execution of the research workflow nodes and designing of visualization library suited for marine users play an important role in the whole software architecture. This paper studies the requirements characteristic in marine research fields and has implemented a series of framework to solve these problems based on object-oriented and design patterns techniques. These frameworks make clear the relationship in all directions between modules and layers of software, which communicate through unified abstract interface and reduce the coupling between modules and layers. The building of these frameworks is importantly significant in advancing the reusability of software and strengthening extensibility and maintainability of the system.
Program aging is a degradation of performance or functionality caused by resource depletion. The aging affects the cloud
services which provide access to big data bank and computing resources. This suffers large budget and delays of defect removal, which
requires other related solutions including renewal in the form of controlled restarts. Collection of various runtime metrics are more
significant source for further study of detection and analysis of aging issues. This study highlights the method for detecting aging
immediately after their introduction by runtime comparisons of different development scenarios. The study focuses on aging of
program and service crash as a consequence.
An analysis of software aging in cloud environment IJECEIAES
Cloud computing is the environment in which several virtual machines (VM) run concurrently on physical machines. The cloud computing infrastructure hosts multiple cloud services that communicate with each other using the interfaces. During operation, the software systems accumulate errors or garbage that leads to system failure and other hazardous consequences. This status is called software aging. Software aging happens because of memory fragmentation, resource consumption in large scale and accumulation of numerical error. Software aging degrads the performance that may result in system failure. This happens because of premature resource exhaustion. The errors that cause software agings are of special types and target the response time and its environment. This issue is to be resolved only during run time as it occurs because of the dynamic nature of the problem. To alleviate the impact of software aging, software rejuvenation technique is being used. Rejuvenation process reboots the system or reinitiates the softwares. Software rejuvenation removes accumulated error conditions, frees up deadlocks and defragments operating system resources like memory. Software aging and rejuvenation has generated a lot of research interest recently. This work reviews some of the research works related to detection of software aging and identifies research gaps.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
4213ijsea04
1. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
THE REMOVAL OF NUMERICAL DRIFT FROM
SCIENTIFIC MODELS
John Collins1,2, Brian Farrimond1,2, David Flower1, Mark Anderson2 and David
Gill3
1
SimCon Ltd, Dartmouth, United Kingdom
john.collins@simconglobal.com
2
Department of Computing, Edge Hill University, Ormskirk, United Kingdom
mark.anderson@edgehill.ac.uk
3
National Center for Atmospheric Research, Boulder, Colorado, USA
gill@ucar.gov
ABSTRACT
Computer programs often behave differently under different compilers or in different computing
environments. Relative debugging is a collection of techniques by which these differences are analysed.
Differences may arise because of different interpretations of errors in the code, because of bugs in the
compilers or because of numerical drift, and all of these were observed in the present study. Numerical
drift arises when small and acceptable differences in values computed by different systems are
integrated, so that the results drift apart. This is well understood and need not degrade the validity of the
program results. Coding errors and compiler bugs may degrade the results and should be removed. This
paper describes a technique for the comparison of two program runs which removes numerical drift and
therefore exposes coding and compiler errors. The procedure is highly automated and requires very little
intervention by the user. The technique is applied to the Weather Research and Forecasting model, the
most widely used weather and climate modelling code.
KEYWORDS
Relative debugging; numerical drift; coding errors; compiler errors
1. INTRODUCTION
The technique of debugging by identifying differences in execution between versions of a
program was introduced by Abramson and Sosic in 1994 and given the name relative debugging
[1][2][3]. A relative debugging system known as Guard was developed that enable users to run
debuggers on several versions of the program simultaneously [4]. Commands are provided that
enable users to identify data structures within the program to have breakpoints set at which the
contents of the data structures are compared, with optional choice of tolerance of insignificant
differences. The user can analyse the differences using text, bitmaps or more powerful
visualisation tools. The system is able to compare versions written in different languages - the
user identifies the data structures to be compared through variable name and location within the
source code files.
The importance of this techniques lies not with the development of new software, but with the on-
going evolution and maintenance of existing software systems [2]. Whilst debugging plays an
important role in the implementation of new systems, the ability to make comparisons between
versions of software, and specifically the data being processed within the code, provides a
DOI : 10.5121/ijsea.2013.4204 51
2. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
powerful mechanism for ensuring that the software is delivering consistent and correct results
throughout its existence. This is achieved through the comparison of values between a reference
program and a development program (Figure 1).
Figure 1: Relative Debugging [2]
Abramson et al [2] describe the successful application of a relative debugging technique for a
program running in a sequential and a parallel environment. Their technique requires the user to
specify specific data structures to be compared during the different program runs. A difficulty of
the technique is that numerical drift will eventually cause any assertion that the data structures are
similar to fail. Also, there may be appreciable labour in choosing and specifying the appropriate
data structures. Both issues are addressed in the technique described within this paper.
1.1. Numerical Drift
Numerical drift between runs of the same program in different computing environments occurs
because of the integration of small differences in the results of floating point computations.
These differences are to be expected. There are, at least, three reasons why they should occur:
i. different systems may use different floating point representations for numbers with different
precision. The results of floating point computations are therefore often slightly different.
Differences from this cause are now relatively uncommon because most computers use the same
format of IEEE floating point numbers;
ii. different compilers may make different choices in the order of evaluation of terms in
floating point expressions. The order of evaluation is usually strictly defined by, for
example, the Fortran standard [5]. However, optimising compilers may sometimes violate the
standard in the interests of efficiency;
iii. different compilers may make different choices of variables or quantities to be held in the
processor registers. The processor registers of a typical PC chip are 10 bytes long. Values
written to memory are 8 or 4 bytes long. The additional precision of values held in registers may
be integrated to cause drift.
The objective of this study is to remove numerical drift. Any remaining differences between
environments are likely to be the result of coding or compiler errors.
2. RELATED WORK
The Guard system was developed as a means of implementing automated relative debugging [2].
The project was driven by the fluid nature of software development for scientific modelling in
52
3. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
which a community contributes towards the on-going development of the model. The community
would generally consist of academics who have a different programming philosophy from those
who program on engineering models [6], and thereby require a mechanism to ensure that the
results generated by incremental changes to the model implementation do not contain significant
differences.
The implemented system was successful in the hands of those users familiar with the code and
likely causes of error. However, it suffered from a number of drawbacks which the system
described in this paper aims to overcome. It has been argued that the automation in the Guard
system is somewhat limited [7], in that Guard is deployed to perform comparisons. Further
extensions have been proposed which extend the functionality to backtrack and re-execute code
[8][9].
In looking for possible bugs, the user declares assertions to specify which data structures to
monitor and where in the code they are to be compared (Abramson et al. 1995). The point in the
code where a comparison is made that result in significant differences is unlikely to be the point
where the differences are being generated. If differences are indeed detected then further work is
required to pin down the cause of the differences, for example by introducing further monitoring
points. The user's experience and intuition are required to make the detection of the location of
the generation of differences efficient. Visualisation tools are used to help the user interpret and
identify the cause of differences in these circumstances.
The differences may be spurious; for example, the result of numerical drift between the
executions of the two versions. Valid discounting of the effects of numerical drift depends on the
experience of the user. In contrast the system described in the current paper automatically
removes the effect of numerical drift.
Later improvements to the Guard system (given the name DUCT) have focussed on attempting to
automate the creation of assertions aimed at identifying the cause of differences[10][11]. The
variables that are used to generate the values in the variable under investigation are recognised as
the source of the differences. They are automatically identified and chains of preceding variable
assignments that contribute to the values they hold when being used to modify the target variable
are created. From these chains, further assertions are generated. Although this technique moves
the user to the source of the generation of differences it still depends on the user's intuition
regarding potential data structures where differences might occur. A technique that systematically
examines every variable's assignment is capable of detecting differences caused by errors not
expected by the user. This would be a full automation of the procedure rather than the hybrid
presented by DUCT. The system described in this paper provides such a technique by identifying
automatically the locations in the code where all significant differences are generated.
The Guard system is able to compare programs written in different languages because the user
explicitly identifies the data structures in each to be compared and where in their source code files
to compare them. The system described in this paper works only for programs written in Fortran
since it depends upon a semantic analysis and automatic code modifications provided by the tool
kit, WinFPT .
An alternative relative debugging project, HPCVL [7], has considered the limitations in Guard
relating to the identification of equivalent variables in the reference and development programs.
As identified, the variable names may differ between the two versions of the code which would
make the comparison difficult, if not impossible. In order to overcome this, a template which
adopted a labelling mechanism was implemented. This required the source code to be manually
amended to include calls to library functions which could be included or removed by pre-
processing the code. Whilst this provides a powerful means by which two programs can be
53
4. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
compared, even should they be written in different languages, it does have the limitation that the
source code needs to be manually amended and also an acknowledged issue that the processing
may require a large amount of disk space [7].
Guard, DUCT and HPCVL represent important developments in relative debugging. However,
each has limitations in terms of implementation or functionality. HPCVL required users to
manually instrument source code to function and DUCT is not a fully automated system. The
relative debugging techniques previously described by Abramson, Sosic and Watson [3] and in
the related studies are labour intensive. It is not practicable to apply such techniques to large
scientific modelling software, such as the WRF model considered in this paper. These earlier
techniques require deep knowledge of the target program, intuition and some luck to succeed in
identifying areas where problems are likely to occur. There is a danger that the users will only
find the errors they expect to find and will miss the unexpected, such as compiler errors and
differences. The technique described here is able to look for differences in the entire program,
even programs as large as WRF, with no need for manual intervention. It is also able to monitor
execution coverage automatically so that test runs can be fine-tuned to maximise coverage of the
code as it is debugged.
The functionality of Guard is centred on comparison of versions of programs. In any non-trivial
program which integrates floating point values, numerical drift will cause the results from
different systems to diverge. Meaningful comparisons between runs can only be made if this drift
is removed. It is therefore essential that this issue is addressed. The technique described here
removes numerical drift.
3. WEATHER RESEARCH AND FORECASTING MODEL
WRF, the Weather Research and Forecasting code developed at UCAR is arguably the most
important computer program ever written [12]. It is the most widely used modelling code to
inform international climate change policy, to which over 100 Billion US Dollars has been
committed to date.
WRF is almost completely written in Fortran 95. The code can be built for a range of
architectures, from a single serial processor to very large multi-processor systems and with
varying patterns of shared memory. Parts of the code are modified by the build procedure to
adapt the target architecture. The Fortran code for a serial single processor system occupies
360,410 lines. There is extensive use of Fortran 90 modules which encapsulate different areas of
the computational process.
The WRF code is divided into three layers [12]. The Mediation layer describes a grid which
represents an area of the Earth's atmosphere. The Model layer describes the physics of the
atmosphere, and deals with issues such as mass flow, radiation, chemistry and effects at the land
and sea surface. The middle layer (the Driver layer) is concerned with distributing data between
the different processors and threads in the target architecture. The data structures in the top two
layers are mostly large hierarchies of Fortran 95 derived types, some with their own overloaded
operators. The data structures in the physics layer are mostly simple arrays of native Fortran
types. The translation takes place in the middle layer. This architecture has implications for the
analysis technique described in this paper.
The version of WRF used in the study was 3.1.
54
5. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
4. WINFPT TOOLKIT
The tool used to insert the trace statements into the source code is WinFPT. WinFPT is written
and maintained by two of the authors. It analyses Fortran codes in the same way as a compiler,
and stores an internal representation of the code, including comments and formatting information.
A full static semantic analysis is carried out, and the tool therefore identifies the data types and
other attributes of all variables, and the intents (Whether input, output both or neither) of all sub-
program arguments. The trace statements are inserted into the code by modifying the internal
representation. A new version of the code is generated from the internal representation.
Processing of WRF by WinFPT takes about four minutes on a typical PC. The process is fully
automated.
The trace file generation is handled by a small Fortran library linked with the modified code.
Tools were also written to handle the large trace data files.
5. REMOVAL OF NUMERICAL DRIFT
The technique is as follows:
a) statements are added to the code to capture every quantity produced as the result of a
computation. This process is entirely automatic, and is carried out by the WinFPT tools described
above. The statements inserted are subroutine calls. The results, a trace of the program flow and
all intermediate data, are captured to a text file;
b) the code is run in the first computing environment, and the trace file is generated;
It is possible to run the program in the second environment, to generate a second file and to
compare the two traces. This was the original intention of the authors. The results show a very
large number of differences as a result of numerical drift and are almost impossible to analyse.
Instead:
c) the program is run in the second environment, but the subroutines which captured the trace
data in the first run are used to read the original trace file. The values computed are compared
with the values read from the first run. There are four possible consequences:
1. the results are identical. No action is taken;
2. the results are similar (The criteria for similarity are passed as parameters to the
program). The values computed are overwritten by the values read from file from the
first trace. This prevents numerical drift;
3. the results are significantly different. The values and the point in the code where the
difference occurred are recorded for investigation. The value computed is overwritten by
the value from the first run so that the analysis may continue;
4. the program follows a different path through the code. The difference is recorded and the
run halts. At present the situation is not recoverable.
5.1 Modifications to the Code
Subroutine calls are inserted automatically into the code to capture or to compare the trace data.
For example, a code fragment from WRF
55
6. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
REAL, PARAMETER :: karman = 0.4
rdt = 1.0 / dtpbl
DO k = 1, kte
sigmaf(k-1) = znw(k)
ENDDO
sigmaf(kte) = 0.0
is modified to:
REAL,PARAMETER :: karman = 0.4
CALL trace_start_sub_program('ACMPBL',6914)
rdt = 1.0/dtpbl
CALL trace_r4_data('RDT',rdt,38749)
DO k = 1,kte
CALL trace_i4_data('K',k,38750)
sigmaf(k-1) = znw(k)
CALL trace_r4_data('sigmaf(k-1)',sigmaf(k-1),38751)
ENDDO
sigmaf(kte) = 0.0
CALL trace_r4_data('sigmaf(kte)',sigmaf(kte),38752)
There is a separate data trace subroutine for each primary data type. The arguments are a string,
which records the variable assigned, including the field and array indices if any, the variable itself
and a unique integer identifier which is used to locate the statement in the code. This identifier is
used to check that the same path is followed through the code on each run.
5.2 Criteria for Similarity
Values from the two runs may differ as a result of numerical drift, and no report is made if they
are sufficiently similar. The criteria for similarity are controlled as follows:
i. logical and integer values are required to match exactly;
ii. character values may be required to match, but it was found that there are relatively few
character computations in the codes which have been analysed, and most of these are file names
which are expected to differ, Provision is therefore made to ignore character variables;
iii. real and complex values are required to differ by less than a percentage difference passed as a
parameter to the modified program. In the study of WRF, this criterion was set to 0.1%.
However, values close to zero are likely to generate large percentage differences by chance.
Therefore values were also considered to be similar if they differed by less than an absolute
criterion, also passed as a parameter. In the study of WRF this was set to 1.0E-10.
5.3 Trace Files
The trace is written to a single ASCII file. The records, which record the data assignments,
contain the identifier number, the descriptive string and the value. Values are written in a
consistent format, irrespective of the Fortran kind of the data type in order that runs using
different data kinds may be compared. Records are also written to mark the entry to and return
from each routine.
The trace files are large. A file produced by a very short run of WRF may be of the order of 30
gigabytes. These files are too large to be handled by most text editors, and special tools were
written to extract data from them. The files are also slow to produce. The present implementation
56
7. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
of the technique requires the code to be run serially, on a single processor, and the run speed is of
the order of ten times slower than the serial un-modified code.
6. WRF CASE STUDY
WRF may be run as a single processor serial program under Linux. Differences are observed
between the output data generated by runs under GNU gfortran and Intel ifort . Some
differences are due to numerical drift, but the technique also reveals coding and compiler errors.
Three examples of the findings are described below.
6.1 Preparing WRF for Analysis
The build procedure for WRF modifies the code, and generates code automatically to adapt to
different computing environments. Code is modified by the sed stream editor and by the cpp C
pre- processor. This is necessary because the code must be optimised for different parallel
architectures with different arrangements of shared memory. WinFPT cannot analyse the
distributed code because it cannot interpret the cpp directives or anticipate the modifications
made by sed. The WRF build procedure was therefore interrupted to capture the post-processed
Fortran 95 files for analysis. A modified version of the build procedure was written to complete
the build from the post-processed files.
6.2 Case 1: An Uninitialised Variable
The WRF code was processed to insert the trace statements, built with gfortran and a short run
was made. The code was rebuilt with ifort and run again. The first difference reported is:
!*** Trace value error:
!Value computed: 148443456
!Trace file line: 24468 NEW_DOMDESC = 538976288
The value computed for NEW_DOMDESC by:
ifort is 148443456
gfortran is 538976288
The identifier, 24468, is used to identify the statement in the code where the variable was
assigned. The statement is a subroutine call, and the variable should be assigned as an
INTENT(INOUT) argument. However, it is never initialised and is not written to in the routine.
This is a coding error. The variable is used in the subsequent code. However it is not clear that it
has any significant effect on the program results.
6.3 Case 2: A Coding Error, a Compiler Bug and a Poor Fortran 95 Language
Feature
The comparison ifort run halts with the report:
!*** Trace error at start of sub-program: ESMF_TIMECOPY
!Trace file line: 9608 STOPTIME = 0
ifort starts execution of the subroutine esmf_timecopy.
gfortran does not.
57
8. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
1) The Coding Error
The subroutine esmf_timecopy is in esmf_time.f90 in the MODULE esmf_timemod. It is used to
overload the assignment of variables of the derived type esmf_time. The code is:
!===========================================================
!
! ESMF Time Module
module ESMF_TimeMod
!
!===========================================================
:
! !PRIVATE TYPES:
private
!-----------------------------------------------------------
! ! ESMF_Time
:
!-----------------------------------------------------------
!BOP
! !INTERFACE:
interface assignment (=)
! !PRIVATE MEMBER FUNCTIONS:
module procedure ESMF_TimeCopy
! ! DESCRIPTION:
! This interface overloads the = operator for the
! ESMF_Time class
!
!EOP
End interface
!
!----------------------------------------------------------
There are many calls, but they are written as assignment statements. For example:
clockint%currtime = starttime
The coding error is that there is a leading PRIVATE statement in the module esmf_clockmod.
esmf_timecopy is declared public, and is therefore visible when the module is used. However,
ASSIGNMENT(=) is not declared public. Therefore the overloading of the operator should not
be exported.
2) The Compiler Bug
This code works as intended under ifort because of a bug in the ifort compiler. ifort exports the
use of the SUBROUTINE esmf_timecopy to overload assignment when it exports the routine
itself. It should not[5]. The code works as written under gfortran, not as intended.
3) A Poor Language Feature
This is a serious failure in the Fortran 95 language. If an error occurs in the coding of an
INTERFACE ASSIGNMENT (=) statement, as occurred here, then the values of the variables on
which the overloaded assignment is intended to operate are simply, and silently copied.
6.4 Case 3: Different Orders of Computation (Not an Error)
The trace report is:
58
9. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
!*** Trace sequence error:
!Sequence number reached: 2632
!Trace line: 44475387 2444 Start sub-program:
! WRF_PUT_DOM_TI_INTEGER
44475387 2632 Start sub-program: USE_PACKAGE
gfortran called wrf_put_dom_ti_integer
ifort called use_package
The code here is:
IF ((use_package(io_form)==io_netcdf) .OR. &
(use_package(io_form)==io_phdf5) .OR. &
(use_package(io_form)==io_pnetcdf)) THEN
CALL wrf_put_dom_ti_integer(fid,'SURFACE_INPUT_SOURCE', &
surface_input_source,1,ierr)
Both gfortran and ifort are optimising compilers. Both will evaluate each alternative of the .OR.
clause and will stop evaluation as soon as one is true. The Fortran standard explicitly states that
the order of evaluation is not defined by the language in this context [5]. The gfortran compiler
evaluated the first sub-clause first. The ifort compiler did not, and therefore invoked the
function use_package again.
This is not an error, provided that the function involved has no side effects (which is the case
here). However, it is not a good idea to have an undefined pattern of program flow, and the trace
analysis traps it.
The present library routines halt the program when the program flow in the second run differs
from that in the first. The code was therefore modified for future runs:
i_use_package = use_package(io_form)
IF ((i_use_package==io_netcdf) .OR. &
(i_use_package==io_phdf5) .OR. &
(i_use_package==io_pnetcdf)) THEN
CALL wrf_put_dom_ti_integer(fid,'SURFACE_INPUT_SOURCE', &
surface_input_source,1,ierr)
6.5 Further Analysis of WRF
A coverage analysis of the WRF code indicated that only 18% of the code was visited in the runs
which were analysed. This is to be expected. In a single WRF run, only one of 14 possible
physics regimes is selected, and the atmospheric physics code is the largest part of the program.
The intention of future work on the project is to analyse all of the physics code. There are also
many static analyses carried out by WinFPT which expose features in the code which should be
corrected.
7. IMPLICATIONS FOR WRF
The WRF modelling system is complex and exploits many Fortran capabilities that permit
compile-time checking, such as modules and interfaces. The code is largely Fortran compliant,
which is gauged by the number of successful compiler ports: IBM xlf, PGI, Intel, gfortran, g95.
While the architecture of the WRF model is sound, over the years capabilities and code have been
added through accretion by many different developers and contributors of the modelling
59
10. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
community. The complicated nature and background of the source code means that the software
has accumulated inconsistencies and inefficiencies with respect to the strict Fortran standards.
The initial tangible benefit that the WinFPT analyses provide is run-time stability. The FPT
group is working on identifying coding errors in the WRF model. For example, they have already
identified improper usages of an overloaded "=" when ESMF calls are made, and a returned value
into a parameter. These errors constitute the visible portion of the iceberg for unknown, lurking
bugs that could trigger failed forecasts for no apparent reason. These problems are also likely to
be a source of different behaviour when running on different machines, such as when porting
from one OS/compiler to another. Just as the WRF community greatly benefits from having
thousands of users investigate the code and identify problems in the physical parameterization
schemes, so the community will also benefit from having a tool that identifies (for later
correction) errors in the Fortran code which will reduce the exposure to model failures.
With the broad usage of the WRF code across many different hardware and software systems, the
intercomparability of model results becomes tantamount. Here also the WinFPT tool is able to
provide a solution by allowing a fine-grained granularity comparison of results while the model is
integrating.
8. CONCLUSIONS
The technique described isolates coding and compiler errors which lead to differences in
performance when the same program is run in different computing environments. It has the
advantage that it is highly automated, and requires very little human intervention. However, it is
computationally intensive, and currently requires code to be run serially on a single processor.
Initial investigation making use of the techniques described has enabled a number of issues to be
discovered. These issues can be related to coding in the model and the compilers used to build
the code. The paper discusses three findings from the experiments that have been performed.
These represent important validation of this technique, and the intention of the project is to apply
the technique to include broader coverage of the WRF model code.
The initial experiments have focussed on single processor builds of the WRF model. By making
use of alternative test suites, it is envisaged that coverage of the WRF code will be significantly
increased from the 18% which has currently been achieved. However, this could be further
improved by expanding the experimentation to also include multiprocessor builds. One aim of
future work on the project is to investigate the combination of aspects of this technique and the
relative debugging techniques developed by Abramson et al. [2] to analyse runs made on
multiprocessor systems.
The technique described here offers the possibility of a new phase in software development in
which software validation can be enhanced by comparing execution on different compilers to
track down bugs that would be difficult or impossible to detect otherwise. This is especially
useful where immature compilers are involved.
REFERENCES
[1] D. Abramson and R. Sosic. A debugging and testing tool for supporting software evolution.
Automated Software Engineering: An International Journal, 3(3/4):369–390, August (1996).
[2] D. Abramson, I. Foster, J. Michalakes and R. Sosic. Relative debugging and its application to the
development of large numerical models. In Proceedings of the 1995 ACM/IEEE conference on
Supercomputing, ACM, New York, Article 51.(1995)
[3] D. Abramson, R. Sosic, C. Watson, "Implementation techniques for a parallel relative debugger,"
Parallel Architectures and Compilation Techniques, 1996., Proceedings of the 1996 Conference on ,
vol., no., pp.218-226, Oct (1996)
60
11. International Journal of Software Engineering & Applications (IJSEA), Vol.4, No.2, March 2013
[4] Abramson D., Foster, I., Michalakes, J. and Sosic R., “Relative Debugging: A New Methodology for
Debugging Scientific Applications”, Communications of the Association for Computing Machinery
(CACM), Vol 39, No 11, pp 67 - 77, Nov (1996).
[5] ANSI, Programming Language Fortran 90, X3.198-1992, American National Standard Institute
(1992).
[6] Easterbrook S.M, Johns T.C., Engineering the Software for Understanding Climate Change,
Computing In Science and Engineering, Nov (2009)
[7] Liu, G., Schmider, H.L., and Edgecombe, K.E.. A Data Identification Scheme for Automatic Relative
Debugging. In Proceedings of the 2004 International Conference on Parallel Processing Workshops
(ICPPW '04). IEEE Computer Society, Washington, DC, USA, 247-253, (2004)
[8] Matthews, G., Hood, R., Johnson, S., and Leggett, P.. Backtracking and Re-execution in the
Automatic Debugging of Parallelized Programs. In Proceeding of the 11th IEEE International
Symposium on High Performance Distributed Computing, 2002, HPDC-11, (2002).
[9] Matthews, G., Hood, R., Jin, H., Johnson, S., and Ierotheou, C.. Automatic Relative Debugging of
OpenMP Programs Tech. Report NAS-03-014, NASA Ames Research Center, Moffett Field,
California, (2003)
[10] Searle, A., Gough, J. and Abramson, D., “DUCT: An Interactive Define-USE Chain Navigation Tool
for Relative Debugging”, Proceedings of the 5th International Workshop in Automated and
Algorithmic Debugging. 2003. Ghent, Belgium, Sep (2003a).
[11] Searle, A., Gough, J. and Abramson, D, "Automating Relative Debugging," Automated Software
Engineering, International Conference on, p. 356, 18th IEEE International Conference on Automated
Software Engineering (ASE'03), (2003b)
[12] Michalakes, J., Dudhia, J., Gill, D., Henderson, T., Klemp, J., Skamarock, W., and Wang, W.: The
Weather Research and Forecast Model: Software Architecture and Performance. Proceedings of the
Eleventh ECMWF Workshop on the Use of High Performance Computing in Meteorology. Eds.
Walter Zwieflhofer and George Mozdzynski. World Scientific, pp 156 – 168, (2005)
61