Software size estimation at early stages of project development holds great significance to meet
the competitive demands of software industry. Software size represents one of the most
interesting internal attributes which has been used in several effort/cost models as a predictor
of effort and cost needed to design and implement the software. The whole world is focusing
towards object oriented paradigm thus it is essential to use an accurate methodology for
measuring the size of object oriented projects. The class point approach is used to quantify
classes which are the logical building blocks in object oriented paradigm. In this paper, we
propose a class point based approach for software size estimation of On-Line Analytical
Processing (OLAP) systems. OLAP is an approach to swiftly answer decision support queries
based on multidimensional view of data. Materialized views can significantly reduce the
execution time for decision support queries. We perform a case study based on the TPC-H
benchmark which is a representative of OLAP System. We have used a Greedy based approach
to determine a good set of views to be materialized. After finding the number of views, the class
point approach is used to estimate the size of an OLAP System The results of our approach are
validated.
An Analysis on Query Optimization in Distributed DatabaseEditor IJMTER
The query optimizer is a significant element in today’s relational database
management system. This element is responsible for translating a user-submitted query
commonly written in a non-procedural language-into an efficient query evaluation program that
can be executed against the database. This research paper describes architecture steps of query
process and optimization time and memory usage. Key goal of this paper is to understand the
basic query optimization process and its architecture.
A systematic mapping study of performance analysis and modelling of cloud sys...IJECEIAES
Cloud computing is a paradigm that uses utility-driven models in providing dynamic services to clients at all levels. Performance analysis and modelling is essential because of service level agreement guarantees. Studies on performance analysis and modelling are increasing in a productive manner on the cloud landscape on issues like virtual machines and data storage. The objective of this study is to conduct a systematic mapping study of performance analysis and modelling of cloud systems and applications. A systematic mapping study is useful in visualization and summarizing the research carried in an area of interest. The systematic study provided an overview of studies on this subject by using a structure, based on categorization. The results are presented in terms of research such as evaluation and solution, and contribution such as tools and method utilized. The results showed that there were more discussions on optimization in relation to tool, method and process with 6.42%, 14.29% and 7.62% respectively. In addition, analysis based on designs in terms of model had 14.29% and publication relating to optimization in terms of evaluation research had 9.77%, validation 7.52%, experience 3.01%, and solution 10.51%. Research gaps were identified and should motivate researchers in pursuing further research directions.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Size and Time Estimation in Goal Graph Using Use Case Points (UCP): A SurveyIJERA Editor
In order to achieve ideal status and meet demands of stakeholders, each organization should follow their vision and long term plan. Goals and strategies are two fundamental basis in vision and mission. Goals identify framework of organization where processes, rules and resources are designed. Goals are modelled based on a graph structure by means of extraction, classification and determining requirements and their relations and in form of graph. Goal graph shows goals which should be satisfied in order to guarantee right route of organization. On the other hand, these goals can be called as predefined sub projects which business management unit should consider and analyse them. If we know approximate size and time of each part, we will design better management plans resulting in more prosperity and less fail. This paper studies how use case points method is used in calculating size and time in goal graph.
Software cost estimation is a key open issue for the software industry, which
suffers from cost overruns frequently. As the most popular technique for object-oriented
software cost estimation is Use Case Points (UCP) method, however, it has two major
drawbacks: the uncertainty of the cost factors and the abrupt classification. To address
these two issues, refined the use case complexity classification using fuzzy logic theory which
mitigate the uncertainty of cost factors and improve the accuracy of classification.
Software estimation is a crucial task in software engineering. Software estimation
encompasses cost, effort, schedule, and size. The importance of software estimation becomes
critical in the early stages of the software life cycle when the details of software have not
been revealed yet. Several commercial and non-commercial tools exist to estimate software
in the early stages. Most software effort estimation methods require software size as one of
the important metric inputs and consequently, software size estimation in the early stages
becomes essential.
The proposed method presents a techniques using fuzzy logic theory to improve the
accuracy of the use case points method by refining the use case classification.
High dimensionality reduction on graphical dataeSAT Journals
Abstract In spite of the fact that graph embedding has been an intense instrument for displaying data natural structures, just utilizing all elements for data structures revelation may bring about noise amplification. This is especially serious for high dimensional data with little examples. To meet this test, a novel effective structure to perform highlight determination for graph embedding, in which a classification of graph implanting routines is given a role as a slightest squares relapse issue. In this structure, a twofold component selector is acquainted with normally handle the component cardinality at all squares detailing. The proposed strategy is quick and memory proficient. The proposed system is connected to a few graph embedding learning issues, counting administered, unsupervised and semi supervised graph embedding. Key Words:Efficient feature selection, High dimensional data, Sparse graph embedding, Sparse principal component analysis, Subproblem Optimization.
Decision Making Framework in e-Business Cloud Environment Using Software Metr...ijitjournal
Cloud computing technology is most important one in IT industry by enabling them to offer access to their
system and application services on payment type. As a result, more than a few enterprises with Facebook,
Microsoft, Google, and amazon have started offer to their clients. Quality software is most important one in
market competition in this paper presents a hybrid framework based on the goal/question/metric paradigm
to evaluate the quality and effectiveness of previous software goods in project, product and organizations
in a cloud computing environment. In our approach it support decision making in the area of project,
product and organization levels using Neural networks and three angular metrics i.e., project metrics,
product metrics, and organization metrics
An Analysis on Query Optimization in Distributed DatabaseEditor IJMTER
The query optimizer is a significant element in today’s relational database
management system. This element is responsible for translating a user-submitted query
commonly written in a non-procedural language-into an efficient query evaluation program that
can be executed against the database. This research paper describes architecture steps of query
process and optimization time and memory usage. Key goal of this paper is to understand the
basic query optimization process and its architecture.
A systematic mapping study of performance analysis and modelling of cloud sys...IJECEIAES
Cloud computing is a paradigm that uses utility-driven models in providing dynamic services to clients at all levels. Performance analysis and modelling is essential because of service level agreement guarantees. Studies on performance analysis and modelling are increasing in a productive manner on the cloud landscape on issues like virtual machines and data storage. The objective of this study is to conduct a systematic mapping study of performance analysis and modelling of cloud systems and applications. A systematic mapping study is useful in visualization and summarizing the research carried in an area of interest. The systematic study provided an overview of studies on this subject by using a structure, based on categorization. The results are presented in terms of research such as evaluation and solution, and contribution such as tools and method utilized. The results showed that there were more discussions on optimization in relation to tool, method and process with 6.42%, 14.29% and 7.62% respectively. In addition, analysis based on designs in terms of model had 14.29% and publication relating to optimization in terms of evaluation research had 9.77%, validation 7.52%, experience 3.01%, and solution 10.51%. Research gaps were identified and should motivate researchers in pursuing further research directions.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Size and Time Estimation in Goal Graph Using Use Case Points (UCP): A SurveyIJERA Editor
In order to achieve ideal status and meet demands of stakeholders, each organization should follow their vision and long term plan. Goals and strategies are two fundamental basis in vision and mission. Goals identify framework of organization where processes, rules and resources are designed. Goals are modelled based on a graph structure by means of extraction, classification and determining requirements and their relations and in form of graph. Goal graph shows goals which should be satisfied in order to guarantee right route of organization. On the other hand, these goals can be called as predefined sub projects which business management unit should consider and analyse them. If we know approximate size and time of each part, we will design better management plans resulting in more prosperity and less fail. This paper studies how use case points method is used in calculating size and time in goal graph.
Software cost estimation is a key open issue for the software industry, which
suffers from cost overruns frequently. As the most popular technique for object-oriented
software cost estimation is Use Case Points (UCP) method, however, it has two major
drawbacks: the uncertainty of the cost factors and the abrupt classification. To address
these two issues, refined the use case complexity classification using fuzzy logic theory which
mitigate the uncertainty of cost factors and improve the accuracy of classification.
Software estimation is a crucial task in software engineering. Software estimation
encompasses cost, effort, schedule, and size. The importance of software estimation becomes
critical in the early stages of the software life cycle when the details of software have not
been revealed yet. Several commercial and non-commercial tools exist to estimate software
in the early stages. Most software effort estimation methods require software size as one of
the important metric inputs and consequently, software size estimation in the early stages
becomes essential.
The proposed method presents a techniques using fuzzy logic theory to improve the
accuracy of the use case points method by refining the use case classification.
High dimensionality reduction on graphical dataeSAT Journals
Abstract In spite of the fact that graph embedding has been an intense instrument for displaying data natural structures, just utilizing all elements for data structures revelation may bring about noise amplification. This is especially serious for high dimensional data with little examples. To meet this test, a novel effective structure to perform highlight determination for graph embedding, in which a classification of graph implanting routines is given a role as a slightest squares relapse issue. In this structure, a twofold component selector is acquainted with normally handle the component cardinality at all squares detailing. The proposed strategy is quick and memory proficient. The proposed system is connected to a few graph embedding learning issues, counting administered, unsupervised and semi supervised graph embedding. Key Words:Efficient feature selection, High dimensional data, Sparse graph embedding, Sparse principal component analysis, Subproblem Optimization.
Decision Making Framework in e-Business Cloud Environment Using Software Metr...ijitjournal
Cloud computing technology is most important one in IT industry by enabling them to offer access to their
system and application services on payment type. As a result, more than a few enterprises with Facebook,
Microsoft, Google, and amazon have started offer to their clients. Quality software is most important one in
market competition in this paper presents a hybrid framework based on the goal/question/metric paradigm
to evaluate the quality and effectiveness of previous software goods in project, product and organizations
in a cloud computing environment. In our approach it support decision making in the area of project,
product and organization levels using Neural networks and three angular metrics i.e., project metrics,
product metrics, and organization metrics
Automatically Estimating Software Effort and Cost using Computing Intelligenc...cscpconf
In the IT industry, precisely estimate the effort of each software project the development cost
and schedule are count for much to the software company. So precisely estimation of man
power seems to be getting more important. In the past time, the IT companies estimate the work
effort of man power by human experts, using statistics method. However, the outcomes are
always unsatisfying the management level. Recently it becomes an interesting topic if computing
intelligence techniques can do better in this field. This research uses some computing
intelligence techniques, such as Pearson product-moment correlation coefficient method and
one-way ANOVA method to select key factors, and K-Means clustering algorithm to do project
clustering, to estimate the software project effort. The experimental result show that using
computing intelligence techniques to estimate the software project effort can get more precise
and more effective estimation than using traditional human experts did
For years, the Machine Learning community has focused on developing efficient
algorithms that can produce very accurate classifiers. However, it is often much easier
to find several good classifiers based on dataset combination, instead of single classifier
applied on deferent datasets. The advantages of using classifier dataset combinations
instead of a single one are twofold: it helps lowering the computational complexity by
using simpler models, and it can improve the classification accuracy and performance.
Most Data mining applications are based on pattern matching algorithms, thus improving
the performance of the classification has a positive impact on the quality of the overall
data mining task. Since combination strategies proved very useful in improving the
performance, these techniques have become very important in applications such as
Cancer detection, Speech Technology and Natural Language Processing .The aim of this
paper is basically to propose proprietary metric, Normalized Geometric Index (NGI)
based on the latent properties of datasets for improving the accuracy of data mining tasks.
Unsupervised Feature Selection Based on the Distribution of Features Attribut...Waqas Tariq
Since dealing with high dimensional data is computationally complex and sometimes even intractable, recently several feature reductions methods have been developed to reduce the dimensionality of the data in order to simplify the calculation analysis in various applications such as text categorization, signal processing, image retrieval, gene expressions and etc. Among feature reduction techniques, feature selection is one the most popular methods due to the preservation of the original features. However, most of the current feature selection methods do not have a good performance when fed on imbalanced data sets which are pervasive in real world applications. In this paper, we propose a new unsupervised feature selection method attributed to imbalanced data sets, which will remove redundant features from the original feature space based on the distribution of features. To show the effectiveness of the proposed method, popular feature selection methods have been implemented and compared. Experimental results on the several imbalanced data sets, derived from UCI repository database, illustrate the effectiveness of our proposed methods in comparison with the other compared methods in terms of both accuracy and the number of selected features.
Issues in Query Processing and OptimizationEditor IJMTER
The paper identifies the various issues in query processing and optimization while
choosing the best database plan. It is unlike preceding query optimization techniques that uses only a
single approach for identifying best query plan by extracting data from database. Our approach takes
into account various phases of query processing and optimization, heuristic estimation techniques
and cost function for identifying the best execution plan. A review report on various phases of query
processing, goals of optimizer, various rules for heuristic optimization and cost components involved
are presented in this paper.
SENSITIVITY ANALYSIS OF INFORMATION RETRIEVAL METRICS ijcsit
Average Precision, Recall and Precision are the main metrics of Information Retrieval (IR) systems performance. Using Mathematical and empirical analysis, in this paper, we show the properties of those metrics. Mathematically, it is demonstrated that all those parameters are very sensitive to relevance judgment which is not usually very reliable. We show that position shifting downwards of the relevant document within the ranked list is followed by Average Precision decreasing. The variation of Average Precision parameter value is highly present in the positions 1 to 10, while from the 10th position on, this variation is negligible. In addition, we try to estimate the regularity of the Average Precision value changes, when we assume that we are switching the arbitrary number of relevance judgments within the existing ranked list, from non-relevant to relevant. Empirically, it is shown hat 6 relevant documents at the end of the 20 document list, have approximately same Average Precision value as a single relevant document at the beginning of this list, while Recall and Precision values increase linearly, regardless of the document position in the list. Also, we show that in the case of Serbian-to-English human translation query followed by English-to-Serbian machine translation, relevance judgment is significantly changed and therefore, all the parameters for measuring the IR system performance are also subject to change.
Iso9126 based software quality evaluation using choquet integral ijseajournal
Evaluating software quality is an important and essential issue in the development process because it helps
to deliver a competitive software product. A decision of selecting the best software based on quality
attributes is a type of multi-criteria decision-making (MCDM) processes where interactions among criteria
should be considered. This paper presents and develops quantitative evaluations by considering
interactions among criteria in the MCDM problems. The aggregator methods such as Arithmetic Mean
(AM) and Weighted Arithmetic Mean (WAM) are introduced, described and compared to Choquet Integral
(CI) approach which is a type of fuzzy measure used as a new method for MCDM. The comparisons are
shown by evaluating and ranking software alternatives based on six main quality attributes as identified by
the ISO 9126-1 standard. The evaluation experiments depend on real data collected from case studies.
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...ijcsa
The work is about using Simulated Annealing Algorithm for the effort estimation model parameter
optimization which can lead to the reduction in the difference in actual and estimated effort used in model
development.
The model has been tested using OOP’s dataset, obtained from NASA for research purpose.The data set
based model equation parameters have been found that consists of two independent variables, viz. Lines of
Code (LOC) along with one more attribute as a dependent variable related to software development effort
(DE). The results have been compared with the earlier work done by the author on Artificial Neural
Network (ANN) and Adaptive Neuro Fuzzy Inference System (ANFIS) and it has been observed that the
developed SA based model is more capable to provide better estimation of software development effort than
ANN and ANFIS
Modeling enterprise architecture using timed colored petri net single process...ijmpict
The purpose of modeling enterprise architecture and analysis of it is to ease decision making about
architecture of information systems. Planning is one of the most important tasks in an organization and has
a major role in increasing the productivity of it. Scope of this paper is scheduling processes in the
enterprise architecture. Scheduling is decision making on execution start time of processes that are used in
manufacturing and service systems. Different methods and tools have been proposed for modeling
enterprise architecture. Colored Petri net is extension of traditional Petri net that its modeling capability
has grown dramatically. A developed model with Colored Petri net is suitable for verification of
operational aspects and performance evaluation of information systems. With having ability of hierarchical
modeling, colored Petri nets permits that using predesigned modules for smaller parts of the system and
with a general algorithm, any kind of enterprise architecture can be modeled. A two level hierarchical
model is presented as a building block for modeling architecture of Transaction Processing Systems (TPS)
in this paper. This model schedules and runs processes based on a predetermined non-preemptive
scheduling method. The model can be used for scheduling of processes with four non-preemptive methods
named, priority based (PR), shortest job first (SJF), first come first served (FCFS) and highest response
ratio next (HRRN). The presented model is designed such can be used as one of the main components in
modeling any type of enterprise architecture. Most enterprise architectures can be modeled by putting
together appropriate number of these modules and proper composition of them.
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Ferhat Ozgur Catak
Database Management Systems (DBMS) play an important role to support
enterprise application developments. Selection of the right DBMS is a crucial decision for
software engineering process. This selection requires optimizing a number of criteria.
Evaluation and selection of DBMS among several candidates tend to be very complex. It
requires both quantitative and qualitative issues. Wrong selection of DBMS will have a
negative effect on the development of enterprise application. It can turn out to be costly and adversely affect business process. The following study focuses on the evaluation of a multi criteria
decision problem by the usage of fuzzy logic. We will demonstrate the methodological considerations
regarding to group decision and fuzziness based on the DBMS selection problem. We developed a new
Fuzzy AHP based decision model which is formulated and proposed to select a DBMS easily. In this
decision model, first, main criteria and their sub criteria are determined for the evaluation. Then these
criteria are weighted by pair-wise comparison, and then DBMS alternatives are evaluated by assigning a
rating scale.
An Overview on Data Quality Issues at Data Staging ETLidescitation
A data warehouse (DW) is a collection of technologies
aimed at enabling the decision maker to make better and
faster decisions. Data warehouses differ from operational
databases in that they are subject oriented, integrated, time
variant, non volatile, summarized, larger, not normalized, and
perform OLAP. The generic data warehouse architecture
consists of three layers (data sources, DSA, and primary data
warehouse). During the ETL process, data is extracted from
an OLTP databases, transformed to match the data warehouse
schema, and loaded into the data warehouse database
OPTIMIZATION APPROACHES IN OPTIMAL INTEGRATED CONTROL, COMPUTATION AND COMMUN...ijctcm
This paper studies the existing approaches in optimal integrated control, computation and communication
problems. It concentrates on joint optimization problems aimed at finding communication/computation
policy and control signal. Different aspects including computational complexity, convexity, proposed
methods to find optimum and other issues related to control performance are studied and compared for different approaches.
Program analysis is useful for debugging, testing and maintenance of software systems due to information
about the structure and relationship of the program’s modules . In general, program analysis is performed
either based on control flow graph or dependence graph. However, in the case of aspect-oriented
programming (AOP), control flow graph (CFG) or dependence graph (DG) are not enough to model the
properties of Aspect-oriented (AO) programs. With respect to AO programs, although AOP is good for
modular representation and crosscutting concern, suitable model for program analysis is required to
gather information on its structure for the purpose of minimizing maintenance effort. In this paper Aspect
Oriented Dependence Flow Graph (AODFG) as an intermediate representation model is proposed to
represent the structure of aspect-oriented programs. AODFG is formed by merging the CFG and DG, thus
more information about dependencies between the join points, advice, aspects and their associated
construct with the flow of control from one statement to another are gathered. We discussthe performance
of AODFG by analysing some examples of AspectJ program taken from AspectJ Development Tools
(AJDT).
Automatically Estimating Software Effort and Cost using Computing Intelligenc...cscpconf
In the IT industry, precisely estimate the effort of each software project the development cost
and schedule are count for much to the software company. So precisely estimation of man
power seems to be getting more important. In the past time, the IT companies estimate the work
effort of man power by human experts, using statistics method. However, the outcomes are
always unsatisfying the management level. Recently it becomes an interesting topic if computing
intelligence techniques can do better in this field. This research uses some computing
intelligence techniques, such as Pearson product-moment correlation coefficient method and
one-way ANOVA method to select key factors, and K-Means clustering algorithm to do project
clustering, to estimate the software project effort. The experimental result show that using
computing intelligence techniques to estimate the software project effort can get more precise
and more effective estimation than using traditional human experts did
For years, the Machine Learning community has focused on developing efficient
algorithms that can produce very accurate classifiers. However, it is often much easier
to find several good classifiers based on dataset combination, instead of single classifier
applied on deferent datasets. The advantages of using classifier dataset combinations
instead of a single one are twofold: it helps lowering the computational complexity by
using simpler models, and it can improve the classification accuracy and performance.
Most Data mining applications are based on pattern matching algorithms, thus improving
the performance of the classification has a positive impact on the quality of the overall
data mining task. Since combination strategies proved very useful in improving the
performance, these techniques have become very important in applications such as
Cancer detection, Speech Technology and Natural Language Processing .The aim of this
paper is basically to propose proprietary metric, Normalized Geometric Index (NGI)
based on the latent properties of datasets for improving the accuracy of data mining tasks.
Unsupervised Feature Selection Based on the Distribution of Features Attribut...Waqas Tariq
Since dealing with high dimensional data is computationally complex and sometimes even intractable, recently several feature reductions methods have been developed to reduce the dimensionality of the data in order to simplify the calculation analysis in various applications such as text categorization, signal processing, image retrieval, gene expressions and etc. Among feature reduction techniques, feature selection is one the most popular methods due to the preservation of the original features. However, most of the current feature selection methods do not have a good performance when fed on imbalanced data sets which are pervasive in real world applications. In this paper, we propose a new unsupervised feature selection method attributed to imbalanced data sets, which will remove redundant features from the original feature space based on the distribution of features. To show the effectiveness of the proposed method, popular feature selection methods have been implemented and compared. Experimental results on the several imbalanced data sets, derived from UCI repository database, illustrate the effectiveness of our proposed methods in comparison with the other compared methods in terms of both accuracy and the number of selected features.
Issues in Query Processing and OptimizationEditor IJMTER
The paper identifies the various issues in query processing and optimization while
choosing the best database plan. It is unlike preceding query optimization techniques that uses only a
single approach for identifying best query plan by extracting data from database. Our approach takes
into account various phases of query processing and optimization, heuristic estimation techniques
and cost function for identifying the best execution plan. A review report on various phases of query
processing, goals of optimizer, various rules for heuristic optimization and cost components involved
are presented in this paper.
SENSITIVITY ANALYSIS OF INFORMATION RETRIEVAL METRICS ijcsit
Average Precision, Recall and Precision are the main metrics of Information Retrieval (IR) systems performance. Using Mathematical and empirical analysis, in this paper, we show the properties of those metrics. Mathematically, it is demonstrated that all those parameters are very sensitive to relevance judgment which is not usually very reliable. We show that position shifting downwards of the relevant document within the ranked list is followed by Average Precision decreasing. The variation of Average Precision parameter value is highly present in the positions 1 to 10, while from the 10th position on, this variation is negligible. In addition, we try to estimate the regularity of the Average Precision value changes, when we assume that we are switching the arbitrary number of relevance judgments within the existing ranked list, from non-relevant to relevant. Empirically, it is shown hat 6 relevant documents at the end of the 20 document list, have approximately same Average Precision value as a single relevant document at the beginning of this list, while Recall and Precision values increase linearly, regardless of the document position in the list. Also, we show that in the case of Serbian-to-English human translation query followed by English-to-Serbian machine translation, relevance judgment is significantly changed and therefore, all the parameters for measuring the IR system performance are also subject to change.
Iso9126 based software quality evaluation using choquet integral ijseajournal
Evaluating software quality is an important and essential issue in the development process because it helps
to deliver a competitive software product. A decision of selecting the best software based on quality
attributes is a type of multi-criteria decision-making (MCDM) processes where interactions among criteria
should be considered. This paper presents and develops quantitative evaluations by considering
interactions among criteria in the MCDM problems. The aggregator methods such as Arithmetic Mean
(AM) and Weighted Arithmetic Mean (WAM) are introduced, described and compared to Choquet Integral
(CI) approach which is a type of fuzzy measure used as a new method for MCDM. The comparisons are
shown by evaluating and ranking software alternatives based on six main quality attributes as identified by
the ISO 9126-1 standard. The evaluation experiments depend on real data collected from case studies.
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...ijcsa
The work is about using Simulated Annealing Algorithm for the effort estimation model parameter
optimization which can lead to the reduction in the difference in actual and estimated effort used in model
development.
The model has been tested using OOP’s dataset, obtained from NASA for research purpose.The data set
based model equation parameters have been found that consists of two independent variables, viz. Lines of
Code (LOC) along with one more attribute as a dependent variable related to software development effort
(DE). The results have been compared with the earlier work done by the author on Artificial Neural
Network (ANN) and Adaptive Neuro Fuzzy Inference System (ANFIS) and it has been observed that the
developed SA based model is more capable to provide better estimation of software development effort than
ANN and ANFIS
Modeling enterprise architecture using timed colored petri net single process...ijmpict
The purpose of modeling enterprise architecture and analysis of it is to ease decision making about
architecture of information systems. Planning is one of the most important tasks in an organization and has
a major role in increasing the productivity of it. Scope of this paper is scheduling processes in the
enterprise architecture. Scheduling is decision making on execution start time of processes that are used in
manufacturing and service systems. Different methods and tools have been proposed for modeling
enterprise architecture. Colored Petri net is extension of traditional Petri net that its modeling capability
has grown dramatically. A developed model with Colored Petri net is suitable for verification of
operational aspects and performance evaluation of information systems. With having ability of hierarchical
modeling, colored Petri nets permits that using predesigned modules for smaller parts of the system and
with a general algorithm, any kind of enterprise architecture can be modeled. A two level hierarchical
model is presented as a building block for modeling architecture of Transaction Processing Systems (TPS)
in this paper. This model schedules and runs processes based on a predetermined non-preemptive
scheduling method. The model can be used for scheduling of processes with four non-preemptive methods
named, priority based (PR), shortest job first (SJF), first come first served (FCFS) and highest response
ratio next (HRRN). The presented model is designed such can be used as one of the main components in
modeling any type of enterprise architecture. Most enterprise architectures can be modeled by putting
together appropriate number of these modules and proper composition of them.
Fuzzy Analytic Hierarchy Based DBMS Selection In Turkish National Identity Ca...Ferhat Ozgur Catak
Database Management Systems (DBMS) play an important role to support
enterprise application developments. Selection of the right DBMS is a crucial decision for
software engineering process. This selection requires optimizing a number of criteria.
Evaluation and selection of DBMS among several candidates tend to be very complex. It
requires both quantitative and qualitative issues. Wrong selection of DBMS will have a
negative effect on the development of enterprise application. It can turn out to be costly and adversely affect business process. The following study focuses on the evaluation of a multi criteria
decision problem by the usage of fuzzy logic. We will demonstrate the methodological considerations
regarding to group decision and fuzziness based on the DBMS selection problem. We developed a new
Fuzzy AHP based decision model which is formulated and proposed to select a DBMS easily. In this
decision model, first, main criteria and their sub criteria are determined for the evaluation. Then these
criteria are weighted by pair-wise comparison, and then DBMS alternatives are evaluated by assigning a
rating scale.
An Overview on Data Quality Issues at Data Staging ETLidescitation
A data warehouse (DW) is a collection of technologies
aimed at enabling the decision maker to make better and
faster decisions. Data warehouses differ from operational
databases in that they are subject oriented, integrated, time
variant, non volatile, summarized, larger, not normalized, and
perform OLAP. The generic data warehouse architecture
consists of three layers (data sources, DSA, and primary data
warehouse). During the ETL process, data is extracted from
an OLTP databases, transformed to match the data warehouse
schema, and loaded into the data warehouse database
OPTIMIZATION APPROACHES IN OPTIMAL INTEGRATED CONTROL, COMPUTATION AND COMMUN...ijctcm
This paper studies the existing approaches in optimal integrated control, computation and communication
problems. It concentrates on joint optimization problems aimed at finding communication/computation
policy and control signal. Different aspects including computational complexity, convexity, proposed
methods to find optimum and other issues related to control performance are studied and compared for different approaches.
Program analysis is useful for debugging, testing and maintenance of software systems due to information
about the structure and relationship of the program’s modules . In general, program analysis is performed
either based on control flow graph or dependence graph. However, in the case of aspect-oriented
programming (AOP), control flow graph (CFG) or dependence graph (DG) are not enough to model the
properties of Aspect-oriented (AO) programs. With respect to AO programs, although AOP is good for
modular representation and crosscutting concern, suitable model for program analysis is required to
gather information on its structure for the purpose of minimizing maintenance effort. In this paper Aspect
Oriented Dependence Flow Graph (AODFG) as an intermediate representation model is proposed to
represent the structure of aspect-oriented programs. AODFG is formed by merging the CFG and DG, thus
more information about dependencies between the join points, advice, aspects and their associated
construct with the flow of control from one statement to another are gathered. We discussthe performance
of AODFG by analysing some examples of AspectJ program taken from AspectJ Development Tools
(AJDT).
Secured wireless communication through simulated annealing guided traingulari...csandit
In this paper, simulated annealing guided traingularized encryption using multilayer perceptron
generated session key (SATMLP) has been proposed for secured wireless communication. Both
sender and receiver station uses identical multilayer perceptron and depending on the final
output of the both side multilayer perceptron, weights vector of hidden layer get tuned in both
ends. After this tunning step both perceptrons generates identical weight vectors which is
consider as an one time session key. In the 1st level of encryption process traingularized sub
key1 is use to encrypt the plain text. In 2nd level of encryption simulated annealing method
helps to generates sub key 2 for further encryption of 1st level generated traingularized
encrypted text. Finally multilayer perceptron generated one time session key is used to perform
3rd level of encryption of 2nd level generated encrypted text. Recipient will use same identical
multilayer perceptron guided session key for performing deciphering process for getting the
simulated annealing guided intermediate cipher text. Then using sub key 2 deciphering
technique is performed to get traingularized encrypted text. Finally sub key 1 is used to
generate the plain text. In this proposed technique session key is not transmitted over public
channel apart from few data transfers are needed for weight simulation process. Because after
synchronization process both multilayer perceptron generates identical weight vector which
acts as a session key. Parametric tests are done and results are compared in terms of Chi-
Square test, response time in transmission with some existing classical techniques, which shows
comparable results for the proposed system.
Performance evaluations of grioryan fft and cooley tukey fft onto xilinx virt...csandit
A large family of signal processing techniques consist of Fourier-transforming a signal,
manipulating the Fourier-transformed data in a simple way, and reversing the transformation.
We widely use Fourier frequency analysis in equalization of audio recordings, X-ray
crystallography, artefact removal in Neurological signal and image processing, Voice Activity
Detection in Brain stem speech evoked potentials, speech processing spectrograms are used to
identify phonetic sounds and so on. Discrete Fourier Transform (DFT) is a principal
mathematical method for the frequency analysis. The way of splitting the DFT gives out various
fast algorithms. In this paper, we present the implementation of two fast algorithms for the DFT
for evaluating their performance. One of them is the popular radix-2 Cooley-Tukey fast Fourier
transform algorithm (FFT) [1] and the other one is the Grigoryan FFT based on the splitting by
the paired transform [2]. We evaluate the performance of these algorithms by implementing
them on the Xilinx Virtex-II pro [3] and Virtex-5 [4] FPGAs, by developing our own FFT
processor architectures. Finally we show that the Grigoryan FFT is working fatser than
Cooley-Tukey FFT, consequently it is useful for higher sampling rates. Operating at higher
sampling rates is a challenge in DSP applications.
COLEGIO LUCIANO ANDRADE MARIN
Tema:
La Teoría De La Evolución De Charles Darwin
Integrantes:
ERICK CALDERON
ELVIS LEMA
JEFFERSON SANTOS
CRISTIAN ALBA
ALEXANDER PEÑAFIEL
In the present paper, applicability and
capability of A.I techniques for effort estimation prediction has
been investigated. It is seen that neuro fuzzy models are very
robust, characterized by fast computation, capable of handling
the distorted data. Due to the presence of data non-linearity, it is
an efficient quantitative tool to predict effort estimation. The one
hidden layer network has been developed named as OHLANFIS
using MATLAB simulation environment.
Here the initial parameters of the OHLANFIS are
identified using the subtractive clustering method. Parameters of
the Gaussian membership function are optimally determined
using the hybrid learning algorithm. From the analysis it is seen
that the Effort Estimation prediction model developed using
OHLANFIS technique has been able to perform well over normal
ANFIS Model.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
Algorithm ExampleFor the following taskUse the random module .docxdaniahendric
Algorithm Example
For the following task:
Use the random module to write a number guessing game.
The number the computer chooses should change each time you run the program.
Repeatedly ask the user for a number. If the number is different from the computer's let the user know if they guessed too high or too low. If the number matches the computer's, the user wins.
Keep track of the number of tries it takes the user to guess it.
An appropriate algorithm might be:
Import the random module
Display a welcome message to the user
Choose a random number between 1 and 100
Get a guess from the user
Set a number of tries to 0
As long as their guess isn’t the number
Check if guess is lower than computer
If so, print a lower message.
Otherwise, is it higher?
If so, print a higher message.
Get another guess
Increment the tries
Repeat
When they guess the computer's number, display the number and their tries count
Notice that each line in the algorithm corresponds to roughly a line of code in Python, but there is no coding itself in the algorithm. Rather the algorithm lays out what needs to happen step by step to achieve the program.
Software Quality Metrics for Object-Oriented Environments
AUTHORS:
Dr. Linda H. Rosenberg Lawrence E. Hyatt
Unisys Government Systems Software Assurance Technology Center
Goddard Space Flight Center Goddard Space Flight Center
Bld 6 Code 300.1 Bld 6 Code 302
Greenbelt, MD 20771 USA Greenbelt, MD 20771 USA
I. INTRODUCTION
Object-oriented design and development are popular concepts in today’s software development
environment. They are often heralded as the silver bullet for solving software problems. While
in reality there is no silver bullet, object-oriented development has proved its value for systems
that must be maintained and modified. Object-oriented software development requires a
different approach from more traditional functional decomposition and data flow development
methods. This includes the software metrics used to evaluate object-oriented software.
The concepts of software metrics are well established, and many metrics relating to product
quality have been developed and used. With object-oriented analysis and design methodologies
gaining popularity, it is time to start investigating object-oriented metrics with respect to
software quality. We are interested in the answer to the following questions:
• What concepts and structures in object-oriented design affect the quality of the
software?
• Can traditional metrics measure the critical object-oriented structures?
• If so, are the threshold values for the metrics the same for object-oriented designs as for
functional/data designs?
• Which of the many new metrics found in the literature are useful to measure the critical
concepts of object-oriented structures?
II. METRIC EVALUATION CRITERIA
While metrics for the traditional functional decomposition and data analysis design appro ...
Threshold benchmarking for feature ranking techniquesjournalBEEI
In prediction modeling, the choice of features chosen from the original feature set is crucial for accuracy and model interpretability. Feature ranking techniques rank the features by its importance but there is no consensus on the number of features to be cut-off. Thus, it becomes important to identify a threshold value or range, so as to remove the redundant features. In this work, an empirical study is conducted for identification of the threshold benchmark for feature ranking algorithms. Experiments are conducted on Apache Click dataset with six popularly used ranker techniques and six machine learning techniques, to deduce a relationship between the total number of input features (N) to the threshold range. The area under the curve analysis shows that ≃ 33-50% of the features are necessary and sufficient to yield a reasonable performance measure, with a variance of 2%, in defect prediction models. Further, we also find that the log2(N) as the ranker threshold value represents the lower limit of the range.
Proceedings of the 2015 Industrial and Systems Engineering Res.docxwkyra78
Proceedings of the 2015 Industrial and Systems Engineering Research Conference
S. Cetinkaya and J. K. Ryan, eds.
Use of Symbolic Regression for Lean Six Sigma Projects
Daniel Moreno-Sanchez, MSc.
Jacobo Tijerina-Aguilera, MSc.
Universidad de Monterrey
San Pedro Garza Garcia, NL 66238, Mexico
Arlethe Yari Aguilar-Villarreal, MEng.
Universidad Autonoma de Nuevo Leon
San Nicolas de los Garza, NL 66451, Mexico
Abstract
Lean Six Sigma projects and the quality engineering profession have to deal with an extensive selection of tools
most of them requiring specialized training. The increased availability of standard statistical software motivates the
use of advanced data science techniques to identify relationships between potential causes and project metrics. In
these circumstances, Symbolic Regression has received increased attention from researchers and practitioners to
uncover the intrinsic relationships hidden within complex data without requiring specialized training for its
implementation. The objective of this paper is to evaluate the advantages and drawbacks of using computer assisted
Symbolic Regression within the Analyze phase of a Lean Six Sigma project. An application of this approach in a
service industry project is also presented.
Keywords
Symbolic Regression, Data Science, Lean Six Sigma
1. Introduction
Lean Six Sigma (LSS) has become a well-known hybrid methodology for quality and productivity improvement in
organizations. Its wide adoption in several industries has shaped Process Innovation and Operational Excellence
initiatives, enabling LSS to become a main topic in quality practitioner sites of interest [1], recognized Six Sigma
(SS) certification body of knowledge contents [2], and professional society conferences [3].
However LSS projects and the quality engineering profession have to deal with an extensive selection of tools most
of them requiring specialized training. To assist LSS practitioners it is common to categorize tools based on the
traditional DMAIC model which stands for Define, Measure, Analyze, Improve, and Control phases. Table 1
presents an overview of the main tools that are commonly used in each phase of a LSS project, allowing team
members to progressively develop an understanding between realizing each phase’s intent and how the selected
tools can contribute to that purpose.
This paper focuses on the Analyze phase where tools for statistical model building are most likely to be selected.
The increased availability of standard statistical software motivates the use of advanced data science techniques to
identify relationships between potential causes and project metrics. In these circumstances Symbolic Regression
(SR) has received increased attention from researchers and practitioners even though SR is still in an early stage of
commercial availability.
The objective of this paper is to evaluate the advantages and drawbacks o ...
Availability Assessment of Software Systems Architecture Using Formal ModelsEditor IJCATR
There has been a significant effort to analyze, design and implement the information systems to process the information and data, and solve various problems. On the one hand, complexity of the contemporary systems, and eye-catching increase in the variety and volume of information has led to great number of the components and elements, and more complex structure and organization of the information systems. On the other hand, it is necessary to develop the systems which meet all of the stakeholders' functional and non-functional requirements. Considering the fact that evaluation and assessment of the aforementioned requirements - prior to the design and implementation phases - will consume less time and reduce costs, the best time to measure the evaluable behavior of the system is when its software architecture is provided. One of the ways to evaluate the architecture of software is creation of an executable model of architecture.
The present research used availability assessment and took repair, maintenance and accident time parameters into consideration. Failures of software and hardware components have been considered in the architecture of software systems. To describe the architecture easily, the authors used Unified Modeling Language (UML). However, due to the informality of UML, they utilized Colored Petri Nets (CPN) for assessment too. Eventually, the researchers evaluated a CPN-based executable model of architecture through CPN-Tools.
Performance Evaluation using Blackboard Technique in Software ArchitectureEditor IJCATR
Validation of software systems is very useful at the primary stages of their development cycle. Evaluation of functional
requirements is supported by clear and appropriate approaches, but there is no similar strategy for evaluation of non-functional
requirements (such as performance). Whereas establishing the non-functional requirements have significant effect on success of
software systems, therefore considerable necessities are needed for evaluation of non-functional requirements. Also, if the software
performance has been specified based on performance models, may be evaluated at the primary stages of software development cycle.
Therefore, modeling and evaluation of non-functional requirements in software architecture level, that are designed at the primary
stages of software systems development cycle and prior to implementation, will be very effective.
We propose an approach for evaluate the performance of software systems, based on black board technique in software architecture
level. In this approach, at first, software architecture using blackboard technique is described by UML use case, activity and
component diagrams. then UML model is transformed to an executable model based on timed colored petri nets(TCPN)
Consequently, upon execution of an executive model and analysis of its results, non-functional requirements including performance
(such as response time) may be evaluated in software architecture level.
Using Data Mining to Identify COSMIC Function Point Measurement Competence IJECEIAES
Cosmic Function Point (CFP) measurement errors leads budget, schedule and quality problems in software projects. Therefore, it’s important to identify and plan requirements engineers’ CFP training need quickly and correctly. The purpose of this paper is to identify software requirements engineers’ COSMIC Function Point measurement competence development need by using machine learning algorithms and requirements artifacts created by engineers. Used artifacts have been provided by a large service and technology company ecosystem in Telco. First, feature set has been extracted from the requirements model at hand. To do the data preparation for educational data mining, requirements and COSMIC Function Point (CFP) audit documents have been converted into CFP data set based on the designed feature set. This data set has been used to train and test the machine learning models by designing two different experiment settings to reach statistically significant results. Ten different machine learning algorithms have been used. Finally, algorithm performances have been compared with a baseline and each other to find the best performing models on this data set. In conclusion, REPTree, OneR, and Support Vector Machines (SVM) with Sequential Minimal Optimization (SMO) algorithms achieved top performance in forecasting requirements engineers’ CFP training need.
Software requirements prioritization is a
recognized practice in requirements engineering (RE)
that facilitates the management of stakeholders’
subjective views as specified in their requirements
listing. Since RE process is naturally collaborative in
nature, the intensiveness from both knowledge and
human perspectives opens up the problem of decision
making on requirements, which can be facilitated by
requirements prioritization. However, due to the large
volume of requirements elicited when considering an
ultra-large-scale system, existing prioritization
techniques proposed so far suffer some setbacks in
terms of efficiency, effectiveness and scalability. This
paper employed the use of a more efficient ranking
algorithm for requirements prioritization based on the
limitations of existing techniques. The major objective
is to provide a well-defined ranking procedure through
analysis, suitable for prioritizing software requirements.
An empirical evaluation of the proposed technique was
made using a typical scenario of the Pharmacy
Information System at the Obafemi Awolowo
University Teaching Hospital Complex (OAUTHC) as a
case study. The results showed the computation of the
positive ideal solution (PIS) and negative ideal solution
(NIS), as well as the closeness coefficient (CC) for 4
requirements across 3 stakeholders. The CC showed the
final ranks of requirements, where R4 with 2.09 point is
the most valued requirements, while R1 and R2 with
CC of 1.37 and 1.05 were next in the order of priority
respectively. The CC provides the medium through
which problems of multiple criteria decision making
can be handled, so as to determine the order of priority
of the available alternatives. The paper conveyed
encouraging evidence for the software engineering
community that is capable of resolving redundant
specified requirements, thereby providing the potential
that will facilitate effective and efficient decision
making in handling the differences amongst
requirements that have been prioritized. Thus,
prioritizing software requirements with the
recommended ranking procedure during software
development is crucial and vital in order to reduce
development cost.
COMPARATIVE STUDY OF SOFTWARE ESTIMATION TECHNIQUES ijseajournal
Many information technology firms among other organizations have been working on how to perform estimation of the sources such as fund and other resources during software development processes. Software development life cycles require lot of activities and skills to avoid risks and the best software estimation technique is supposed to be employed. Therefore, in this research, a comparative study was conducted, that consider the accuracy, usage, and suitability of existing methods. It will be suitable for the project managers and project consultants during the whole software project development process. In this project technique such as linear regression; both algorithmic and non-algorithmic are applied. Model, composite and regression techniques are used to derive COCOMO, COCOMO II, SLIM and linear multiple respectively. Moreover, expertise-based and linear-based rules are applied in non-algorithm methods. However, the technique needs some advancement to reduce the errors that are experienced during the software development process. Therefore, this paper in relation to software estimation techniques has proposed a model that can be helpful to the information technology firms, researchers and other firms that use information technology in the processes such as budgeting and decision-making processes.
Maintaining the quality of the software is the major challenge in the process of software development.
Software inspections which use the methods like structured walkthroughs and formal code reviews involve
careful examination of each and every aspect/stage of software development. In Agile software
development, refactoring helps to improve software quality. This refactoring is a technique to improve
software internal structure without changing its behaviour. After much study regarding the ways to
improve software quality, our research proposes an object oriented software metric tool called
“MetricAnalyzer”. This tool is tested on different codebases and is proven to be much useful.
Project PlanFor our Project Plan, we are going to develop.docxwkyra78
Project Plan
For our Project Plan, we are going to develop a new information system for a mid-size organization that will automate the payroll transactions. Since we already have all the information that the new system needed, we propose to use the Waterfall Development Methodology because we are going from one phase to the next.
Planning
System
Implementation
Design
Analysis
Estimated Projected Time Frame:
For Estimating the Project Time Frame, the project manager will develop a preliminary estimation of how long it will take to build the new system. There are several sources that the project manager uses to estimate the time frame. First is they can take it from projects that had similar tasks and technologies, an experienced developers can provided the estimation, or the type of methodology that is being used. It is a good practice to keep track of actual time and effort values during the SDLC so the data can be redefine and are used as a guide for future projects. For now, I will use the Industry standard to estimate the project time frame. Within the Industry standard for a typical business application system, they will spends 15% of effort in the planning phase, 20% in the analysis phase, 35% in the design phase, and 30% in the implementation phase. With the planning phase take 5 month to complete, then for the total project time is 33.3 person-month to complete the project.
Project time line
Planning
Analysis
Design
Implementation
15%
20%
35%
30%
5 person-month
6.66 person-month
11.66 person-month
10 person-month
Developing the Work Plan:
Once the project schedule has been established, the project manager can start creating a work plan for the project. The work plan is a schedule that he projects manager use to keep record and keep track of all the tasks that need to be accomplished over the entire project. The project manager will need to identify the entire task that are needed and determine how long each of the tasks will take. Then the task will be organized within the work breakdown structure. For the main task to be complete, the subtask has to be completed first.
Staffing the Project:
The project manager needs to figure out how many staff is needed for the project. The amount of staff is needed depend on how fast they want to finish the project. Increase in staff does not mean increases in productivity. If more staff is needed, make sure to have some kind of reporting structure. The project manager needs to know the staff capabilities and assign task according to their skills. Project manager needs to know how to motivate the staff for a project success.
Coordinating Project Activities:
The project manager needs to have activities put in place during the entire SDLC. Activities are tools that is use to ensure that he project stays on track and that the chance of failure is kept to a minimum. Case Tools (computer-aided software engineering), Standards, and Documentation are all activi ...
Machine learning for text document classification-efficient classification ap...IAESIJAI
Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides a text document categorization method that is both efficient and effective. In addition, methods for determining the proper relationship between a set of words in a document and its document categorization is also obtained.
GENETIC-FUZZY PROCESS METRIC MEASUREMENT SYSTEM FOR AN OPERATING SYSTEMijcseit
Operating system (Os) is the most essential software of the computer system,deprived ofit, the computer
system is totally useless. It is the frontier for assessing relevant computer resources. It performance greatly
enhances user overall objective across the system. Related literatures have try in different methods and
techniques to measure the process matric performance of the operating system but none has incorporated
the use of genetic algorithm and fuzzy logic in their varied techniques which indeed is a novel approach.
Extending the work of Michalis, this research focuses on measuring the process matrix performance of an
operating system utilizing set of operating system criteria’s while fusing fuzzy logic to handle
impreciseness and genetic for process optimization.
GENETIC-FUZZY PROCESS METRIC MEASUREMENT SYSTEM FOR AN OPERATING SYSTEMijcseit
Operating system (Os) is the most essential software of the computer system,deprived ofit, the computer system is totally useless. It is the frontier for assessing relevant computer resources. It performance greatly
enhances user overall objective across the system. Related literatures have try in different methods and techniques to measure the process matric performance of the operating system but none has incorporated the use of genetic algorithm and fuzzy logic in their varied techniques which indeed is a novel approach. Extending the work of Michalis, this research focuses on measuring the process matrix performance of an
operating system utilizing set of operating system criteria’s while fusing fuzzy logic to handle impreciseness and genetic for process optimization.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
2. 432 Computer Science & Information Technology (CS & IT)
joins and aggregation operations prior to execution of queries and storing the results at the data
warehouse [7],[8]. This dramatically improves the response time of decision support queries.
However the number of possible views exponentially increases relative to the number of database
dimensions. We need to find out what views should be materialized in order to improve query
performance under resource constraints. A greedy algorithm is adopted to choose the most
beneficial view per storage space (Benefit-Per-Unit-Space) up to the given storage limit. The
algorithm considers that there is a linear relationship between the cost of answering a user query
and the size of the view that is used to answer that query. This cost, which is the number of rows
in the view, is then used to select the most beneficial view for materialization. In this paper we
study the TPCH Benchmark [6] to find the number of tables and views present in an OLAP
system.
2. CLASS POINT APPROACH
The Class Point approach provides a system-level estimation of the size of OO products. The
class point approach was introduced in 1998 [17]. In object-oriented development, the class
diagram has a great deal of quantification information based on the design document. It contains
the structural functionality of the target system and its class hierarchy, which are the logical
blocks of the developed system. This approach to size estimation focuses on 1) Local methods
2)Interaction of the class 3) The attributes. The Class Point size estimation process is structured
into three main phases, corresponding to analogous phases in the FP approach. During the first
step the design specifications are analyzed in order to identify and classify the classes into four
types of system components, namely the problem domain type, the Human interaction type, the
data management type, and the task management type. During the second step, each identified
class is assigned a complexity level, which is determined on the basis of the local methods in the
class and of the interaction of the class with the rest of the system. The measures and the way
they are used to carry out this step represent the substantial difference between CP1 and CP2.
Indeed, in CP1 the complexity level of each class is determined on the basis of the Number of
External Methods (NEM) and the Number of Services Requested (NSR). The NEM measure of a
class in an object-oriented system is given by the number of public methods in a class. NSR is
used to measure the interconnection of system components. It is again applicable to a single class
and is determined by the number of different services requested to other classes. In CP2, besides
the above measures, the Number Of Attributes (NOA) measure is also taken into account in order
to calculate the complexity level of each class. Once a complexity level of each class has been
assigned, such information and class type are used to assign a weight to the class. Then, the Total
Unadjusted Class Point value (TUCP) is computed as a weighted sum. The Technical Complexity
Factor (TCF) of the application is determined by assigning the degree of influence that 18
general system characteristics have on the application. The sum of the influence degrees related
to such general system characteristics forms the Total Degree of Influence (TDI). Finally, the
Class Point value is determined by adjusting the TUCP with a value obtained by considering
global system characteristics as in FPA and some additional characteristics especially conceived
for object-oriented systems, namely:
• User Adaptivity
• Rapid Prototyping
• Multiuser Interactivity
• Multiple Interfaces.
3. RELATED WORK
Many metrics for size estimation have been proposed for procedure oriented systems among
which the Function points have achieved a wide acceptance in the estimation of size of business
3. Computer Science & Information Technology (CS & IT) 433
systems [5]. The method provides estimation of size by measuring the functionality of the
system to be developed.. This allows Function Point Analysis (FPA) to be applied in the early
phases of the lifecycle, which is the main reason for the success of the method. Function point
depends on the information available at the time of specifications [11].Several measures have
been defined so far in order to estimate the size of software systems. Chidamber and Kemerer
defined six measures for assessing OO systems.[12] Among them the Weighted Method per
Class(WMC) , the Number of Children and the Response for a Class have been used as size
measures. All of these measures are useful for productivity analysis as all of them provide class
level measurement. The Use Case Points approach was introduced by Karner [4] as a software
project effort estimation model. Use Case Point effort estimation is an extension of existing
estimation methods, such as function point analysis. This approach, however, has weak points
when applied to general software projects. In recent past other size measures have been
suggested as adaptations of the FP method to Object oriented systems [2]. Whitmire proposes the
application of his 3D Function Points to object-oriented software systems, by considering each
class as an internal file and messages sent across the system boundary as transactions [14].
However, 3D Function Points require a greater degree of detail in order to determine size and
consequently make early counting more difficult. Object point measure is another adaption of
function point, used in the improved COCOMO2.0 effort estimation technique[13]. Object Point
count is very similar to Function Point, but objects are taken as the basis of the counting process.
However, such objects are not directly related to objects in the OO methodology, but rather refer
to screens, reports, or 3GL modules. The class point approach is conceived by recasting the ideas
underlying the FP analysis within the OO paradigm and by combining well known OO measures
[1],[3],[9]. The class point size estimation process is structured in three main phases and two
levels of complexity and classifying the system into component types.
4. OLAP SYSTEM
OLAP systems have become increasingly popular in many application areas as they considerably
ease the process of analysing large amounts of data, stored in data warehouses [16]. Data
warehouse must have efficient OLAP tools to explore the data and to provide users with real
insight of the data in data warehouse. Due to large size of the data warehouse and the complexity
of queries, quick response time plays an important role as timely access to information is the
basic requirement of an OLAP system. Unified Modelling Language (UML) diagrams represents
the static and dynamic aspects of a OLAP system. The UML class diagram in Figure.1 shows the
static structural behaviour of the OLAP system, in which operations are designed for the complete
system. The class diagram has persistent classes, like Dimensions ,Facts and Views and Control
classes like ORB, Query Execution, OLAP API, OLAP operations, and Aggregation. These
classes are related to each other by through associations. The access layer must be able to
translate the data related requests from the user or business layer i.e it must be able to create the
correct SQL statement and execute it. Server programs generally receive requests from the client
from the client programs and execute database retrieval and updates. Each portion of the database
is managed by a server, a process which is responsible for controlling access and retrieval of data
from database portion. The server dispenses information to client applications. The client and the
server processes communicate through a well defined set of standard application program
interfaces(API's).The data model incorporated into a database system defines a framework of
concepts that can be used to express the problem domain. Materialized views within the data
warehouse are transparent to the end user or to the database application. Materialized views are
usually accessed through the query rewrite mechanism. If a materialized view is to be used by
query rewrite, it must be stored in the same database as the fact or detail tables on which it relies.
The motivation for using materialized views is to improve performance. Materialized view
management activities considers measuring the space to be used by materialized views
,determining which existing materialized views should be dropped, ensuring that all materialized
4. 434 Computer Science & Information Technology (CS & IT)
views are refreshed properly each time the database is updated. In such a system, aggregates play
a very important role, because an OLAP query is usually an aggregated view on existing
(relational) data. In SQL, you are probably familiar with aggregate functions like: COUNT, SUM,
AVG, MIN and MAX. Execution of query by the relational engine involves parsing of the
submitted statements, optimization of the SQL statements, compilation of the code, and
generation of the query execution plan. During execution, programs call the storage engine to
retrieve or manipulate the data stored in the database. A database administrator adds OLAP
metadata to a data warehouse. The end result is the creation of one or more measure folders that
contain one or more measures. The measures have dimensions, and the dimensions have
hierarchies, levels, and attributes. An OLAP API gives access only to the measures that are
contained in measure folders. Conceiving data as a cube with hierarchical dimensions leads to
conceptually straight forward operations to facilitate analysis. Common OLAP operations include
roll up, slice and dice, drill down and pivot. A roll-up involves summarizing the data along a
dimension. Drill Down allows the user to move from the current data cube to a more detailed data
cube. Slice is the act of picking a rectangular subset of a cube by choosing a single value for one
of its dimensions, creating a new cube with one fewer dimension. The dice operation produces a
subcube by allowing the analyst to pick specific values of multiple dimensions. The Object
Request Broker (ORB) is a process which sends and receives messages to resources and other
services distributed across multiple application servers. CORBA object request brokers(ORB's)
implement a communication channel through which applications can access object interfaces and
request data and services. CORBA requires an Object Request Broker both on the OLAP API
client computer and on the OLAP Services computer. When an application calls a method that
requires an interaction with an OLAP server, the client ORB intercepts the call, interacts with the
OLAP servers ORB to find the object on the server side that can implement the request, passes
the parameters, invokes the object's method, and returns the results. The client application forms
the front end of the system which the user sees and interacts with. The end-user query model
identifies all the conceptual query objects with which the application user interface will deal, thus
taking advantage of the strengths of object-oriented design. It allows for a clear correspondence
between user interface objects and OLAP API objects. The UML class diagram in Figure.1 shows
the static structural behaviour of the OLAP system, in which operations are designed for the
complete system. The class diagram has persistent classes, like Dimensions ,Facts and Views and
Control classes like ORB, Query Execution, OLAP API, OLAP operations, and Aggregation.
These classes are related to each other by through associations. The steps to calculate the class
point for OLAP Systems are as follows:
1.Identification of classes
2.Determination of complexity of classes
3.Calculation of unadjusted class point
4.Calculation of technical complexity factor
5.Calculation of class point
5. Computer Science & Information Technology (CS & IT) 435
Figure 1.Class diagram
Step 1.
From the class diagram of OLAP system, the classes are classified into typical PDT, HIT, DMT
and TMT classes as given in Table 1.
Step 2.
The class point method uses two complexity level measures CP1 and CP2. In CP1 the complexity
level of each class is determined based on Number of external methods and Number of services
requested. Both the measures are available in design documentation. The CP1 measure can be
used in early phases of software development and the CP2 measure can be used only when the
number of attributes is available [1]. Thus considering only CP1 the complexity and the weights
associated with various classes forming an OLAP system is given in Table 2.
6. 436 Computer Science & Information Technology (CS & IT)
Table 1
Table 2
Step 3.
Assuming that there are n dimensions and facts and m number of materialized views we can
compute total unadjusted class point value. Thus, the TUCP is computed as the weighted total of
the four components of the application:
4 3
TUCP =Σ Σ w ij × xij;
i=1 j=1
where xij is the number of classes of component type i (problem domain, human interaction, etc.)
with complexity level j (low, average, or high), and wij is the weighting value for type i and
complexity level j.
Class type Description OLAP Systems
Prolem Domain
Type(PDT)
Represents real world entities in the
application domain
Dimensions and Facts
Human Interface
Type(HIT)
Satisfies the need for visualizing
information and human computer
interactions.
User-Interface
Data Management
Type(DMT)
Includes classes which Incorporate data
storage and retrieval
OLAP server, DW server,
MVS, Metadata, Mat. views
Task Management
Type(TMT)
Includes classes which are responsible
for tasks
ORB, Query Exec, OLAP API,
OLAP operations, Aggregation
Class Type NEM NSR Complexity Weight.
Fact, Dimension PDT 3 0 low 3
MVS DMT 0 2 Avg. 8
OLAP server DMT 2 10 Avg. 8
DW Server DMT 1 7 High 13
Metadata DMT 3 2 Low 5
Aggregate DMT 5 2 Avg. 8
Mat. Views DMT 3 0 Low 5
ORB TMT 3 0 Low 4
OLAP API TMT 4 8 Avg. 6
Query Exec TMT 4 4 Avg. 6
OLAP Operation TMT 5 5 High 9
User Interface HIT 0 5 Avg. 7
7. Computer Science & Information Technology (CS & IT) 437
Step 4.
The Technical Complexity Factor (TCF) is determined by assigning the degree of influence
(ranging from 0 to 5) that 18 general system characteristics have on the application from the
designer’s point of view [1]. The sum of the influence degrees related to such general system
characteristics forms the Total Degree of Influence (TDI), which is used to determine the TCF
according to the following formula:
18
TCF = 0.55 + 0.01× Σ f i
i=1
For OLAP Systems characteristics like Data communication, Distributed functions, Multiple
users, Ease of operation ,adaptability by user have strong or significant influence on development
of the system while as characteristics like Transaction rate, Online data entry, Online update and
multiuser interaction have no influence or least influence on processing the system. The other
characteristics Performance, Reusability, Compiler processing and High end configuration have
average influence on the system. Based on the influence of these characteristics the TCF
factor is calculated as
TCF=0.55+0.01{55}=0.55+0.55=1.10
Step 5.
The final value of the Adjusted Class Point (CP) is obtained by multiplying the Total Unadjusted
Class Point value by TCF .
CP = TUCP×TCF
The CP count can vary with respect to the unadjusted count from -45 percent +45 percent due to
adjustment factor. For OLAP systems the final class point value is given as
CP=1.10(74+3n+5m) --- eq. 1.
Where n is the number of facts and dimensions and m is the number of materialized views.
Experimental Study
5. EXPERIMENTAL STUDY
In order to determine the values of m and n in the above Complexity formula we have used TPC–
H Benchmark for experimentation and Illustration [6]. The TPC Benchmark H models a data
warehouse for any organization which must sell, distribute or manage a product worldwide. The
data base has data about each such transaction over a period of seven years. The TPC-H database
is defined to consist of eight separate tables. The name of the tables in itself indicate their
contents: part, supplier, partsupp, customer, nation, region, lineitem and orders The queries and
the data populating the database have been widely used in research as it has industry-wide
relevance. Using Table-Class mapping where a single table is mapped to a single class we have
obtained the value of n as 8 [2]. In order to obtain the value of m i.e the number of materialized
views we need to understand the Lattice structure of a cube and the concept behind Greedy
algorithm for selecting the views. The challenge for the design of OLAP systems is the
exponential explosion of possible views as the number of dimensions increases. If D is the
8. 438 Computer Science & Information Technology (CS & IT)
number of dimensions and hi the number of hierarchical levels in dimension i then the general
equation for calculating the number of possible views is given by Equation
D
Possible views = Π hi
i=1
As dimensionality increases linearly, the number of possible views explodes exponentially.
OLAP system cannot materialize all the views in a given lattice structure because of constraints
on storage space, computational time and view maintenance cost.. Typically, a strategic subset
of views must be selected for materialization. There is a need to select an optimal set of views to
be materialized. Using Greedy based approach for view selection we select a beneficial view at
each step that fits within the space available for view materialization. Greedy algorithm
considers Cost(vw), the cost associated with each view based on the number of rows in the
view. k is the number of views to be materialized in addition to the root view. After selecting
set S of views, the benefit of view vw relative to S is denoted by Ben(vw,S). For selecting a set
of k views to be materialized ,the Greedy Algorithm is given below:
S = {root view};
for i = 1 to k do begin
select that view vw not in S such that Ben(vw,S) is maximized;
S = S union {vw};
end;
For our experimental study, we have populated a 1-GB Benchmark database. We populated the
root node from this database using the Customer (C), Part (P), Month (M) dimensions, and the
“Sale” measure. The customer dimension has 3 levels of hierarchy customer, nation and all. The
part dimension has 3 levels of hierarchy part, part type, all. The sales are analyzed at three levels
of time hierarchy-Month, Year, All. We ran the greedy algorithm on the lattice of Figure 2 using
the TPC-H database as described above. When k=5, the greedy algorithm picks the root view
(view 0) view then view 6 whose benefit is maximum as all the views below it are each improved
from 6001192 to 45000,when we use view 6 in place of view 0. Similarly view 9,view 10,view 11
are picked according to the order of their respective benefits. Table 3 shows the results for the
number of views to be materialized derived after applying Greedy Algorithm. In the first problem
instance, we imposed a constraint of 5 views to be materialized; in the second problem instance,
we set it to 10 views; in the third.
9. Computer Science & Information Technology (CS & IT) 439
.
Figure 2:Lattice Structure with Actual Number of Rows
(Source:TPCH Benchmark Database)
Table 3
Lattice Problem Instance Optimal solution
(Set of Materialized Views)
TPC-H
Materialize 5 views 0,6,9,10,11
Materialize 10 views 0,2,6,9,10,11,12,15,20,21
Materialize 20 views 0,2,3,4,5,6,7,8,
9,10,11,12,13,14,15,16,17,18,20,21,24
10. 440 Computer Science & Information Technology (CS & IT)
Figure 3
problem instance, we set it to 20 views. The views picked up by the greedy algorithm with
maximum benefit in terms of storage space are shown in the table. The Y-axis in the graph of
Figure 3 shows the total time taken as well as the space used. On X-axis it has the number of
views picked. From the graph we can make a clear decision of when to stop materializing views.
It is clear from the graph that for the first 10 views the query time in terms of number of rows is
reduced substantially. However we have observed that performance after materializing 10 views
remains constant and hence we do not materialize the remaining possible views. After knowing
the number of tables and number of views we can calculate the value of CP1 as given in eq. 1
which equals to 158. The Effort is defined using regression analysis [1] as
Effort=0.843*CP1+241.85. Thus the effort comes to be equal to 380 person hours.
6. VALIDATION AND CONCLUSION
Size estimation is one of the critical tasks in object oriented software project management. It is
widely accepted that system size is strongly correlated with development effort [13], [18], [19],
[20]. The Class Point approach provides a system-level size measure by suitably combining well
known OO measures, which consider specific aspects of a single class. In particular, two
measures are proposed, namely, CP1 and CP2. We have used CP1 at the beginning of the
development process of OLAP System to carry out a preliminary size estimation. In this paper we
have used UML class diagram document for size estimation of OLAP systems. The results
of our approach are validated from websites like - http://datawarehouse.ittoolbox.com, which
estimate effort of OLAP Systems based on their complexity to be as 1) Simple OLAP - 2
weeks 2) Medium OLAP - 2 months 3) Complex OLAP - 2 years. Assuming Medium case
OLAP system where the data warehouse is already existing, our approach to size estimation is
very close to industry estimated values. Future work may include refinement of size estimation
by applying CP2.
REFERENCES
[1] Gennaro Costagliola and Genoveffa Tortora, “Class Point: An Approach for the Size Estimation of
Object-Oriented Systems”, IEEE transaction on Software Engineering, Vol. 31, No. 1,January
2005,page 52-74.
[2] Ali Bahrami, Object-Oriented System Development , International edition , 1999.
11. Computer Science & Information Technology (CS & IT) 441
[3] Wei Zhou and Qiang Liu,” Extended Class Point Approach of size Estimation for OO Product”,
IEEE sponsored 2nd International Conference on Computer Engineering and Technology,2010,Vol-
Page:117-122.
[4] Parastoo Mohagheghi, Bente Anda, and Reidar Conradi, “Effort Estimation of Use Cases for
Incremental large-Scale Software Development, ” Proceedings of the International Conference on
Software Engineering (ICSE’05), pp. 303-311, May 15-21. 2005.
[5] Roger S. Pressman, ―Software Engineering A practioner’s Approach, published McGraw Hill,
2005.
[6] TPC Benchmark H (Decision Support) Revision 2.14.4. http://www.tpc.org.
[7] Yang J., Karlapalem K., Li Q., Algorithms For Materialized View Design in Data Warehousing
Environment. VLDB 1997.
[8] Andreas Rauber, Philipp Tomsieh, ”An Arichitecture for modular On- Line Analytical Processing
systems:Supporting Distributed and Parallel Query Procesing using Co-operating CORBA objects”.
DEXA Workshop,1999.pp45-49.
[9] Aditi Kapoor, Parul Pandey ,”Fuzzy Class Point Approach”.Journal, IJRTET,Volume 5, Issue 1,
2011.
[10] V. Harinarayan, A. Rajaraman, and J. D. IJllman. Implementing Data Cubes Efficiently. A f~dl
version of this paper. At http://clb.stanford. edu/ pllb/harinarayan /199,5/cube.ps
[11] J.B. Dreger, Function Point Analysis. Prentice Hall, 1989.
[12] C.F. Kemerer and B.S. Porter, “Improving the Reliability of Function Point Measurement: An
Empirical Study,” IEEE Trans. Software Eng., vol. 18, no. 11, pp. 1011-1024, Nov. 1992.
[13] P. Nesi and T. Querci, “Effort Estimation and Prediction of Object- Oriented Systems,” J. Systems
and Software, vol. 42, pp. 89-102, 1998.
[14] S.A. Whitmire, “3D Function Points: Applications for Object- Oriented Software,” Proc. ASM’96
Conf., 1996.
[15] L.C. Briand, S. Morasca, and V.R. Basili, ―Property Based Software Engineering Measurementǁ,
IEEE Transaction on Software Engineering, vol. 22, no. 1, pp. 68-86, 2009.
[16] Surajit Chaudhuri,Umeshwar Dayal,”An Overview of Data Warehousing and OLAP Technology”
.SIGMOD Record, Vol.26,No.1,1997.pp 65-74 .
[17] Smith R. K, “Effort Estimation in Component-Based Software Development: Identifying
Parameters”, In the proceedings of the 36th ACM Southeast Regional Conference, Marietta, Georgia,
USA, pp 323- 325,1998.
[18] V.B. Misic and D.N. Tesic, “Estimation of Effort and Complexity: an Object Oriented Case Study,”
J. Systems and Software, vol. 41, pp. 133-143, 1999.
[19] S. Moser, B. Henderson-Sellers, and V.B. Misic, “Cost Estimation Based on Business Models,” J.
Systems and Software, vol. 49, pp. 33- 42, 1999.
[20] B.W. Boehm, B. Clark, and E. Hiriwitz, “Cost Models for Future Life Cycle Processes: COCOMO 2.
0,” Ann. Software Eng., vol. 1, no. 1, pp. 1-24, 1995.
Authors
Madhu Bhan. Faculty at M.S.Ramaiah Institute of Technology, Bangalore, India.
Currently perusing P.hd. Area of Intrest is Databases and Data Warehousing
Dr.T.V. Suresh Kumar. Professor at M.S. Ramaiah Institute of Technology,
Bangalore, India. Area of Interests include Software Performance Engineering,Object
Technology.
Dr.K.Rajanikanth. Ph.d from Indian Institute of Science (IISc), Bangalore, India.
Currently working as Chairman and Advisor for many bodies of Visvesvaraya Tech.
University. Area of interests include Adhoc Networks, Image processing,
Microprocessors and Micro-controllers, Data Warehouses.