This document summarizes a research paper that proposes a new model called DAG-WGAN for causal structure learning. DAG-WGAN combines a Wasserstein generative adversarial network with an auto-encoder architecture. It learns causal structures by generating synthetic data that matches the real data distribution. The model incorporates an acyclicity constraint to ensure the learned causal graphs are directed acyclic graphs. Experimental results show DAG-WGAN performs better than other models at learning causal structures from high-dimensional tabular data, demonstrating the benefit of using the Wasserstein distance metric in the generative process of causal structure learning.
Clustering of high dimensionality data which can be seen in almost all fields these days is becoming
very tedious process. The key disadvantage of high dimensional data which we can pen down is curse
of dimensionality. As the magnitude of datasets grows the data points become sparse and density of
area becomes less making it difficult to cluster that data which further reduces the performance of
traditional algorithms used for clustering. Semi-supervised clustering algorithms aim to improve
clustering results using limited supervision. The supervision is generally given as pair wise
constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are
designed for data represented as vectors [2]. In this paper, we unify vector-based and graph-based
approaches. We first show that a recently-proposed objective function for semi-supervised clustering
based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of
constraint penalty functions, can be expressed as a special case of the global kernel k-means objective
[3]. A recent theoretical connection between global kernel k-means and several graph clustering
objectives enables us to perform semi-supervised clustering of data. In particular, some methods have
been proposed for semi supervised clustering based on pair wise similarity or dissimilarity
information. In this paper, we propose a kernel approach for semi supervised clustering and present in
detail two special cases of this kernel approach.
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILUREIJCI JOURNAL
To address the problem of insufficient failure data generated by disks and the imbalance between the number of normal and failure data. The existing Conditional Tabular Generative Adversarial Networks(CTGAN) deep learning methods have been proven to be effective in solving imbalance disk failure data. But CTGAN cannot learn the internal information of disk failure data very well. In this paper,
a fault diagnosis method based on improved CTGAN, a classifier for specific category discrimination is added and a discriminator generate adversarial network based on residual network is proposed. We named it Residual Conditional Tabular Generative Adversarial Networks (RCTGAN). Firstly, to enhance the stability of system a residual network is utilized. RCTGAN uses a small amount of real failure data to synthesize fake fault data; Then, the synthesized data is mixed with the real data to balance the amount of normal and failure data; Finally, four classifier (multilayer perceptron, support vector machine, decision tree, random forest) models are trained using the balanced data set, and the performance of the models is evaluated using G-mean. The experimental results show that the data synthesized by the RCTGAN can
further improve the fault diagnosis accuracy of the classifier.
DISK FAILURE PREDICTION BASED ON MULTI-LAYER DOMAIN ADAPTIVE LEARNING IJCI JOURNAL
Large scale data storage is susceptible to failure. As disks are damaged and replaced, traditional machine
learning models, which rely on historical data to make predictions, struggle to accurately predict disk
failures. This paper presents a novel method for predicting disk failures by leveraging multi-layer domain
adaptive learning techniques. First, disk data with numerous faults is selected as the source domain, and
disk data with fewer faults is selected as the target domain. A training of the feature extraction network is
performed with the selected origin and destination domains. The contrast between the two domains
facilitates the transfer of diagnostic knowledge from the domain of source and target. According to the
experimental findings, it has been demonstrated that the proposed technique can generate a reliable
prediction model and improve the ability to predict failures on disk data with few failure samples.
SCALABLE SEMI-SUPERVISED LEARNING BY EFFICIENT ANCHOR GRAPH REGULARIZATIONNexgen Technology
TO GET THIS PROJECT COMPLETE SOURCE ON SUPPORT WITH EXECUTION PLEASE CALL BELOW CONTACT DETAILS
MOBILE: 9791938249, 0413-2211159, WEB: WWW.NEXGENPROJECT.COM,WWW.FINALYEAR-IEEEPROJECTS.COM, EMAIL:Praveen@nexgenproject.com
NEXGEN TECHNOLOGY provides total software solutions to its customers. Apsys works closely with the customers to identify their business processes for computerization and help them implement state-of-the-art solutions. By identifying and enhancing their processes through information technology solutions. NEXGEN TECHNOLOGY help it customers optimally use their resources.
Clustering is also known as data segmentation aims to partitions data set into groups, clusters, according to their similarity. Cluster analysis has been extensively studied in many researches. There are many algorithms for different types of clustering. These classical algorithms can't be applied on big data due to its distinct features. It is a challenge to apply the traditional techniques on large unstructured data. This study proposes a hybrid model to cluster big data using the famous traditional K-means clustering algorithm. The proposed model consists of three phases namely; Mapper phase, Clustering Phase and Reduce phase. The first phase uses map-reduce algorithm to split big data into small datasets. Whereas, the second phase implements the traditional clustering K-means algorithm on each of the spitted small data sets. The last phase is responsible of producing the general clusters output of the complete data set. Two functions, Mode and Fuzzy Gaussian, have been implemented and compared at the last phase to determine the most suitable one. The experimental study used four benchmark big data sets; Covtype, Covtype-2, Poker, and Poker-2. The results proved the efficiency of the proposed model in clustering big data using the traditional K-means algorithm. Also, the experiments show that the Fuzzy Gaussian function produces more accurate results than the traditional Mode function.
Clustering of high dimensionality data which can be seen in almost all fields these days is becoming
very tedious process. The key disadvantage of high dimensional data which we can pen down is curse
of dimensionality. As the magnitude of datasets grows the data points become sparse and density of
area becomes less making it difficult to cluster that data which further reduces the performance of
traditional algorithms used for clustering. Semi-supervised clustering algorithms aim to improve
clustering results using limited supervision. The supervision is generally given as pair wise
constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are
designed for data represented as vectors [2]. In this paper, we unify vector-based and graph-based
approaches. We first show that a recently-proposed objective function for semi-supervised clustering
based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of
constraint penalty functions, can be expressed as a special case of the global kernel k-means objective
[3]. A recent theoretical connection between global kernel k-means and several graph clustering
objectives enables us to perform semi-supervised clustering of data. In particular, some methods have
been proposed for semi supervised clustering based on pair wise similarity or dissimilarity
information. In this paper, we propose a kernel approach for semi supervised clustering and present in
detail two special cases of this kernel approach.
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILUREIJCI JOURNAL
To address the problem of insufficient failure data generated by disks and the imbalance between the number of normal and failure data. The existing Conditional Tabular Generative Adversarial Networks(CTGAN) deep learning methods have been proven to be effective in solving imbalance disk failure data. But CTGAN cannot learn the internal information of disk failure data very well. In this paper,
a fault diagnosis method based on improved CTGAN, a classifier for specific category discrimination is added and a discriminator generate adversarial network based on residual network is proposed. We named it Residual Conditional Tabular Generative Adversarial Networks (RCTGAN). Firstly, to enhance the stability of system a residual network is utilized. RCTGAN uses a small amount of real failure data to synthesize fake fault data; Then, the synthesized data is mixed with the real data to balance the amount of normal and failure data; Finally, four classifier (multilayer perceptron, support vector machine, decision tree, random forest) models are trained using the balanced data set, and the performance of the models is evaluated using G-mean. The experimental results show that the data synthesized by the RCTGAN can
further improve the fault diagnosis accuracy of the classifier.
DISK FAILURE PREDICTION BASED ON MULTI-LAYER DOMAIN ADAPTIVE LEARNING IJCI JOURNAL
Large scale data storage is susceptible to failure. As disks are damaged and replaced, traditional machine
learning models, which rely on historical data to make predictions, struggle to accurately predict disk
failures. This paper presents a novel method for predicting disk failures by leveraging multi-layer domain
adaptive learning techniques. First, disk data with numerous faults is selected as the source domain, and
disk data with fewer faults is selected as the target domain. A training of the feature extraction network is
performed with the selected origin and destination domains. The contrast between the two domains
facilitates the transfer of diagnostic knowledge from the domain of source and target. According to the
experimental findings, it has been demonstrated that the proposed technique can generate a reliable
prediction model and improve the ability to predict failures on disk data with few failure samples.
SCALABLE SEMI-SUPERVISED LEARNING BY EFFICIENT ANCHOR GRAPH REGULARIZATIONNexgen Technology
TO GET THIS PROJECT COMPLETE SOURCE ON SUPPORT WITH EXECUTION PLEASE CALL BELOW CONTACT DETAILS
MOBILE: 9791938249, 0413-2211159, WEB: WWW.NEXGENPROJECT.COM,WWW.FINALYEAR-IEEEPROJECTS.COM, EMAIL:Praveen@nexgenproject.com
NEXGEN TECHNOLOGY provides total software solutions to its customers. Apsys works closely with the customers to identify their business processes for computerization and help them implement state-of-the-art solutions. By identifying and enhancing their processes through information technology solutions. NEXGEN TECHNOLOGY help it customers optimally use their resources.
Clustering is also known as data segmentation aims to partitions data set into groups, clusters, according to their similarity. Cluster analysis has been extensively studied in many researches. There are many algorithms for different types of clustering. These classical algorithms can't be applied on big data due to its distinct features. It is a challenge to apply the traditional techniques on large unstructured data. This study proposes a hybrid model to cluster big data using the famous traditional K-means clustering algorithm. The proposed model consists of three phases namely; Mapper phase, Clustering Phase and Reduce phase. The first phase uses map-reduce algorithm to split big data into small datasets. Whereas, the second phase implements the traditional clustering K-means algorithm on each of the spitted small data sets. The last phase is responsible of producing the general clusters output of the complete data set. Two functions, Mode and Fuzzy Gaussian, have been implemented and compared at the last phase to determine the most suitable one. The experimental study used four benchmark big data sets; Covtype, Covtype-2, Poker, and Poker-2. The results proved the efficiency of the proposed model in clustering big data using the traditional K-means algorithm. Also, the experiments show that the Fuzzy Gaussian function produces more accurate results than the traditional Mode function.
Choosing allowability boundaries for describing objects in subject areasIAESIJAI
Anomaly detection is one of the most promising problems for study and can be used as independent units and preprocessing tools before solving any fundamental data mining problems. This article proposes a method for detecting specific errors with the involvement of experts from subject areas to fill knowledge. The proposed method about outliers hypothesizes that they locate closer to logical boundaries of intervals derived from pair features, and the interval ranges vary in different domains. We construct intervals leveraging pair feature values. While forming knowledge in a specific field, a domain specialist checks the logical allowability of objects based on the range of the intervals. If the objects are logical outliers, the specialist ignores or corrects them. We offer the general algorithm for the formation of the database based on the proposed method in the form of a pseudo-code, and we provide comparison results with existing methods.
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHcscpconf
Due to the intangible nature of “software”, accurate and reliable software effort estimation is a challenge in the software Industry. It is unlikely to expect very accurate estimates of software
development effort because of the inherent uncertainty in software development projects and the complex and dynamic interaction of factors that impact software development. Heterogeneity exists in the software engineering datasets because data is made available from diverse sources.
This can be reduced by defining certain relationship between the data values by classifying them into different clusters. This study focuses on how the combination of clustering and
regression techniques can reduce the potential problems in effectiveness of predictive efficiency due to heterogeneity of the data. Using a clustered approach creates the subsets of data having a degree of homogeneity that enhances prediction accuracy. It was also observed in this study that ridge regression performs better than other regression techniques used in the analysis.
Estimating project development effort using clustered regression approachcsandit
Due to the intangible nature of “software”, accurate and reliable software effort estimation is a
challenge in the software Industry. It is unlikely to expect very accurate estimates of software
development effort because of the inherent uncertainty in software development projects and the
complex and dynamic interaction of factors that impact software development. Heterogeneity
exists in the software engineering datasets because data is made available from diverse sources.
This can be reduced by defining certain relationship between the data values by classifying
them into different clusters. This study focuses on how the combination of clustering and
regression techniques can reduce the potential problems in effectiveness of predictive efficiency
due to heterogeneity of the data. Using a clustered approach creates the subsets of data having
a degree of homogeneity that enhances prediction accuracy. It was also observed in this study
that ridge regression performs better than other regression techniques used in the analysis.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Data mining is utilized to manage huge measure of information which are put in the data ware houses and databases, to discover required information and data. Numerous data mining systems have been proposed, for example, association rules, decision trees, neural systems, clustering, and so on. It has turned into the purpose of consideration from numerous years. A re-known amongst the available data mining strategies is clustering of the dataset. It is the most effective data mining method. It groups the dataset in number of clusters based on certain guidelines that are predefined. It is dependable to discover the connection between the distinctive characteristics of data.
In k-mean clustering algorithm, the function is being selected on the basis of the relevancy of the function for predicting the data and also the Euclidian distance between the centroid of any cluster and the data objects outside the cluster is being computed for the clustering the data points. In this work, author enhanced the Euclidian distance formula to increase the cluster quality.
The problem of accuracy and redundancy of the dissimilar points in the clusters remains in the improved k-means for which new enhanced approach is been proposed which uses the similarity function for checking the similarity level of the point before including it to the cluster.
A Novel Clustering Method for Similarity Measuring in Text DocumentsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
A survey on methods and applications of meta-learning with GNNsShreya Goyal
This survey paper has provided a comprehensive review of works that are a combination of graph neural networks (GNNs) and meta-learning. They have also provided a thorough review, summary of methods, and applications in these categories. The application of meta-learning to GNNs is a growing and exciting field; many graph problems will benefit immensely from the combination of the two approaches.
An approach for improved students’ performance prediction using homogeneous ...IJECEIAES
Web-based learning technologies of educational institutions store a massive amount of interaction data which can be helpful to predict students’ performance through the aid of machine learning algorithms. With this, various researchers focused on studying ensemble learning methods as it is known to improve the predictive accuracy of traditional classification algorithms. This study proposed an approach for enhancing the performance prediction of different single classification algorithms by using them as base classifiers of homogeneous ensembles (bagging and boosting) and heterogeneous ensembles (voting and stacking). The model utilized various single classifiers such as multilayer perceptron or neural networks (NN), random forest (RF), naïve Bayes (NB), J48, JRip, OneR, logistic regression (LR), k-nearest neighbor (KNN), and support vector machine (SVM) to determine the base classifiers of the ensembles. In addition, the study made use of the University of California Irvine (UCI) open-access student dataset to predict students’ performance. The comparative analysis of the model’s accuracy showed that the best-performing single classifier’s accuracy increased further from 93.10% to 93.68% when used as a base classifier of a voting ensemble method. Moreover, results in this study showed that voting heterogeneous ensemble performed slightly better than bagging and boosting homogeneous ensemble methods.
GET IEEE BIG DATA,JAVA ,DOTNET,ANDROID ,NS2,MATLAB,EMBEDED AT LOW COST WITH BEST QUALITY PLEASE CONTACT BELOW NUMBER
FOR MORE INFORMATION PLEASE FIND THE BELOW DETAILS:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com
Mobile: 9791938249
Telephone: 0413-2211159
www.nexgenproject.com
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...cscpconf
Multi-label spatial classification based on association rules with multi objective genetic
algorithms (MOGA) enriched by semi supervised learning is proposed in this paper. It is to deal
with multiple class labels problem. In this paper we adapt problem transformation for the multi
label classification. We use hybrid evolutionary algorithm for the optimization in the generation
of spatial association rules, which addresses single label. MOGA is used to combine the single
labels into multi labels with the conflicting objectives predictive accuracy and
comprehensibility. Semi supervised learning is done through the process of rule cover
clustering. Finally associative classifier is built with a sorting mechanism. The algorithm is
simulated and the results are compared with MOGA based associative classifier, which out
performs the existing
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELijcsit
Predicting the student performance is a great concern to the higher education managements.This
prediction helps to identify and to improve students' performance.Several factors may improve this
performance.In the present study, we employ the data mining processes, particularly classification, to
enhance the quality of the higher educational system. Recently, a new direction is used for the improvement
of the classification accuracy by combining classifiers.In thispaper, we design and evaluate a fastlearning
algorithm using AdaBoost ensemble with a simple genetic algorithmcalled “Ada-GA” where the genetic
algorithm is demonstrated to successfully improve the accuracy of the combined classifier performance.
The Ada-GA algorithm proved to be of considerable usefulness in identifying the students at risk early,
especially in very large classes. This early prediction allows the instructor to provide appropriate advising
to those students. The Ada/GA algorithm is implemented and tested on ASSISTments dataset, the results
showed that this algorithm hassuccessfully improved the detection accuracy as well as it reduces the
complexity of computation.
An advance extended binomial GLMBoost ensemble method with synthetic minorit...IJECEIAES
Classification is an important activity in a variety of domains. Class imbalance problem have reduced the performance of the traditional classification approaches. An imbalance problem arises when mismatched class distributions are discovered among the instances of class of classification datasets. An advance extended binomial GLMBoost (EBGLMBoost) coupled with synthetic minority over-sampling technique (SMOTE) technique is the proposed model in the study to manage imbalance issues. The SMOTE is used to solve the proposed model, ensuring that the target variable's distribution is balanced, whereas the GLMBoost ensemble techniques are built to deal with imbalanced datasets. For the entire experiment, twenty different datasets are used, and support vector machine (SVM), Nu-SVM, bagging, and AdaBoost classification algorithms are used to compare with the suggested method. The model's sensitivity, specificity, geometric mean (G-mean), precision, recall, and F-measure resulted in percentages for training and testing datasets are 99.37, 66.95, 80.81, 99.21, 99.37, 99.29 and 98.61, 54.78, 69.88, 98.77, 96.61, 98.68, respectively. With the help of the Wilcoxon test, it is determined that the proposed technique performed well on unbalanced data. Finally, the proposed solutions are capable of efficiently dealing with the problem of class imbalance.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that documents within a cluster have high intra-similarity and low inter-similarity to other clusters. Many document clustering algorithms provide localized search in effectively navigating, summarizing, and organizing information. A global optimal solution can be obtained by applying high-speed and high-quality optimization algorithms. The optimization technique performs a globalized search in the entire solution space. In this paper, a brief survey on optimization approaches to text document clustering is turned out.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
10th International Conference on Artificial Intelligence and Applications (AI...gerogepatton
10th International Conference on Artificial Intelligence and Applications (AI 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Artificial Intelligence and its applications. The Conference looks for significant contributions to all major fields of the Artificial Intelligence, Soft Computing in theoretical and practical aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
More Related Content
Similar to CAUSALITY LEARNING WITH WASSERSTEIN GENERATIVE ADVERSARIAL NETWORKS
Choosing allowability boundaries for describing objects in subject areasIAESIJAI
Anomaly detection is one of the most promising problems for study and can be used as independent units and preprocessing tools before solving any fundamental data mining problems. This article proposes a method for detecting specific errors with the involvement of experts from subject areas to fill knowledge. The proposed method about outliers hypothesizes that they locate closer to logical boundaries of intervals derived from pair features, and the interval ranges vary in different domains. We construct intervals leveraging pair feature values. While forming knowledge in a specific field, a domain specialist checks the logical allowability of objects based on the range of the intervals. If the objects are logical outliers, the specialist ignores or corrects them. We offer the general algorithm for the formation of the database based on the proposed method in the form of a pseudo-code, and we provide comparison results with existing methods.
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACHcscpconf
Due to the intangible nature of “software”, accurate and reliable software effort estimation is a challenge in the software Industry. It is unlikely to expect very accurate estimates of software
development effort because of the inherent uncertainty in software development projects and the complex and dynamic interaction of factors that impact software development. Heterogeneity exists in the software engineering datasets because data is made available from diverse sources.
This can be reduced by defining certain relationship between the data values by classifying them into different clusters. This study focuses on how the combination of clustering and
regression techniques can reduce the potential problems in effectiveness of predictive efficiency due to heterogeneity of the data. Using a clustered approach creates the subsets of data having a degree of homogeneity that enhances prediction accuracy. It was also observed in this study that ridge regression performs better than other regression techniques used in the analysis.
Estimating project development effort using clustered regression approachcsandit
Due to the intangible nature of “software”, accurate and reliable software effort estimation is a
challenge in the software Industry. It is unlikely to expect very accurate estimates of software
development effort because of the inherent uncertainty in software development projects and the
complex and dynamic interaction of factors that impact software development. Heterogeneity
exists in the software engineering datasets because data is made available from diverse sources.
This can be reduced by defining certain relationship between the data values by classifying
them into different clusters. This study focuses on how the combination of clustering and
regression techniques can reduce the potential problems in effectiveness of predictive efficiency
due to heterogeneity of the data. Using a clustered approach creates the subsets of data having
a degree of homogeneity that enhances prediction accuracy. It was also observed in this study
that ridge regression performs better than other regression techniques used in the analysis.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Data mining is utilized to manage huge measure of information which are put in the data ware houses and databases, to discover required information and data. Numerous data mining systems have been proposed, for example, association rules, decision trees, neural systems, clustering, and so on. It has turned into the purpose of consideration from numerous years. A re-known amongst the available data mining strategies is clustering of the dataset. It is the most effective data mining method. It groups the dataset in number of clusters based on certain guidelines that are predefined. It is dependable to discover the connection between the distinctive characteristics of data.
In k-mean clustering algorithm, the function is being selected on the basis of the relevancy of the function for predicting the data and also the Euclidian distance between the centroid of any cluster and the data objects outside the cluster is being computed for the clustering the data points. In this work, author enhanced the Euclidian distance formula to increase the cluster quality.
The problem of accuracy and redundancy of the dissimilar points in the clusters remains in the improved k-means for which new enhanced approach is been proposed which uses the similarity function for checking the similarity level of the point before including it to the cluster.
A Novel Clustering Method for Similarity Measuring in Text DocumentsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
A survey on methods and applications of meta-learning with GNNsShreya Goyal
This survey paper has provided a comprehensive review of works that are a combination of graph neural networks (GNNs) and meta-learning. They have also provided a thorough review, summary of methods, and applications in these categories. The application of meta-learning to GNNs is a growing and exciting field; many graph problems will benefit immensely from the combination of the two approaches.
An approach for improved students’ performance prediction using homogeneous ...IJECEIAES
Web-based learning technologies of educational institutions store a massive amount of interaction data which can be helpful to predict students’ performance through the aid of machine learning algorithms. With this, various researchers focused on studying ensemble learning methods as it is known to improve the predictive accuracy of traditional classification algorithms. This study proposed an approach for enhancing the performance prediction of different single classification algorithms by using them as base classifiers of homogeneous ensembles (bagging and boosting) and heterogeneous ensembles (voting and stacking). The model utilized various single classifiers such as multilayer perceptron or neural networks (NN), random forest (RF), naïve Bayes (NB), J48, JRip, OneR, logistic regression (LR), k-nearest neighbor (KNN), and support vector machine (SVM) to determine the base classifiers of the ensembles. In addition, the study made use of the University of California Irvine (UCI) open-access student dataset to predict students’ performance. The comparative analysis of the model’s accuracy showed that the best-performing single classifier’s accuracy increased further from 93.10% to 93.68% when used as a base classifier of a voting ensemble method. Moreover, results in this study showed that voting heterogeneous ensemble performed slightly better than bagging and boosting homogeneous ensemble methods.
GET IEEE BIG DATA,JAVA ,DOTNET,ANDROID ,NS2,MATLAB,EMBEDED AT LOW COST WITH BEST QUALITY PLEASE CONTACT BELOW NUMBER
FOR MORE INFORMATION PLEASE FIND THE BELOW DETAILS:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com
Mobile: 9791938249
Telephone: 0413-2211159
www.nexgenproject.com
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...cscpconf
Multi-label spatial classification based on association rules with multi objective genetic
algorithms (MOGA) enriched by semi supervised learning is proposed in this paper. It is to deal
with multiple class labels problem. In this paper we adapt problem transformation for the multi
label classification. We use hybrid evolutionary algorithm for the optimization in the generation
of spatial association rules, which addresses single label. MOGA is used to combine the single
labels into multi labels with the conflicting objectives predictive accuracy and
comprehensibility. Semi supervised learning is done through the process of rule cover
clustering. Finally associative classifier is built with a sorting mechanism. The algorithm is
simulated and the results are compared with MOGA based associative classifier, which out
performs the existing
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELijcsit
Predicting the student performance is a great concern to the higher education managements.This
prediction helps to identify and to improve students' performance.Several factors may improve this
performance.In the present study, we employ the data mining processes, particularly classification, to
enhance the quality of the higher educational system. Recently, a new direction is used for the improvement
of the classification accuracy by combining classifiers.In thispaper, we design and evaluate a fastlearning
algorithm using AdaBoost ensemble with a simple genetic algorithmcalled “Ada-GA” where the genetic
algorithm is demonstrated to successfully improve the accuracy of the combined classifier performance.
The Ada-GA algorithm proved to be of considerable usefulness in identifying the students at risk early,
especially in very large classes. This early prediction allows the instructor to provide appropriate advising
to those students. The Ada/GA algorithm is implemented and tested on ASSISTments dataset, the results
showed that this algorithm hassuccessfully improved the detection accuracy as well as it reduces the
complexity of computation.
An advance extended binomial GLMBoost ensemble method with synthetic minorit...IJECEIAES
Classification is an important activity in a variety of domains. Class imbalance problem have reduced the performance of the traditional classification approaches. An imbalance problem arises when mismatched class distributions are discovered among the instances of class of classification datasets. An advance extended binomial GLMBoost (EBGLMBoost) coupled with synthetic minority over-sampling technique (SMOTE) technique is the proposed model in the study to manage imbalance issues. The SMOTE is used to solve the proposed model, ensuring that the target variable's distribution is balanced, whereas the GLMBoost ensemble techniques are built to deal with imbalanced datasets. For the entire experiment, twenty different datasets are used, and support vector machine (SVM), Nu-SVM, bagging, and AdaBoost classification algorithms are used to compare with the suggested method. The model's sensitivity, specificity, geometric mean (G-mean), precision, recall, and F-measure resulted in percentages for training and testing datasets are 99.37, 66.95, 80.81, 99.21, 99.37, 99.29 and 98.61, 54.78, 69.88, 98.77, 96.61, 98.68, respectively. With the help of the Wilcoxon test, it is determined that the proposed technique performed well on unbalanced data. Finally, the proposed solutions are capable of efficiently dealing with the problem of class imbalance.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that documents within a cluster have high intra-similarity and low inter-similarity to other clusters. Many document clustering algorithms provide localized search in effectively navigating, summarizing, and organizing information. A global optimal solution can be obtained by applying high-speed and high-quality optimization algorithms. The optimization technique performs a globalized search in the entire solution space. In this paper, a brief survey on optimization approaches to text document clustering is turned out.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
10th International Conference on Artificial Intelligence and Applications (AI...gerogepatton
10th International Conference on Artificial Intelligence and Applications (AI 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Artificial Intelligence and its applications. The Conference looks for significant contributions to all major fields of the Artificial Intelligence, Soft Computing in theoretical and practical aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
May 2024 - Top 10 Read Articles in Artificial Intelligence and Applications (...gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)gerogepatton
3rd International Conference on Artificial Intelligence Advances (AIAD 2024) will act as a major forum for the presentation of innovative ideas, approaches, developments, and research projects in the area advanced Artificial Intelligence. It will also serve to facilitate the exchange of information between researchers and industry professionals to discuss the latest issues and advancement in the research area. Core areas of AI and advanced multi-disciplinary and its applications will be covered during the conferences.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
Information Extraction from Product Labels: A Machine Vision Approachgerogepatton
This research tackles the challenge of manual data extraction from product labels by employing a blend of
computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines
Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional
Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by
incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition
(OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food
Facts API (Application Programming Interface) for database population and text-only label prediction.
The CRNN model is trained on encoded labels and evaluated for accuracy on a dedicated test set.
Importantly, our approach enables visually impaired individuals to access essential information on
product labels, such as directions and ingredients. Overall, the study highlights the efficacy of deep
learning and OCR in automating label extraction and recognition.
10th International Conference on Artificial Intelligence and Applications (AI...gerogepatton
10th International Conference on Artificial Intelligence and Applications (AI 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Artificial Intelligence and its applications. The Conference looks for significant contributions to all major fields of the Artificial Intelligence, Soft Computing in theoretical and practical aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
Research on Fuzzy C- Clustering Recursive Genetic Algorithm based on Cloud Co...gerogepatton
Aiming at the problems of poor local search ability and precocious convergence of fuzzy C-cluster
recursive genetic algorithm (FOLD++), a new fuzzy C-cluster recursive genetic algorithm based on
Bayesian function adaptation search (TS) was proposed by incorporating the idea of Bayesian function
adaptation search into fuzzy C-cluster recursive genetic algorithm. The new algorithm combines the
advantages of FOLD++ and TS. In the early stage of optimization, fuzzy C-cluster recursive genetic
algorithm is used to get a good initial value, and the individual extreme value pbest is put into Bayesian
function adaptation table. In the late stage of optimization, when the searching ability of fuzzy C-cluster
recursive genetic is weakened, the short term memory function of Bayesian function adaptation table in
Bayesian function adaptation search algorithm is utilized. Make it jump out of the local optimal solution,
and allow bad solutions to be accepted during the search. The improved algorithm is applied to function
optimization, and the simulation results show that the calculation accuracy and stability of the algorithm
are improved, and the effectiveness of the improved algorithm is verified
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
10th International Conference on Artificial Intelligence and Soft Computing (...gerogepatton
10th International Conference on Artificial Intelligence and Soft Computing (AIS 2024) will
provide an excellent international forum for sharing knowledge and results in theory, methodology, and
applications of Artificial Intelligence, Soft Computing. The Conference looks for significant
contributions to all major fields of the Artificial Intelligence, Soft Computing in theoretical and practical
aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from
both academia as well as industry to meet and share cutting-edge development in the field.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
Employee attrition refers to the decrease in staff numbers within an organization due to various reasons.
As it has a negative impact on long-term growth objectives and workplace productivity, firms have
recognized it as a significant concern. To address this issue, organizations are increasingly turning to
machine-learning approaches to forecast employee attrition rates. This topic has gained significant
attention from researchers, especially in recent times. Several studies have applied various machinelearning methods to predict employee attrition, producing different resultsdepending on the employed
methods, factors, and datasets. However, there has been no comprehensive comparative review of multiple
studies applying machine-learning models to predict employee attrition to date. Therefore, this study aims
to fill this gap by providing an overview of research conducted on applying machine learning to predict
employee attrition from 2019 to February 2024. A literature review of relevant studies was conducted,
summarized, and classified. Most studies agree on conducting comparative experiments with multiple
predictive models to determine the most effective one.From this literature survey, the RF algorithm and
XGB ensemble method are repeatedly the best-performing, outperforming many other algorithms.
Additionally, the application of deep learning to employee attrition prediction issues also shows promise.
While there are discrepancies in the datasets used in previous studies, it is notable that the dataset
provided by IBM is the most widely utilized. This study serves as a concise review for new researchers,
facilitating their understanding of the primary techniques employed in predicting employee attrition and
highlighting recent research trends in this field. Furthermore, it provides organizations with insight into
the prominent factors affecting employee attrition, as identified by studies, enabling them to implement
solutions aimed at reducing attrition rates.
10th International Conference on Artificial Intelligence and Applications (AI...gerogepatton
10th International Conference on Artificial Intelligence and Applications (AIFU 2024) is a forum for presenting new advances and research results in the fields of Artificial Intelligence. The conference will bring together leading researchers, engineers and scientists in the domain of interest from around the world. The scope of the conference covers all theoretical and practical aspects of the Artificial Intelligence.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...gerogepatton
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
13th International Conference on Software Engineering and Applications (SEA 2...gerogepatton
13th International Conference on Software Engineering and Applications (SEA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Software Engineering and Applications. The goal of this conference is to bring together researchers and practitioners from academia and industry to focus on understanding Modern software engineering concepts and establishing new collaborations in these areas.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
AN IMPROVED MT5 MODEL FOR CHINESE TEXT SUMMARY GENERATIONgerogepatton
Complicated policy texts require a lot of effort to read, so there is a need for intelligent interpretation of
Chinese policies. To better solve the Chinese Text Summarization task, this paper utilized the mT5 model
as the core framework and initial weights. Additionally, In addition, this paper reduced the model size
through parameter clipping, used the Gap Sentence Generation (GSG) method as unsupervised method,
and improved the Chinese tokenizer. After training on a meticulously processed 30GB Chinese training
corpus, the paper developed the enhanced mT5-GSG model. Then, when fine-tuning the Chinese Policy
text, this paper chose the idea of “Dropout Twice”, and innovatively combined the probability distribution
of the two Dropouts through the Wasserstein distance. Experimental results indicate that the proposed
model achieved Rouge-1, Rouge-2, and Rouge-L scores of 56.13%, 45.76%, and 56.41% respectively on
the Chinese policy text summarization dataset.
Get Ahead with YouTube Growth Services....SocioCosmos
Get noticed on YouTube by buying authentic engagement. Sociocosmos helps you grow your channel quickly and effectively.
https://www.sociocosmos.com/product-category/youtube/
Social media refers to online platforms and tools that enable users to create, share, and exchange information, ideas, and content in virtual communities and networks. These platforms have revolutionized the way people communicate, interact, and consume information. Here are some key aspects and descriptions of social media:
Your Path to YouTube Stardom Starts HereSocioCosmos
Skyrocket your YouTube presence with Sociocosmos' proven methods. Gain real engagement and build a loyal audience. Join us now.
https://www.sociocosmos.com/product-category/youtube/
Unlock TikTok Success with Sociocosmos..SocioCosmos
Discover how Sociocosmos can boost your TikTok presence with real followers and engagement. Achieve your social media goals today!
https://www.sociocosmos.com/product-category/tiktok/
Grow Your Reddit Community Fast.........SocioCosmos
Sociocosmos helps you gain Reddit followers quickly and easily. Build your community and expand your influence.
https://www.sociocosmos.com/product-category/reddit/
Enhance your social media strategy with the best digital marketing agency in Kolkata. This PPT covers 7 essential tips for effective social media marketing, offering practical advice and actionable insights to help you boost engagement, reach your target audience, and grow your online presence.
“To be integrated is to feel secure, to feel connected.” The views and experi...AJHSSR Journal
ABSTRACT: Although a significant amount of literature exists on Morocco's migration policies and their
successes and failures since their implementation in 2014, there is limited research on the integration of subSaharan African children into schools. This paperis part of a Ph.D. research project that aims to fill this gap. It
reports the main findings of a study conducted with migrant children enrolled in two public schools in Rabat,
Morocco, exploring how integration is defined by the children themselves and identifying the obstacles that they
have encountered thus far. The following paper uses an inductive approach and primarily focuses on the
relationships of children with their teachers and peers as a key aspect of integration for students with a migration
background. The study has led to several crucial findings. It emphasizes the significance of speaking Colloquial
Moroccan Arabic (Darija) and being part of a community for effective integration. Moreover, it reveals that the
use of Modern Standard Arabic as the language of instruction in schools is a source of frustration for students,
indicating the need for language policy reform. The study underlines the importanceof considering the
children‟s agency when being integrated into mainstream public schools.
.
KEYWORDS: migration, education, integration, sub-Saharan African children, public school
How social media marketing helps businesses in 2024.pdfpramodkumar2310
Social media marketing refers to the process of utilizing social media platforms to promote products, services, or brands. It involves creating and sharing valuable content, engaging with followers, analyzing data, and running targeted advertising campaigns.
www.nidmindia.com
Exploring Factors Affecting the Success of TVET-Industry Partnership: A Case ...AJHSSR Journal
ABSTRACT: The purpose of this study was to explore factors affecting the success of TVET-industry
partnerships. A case study design of the qualitative research method was used to achieve this objective. For the
study, one polytechnic college of Oromia regional state, and two industries were purposively selected. From the
sample polytechnic college and industries, a total of 17 sample respondents were selected. Out of 17
respondents, 10 respondents were selected using the snowball sampling method, and the rest 7 respondents were
selected using the purposive sampling technique. The qualitative data were collected through an in-depth
interview and document analysis. The data were analyzed using thematic approaches. The findings revealed that
TVET-industry partnerships were found weak. Lack of key stakeholder‟s awareness shortage of improved
training equipment and machines in polytechnic colleges, absence of trainee health insurance policy, lack of
incentive mechanisms for private industries, lack of employer industries involvement in designing and
developing occupational standards, and preparation of curriculum were some of the impediments of TVETindustry partnership. Based on the findings it was recommended that the Oromia TVET bureau in collaboration
with other relevant concerned regional authorities and TVET colleges, set new strategies for creating strong
awareness for industries, companies, and other relevant stakeholders on the purpose and advantages of
implementing successful TVET-industry partnership. Finally, the Oromia regional government in collaboration
with the TVET bureau needs to create policy-supported incentive strategies such as giving occasional privileges
of duty-free import, tax reduction, and regional government recognition awards based on the level of partnership
contribution to TVET institutions in promoting TVET-industry partnership.
KEY WORDS: employability skills, industries, and partnership
Multilingual SEO Services | Multilingual Keyword Research | Filosemadisonsmith478075
Multilingual SEO services are essential for businesses aiming to expand their global presence. They involve optimizing a website for search engines in multiple languages, enhancing visibility, and reaching diverse audiences. Filose offers comprehensive multilingual SEO services designed to help businesses optimize their websites for search engines in various languages, enhancing their global reach and market presence. These services ensure that your content is not only translated but also culturally and contextually adapted to resonate with local audiences.
Visit us at -https://www.filose.com/
Non-Financial Information and Firm Risk Non-Financial Information and Firm RiskAJHSSR Journal
ABSTRACT: This research aims to examine how ESG disclosure and risk disclosure affect the total risk of
companies. Using cross section data from 355 companies listed in Indonesia Stock Exchange, data regarding
ESG disclosure and risk was collected. In this research, ESG and risk disclosures are measured based on content
analysis using GRI 4 guidelines for ESG disclosures and COSO ERM for risk disclosures. Using multiple
regression, it is concluded that only risk disclosure can reduce the company's total risk, while ESG disclosure
cannot affect the company's total risk. This shows that only risk disclosure is relevant in determining a
company's total risk.
KEYWORDS: ESG disclosure, risk disclosure, firm risk
The Challenges of Good Governance and Project Implementation in Nigeria: A Re...AJHSSR Journal
ABSTRACT : This study reveals that systemic corruption and other factors including poor leadership,
leadership recruitment processes, ethnic and regional politics, tribalism and mediocrity, poor planning, and
variation of project design have been the causative factors that undermine projects implementation in postindependence African states, particularly in Nigeria. The study, thus, argued that successive governments of
African states, using Nigeria as a case study, have been deeply engrossed in this obnoxious practice that has
undermined infrastructure sector development as well as enthroned impoverishment and mass poverty in these
African countries. This study, therefore, is posed to examine the similarities in causative factors, effects and
consequences of corruption and how it affects governance, projects implementation and national growth. To
achieve this, the study adopted historical research design which is qualitative and explorative in nature. The
study among others suggests that the governments of developing countries should shun corruption and other
forms of obnoxious practices in order to operate effective and efficient systems that promote good governance
and ensure there is adequate projects implementation which are the attributes of a responsible government and
good leadership. Policy makers should also prioritize policy objectives and competence to ensure that policies
are fully implemented within stipulated time frame.
KEYWORDS: Developing Countries, Nigeria, Government, Project Implementation, Project Failure
The Challenges of Good Governance and Project Implementation in Nigeria: A Re...
CAUSALITY LEARNING WITH WASSERSTEIN GENERATIVE ADVERSARIAL NETWORKS
1. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
DOI: 10.5121/ijaia.2022.13301 1
CAUSALITY LEARNING WITH WASSERSTEIN
GENERATIVE ADVERSARIAL NETWORKS
Hristo Petkov1
, Colin Hanley2
and Feng Dong1
1
Department of Computer and Information Sciences,
University of Strathclyde, Glasgow, United Kingdom
2
Department of Management Science,
University of Strathclyde, Glasgow, United Kingdom
ABSTRACT
Conventional methods for causal structure learning from data face significant challenges due to
combinatorial search space. Recently, the problem has been formulated into a continuous optimization
framework with an acyclicity constraint to learn Directed Acyclic Graphs (DAGs). Such a framework
allows the utilization of deep generative models for causal structure learning to better capture the relations
between data sample distributions and DAGs. However, so far no study has experimented with the use of
Wasserstein distance in the context of causal structure learning. Our model named DAG-WGAN combines
the Wasserstein-based adversarial loss with an acyclicity constraint in an auto-encoder architecture. It
simultaneously learns causal structures while improving its data generation capability. We compare the
performance of DAG-WGAN with other models that do not involve the Wasserstein metric in order to
identify its contribution to causal structure learning. Our model performs better with high cardinality data
according to our experiments.
KEYWORDS
Generative Adversarial Networks, Wasserstein Distance, Bayesian Networks, Causal Structure Learning,
Directed Acyclic Graphs.
1. INTRODUCTION
Causal relationships constitute important scientific knowledge. Bayesian Networks (BN) are
graphical models representing casual structures between variables and their conditional
dependencies in the form of Directed Acyclic Graphs (DAG). They are widely used for causal
inference in many application areas such as medicine, genetics and economics [1-3].
Learning causal structure from data is difficult. One of the major challenges arises from the
combinatorial nature of the search space. Increasing the number of variables leads to a super-
exponential increase of possible DAGs. This makes the problem computationally intractable.
Over the last few years, several methods have been proposed to address the problem of
discovering DAGs from data [4], including the score-based methods and constraint-based
methods. Kuipers-et-al [5], Heckerman-et-al [6] and Bouckaert [7] have proposed score-based
methods (SBMs). They formulate the causal structure learning problem into the optimization of a
score function with acyclicity constraint. SBMs rely on the utilization of general optimization
techniques to discover DAG structures. Additional structure assumptions and approximate
searches are often needed because the complexity of the search space remains super-exponential.
2. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
2
Pearl [8], Spirtes-et-al [1] and Zhang [9] have proposed constraint-based methods (CBMs) which
utilize conditional independence tests to reduce the DAG search space and provide graphs that
satisfy a set of conditional independencies. However, CBMs often lead to spurious results. Also
they are not very robust over sampling noise. Some works have tried to combine SBMs with
CBMs. For example, in the MMHC algorithm [10], Hill Climbing is used as the score function
and Min-Max Parents and Children (MMPC) is used to check for relations between variables.
Recently Zheng et al. [11] made a breakthrough by transforming the causal structure learning
problem from combinatorial to continuous optimization with an acyclicity constraint, which can
be efficiently solved using conventional optimization methods. This technique has been extended
to cover nonlinear cases. Yu et al. [12] have proposed a model named DAG-GNN based on a
variational auto-encoder architecture. Their method can handle linear, non-linear, continuous and
categorical data. Another model named GraN-DAG [13] has also been proposed to learn
causality from both linear and non-linear data.
Our work is related to the use of generative adversarial networks (GANs) for causal structure
learning. GANs is a powerful approach to generate synthetic data. They have also been
experimented with to synthesize tabular data [14-15]. To leverage GANs for causal structure
learning, a fundamental question is to identify how much the data distribution metrics contribute
to causal structure learning. Recently, Gao et al. [16] developed a GAN-based model (DAG-
GAN) that learns causal structures from tabular data based on Maximum Mean Discrepancy
(MMD) [17]. In contrast, our work has investigated Wasserstein GAN (WGAN) for learning
causal structures from tabular data. The Wasserstein distance metric from optimal transport
distance [18] is an established metric that preserves basic metric properties [19-22] which has led
to Wasserstein GAN (WGAN) to achieve significant improvement in training stability and
convergence by addressing the gradient vanishing problem and partially removing mode collapse
[23]. However, to the best of our knowledge, so far no study has been conducted to experiment
with the Wasserstein metric for causality learning.
Our proposed new DAG-WGAN model combines WGAN-GP with an auto-encoder. A critic
(discriminator) is involved to measure the Wasserstein distance between the real and synthetic
data. In essence, the model learns causal structure in a generative process that trains the model to
realistically generate synthetic data. With the explicit modelling of learnable causal relations (i.e.
DAGs), the model learns how to generate synthetic data by simultaneously optimizing the causal
structure and the model parameters via end-to-end training. We compare the performance of
DAG-WGAN with other models that do not involve the Wasserstein metric in order to identify
the contribution from the Wasserstein metric in causal structure learning.
According to our experiments, the new DAG-WGAN model performs better than other models
by a margin in tabular data with wide columns. The model works well with both continuous and
discrete data while being capable of producing less noisy and more realistic data samples. It can
handle multiple data types, including linear, non-linear, continuous and discrete. We conclude
that the involvement of the Wasserstein metric helps causal structure learning in the generative
process.
2. RELATED WORK
DAGs consist of a set of nodes (variables) and directed edges (connections) between the nodes to
indicate direct causal relationships between the variables. When a DAG entails conditional
independencies of the variables in a joint distribution, the faithfulness condition allows us to
recover the DAG from the joint distribution [24]. We perform DAG learning with data
distributions that are exhibited in data samples.
3. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
3
2.1. Traditional DAG Learning Approaches
There are three main approaches for learning DAGs from data, including the constraint-based,
score-based and hybrid approaches.
2.1.1. Constraint-based Approach
Constraint-based search methods limit the graph search space by running local independence
tests to reduce the combinatorial complexity of the search space [25]. Examples of the constraint-
based algorithms include Causal Inference (CI) [8] and Fast Causal Inference (FCI) [26]; [9].
However, typically these methods only produce a set of candidate causal structures that all satisfy
the same conditional independencies.
2.1.2. Score-based Approach
Score-based methods associate a score to each DAG in the search space in order to measure how
well each graph fits the data. Typical score functions include Bayesian Gaussian equivalent
(BGe) [5], Bayesian Discrete equivalent (BDe) [6], Bayesian Information Criterion (BIC) [27],
Minimum Description Length (MDL) [7]. Assigning a score to each possible graph is difficult
due to the combinatorial nature of the problem. As a result, additional assumptions about the
DAGs have to be made - the most commonly used ones are bounded tree-width [28], tree-
structure [29] and sampling-based structure learning [30-32].
2.1.3. Hybrid Approach
Hybrid methods involve both score-based and constraint-based methods for DAG learning. A
subset of candidate graphs is generated by using independence tests and they are scored by a
score function. One example of the hybrid methods is named Max-Min-Hill-Climbing [10].
Another example is RELAX [33], which introduces “constraint relaxation” of possibly inaccurate
independence constraints of the search space.
2.2. DAG Learning with Continuous Optimization
Recently, a new approach named DAG-NOTEARS was proposed by Zheng et al. [11] to
transform the causal graph structure learning problem from a combinatorial problem into a
continuous optimization problem. This facilitates the use of deep generative models for causal
structure learning to leverage relations between data sample distributions and DAGs. However,
the original DAG-NOTEARS model can only handle linear and continuous data.
Based on DAG-NOTEARS, new solutions have been developed to cover non-linearity. Yu et al.
[12] developed DAG-GNN, which performs causal structure learning by using a Variational
Auto-Encoder architecture. GraN-DAG proposed by Lachapelle et al. [13] is another extension
from DAG-NOTEARS to handle non-linearity. In GraN-DAG, the calculation of the neural
network weights is constrained by the acyclicity constraint between the variables. It can work
with both continuous and discrete data. Meanwhile, the latest work in DAG-NoCurl from Yu et
al. [34] shows that it is also possible to learn DAGs without using explicit DAG constraints.
Notably, DAG-GAN proposed by Gao et al. [16] is one of the latest works that uses GAN for
causal structure learning. The work involves the use of Maximum Mean Discrepancy (MMD)
[17] in its loss function. The concept of "Information Maximization" [35-37] is incorporated into
the GAN framework. The resulting model handles multiple data types (continuous and discrete).
However, their experiments have only covered up to 40 nodes in the graph.
4. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
4
3. GENERATIVE MODEL ARCHITECTURES
Our model is based on the combination of Auto-encoder (AE) and Wasserstein Generative
Adversarial Network (WGAN).
3.1. Auto-encoder architecture
The auto-encoder architecture consists of an encoder which takes input data and produces a latent
variable z containing encoded features of the input. The latent variable z is then used by a decoder
to reconstructs the input data. This can be represented mathematically as:
(1)
where both the encoder Enc and the decoder Dec are functions parameterized by φ and θ
respectively.
3.2. Wasserstein Generative Adversarial Network
A standard generative adversarial network consists of a pair of generator/discriminator that
compete against each other. The generator synthesizes data samples from random noise samples
z. The discriminator determines whether the generated data is real or fake. WGAN is an
important improvement over the original GANs. Unlike GANs (in which the discriminator only
tells whether the data samples are real or fake), WGAN calculates the distance between the real
and fake data samples with Wasserstein Distance. WGAN-GP (Wasserstein GAN with Gradient
Penalty) is an improvement of the WGAN model by adding a gradient penalty to satisfy the
Lipschitz constraint. The loss function of WGAN-GP can be represented using the following
equations:
(2)
where the discriminator (critic) D needs to satisfy the 1-Lipschitz constraints, ℙr and ℙg denote
the real and synthetic data distributions, and ℙX
̂ is a distribution between real and generated data
samples.
4. DAG-WGAN
Our proposed DAG-WGAN model involves causal structure in the model architecture by
incorporating an adjacency matrix under an acyclicity constraint - see Figure 1. The model has
two main components: (1) an auto-encoder which computes the latent representations of the input
data; and (2) a WGAN which synthesizes the data with adversarial loss. The decoder of the auto-
encoder is used as the generator for the WGAN to generate synthetic data. The encoder is trained
with the reconstruction loss while the decoder is trained according to both the reconstruction and
adversarial loss. We cover both continuous and discrete data types. The joint WGANs and auto-
encoders architecture is motivated by the success of VAE (variational auto-encoder) GAN to
better capture data and feature representation [38].
5. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
5
Figure 1. DAG-WGAN Model Architecture
4.1. Auto-encoder (AE) and Reconstruction Loss
The use of the auto-encoder is to make sure that the latent space contains meaningful
representations, which are used as the noise input to the generator in the training for adversarial
loss. The representations are regularized to prevent over-fitting. Similar to Yu et al. [12], we
explicitly model the causal structure in both the encoder and decoder by using structural causal
model (SCM). The encoder Enc is as follows:
(3)
where f1 is a parameterized function to transform X, X ∈ ℝm×d
is a data sample from a joint
distribution of m variables in d dimensions. Z ∈ ℝm×d
is the latent representation. A ∈ ℝm×m
is the
weighted adjacency matrix. The corresponding decoder Dec is as follows:
(4)
where f2 is also a parameterized function that conceptually inverses f1. The functions f1 and f2 can
perform both linear and non-linear transformations on Z and X. Each variable corresponds to a
node in the weighted adjacency matrix A.
The AE is trained through a reconstruction loss, which can be defined as:
(5)
where MX is a product of the decoder.
To avoid over-fitting, a regularizer is added to the reconstruction loss. The regularizer loss term
takes the following form:
(6)
where MZ is the output of the encoder.
6. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
6
4.2. WGAN-GP
The decoder of the auto-encoder is also used as the generator of the WGAN model. Alongside the
auto-encoder, we use a critic to provide the adversarial loss with gradient penalty. All together,
we utilize the critic loss LD to train the critic, the generator loss LG (Equation 2) to train the
generator (namely the decoder in the AE), and the reconstruction loss LR (Equation 5) together
with the regularizer (Equation 6) to train both the encoder and decoder.
The critic is based upon the popular PacGAN [39] framework with the aim of successfully
handling mode collapse and is implemented as follows:
(7)
where x̃ is the data produced from the generator and X is the input data used for the model.
Leaky-ReLU is the activation function and Dropout is used for stability and overfitting
prevention. GP stands for Gradient Penalty [22] and is used in the loss term of the critic. Pac is a
notion coming from PacGAN [39] and is used to prevent mode collapse in categorical data,
which we found practically useful in terms of improving the outcomes. We use 10 data samples
per ‘pac’ to prevent mode collapse. Note that the number of data samples per ‘pac’ may vary,
however so far we found that 10 produces good results. Furthermore, one might argue that we
can remove the ‘pac’ term entirely when dealing with data which is not categorical. However, our
experiment results are not in favour of this idea.
4.3. Acyclicity Constraint
Neither minimizing the reconstruction loss, nor optimizing the adversarial loss ensures acyclicity.
Therefore, an additional term needs to be added to the loss function of the model. Here we use
the acyclicity constraint proposed by Yu et al. [12].
(8)
where A is the weighted adjacency matrix of the causal graph, α is a positive hyper-parameter
greater, m is the number of the variables, tr is a matrix trace and ◦ is the Hadamard product [40]
of A. This acyclicity requirement associated with DAG structure learning reformulates the nature
of the structure learning approach into a constrained continuous optimization. As such, we treat
our approach as a constrained optimization problem and use the popular augmented Lagrangian
method [41] to solve it.
4.4. Discrete Variables
In addition, our model naturally handles discrete variables by reformulating the reconstruction
loss term using the Cross-Entropy Loss (CEL) as follows:
(9)
where PX is the output of the decoder and X is the input data for the auto-encoder.
7. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
7
5. EXPERIMENTS
This section provides experimental results of the model performance by comparing against other
related approaches. In particular, our experiments try to identify the contribution from the
Wasserstein loss to causal structure learning by making a direct comparison with DAG-GNN [12]
where a similar auto-encoder architecture was used without involving the Wasserstein loss.
Furthermore, we have also compared the results against DAG-NOTEARS [11] and DAG-NoCurl
[34]. All the comparisons are measured using the Structural Hamming Distance (SHD) [42].
More specifically, we measure the SHD between the learned causal graph and the ground truth
graph. Moreover, we also test the integrity of the generated data against CorGAN [14].
The implementation was based on PyTorch [43]. In addition, we used learning rate schedulers
and Adam optimizers for both discriminator and auto-encoder with a learning rate of 3-e3.
5.1. Continuous data
To evaluate the model with continuous data, our experiments tried to learn causal graphs from
synthetic data that were created with known causal graph structures and equations. To allow
comparisons, we employed the same underlying graphs and equations like those in the related
work, namely DAG-GNN [12], DAG-NoCurl [34] and DAG-NOTEARS [11].
More specifically, the data synthesis was performed in two steps: 1) generating the ground truth
causal graph and 2) generating samples from the graph based on the linear SEM of the ground
truth graph. In Step (1), we generated an Erdos-Renyi directed acyclic graph with an expected
node degree of 3. The DAG was represented in a weighted adjacency matrix A. In Step (2), a
sample X was generated based on the following equations. We used the linear SEM X = AT
x+z
for the linear case, and two different equations, namely X = AT
h(x)+z (non-linear-1) and X =
2sin(AT
(x+0.5)) + AT
cos(x+0,5) + z (non-linear-2) for the nonlinear case.
The experiments were conducted with 5000 samples per graph. The graph sizes used in the
experiments were 10, 20, 50 and 100. We measured the SHD (averaged over five different
iterations of each model) between the output of a model and the ground truth, and the outcome
was compared against those from the related work models (i.e. those mentioned at the beginning
of Section 4). In addition to the mean SHD, confidence intervals were also measured based on the
variance in the estimated means. These provide insight into the consistency of the model. Tables
1-3 show the results on continuous data samples:
Table 1. Comparisons of DAG Structure Learning Outcomes between DAG-NOTEARS, DAG-NoCurl,
DAG-GNN and DAG-WGAN with Linear Data Samples
Model
SHD (5000 linear samples)
d = 10 d = 20 d = 50 d = 100
DAG-
NOTEARS
8.4 ± 7.94 2.6 ± 1.84 25.2 ± 19.82 106.56 ±
56.51
DAG-NoCurl 7.9 ± 7.26 2.5 ± 1.93 24.6 ± 19.43 99.18 ± 55.27
DAG-GNN 6 ± 7.77 3.2 ± 1.6 21.4 ± 14.15 88.8 ± 47.63
DAG-WGAN 2.2 ± 4.4 2 ± 1.1 4.8 ± 4.26 28.20 ± 12.02
8. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
8
Table 2. Comparison of DAG Structure Learning Outcomes between DAG-NOTEARS, DAG-NoCurl,
DAG-GNN and DAG-WGAN with Non-Linear Data Samples 1
Model
SHD (5000 non-linear-1 samples)
d = 10 d = 20 d = 50 d = 100
DAG-
NOTEARS
11.2 ± 4.79 19.3 ± 3.14 53.7 ± 11.39 105.47 ±
13.51
DAG-NoCurl 10.4 ± 4.42 17.4 ± 3.27 51.6 ± 11.43 105.7 ± 13.65
DAG-GNN 9.40 ± 0.8 15 ± 3.58 49.8 ± 7.03 104.8 ± 12.84
DAG-WGAN 9.8 ± 2.4 16 ± 5.4 40.40 ± 10.97 80.40 ± 9.09
Table 3. Comparison of DAG Structure Learning Outcomes between DAG-NOTEARS, DAG-NoCurl,
DAG-GNN and DAG-WGAN with Non-Linear Data Samples 2
Model
SHD (5000 non-linear-2 samples)
d = 10 d = 20 d = 50 d = 100
DAG-
NOTEARS
9.8 ± 2.61 22.9 ± 2.14 38.3 ± 13.19 125.21 ±
61.19
DAG-NoCurl 7.4 ± 2.78 17.6 ± 2.25 33.6 ± 12.53 116.8 ± 62.3
DAG-GNN 2.6 ± 2.06 3.80 ± 1.94 13.8 ± 6.88 112.2 ± 59.05
DAG-WGAN 1 ± 1.1 3.4 ± 2.06 12.20 ± 7.81 20.20 ± 11.67
5.2. Benchmark discrete data
To evaluate the model with discrete data, we used the benchmark datasets available at the
Bayesian Network Repository https://www.bnlearn.com/bnrepository/ . The repository provides a
variety of datasets together with their ground truth graphs (Discrete Bayesian Networks, Gaussian
Bayesian Networks and Conditional Linear Gaussian Bayesian Networks) in different sizes
(Small Networks, Medium Networks, Large Networks, Very Large Networks and Massive
Networks). To test the scalability of our model, we used datasets of multiple sizes. The datasets
utilized in the experiment were Sachs, Alarm, Child, Hailfinder and Pathfinder. The SHD metric
was used to measure the performance. Table 4 contains the results from the experiment.
Table 4. Comparison of DAG Structure Learning Outcomes between DAG-WGAN and DAG-GNN with
Discrete Data Samples
Dataset Nodes
SHD
DAG-WGAN DAG-GNN
Sachs 11 17 25
Child 20 20 30
Alarm 37 36 55
Hailfinder 56 73 71
Pathfinder 109 196 218
9. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
9
5.3. Data integrity
DAG-WGAN was also evaluated by comparing its data generation capabilities against other
models. More specifically, we compare the data generation capabilities of the models on a
'dimension-wise probability' basis by measuring how well these models learn the real data
distributions per dimension. We used the MIMIC-III dataset [44] in the experiments as the same
dataset was also used in other comparable works. The data is presented in the form of a patient
record, where each record has a fixed size of 1071 entries. We use the same data and pre-
processing steps, in order to ensure a fair comparison between all models.
Figure 2 depicts the results of the experiment. We have only compared with CorGAN [14] as it
out-performs the other similar models such as medGAN [45] and DBM [46] - see [14] where
results of the other models are available. We present the results in a scatter plot, where each point
represents one of the 1071 entries and the x and y axes represent the success rate for real and
synthetic data respectively. In addition, we use a diagonal line to mark the ideal scenario.
Figure 2. Data generation test results
6. DISCUSSION
We discuss the results of both continuous and discrete datasets. The limitations of the model are
also addressed together with the plans for future work.
The results on the continuous datasets are competitive across all three cases (linear, non-linear-1
and non-linear-2). According to Tables 1, 2 and 3, our model dominates DAG-NOTEARS and
DAG-NoCurl, producing better results throughout all experiments and outperforming DAG-GNN
in most of the cases.
For small-scale datasets (e.g. d=10 and d=20), our model performs better than DAG-GNN in
most cases and is superior to DAG-NOTEARS and DAG-NoCurl in all the experiments.
For large-scale datasets (e.g. d=50 and d=100), our model outperforms all the other models used
in the study by a significantly large margin, which implies that the model can scale better, which
is a significant advantage.
Notably, our experiments have mainly focused on the comparisons with DAG-GNN, as we aim
to identify contributions from the Wasserstein loss to causal structure learning. In DAG-GNN, a
very similar auto-encoder architecture was employed and DAG-WGAN has added WGAN as an
additional component. Hence the comparison is meaningful in order to identify contributions
from the Wasserstein metric.
10. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
10
For the discrete case, the results from the comparison between the two models are competitive.
Out of the five experiments conducted during the study, four results were clearly in favour of our
model (i.e.DAG-WGAN) and in one case DAG-GNN was slightly better. These are encouraging
results compared to the state of the art model of DAG-GNN.
Also, according to the results illustrated in Figure 2, dimension-wise, the data generated using
DAG-WGAN is more accurate and of higher quality than the ones generated using CorGAN,
medGAN or DBM.
These results show that DAG-WGAN can handle both continuous and discrete data effectively.
They have also demonstrated the quality of the generated data. As the improvement was achieved
by introducing the Wasserstein loss in addition to the auto-encoder architecture, the comparisons
between them show that the hypothesis on the contribution from the Wasserstein metric to causal
structure learning stands.
However, the discrepancy which occurred in the synthetic continuous data results provides us
with an insight into the limitations of the model. Some of our early analysis shows that further
improvement can be achieved by generalizing the current auto-encoder architecture. Furthermore,
as it stands currently, our model does not handle vector or mixed-typed data. These aspects will
be further experimented with and reported in our future work.
On the topic of potential improvements, the capability of recovering latent representation places
the generative models in a good position to address the hidden confounder challenges in causality
learning - some earlier work from [39-40] have moved towards this direction. We will further
investigate whether DAG-WGAN can contribute. Last but not least, the latest work in DAG-
NoCurl [29] shows that the speed performance can be improved by avoiding the DAG
constraints. We will investigate how this new development can be adapted to DAG-WGAN to
improve its overall performance.
The capability of generating latent representations places the generative models in a good
position to address the hidden confounder challenges in causality learning - some earlier work
from [47-48] have moved towards this direction. In the future work, we will further investigate
whether DAG-WGAN can contribute to this by incorporating the "Information Maximization"
concept [35-37] using the Maximum Mean Discrepancy loss term (MMD) [17]. The inclusion of
the MMD loss term to the auto-encoder loss would allow for more meaningful disentangled latent
representations, resulting in a potential increase in accuracy. Last but not least, the latest work in
DAG-NoCurl [34] shows that the speed performance can be improved by avoiding explicit DAG
constraints. We will investigate how this new development can be adapted to DAG-WGAN to
improve its overall performance.
7. CONCLUSION
In this work, we investigate the contributions of the Wasserstein distance metric to causal
structure learning from tabular data. We test the hypothesis with evidence that the Wasserstein
metric can simultaneously improve structure learning while generating more realistic data
samples. This leads to the development of a novel approach to DAG structure learning, which we
coined DAG-WGAN. The effectiveness of our model has been demonstrated in a series of tests,
which show the inclusion of the Wasserstein metric can indeed improve the outcomes of causal
structure learning. According to our results, the Wasserstein metric allows our model to generate
less noisy and more realistic output while being easier to train. The increase in quality of the
synthesized data in turn leads to an improvement in structure learning.
11. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
11
REFERENCES
[1] Spirtes, P., Glymour, C., Scheines, R.: Computation, Causation, and Discovery. AAAI Press,
Cambridge, Massachusetts (1985)
[2] Ott, S., Imoto, S., Miyano, S.: Finding optimal models for small gene networks. Paper presented at
Pacific symposium on Biocomputing (2004)
[3] Sachs, K., Perez, O., Pe ́er, D., Lauffenburger, Douglas A., Nolan, Garry P.: Causal protein-signaling
networks derived from multiparameter single-cell data. Science 308(5721), 523–529 (2005)
[4] Chickering, D.M., Heckerman, D., Meek, C.: Large sample learning of Bayesian networks is np-hard.
Journal of Machine Learning Research 5, 1287–1330 (2004)
[5] Kuipers, J., Moffa, G., Heckerman, D.: Addendum on the scoring of Gaussian directed acyclic
graphical models. The Annals of Statistics 42(4), 1689–1691 (2014)
[6] Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: The combination of
knowledge and statistical data. Machine Learning 20, 197–243 (1995)
[7] Bouckaert, R.: Probabilistic network construction using the minimum description length principle.
Paper presented at European conference on symbolic and quantitative approaches to reasoning and
uncertainty (1993)
[8] Pearl, J.: Causality: models, reasoning, and inference. Econometric Theory 19(46), 675–685 (2003)
[9] Zhang, J.: On the completeness of orientation rules for causal discovery in the presence of latent
confounders and selection bias. Artificial Intelligence 172(16-17), 1873–1896 (2008)
[10] Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure
learning algorithm. Machine Learning 65(1), 31–78 (2006)
[11] Zheng, X., Aragam, B., Ravikumar, P., Xing, E.P.: DAGs with NOTEARS: Continuous Optimization
for Structure Learning. Paper presented at 32nd Conference on Neural Information Processing
Systems, Montr ́eal, Canada (2018)
[12] Yu, Y., Chen, J.J., Gao, T., Yu, M.: DAG-GNN: DAG Structure Learning with Graph Neural
Networks. Paper presented at Proceedings of the 36th International Conference on Machine Learning,
Long Beach, California, PMLR 97, 2019. Copyright 2019 by the author(s) (2019)
[13] Lachapelle, S., Brouillard, P., Deleu, T., Lacoste-Julien, S.: Gradient-Based Neural DAG Learning.
Paper presented at ICLR 2020 (2020)
[14] Torfi, A., Fox, E.A.: CORGAN: Correlation-Capturing Convolutional Neural Networks for
Generating Synthetic Healthcare Records. Paper presented at The Thirty-Third International FLAIRS
Conference (FLAIRS-33) (2020)
[15] Rajabi, A., Garibay, Ozlem O.: Tabfairgan: Fair tabular data generation with generative adversarial
networks. Preprint at https://arxiv.org/abs/2109.00666 (2021)
[16] Gao, Y., Shen, L., Xia, S.-T.: DAG-GAN: Causal Structure Learning with Generative Adversarial
Nets. Paper presented at IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP) (2021)
[17] Tolstikhin, Ilya O., Sriperumbudur, Bharath K., Sch ̈olkopf, Bernhard.: Minimax estimation of
maximum mean discrepancy with radial kernels. Paper presented at 30th
Conference on Neural
Information Processing Systems (2016)
[18] Kantorovich, L.V.: Mathematical methods of organizing and planning production. Management
Science 6(4), 366–422 (1960)
[19] Cuturi, M.: Sinkhorn Distances: Lightspeed Computation of Optimal Transport. Paper presented at
Advances in Neural Information Processing Systems 26 (2013)
[20] Jacob, L., She, J., Almahairi, A., Rajeswar, S., Courville, A.C.: W2GAN: RECOVERING AN
OPTIMAL TRANSPORT MAP WITH A GAN. Paper presented at ICLR 2018 (2018)
[21] Genevay, A., Peyr ́e, G., Cuturi, M.: Learning Generative Models with Sinkhorn Divergences. Paper
presented at 21st International Conference on Artificial Intelligence and Statistics (2018)
[22] Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. Paper presented at Proceedings of the 34th
International Conference on Machine Learning (2017)
[23] Wiatrak, M., Albrecht, S.V., Nystrom, A.: Stabilizing Generative Adversarial Networks: A Survey.
Preprint at https://arxiv.org/abs/1910.00927 (2019)
[24] Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan
Kaufmann Publishers, Los Angeles, California (1988)
[25] Spirtes, P., Glymour, C.: An algorithm for fast recovery of sparse causal graphs. Social Science
Computer Review 9(1), 62–72 (1991)
12. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
12
[26] Spirtes, P., Meek, C., Richardson, T.: Causal Inference in the Presence of Latent Variables and
Selection Bias. Paper presented at UAI’95: Proceedings of the Eleventh conference on Uncertainty in
artificial intelligence (1995)
[27] Chickering, D.M., Heckerman, D.: Efficient approximations for the marginal likelihood of Bayesian
networks with hidden variables. Machine Learning 29(2-3), 181–212 (1997)
[28] Nie, S., Maua, D., de Campos, C., Ji, Q.: Advances in Learning Bayesian Networks of Bounded
Treewidth. Paper presented at Advances in Neural Information Processing Systems 27 (2014)
[29] Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE
transactions on Information Theory 14(3), 462–467 (1968)
[30] He, R., Tian, J., Wu, H.: Structure learning in Bayesian networks of a moderate size by efficient
sampling. Journal of Machine Learning Research 17(1), 3483–3536 (2016)
[31] Madigan, D., York, J., Allard, D.: Bayesian graphical models for discrete data. International
Statistical Review / Revue Internationale de Statistique 63(2), 215–232 (1995)
[32] Friedman, N., Koller, D.: Being Bayesian about network structure. a Bayesian approach to structure
discovery in Bayesian networks. Machine Learning 50(1-2), 95–125 (2003)
[33] Fast, A., Jensen, D.: Constraint Relaxation for Learning the Structure of Bayesian Networks. Paper
presented at University of Massachusetts Amherst (2009)
[34] Yue Yu, N.Y. Tian Gao, Ji, Q.: DAGs with No Curl: An Efficient DAG Structure Learning
Approach. Paper presented at Proceedings of the 38th International Conference on Machine Learning
(2021)
[35] Zhao, S., Song, J., Ermon, S.: Infovae: Information maximizing variational autoencoders. Preprint at
https://arxiv.org/abs/1706.02262 (2017)
[36] Chen, Xi, Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., P. Abbeel: Infogan: Interpretable
representation learning by information maximizing generative adversarial nets. Paper presented at
30th
Conference on Neural Information Processing Systems (2016)
[37] Ye, F., Bors, A.: Infovaegan : learning joint interpretable representations by information
maximization and maximum likelihood. Paper presented at 28th
International Conference on Image
Processing (2021)
[38] Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Auto-encoding beyond Pixels Using a
Learned Similarity Metric. Paper presented at Proceedings of the 33rd International Conference on
International Conference on Machine Learning (2016)
[39] Cheng, A.: PAC-GAN: Packet Generation of Network Traffic using Generative Adversarial
Networks. Paper presented at IEEE 10th Annual Information Technology, Electronics and Mobile
Communication Conference (IEMCON) (2019)
[40] Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge, New York
(1985)
[41] Bertsekas, D.: Nonlinear Programming. Athena Scientific, 2nd edition (1999)
[42] de Jongh, M., Druzdzel, M.J.: A comparison of structural distance measures for causal Bayesian
network models. Recent Advances in Intelligent Information Systems, 443–456 (2009)
[43] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein,
N., Antiga, L., Desmaison, A., K ̈opf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy,
S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An Imperative Style, High-Performance Deep
Learning Library. Paper presented at 33rd Conference on Neural Information Processing Systems,
Vancouver, Canada (2019)
[44] Johnson, A.E.W., Pollard, T.J., Shen, L., Lehman, L.-w.H., Feng, M., Ghassemi, M.M., Moody, B.,
Szolovits, P., Celi, L.A., Mark, R.G.: MIMIC-III, a freely accessible critical care database.
https://doi.org/10.1038/sdata.2016.35 (2016)
[45] Choi, E., Biswal, S., Malin, B.A., Duke, J.D., Stewart, W.F., Sun, J.: Generating Multi-Label Discrete
Patient Records using Generative Adversarial Networks. Paper presented at Proceedings of Machine
Learning for Healthcare 2017 (2017)
[46] Salakhutdinov, R., Hinton, G.E.: Replicated Softmax: An Undirected Topic Model. Paper presented
at Advances in Neural Information Processing Systems 22 (2009)
[47] Louizos, C., Shalit, U., Mooij, J.M., Sontag, D.A., Zemel, R.S., Welling, M.: Causal Effect Inference
with Deep Latent-Variable Models. Preprint at https://arxiv.org/abs/1705.08821 (2017)
[48] Wang, Y., Blei, D.M.: The blessings of multiple causes. Journal of the American Statistical
Association 114(528), 1574–1596 (2019)
13. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.13, No.3, May 2022
13
AUTHORS
Hristo Petkov received the B.Sc. (First Class) degree from the Department of Computer and
Information Sciences, University of Strathclyde. He is currently pursuing the Doctor’s
degree with the same department. His research interests include medicine, healthcare, deep
learning, neural network models.
Colin Hanley received the M.Sc. degree from the Department of Management Science,
University of Strathclyde. Currently, he is in pursuit of a career as a Full Stack Data
Developer. His interests include quantitative analysis, human decision making, data
analytics.
Feng Dong is a Professor of Computer Science at the University of Strathclyde. He was
awarded a PhD in Zhejiang University, China. His recent research has addressed human
centric AI to support knowledge discovery, visual data analytics, image analysis, pattern
recognition.