Bayesian network is a powerful mathematical tool for prediction and diagnosis applications. A large Bayesian network can be constituted of many simple networks which in turn are constructed from simple graphs. A simple graph consists of one child node and many parent nodes. The strength of each relationship between a child node and a parent node is quantified by a weight and all relationships share the same semantics such as prerequisite, diagnostic, and aggregation. The research focuses on converting graphic relationships into conditional probabilities in order to construct a simple Bayesian network from a graph. Diagnostic relationship is the main research object, in which sufficient diagnostic proposition is proposed for validating diagnostic relationship. Relationship conversion is adhered to logic gates such as AND, OR, and XOR, which is essential feature of the research.
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Joe Suzuki
J. Suzuki. ``Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present". IEEE Data Compression Conference, pp. 307-316, Snowbird, Utah, April 2012.
An enhanced fuzzy rough set based clustering algorithm for categorical dataeSAT Journals
Abstract In today’s world everything is done digitally and so we have lots of data raw data. This data are useful to predict future events if we proper use it. Clustering is such a technique where we put closely related data together. Furthermore we have types of data sequential, interval, categorical etc. In this paper we have shown what is the problem with clustering categorical data with rough set and who we can overcome with improvement.
An enhanced fuzzy rough set based clustering algorithm for categorical dataeSAT Publishing House
This document summarizes a research paper that proposes an enhanced fuzzy rough set-based clustering algorithm for categorical data. The paper discusses problems with using traditional rough set theory to cluster categorical data when there are no crisp relations between attributes. It proposes using fuzzy logic to assign weights to attribute values and calculate lower approximations based on the similarity between sets, in order to cluster categorical data when crisp relations do not exist. The proposed method is described through an example comparing traditional rough set clustering to the new fuzzy rough set approach.
1) The document discusses the limiting behavior of the "probability of claiming superiority" (PST) in Bayesian clinical trials as the sample size increases.
2) The main result is that under certain conditions, the PST (also called average power) converges to the prior probability that the alternative hypothesis is true.
3) The two key assumptions for this limiting result are: 1) the posterior distribution is "π-consistent", and 2) the prior probability of the boundary of the alternative hypothesis set is zero.
1) Variational inference is a technique for approximating intractable probabilities using an tractable variational distribution. It works by choosing a variational distribution from a family of simpler distributions and minimizing the distance between it and the true distribution.
2) Specifically, it maximizes a lower bound on the log probability of the data by minimizing the Kullback-Leibler divergence between an approximate conditional distribution and the true posterior.
3) Mean field variational inference assumes independence between variables, allowing optimization of one variable at a time in closed form. This iterative process converges to a local optimum of the evidence lower bound.
Simplicial closure and higher-order link prediction --- SIAMNS18Austin Benson
The document discusses higher-order link prediction, which aims to predict the formation of new groups or "simplices" containing more than two nodes, based on structural properties in timestamped simplex data from various domains. It finds that predicting the closure of open triangles (where a pair of nodes have interacted but not with the third) performs well, and that simply averaging the edge weights in a triangle is often a good predictor. Predicting new structures in communication, collaboration and proximity networks can provide insights beyond classical link prediction.
The document describes q-Bernstein polynomials, which generalize Bernstein polynomials using q-integers. It proposes using q-Bernstein polynomials to estimate distribution and density functions nonparametrically from sample data. Specifically, it defines q-Bernstein distribution and density estimators, which reduce to standard Bernstein estimators when q=1. The paper studies the asymptotic properties of the estimators and performs numerical studies using cross-validation to select the smoothing parameter q. Figures show the estimators can provide a smooth approximation to distribution and density functions.
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Joe Suzuki
J. Suzuki. ``Bayesian network structure estimation based on the Bayesian/MDL criteria when both discrete and continuous variables are present". IEEE Data Compression Conference, pp. 307-316, Snowbird, Utah, April 2012.
An enhanced fuzzy rough set based clustering algorithm for categorical dataeSAT Journals
Abstract In today’s world everything is done digitally and so we have lots of data raw data. This data are useful to predict future events if we proper use it. Clustering is such a technique where we put closely related data together. Furthermore we have types of data sequential, interval, categorical etc. In this paper we have shown what is the problem with clustering categorical data with rough set and who we can overcome with improvement.
An enhanced fuzzy rough set based clustering algorithm for categorical dataeSAT Publishing House
This document summarizes a research paper that proposes an enhanced fuzzy rough set-based clustering algorithm for categorical data. The paper discusses problems with using traditional rough set theory to cluster categorical data when there are no crisp relations between attributes. It proposes using fuzzy logic to assign weights to attribute values and calculate lower approximations based on the similarity between sets, in order to cluster categorical data when crisp relations do not exist. The proposed method is described through an example comparing traditional rough set clustering to the new fuzzy rough set approach.
1) The document discusses the limiting behavior of the "probability of claiming superiority" (PST) in Bayesian clinical trials as the sample size increases.
2) The main result is that under certain conditions, the PST (also called average power) converges to the prior probability that the alternative hypothesis is true.
3) The two key assumptions for this limiting result are: 1) the posterior distribution is "π-consistent", and 2) the prior probability of the boundary of the alternative hypothesis set is zero.
1) Variational inference is a technique for approximating intractable probabilities using an tractable variational distribution. It works by choosing a variational distribution from a family of simpler distributions and minimizing the distance between it and the true distribution.
2) Specifically, it maximizes a lower bound on the log probability of the data by minimizing the Kullback-Leibler divergence between an approximate conditional distribution and the true posterior.
3) Mean field variational inference assumes independence between variables, allowing optimization of one variable at a time in closed form. This iterative process converges to a local optimum of the evidence lower bound.
Simplicial closure and higher-order link prediction --- SIAMNS18Austin Benson
The document discusses higher-order link prediction, which aims to predict the formation of new groups or "simplices" containing more than two nodes, based on structural properties in timestamped simplex data from various domains. It finds that predicting the closure of open triangles (where a pair of nodes have interacted but not with the third) performs well, and that simply averaging the edge weights in a triangle is often a good predictor. Predicting new structures in communication, collaboration and proximity networks can provide insights beyond classical link prediction.
The document describes q-Bernstein polynomials, which generalize Bernstein polynomials using q-integers. It proposes using q-Bernstein polynomials to estimate distribution and density functions nonparametrically from sample data. Specifically, it defines q-Bernstein distribution and density estimators, which reduce to standard Bernstein estimators when q=1. The paper studies the asymptotic properties of the estimators and performs numerical studies using cross-validation to select the smoothing parameter q. Figures show the estimators can provide a smooth approximation to distribution and density functions.
This document discusses different approaches for modeling inconsistent expert judgments and making decisions when probabilities are inconsistent. It begins by introducing the concept of inconsistent beliefs that cannot be represented by a single joint probability distribution. It then reviews three approaches: the Bayesian model, which applies Bayes' theorem but depends on the choice of prior and expert likelihood functions; the quantum model, which represents inconsistencies using context-dependent quantum states but does not indicate a best decision; and the signed probability model, which relaxes the axioms of probability to allow for negative probabilities and aims to find a signed probability distribution with minimum total variation from unity.
This document presents a mathematical model for finding the bivariate normal distribution of luteinizing hormone (LH) and progesterone in normal women. It begins by deriving the properties of the bivariate normal distribution when two random variables are linearly related. It then applies this to model LH and progesterone levels, showing their joint probability density function follows a bivariate normal distribution. Contours of this distribution are ellipses. The model is applied to help understand normal corpus luteum formation and hormone secretion patterns in the luteal phase of the menstrual cycle.
This document provides an overview of probabilistic reasoning and uncertainty in knowledge representation. It discusses:
1) Using probability theory to represent uncertainty quantitatively rather than logical rules with certainty factors.
2) Key concepts in probability theory including random variables, probability distributions, joint probabilities, marginal probabilities, and conditional probabilities.
3) Representing a problem domain as a probabilistic model with a sample space of possible variable states.
4) Independence of variables allowing simpler computation of probabilities.
5) The document is an introduction to probabilistic reasoning concepts to be covered in more detail later, including Bayesian networks.
This document provides an overview of probabilistic reasoning and uncertainty in knowledge representation. It discusses:
1) Using probability theory to represent uncertainty quantitatively rather than logical rules with certainty factors.
2) Key concepts in probability theory including random variables, probability distributions, joint probabilities, marginal probabilities, and conditional probabilities.
3) Representing a problem domain as a probabilistic model with a sample space of possible variable states.
4) Independence of variables allowing simpler computation of probabilities.
5) The document is an introduction to probabilistic reasoning concepts to be covered in more detail later, including Bayesian networks.
3 Entropy, Relative Entropy and Mutual Information.pdfCharlesAsiedu4
The document discusses key information theory concepts such as entropy, mutual information, and relative entropy. It introduces entropy as a measure of uncertainty or information content for a random variable based on its probability distribution. Mutual information measures the amount of information one random variable conveys about another, and is related to the concepts of entropy, joint entropy, and conditional entropy. Various properties of these information measures are also examined, including how they are applied in analyzing communication channels.
The document discusses various methods for constructing confidence intervals for estimating multinomial proportions. It aims to analyze the propensity for aberrations (i.e. unrealistic bounds like negative values) in the interval estimates across different classical and Bayesian methods. Specifically, it provides the mathematical conditions under which each method may produce aberrant interval limits, such as zero-width intervals or bounds exceeding 0 and 1, especially for small sample counts. The document also develops an R program to facilitate computational implementation of the various methods for applied analysis of multinomial data.
This document presents an asymptotic expansion of the posterior density in high dimensional generalized linear models. The main results are:
1) The authors prove a third order correct asymptotic expansion of the posterior density for generalized linear models with canonical link functions when the number of regressors grows with sample size.
2) This asymptotic expansion is then used to derive moment matching priors in the generalized linear model setting.
3) The expansion assumes the number of regressors grows such that p6+εn /n → 0 as n → ∞ for some small ε > 0, which is stronger than prior work requiring only p4n log(pn)/n → 0.
This document discusses univariate and multivariate regressions. It explains how to measure the accuracy of regression coefficient estimates using standard error, confidence intervals, and hypothesis testing. It then covers multiple linear regression, using methods like forward, backward, and mixed selection to identify important predictor variables. The document concludes by discussing model fitting metrics like RSE and R2, and factors to consider when making predictions with regression models.
This document provides an overview of Bayesian networks through a 3-day tutorial. Day 1 introduces Bayesian networks and provides a medical diagnosis example. It defines key concepts like Bayes' theorem and influence diagrams. Day 2 covers propagation algorithms, demonstrating how evidence is propagated through a sample chain network. Day 3 will cover learning from data and using continuous variables and software. The overview outlines propagation algorithms for singly and multiply connected graphs.
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELScscpconf
Uncertainty is a pervasive in real world environment due to vagueness, is associated with the
difficulty of making sharp distinctions and ambiguity, is associated with situations in which the
choices among several precise alternatives cannot be perfectly resolved. Analysis of large
collections of uncertain data is a primary task in the real world applications, because data is
incomplete, inaccurate and inefficient. Representation of uncertain data in various forms such
as Data Stream models, Linkage models, Graphical models and so on, which is the most simple,
natural way to process and produce the optimized results through Query processing. In this
paper, we propose the Uncertain Data model can be represented as Possibilistic data model
and vice versa for the process of uncertain data using various data models such as possibilistic
linkage model, Data streams, Possibilistic Graphs. This paper presents representation and
process of Possiblistic Linkage model through Possible Worlds with the use of product-based
operator.
Min-based qualitative possibilistic networks are one of the effective tools for a compact representation of
decision problems under uncertainty. The exact approaches for computing decision based on possibilistic
networks are limited by the size of the possibility distributions. Generally, these approaches are based on
possibilistic propagation algorithms. An important step in the computation of the decision is the
transformation of the DAG (Direct Acyclic Graph) into a secondary structure, known as the junction trees
(JT). This transformation is known to be costly and represents a difficult problem. We propose in this paper
a new approximate approach for the computation of decision under uncertainty within possibilistic
networks. The computing of the optimal optimistic decision no longer goes through the junction tree
construction step. Instead, it is performed by calculating the degree of normalization in the moral graph
resulting from the merging of the possibilistic network codifying knowledge of the agent and that codifying
its preferences.
The document describes methods for jointly estimating graphical models across multiple classes of heterogeneous data. It presents the problem of estimating a single graphical model for heterogeneous data, which can mask underlying differences, or estimating separate models for each class, which ignores common structures. Three algorithms are proposed to jointly estimate graphical models for different classes while preserving common and unique structures. The methods are motivated by applications to gene expression data from multiple tissue types.
A PRACTICAL POWERFUL ROBUST AND INTERPRETABLE FAMILY OF CORRELATION COEFFICIE...Savas Papadopoulos, Ph.D
If we conducted a competition for which statistical quantity would be the most valuable in exploratory data analysis, the winner would most likely be the correlation coefficient with a significant difference from its first competitor. In addition, most data applications contain non-normal data with outliers without being able to be converted to normal data. Therefore, we search for robust correlation coefficients to nonnormality and/or outliers that could be applied to all applications and detect influenced or hidden correlations not recognized by the most popular correlation coefficients. We introduce a correlation-coefficient family with the Pearson and Spearman coefficients as specific cases. Other family members provide desirable lower p-values than those derived by the standard coefficients in the earlier problems. The proposed family of coefficients, their cut-off points, and p-values, computed by permutation tests, could be applied by all scientists analyzing data. We share simulations, code, and real data by email or the internet.
1) The document discusses non-parametric estimation of probability density functions (pdfs) and gender classification using artificial neural networks (ANNs).
2) It describes limitations of traditional non-parametric methods like Parzen windows and k-nearest neighbors, and how ANNs can help address these issues.
3) ANNs are presented as statistical estimators that are well-suited for pattern recognition and classification problems involving noisy data, providing alternatives to traditional "discrete" classifiers.
The document discusses probabilistic information retrieval and Bayesian networks for modeling document collections and queries. It introduces concepts like conditional probability, Bayes' theorem, and the probability ranking principle. It then describes the binary independence retrieval model and how Bayesian networks can model dependencies between document terms and concepts to improve upon the independence assumption. The use of Bayesian networks allows modeling both documents and queries as networks to estimate the probability that a document satisfies an information need.
Bayesian Networks - A Brief IntroductionAdnan Masood
- A Bayesian network is a graphical model that depicts probabilistic relationships among variables. It represents a joint probability distribution over variables in a directed acyclic graph with conditional probability tables.
- A Bayesian network consists of a directed acyclic graph whose nodes represent variables and edges represent probabilistic dependencies, along with conditional probability distributions that quantify the relationships.
- Inference using a Bayesian network allows computing probabilities like P(X|evidence) by taking into account the graph structure and probability tables.
This document provides an introduction to probability theory and different probability distributions. It begins with defining probability as a quantitative measure of the likelihood of events occurring. It then covers fundamental probability concepts like mutually exclusive events, additive and multiplicative laws of probability, and independent events. The document also introduces random variables and common probability distributions like the binomial, Poisson, and normal distributions. It provides examples of how each distribution is used and concludes with characteristics of the normal distribution.
Naïve Bayes Machine Learning Classification with R Programming: A case study ...SubmissionResearchpa
This analytical review paper clearly explains Naïve Bayes machine learning techniques for simple probabilistic classification based on bayes theorem with the assumption of independence between the characteristics using r programming. Although there is large gap between which algorithm is suitable for data analysis when there was large categorical variable to be predict the value in research data. The model is trained in the training data set to make predictions on the test data sets for the implementation of the Naïve Bayes classification. The uniqueness of the technique is that gets new information and tries to make a better forecast by considering the new evidence when the input variable is of largely categorical in nature that is quite similar to how our human mind works while selecting proper judgement from various alternative of choices and can be applied in the neuronal network of the human brain does using r programming. Here researcher takes binary.csv data sets of 400 observations of 4 dependent attributes of educational data sets. Admit is dependent variable of gre, score gpa and rank of previous grade which ultimately determine whether student will be admitted or not for next program. Initially the gra and gpa variables has 0.36 percent significant in the association with rank categorical variable. The box plot and density plot demonstrate the data overlap between admitted and not admitted data sets. The naïve Bayes classification model classify the large data with 0.68 percent for not admitted where as 0.31 percent were admitted. The confusion matrix, and the prediction were calculated with 0.68 percent accuracy when 95 percent confidence interval. Similarly, the training accuracy is increased from 29 percent to 32 percent when naïve Bayes algorithm method as use kernel is equal to TRUE that ultimately decrease misclassification errors in the binary data sets. Yagyanath Rimal. (2019). Naïve Bayes Machine Learning Classification with R Programming: A case study of binary data sets . International Journal on Orange Technologies, 1(2), 27-34. Retrieved from https://journals.researchparks.org/index.php/IJOT/article/view/358 Pdf Url: https://journals.researchparks.org/index.php/IJOT/article/view/358/347 Paper Url: https://journals.researchparks.org/index.php/IJOT/article/view/358
We propose a regularized method for multivariate linear regression when the number of predictors may exceed the sample size. This method is designed to strengthen the estimation and the selection of the relevant input features with three ingredients: it takes advantage of the dependency pattern between the responses by estimating the residual covariance; it performs selection on direct links between predictors and responses; and selection is driven by prior structural information. To this end, we build on a recent reformulation of the multivariate linear regression model to a conditional Gaussian graphical model and propose a new regularization scheme accompanied with an efficient optimization procedure. On top of showing very competitive performance on artificial and real data sets, our method demonstrates capabilities for fine interpretation of its parameters, as illustrated in applications to genetics, genomics and spectroscopy.
Adversarial Variational Autoencoders to extend and improve generative model -...Loc Nguyen
Generative artificial intelligence (GenAI) has been developing with many incredible achievements like ChatGPT and Bard. Deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to strong points of deep neural network (DNN) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims to synaptic plasticity of human neuron network, fosters generation ability of DGM which produces surprised results with support of statistical flexibility. Two popular approaches in DGM are Variational Autoencoders (VAE) and Generative Adversarial Network (GAN). Both VAE and GAN have their own strong points although they share and imply underline theory of statistics as well as incredible complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. In this research, I try to unify VAE and GAN into a consistent and consolidated model called Adversarial Variational Autoencoders (AVA) in which VAE and GAN complement each other, for instance, VAE is a good data generator by encoding data via excellent ideology of Kullback-Leibler divergence and GAN is a significantly important method to assess reliability of data which is realistic or fake. In other words, AVA aims to improve accuracy of generative models, besides AVA extends function of simple generative models. In methodology this research focuses on combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
Conditional mixture model and its application for regression modelLoc Nguyen
Expectation maximization (EM) algorithm is a powerful mathematical tool for estimating statistical parameter when data sample contains hidden part and observed part. EM is applied to learn finite mixture model in which the whole distribution of observed variable is average sum of partial distributions. Coverage ratio of every partial distribution is specified by the probability of hidden variable. An application of mixture model is soft clustering in which cluster is modeled by hidden variable whereas each data point can be assigned to more than one cluster and degree of such assignment is represented by the probability of hidden variable. However, such probability in traditional mixture model is simplified as a parameter, which can cause loss of valuable information. Therefore, in this research I propose a so-called conditional mixture model (CMM) in which the probability of hidden variable is modeled as a full probabilistic density function (PDF) that owns individual parameter. CMM aims to extend mixture model. I also propose an application of CMM which is called adaptive regression model (ARM). Traditional regression model is effective when data sample is scattered equally. If data points are grouped into clusters, regression model tries to learn a unified regression function which goes through all data points. Obviously, such unified function is not effective to evaluate response variable based on grouped data points. The concept “adaptive” of ARM means that ARM solves the ineffectiveness problem by selecting the best cluster of data points firstly and then evaluating response variable within such best cluster. In orther words, ARM reduces estimation space of regression model so as to gain high accuracy in calculation.
Keywords: expectation maximization (EM) algorithm, finite mixture model, conditional mixture model, regression model, adaptive regression model (ARM).
More Related Content
Similar to Converting Graphic Relationships into Conditional Probabilities in Bayesian Network
This document discusses different approaches for modeling inconsistent expert judgments and making decisions when probabilities are inconsistent. It begins by introducing the concept of inconsistent beliefs that cannot be represented by a single joint probability distribution. It then reviews three approaches: the Bayesian model, which applies Bayes' theorem but depends on the choice of prior and expert likelihood functions; the quantum model, which represents inconsistencies using context-dependent quantum states but does not indicate a best decision; and the signed probability model, which relaxes the axioms of probability to allow for negative probabilities and aims to find a signed probability distribution with minimum total variation from unity.
This document presents a mathematical model for finding the bivariate normal distribution of luteinizing hormone (LH) and progesterone in normal women. It begins by deriving the properties of the bivariate normal distribution when two random variables are linearly related. It then applies this to model LH and progesterone levels, showing their joint probability density function follows a bivariate normal distribution. Contours of this distribution are ellipses. The model is applied to help understand normal corpus luteum formation and hormone secretion patterns in the luteal phase of the menstrual cycle.
This document provides an overview of probabilistic reasoning and uncertainty in knowledge representation. It discusses:
1) Using probability theory to represent uncertainty quantitatively rather than logical rules with certainty factors.
2) Key concepts in probability theory including random variables, probability distributions, joint probabilities, marginal probabilities, and conditional probabilities.
3) Representing a problem domain as a probabilistic model with a sample space of possible variable states.
4) Independence of variables allowing simpler computation of probabilities.
5) The document is an introduction to probabilistic reasoning concepts to be covered in more detail later, including Bayesian networks.
This document provides an overview of probabilistic reasoning and uncertainty in knowledge representation. It discusses:
1) Using probability theory to represent uncertainty quantitatively rather than logical rules with certainty factors.
2) Key concepts in probability theory including random variables, probability distributions, joint probabilities, marginal probabilities, and conditional probabilities.
3) Representing a problem domain as a probabilistic model with a sample space of possible variable states.
4) Independence of variables allowing simpler computation of probabilities.
5) The document is an introduction to probabilistic reasoning concepts to be covered in more detail later, including Bayesian networks.
3 Entropy, Relative Entropy and Mutual Information.pdfCharlesAsiedu4
The document discusses key information theory concepts such as entropy, mutual information, and relative entropy. It introduces entropy as a measure of uncertainty or information content for a random variable based on its probability distribution. Mutual information measures the amount of information one random variable conveys about another, and is related to the concepts of entropy, joint entropy, and conditional entropy. Various properties of these information measures are also examined, including how they are applied in analyzing communication channels.
The document discusses various methods for constructing confidence intervals for estimating multinomial proportions. It aims to analyze the propensity for aberrations (i.e. unrealistic bounds like negative values) in the interval estimates across different classical and Bayesian methods. Specifically, it provides the mathematical conditions under which each method may produce aberrant interval limits, such as zero-width intervals or bounds exceeding 0 and 1, especially for small sample counts. The document also develops an R program to facilitate computational implementation of the various methods for applied analysis of multinomial data.
This document presents an asymptotic expansion of the posterior density in high dimensional generalized linear models. The main results are:
1) The authors prove a third order correct asymptotic expansion of the posterior density for generalized linear models with canonical link functions when the number of regressors grows with sample size.
2) This asymptotic expansion is then used to derive moment matching priors in the generalized linear model setting.
3) The expansion assumes the number of regressors grows such that p6+εn /n → 0 as n → ∞ for some small ε > 0, which is stronger than prior work requiring only p4n log(pn)/n → 0.
This document discusses univariate and multivariate regressions. It explains how to measure the accuracy of regression coefficient estimates using standard error, confidence intervals, and hypothesis testing. It then covers multiple linear regression, using methods like forward, backward, and mixed selection to identify important predictor variables. The document concludes by discussing model fitting metrics like RSE and R2, and factors to consider when making predictions with regression models.
This document provides an overview of Bayesian networks through a 3-day tutorial. Day 1 introduces Bayesian networks and provides a medical diagnosis example. It defines key concepts like Bayes' theorem and influence diagrams. Day 2 covers propagation algorithms, demonstrating how evidence is propagated through a sample chain network. Day 3 will cover learning from data and using continuous variables and software. The overview outlines propagation algorithms for singly and multiply connected graphs.
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELScscpconf
Uncertainty is a pervasive in real world environment due to vagueness, is associated with the
difficulty of making sharp distinctions and ambiguity, is associated with situations in which the
choices among several precise alternatives cannot be perfectly resolved. Analysis of large
collections of uncertain data is a primary task in the real world applications, because data is
incomplete, inaccurate and inefficient. Representation of uncertain data in various forms such
as Data Stream models, Linkage models, Graphical models and so on, which is the most simple,
natural way to process and produce the optimized results through Query processing. In this
paper, we propose the Uncertain Data model can be represented as Possibilistic data model
and vice versa for the process of uncertain data using various data models such as possibilistic
linkage model, Data streams, Possibilistic Graphs. This paper presents representation and
process of Possiblistic Linkage model through Possible Worlds with the use of product-based
operator.
Min-based qualitative possibilistic networks are one of the effective tools for a compact representation of
decision problems under uncertainty. The exact approaches for computing decision based on possibilistic
networks are limited by the size of the possibility distributions. Generally, these approaches are based on
possibilistic propagation algorithms. An important step in the computation of the decision is the
transformation of the DAG (Direct Acyclic Graph) into a secondary structure, known as the junction trees
(JT). This transformation is known to be costly and represents a difficult problem. We propose in this paper
a new approximate approach for the computation of decision under uncertainty within possibilistic
networks. The computing of the optimal optimistic decision no longer goes through the junction tree
construction step. Instead, it is performed by calculating the degree of normalization in the moral graph
resulting from the merging of the possibilistic network codifying knowledge of the agent and that codifying
its preferences.
The document describes methods for jointly estimating graphical models across multiple classes of heterogeneous data. It presents the problem of estimating a single graphical model for heterogeneous data, which can mask underlying differences, or estimating separate models for each class, which ignores common structures. Three algorithms are proposed to jointly estimate graphical models for different classes while preserving common and unique structures. The methods are motivated by applications to gene expression data from multiple tissue types.
A PRACTICAL POWERFUL ROBUST AND INTERPRETABLE FAMILY OF CORRELATION COEFFICIE...Savas Papadopoulos, Ph.D
If we conducted a competition for which statistical quantity would be the most valuable in exploratory data analysis, the winner would most likely be the correlation coefficient with a significant difference from its first competitor. In addition, most data applications contain non-normal data with outliers without being able to be converted to normal data. Therefore, we search for robust correlation coefficients to nonnormality and/or outliers that could be applied to all applications and detect influenced or hidden correlations not recognized by the most popular correlation coefficients. We introduce a correlation-coefficient family with the Pearson and Spearman coefficients as specific cases. Other family members provide desirable lower p-values than those derived by the standard coefficients in the earlier problems. The proposed family of coefficients, their cut-off points, and p-values, computed by permutation tests, could be applied by all scientists analyzing data. We share simulations, code, and real data by email or the internet.
1) The document discusses non-parametric estimation of probability density functions (pdfs) and gender classification using artificial neural networks (ANNs).
2) It describes limitations of traditional non-parametric methods like Parzen windows and k-nearest neighbors, and how ANNs can help address these issues.
3) ANNs are presented as statistical estimators that are well-suited for pattern recognition and classification problems involving noisy data, providing alternatives to traditional "discrete" classifiers.
The document discusses probabilistic information retrieval and Bayesian networks for modeling document collections and queries. It introduces concepts like conditional probability, Bayes' theorem, and the probability ranking principle. It then describes the binary independence retrieval model and how Bayesian networks can model dependencies between document terms and concepts to improve upon the independence assumption. The use of Bayesian networks allows modeling both documents and queries as networks to estimate the probability that a document satisfies an information need.
Bayesian Networks - A Brief IntroductionAdnan Masood
- A Bayesian network is a graphical model that depicts probabilistic relationships among variables. It represents a joint probability distribution over variables in a directed acyclic graph with conditional probability tables.
- A Bayesian network consists of a directed acyclic graph whose nodes represent variables and edges represent probabilistic dependencies, along with conditional probability distributions that quantify the relationships.
- Inference using a Bayesian network allows computing probabilities like P(X|evidence) by taking into account the graph structure and probability tables.
This document provides an introduction to probability theory and different probability distributions. It begins with defining probability as a quantitative measure of the likelihood of events occurring. It then covers fundamental probability concepts like mutually exclusive events, additive and multiplicative laws of probability, and independent events. The document also introduces random variables and common probability distributions like the binomial, Poisson, and normal distributions. It provides examples of how each distribution is used and concludes with characteristics of the normal distribution.
Naïve Bayes Machine Learning Classification with R Programming: A case study ...SubmissionResearchpa
This analytical review paper clearly explains Naïve Bayes machine learning techniques for simple probabilistic classification based on bayes theorem with the assumption of independence between the characteristics using r programming. Although there is large gap between which algorithm is suitable for data analysis when there was large categorical variable to be predict the value in research data. The model is trained in the training data set to make predictions on the test data sets for the implementation of the Naïve Bayes classification. The uniqueness of the technique is that gets new information and tries to make a better forecast by considering the new evidence when the input variable is of largely categorical in nature that is quite similar to how our human mind works while selecting proper judgement from various alternative of choices and can be applied in the neuronal network of the human brain does using r programming. Here researcher takes binary.csv data sets of 400 observations of 4 dependent attributes of educational data sets. Admit is dependent variable of gre, score gpa and rank of previous grade which ultimately determine whether student will be admitted or not for next program. Initially the gra and gpa variables has 0.36 percent significant in the association with rank categorical variable. The box plot and density plot demonstrate the data overlap between admitted and not admitted data sets. The naïve Bayes classification model classify the large data with 0.68 percent for not admitted where as 0.31 percent were admitted. The confusion matrix, and the prediction were calculated with 0.68 percent accuracy when 95 percent confidence interval. Similarly, the training accuracy is increased from 29 percent to 32 percent when naïve Bayes algorithm method as use kernel is equal to TRUE that ultimately decrease misclassification errors in the binary data sets. Yagyanath Rimal. (2019). Naïve Bayes Machine Learning Classification with R Programming: A case study of binary data sets . International Journal on Orange Technologies, 1(2), 27-34. Retrieved from https://journals.researchparks.org/index.php/IJOT/article/view/358 Pdf Url: https://journals.researchparks.org/index.php/IJOT/article/view/358/347 Paper Url: https://journals.researchparks.org/index.php/IJOT/article/view/358
We propose a regularized method for multivariate linear regression when the number of predictors may exceed the sample size. This method is designed to strengthen the estimation and the selection of the relevant input features with three ingredients: it takes advantage of the dependency pattern between the responses by estimating the residual covariance; it performs selection on direct links between predictors and responses; and selection is driven by prior structural information. To this end, we build on a recent reformulation of the multivariate linear regression model to a conditional Gaussian graphical model and propose a new regularization scheme accompanied with an efficient optimization procedure. On top of showing very competitive performance on artificial and real data sets, our method demonstrates capabilities for fine interpretation of its parameters, as illustrated in applications to genetics, genomics and spectroscopy.
Similar to Converting Graphic Relationships into Conditional Probabilities in Bayesian Network (20)
Adversarial Variational Autoencoders to extend and improve generative model -...Loc Nguyen
Generative artificial intelligence (GenAI) has been developing with many incredible achievements like ChatGPT and Bard. Deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to strong points of deep neural network (DNN) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims to synaptic plasticity of human neuron network, fosters generation ability of DGM which produces surprised results with support of statistical flexibility. Two popular approaches in DGM are Variational Autoencoders (VAE) and Generative Adversarial Network (GAN). Both VAE and GAN have their own strong points although they share and imply underline theory of statistics as well as incredible complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. In this research, I try to unify VAE and GAN into a consistent and consolidated model called Adversarial Variational Autoencoders (AVA) in which VAE and GAN complement each other, for instance, VAE is a good data generator by encoding data via excellent ideology of Kullback-Leibler divergence and GAN is a significantly important method to assess reliability of data which is realistic or fake. In other words, AVA aims to improve accuracy of generative models, besides AVA extends function of simple generative models. In methodology this research focuses on combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
Conditional mixture model and its application for regression modelLoc Nguyen
Expectation maximization (EM) algorithm is a powerful mathematical tool for estimating statistical parameter when data sample contains hidden part and observed part. EM is applied to learn finite mixture model in which the whole distribution of observed variable is average sum of partial distributions. Coverage ratio of every partial distribution is specified by the probability of hidden variable. An application of mixture model is soft clustering in which cluster is modeled by hidden variable whereas each data point can be assigned to more than one cluster and degree of such assignment is represented by the probability of hidden variable. However, such probability in traditional mixture model is simplified as a parameter, which can cause loss of valuable information. Therefore, in this research I propose a so-called conditional mixture model (CMM) in which the probability of hidden variable is modeled as a full probabilistic density function (PDF) that owns individual parameter. CMM aims to extend mixture model. I also propose an application of CMM which is called adaptive regression model (ARM). Traditional regression model is effective when data sample is scattered equally. If data points are grouped into clusters, regression model tries to learn a unified regression function which goes through all data points. Obviously, such unified function is not effective to evaluate response variable based on grouped data points. The concept “adaptive” of ARM means that ARM solves the ineffectiveness problem by selecting the best cluster of data points firstly and then evaluating response variable within such best cluster. In orther words, ARM reduces estimation space of regression model so as to gain high accuracy in calculation.
Keywords: expectation maximization (EM) algorithm, finite mixture model, conditional mixture model, regression model, adaptive regression model (ARM).
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...Loc Nguyen
Vũ trụ có vật chất và phản vật chất, xã hội có xung đột và hữu hão để phát triển và suy tàn rồi suy tàn và phát triển. Tôi dựa vào đó để biện minh cho một bài viết có tính chất phản động nghịch chuyển thời cuộc nhưng bạn đọc sẽ tự tìm ra ý nghĩa bất ly của các hình thái xã hội. Ngoài ra bài viết này không đi sâu vào nghiên cứu pháp luật, chỉ đưa ra một cách nhìn tổng quan về dân chủ và thể chế chính trị liên quan đến triết học và tôn giáo, mà theo đó đóng góp của bài viết là khái niệm “nương tạm” của tư pháp không thật sự từ bầu cử và cũng không thật sự từ bổ nhiệm.
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsLoc Nguyen
Collaborative filtering (CF) is a popular technique in recommendation study. Concretely, items which are recommended to user are determined by surveying her/his communities. There are two main CF approaches, which are memory-based and model-based. I propose a new CF model-based algorithm by mining frequent itemsets from rating database. Hence items which belong to frequent itemsets are recommended to user. My CF algorithm gives immediate response because the mining task is performed at offline process-mode. I also propose another so-called Roller algorithm for improving the process of mining frequent itemsets. Roller algorithm is implemented by heuristic assumption “The larger the support of an item is, the higher it’s likely that this item will occur in some frequent itemset”. It models upon doing white-wash task, which rolls a roller on a wall in such a way that is capable of picking frequent itemsets. Moreover I provide enhanced techniques such as bit representation, bit matching and bit mining in order to speed up recommendation process. These techniques take advantages of bitwise operations (AND, NOT) so as to reduce storage space and make algorithms run faster.
Simple image deconvolution based on reverse image convolution and backpropaga...Loc Nguyen
Deconvolution task is not important in convolutional neural network (CNN) because it is not imperative to recover convoluted image when convolutional layer is important to extract features. However, the deconvolution task is useful in some cases of inspecting and reflecting a convolutional filter as well as trying to improve a generated image when information loss is not serious with regard to trade-off of information loss and specific features such as edge detection and sharpening. This research proposes a duplicated and reverse process of recovering a filtered image. Firstly, source layer and target layer are reversed in accordance with traditional image convolution so as to train the convolutional filter. Secondly, the trained filter is reversed again to derive a deconvolutional operator for recovering the filtered image. The reverse process is associated with backpropagation algorithm which is most popular in learning neural network. Experimental results show that the proposed technique in this research is better to learn the filters that focus on discovering pixel differences. Therefore, the main contribution of this research is to inspect convolutional filters from data.
Technological Accessibility: Learning Platform Among Senior High School StudentsLoc Nguyen
The document discusses technological accessibility among senior high school students. It covers several topics:
- Returning to nature through technology and ensuring technology accessibility is a life skill strengthened by advantages while avoiding problems for youth.
- Solutions to the dilemma of technology accessibility include letting youth develop independently, using security controls, and encouraging compassion.
- Researching career is optional but following passion and balancing positive and negative emotions can contribute to success.
The document discusses how engineering and technology can be used for social impact. It makes three key points:
1. Technology stems from knowledge and science, but assessing the value of knowledge is difficult. However, the importance of technology is clear through its fruits. Education fosters gaining knowledge and leads to technological development.
2. While technology increases wealth, it also widens inequality gaps. However, technology can indirectly address inequality through education by providing universal access to learning and bringing universities to people everywhere.
3. Diversity among countries should be embraced, and late innovations building on existing technologies can be particularly impactful, as was the strategy of a racing champion who overtook opponents in the final round through close observation and
Harnessing Technology for Research EducationLoc Nguyen
The document discusses harnessing technology for research and education. It notes some paradoxes of technology, such as how it can both increase inequality but also help solve it through education. It discusses the future of education being supported by distance learning tools, AI, and virtual/augmented reality. Open universities are seen as an intermediate step towards "home universities" and as a place where students can become lifelong learners and teachers. Understanding, emotion, and compassion are discussed as being more important than just remembering facts. The document ends by discussing connecting different subjects and technologies like a jigsaw puzzle.
Future of education with support of technologyLoc Nguyen
The document discusses the future of education with the support of technology. It notes that while technology deepens inequality by benefiting the rich, it can indirectly address inequality through education by allowing anyone to access knowledge and study from anywhere. Emerging forms of educational support include distance learning using tools like Zoom, artificial intelligence assistants, and virtual/augmented reality. Blended teaching combining different methods is emphasized. Understanding is seen as more important than memorization, and education should foster understanding, emotional development, and compassion through nature-based activities. The document argues technology can help education promote love by connecting various topics through both fusion and by assembling them like a jigsaw puzzle.
The document discusses generative artificial intelligence (AI) and its applications, using the metaphor of a dragon's flight. It introduces digital generative AI models that can generate images, sounds, and motions. As an example, it describes a model that could generate possible orbits or movements for a dragon and tiger in an image along with a bamboo background. The document then discusses how generative AI could be applied creatively in education to generate new learning materials and methods. It argues that while technology and societies are changing rapidly, climate change is a unifying challenge that nations must work together to overcome.
Adversarial Variational Autoencoders to extend and improve generative modelLoc Nguyen
Generative artificial intelligence (GenAI) has been developing with many incredible achievements like ChatGPT and Bard. Deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to strong points of deep neural network (DNN) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims to synaptic plasticity of human neuron network, fosters generation ability of DGM which produces surprised results with support of statistical flexibility. Two popular approaches in DGM are Variational Autoencoders (VAE) and Generative Adversarial Network (GAN). Both VAE and GAN have their own strong points although they share and imply underline theory of statistics as well as incredible complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. In this research, I try to unify VAE and GAN into a consistent and consolidated model called Adversarial Variational Autoencoders (AVA) in which VAE and GAN complement each other, for instance, VAE is good at generator by encoding data via excellent ideology of Kullback-Leibler divergence and GAN is a significantly important method to assess reliability of data which is realistic or fake. In other words, AVA aims to improve accuracy of generative models, besides AVA extends function of simple generative models. In methodology this research focuses on combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...Loc Nguyen
Dyadic data which is also called co-occurrence data (COD) contains co-occurrences of objects. Searching for statistical models to represent dyadic data is necessary. Fortunately, finite mixture model is a solid statistical model to learn and make inference on dyadic data because mixture model is built smoothly and reliably by expectation maximization (EM) algorithm which is suitable to inherent spareness of dyadic data. This research summarizes mixture models for dyadic data. When each co-occurrence in dyadic data is associated with a value, there are many unaccomplished values because a lot of co-occurrences are inexistent. In this research, these unaccomplished values are estimated as mean (expectation) of random variable given partial probabilistic distributions inside dyadic mixture model.
Machine learning forks into three main branches such as supervised learning, unsupervised learning, and reinforcement learning where reinforcement learning is much potential to artificial intelligence (AI) applications because it solves real problems by progressive process in which possible solutions are improved and finetuned continuously. The progressive approach, which reflects ability of adaptation, is appropriate to the real world where most events occur and change continuously and unexpectedly. Moreover, data is getting too huge for supervised learning and unsupervised learning to draw valuable knowledge from such huge data at one time. Bayesian optimization (BO) models an optimization problem as a probabilistic form called surrogate model and then directly maximizes an acquisition function created from such surrogate model in order to maximize implicitly and indirectly the target function for finding out solution of the optimization problem. A popular surrogate model is Gaussian process regression model. The process of maximizing acquisition function is based on updating posterior probability of surrogate model repeatedly, which is improved after every iteration. Taking advantages of acquisition function or utility function is also common in decision theory but the semantic meaning behind BO is that BO solves problems by progressive and adaptive approach via updating surrogate model from a small piece of data at each time, according to ideology of reinforcement learning. Undoubtedly, BO is a reinforcement learning algorithm with many potential applications and thus it is surveyed in this research with attention to its mathematical ideas. Moreover, the solution of optimization problem is important to not only applied mathematics but also AI.
Support vector machine is a powerful machine learning method in data classification. Using it for applied researches is easy but comprehending it for further development requires a lot of efforts. This report is a tutorial on support vector machine with full of mathematical proofs and example, which help researchers to understand it by the fastest way from theory to practice. The report focuses on theory of optimization which is the base of support vector machine.
A Proposal of Two-step Autoregressive ModelLoc Nguyen
Autoregressive (AR) model and conditional autoregressive (CAR) model are specific regressive models in which independent variables and dependent variable imply the same object. They are powerful statistical tools to predict values based on correlation of time domain and space domain, which are useful in epidemiology analysis. In this research, I combine them by the simple way in which AR and CAR is estimated in two separate steps so as to cover time domain and space domain in spatial-temporal data analysis. Moreover, I integrate logistic model into CAR model, which aims to improve competence of autoregressive models.
Extreme bound analysis based on correlation coefficient for optimal regressio...Loc Nguyen
Regression analysis is an important tool in statistical analysis, in which there is a demand of discovering essential independent variables among many other ones, especially in case that there is a huge number of random variables. Extreme bound analysis is a powerful approach to extract such important variables called robust regressors. In this research, a so-called Regressive Expectation Maximization with RObust regressors (REMRO) algorithm is proposed as an alternative method beside other probabilistic methods for analyzing robust variables. By the different ideology from other probabilistic methods, REMRO searches for robust regressors forming optimal regression model and sorts them according to descending ordering given their fitness values determined by two proposed concepts of local correlation and global correlation. Local correlation represents sufficient explanatories to possible regressive models and global correlation reflects independence level and stand-alone capacity of regressors. Moreover, REMRO can resist incomplete data because it applies Regressive Expectation Maximization (REM) algorithm into filling missing values by estimated values based on ideology of expectation maximization (EM) algorithm. From experimental results, REMRO is more accurate for modeling numeric regressors than traditional probabilistic methods like Sala-I-Martin method but REMRO cannot be applied in case of nonnumeric regression model yet in this research.
The document proposes a "jagged stock investment" strategy where stocks are repeatedly purchased at intervals where the price is increasing, similar to rises and falls in a saw blade. It shows mathematically that this strategy can achieve equal or higher returns than bank depositing if the average growth rate and interest rate are the same. The strategy models purchasing a share and its replications over time periods where each replication is bought using the profits from the previous rise. It provides formulas to calculate the overall value and return on investment from this jagged stock investment approach.
This document presents a study on minima distribution and its application to explaining the convergence and convergence speed of optimization algorithms. The author proposes weaker conditions for the convergence and monotonicity properties of minima distribution compared to previous work. Specifically, the convergence condition requires the function τ(x) to be positive and non-minimizing everywhere except at minimizers of the target function f(x). The monotonicity condition requires f(x) and the derivative of τ(x) to be inversely proportional. The author then defines convergence speed as the derivative of the integral of f(x) with respect to the optimization iterations and shows it depends on the slope of the derivative of τ(x).
This is chapter 4 “Variants of EM algorithm” in my book “Tutorial on EM algorithm”, which focuses on EM variants. The main purpose of expectation maximization (EM) algorithm, also GEM algorithm, is to maximize the log-likelihood L(Θ) = log(g(Y|Θ)) with observed data Y by maximizing the conditional expectation Q(Θ’|Θ). Such Q(Θ’|Θ) is defined fixedly in E-step. Therefore, most variants of EM algorithm focus on how to maximize Q(Θ’|Θ) in M-step more effectively so that EM is faster or more accurate.
This is the chapter 3 "Properties and convergence of EM algorithm" in my book “Tutorial on EM algorithm”, which focuses on mathematical explanation of the convergence of GEM algorithm given by DLR (Dempster, Laird, & Rubin, 1977, pp. 6-9).
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Converting Graphic Relationships into Conditional Probabilities in Bayesian Network
1. Converting Graphic Relationships
into Conditional Probabilities in
Bayesian Network
Prof. Loc Nguyen PhD, MD, MBA
An Giang University, Vietnam
Email: ng_phloc@yahoo.com
Homepage: www.locnguyen.net
1Published in the book "Bayesian Inference" - InTechOpen9/6/2017
2. Abstract
Bayesian network (BN) is a powerful mathematical tool for prediction
and diagnosis applications. A large BN can be constituted of many
simple networks which in turn are constructed from simple graphs. A
simple graph consists of one child node and many parent nodes. The
strength of each relationship between a child node and a parent node is
quantified by a weight and all relationships share the same semantics
such as prerequisite, diagnostic, and aggregation. The research focuses
on converting graphic relationships into conditional probabilities in
order to construct a simple BN from a graph. Diagnostic relationship is
the main research object, in which sufficient diagnostic proposition is
proposed for validating diagnostic relationship. Relationship conversion
is adhered to logic gates such as AND, OR, and XOR, which is essential
feature of the research.
2Published in the book "Bayesian Inference" - InTechOpen9/6/2017
3. Table of contents
1. Introduction
2. Diagnostic relationship
3. X-gate inferences
4. Multi-hypothesis diagnostic relationship
5. Conclusion
3Published in the book "Bayesian Inference" - InTechOpen9/6/2017
4. 1. Introduction
• In general, Bayesian network (BN) is a directed acyclic graph (DAG) consisting
of a set of nodes and a set of arcs. Each node is a random variable. Each arc
represents a relationship between two nodes.
• The strength of relationship in a graph can be quantified by a number called
weight. There are some important relationships such as prerequisite, diagnostic,
and aggregation.
• The difference between BN and normal graph is that the strength of every
relationship in BN is represented by a conditional probability table (CPT) whose
entries are conditional probabilities of a child node given parent nodes.
• There are two main approaches to construct a BN:
o The first approach aims to learn BN from training data by learning machine algorithms.
o The second approach is that experts define some graph patterns according to specific
relationships and then, BN is constructed based on such patterns along with determined CPT
(s). This research focuses on such second approach.
4Published in the book "Bayesian Inference" - InTechOpen9/6/2017
5. 1. Introduction
• An example of BN, event “cloudy”
is cause of event “rain” which in turn
is cause of “grass is wet” (Murphy,
1998).
• Binary random variables (nodes) C,
S, R, and W denote events such as
cloudy, sprinkler, rain, and wet grass.
Each node is associated with a CPT.
• Given “wet grass” evidence W=1,
due to P(S=1|W=1) = 0.7 >
P(R=1|W=1) = 0.67, which leads to
conclusion that sprinkler is the most
likely cause of wet grass.
9/6/2017 Published in the book "Bayesian Inference" - InTechOpen 5
6. 1. Giới thiệu
• Application of BN for adaptive
learning, in which the course is
modeled by BN.
• Suppose the course “Java
Tutorial” has two main topics
such as “Getting Started” and
“Learning Java Language”. Each
topic have many subjects. All
topics and subjects are
hypothesis nodes.
• Each leaf subject is attached to a
test which modeled as a
evidence node.
9/6/2017 Published in the book "Bayesian Inference" - InTechOpen 6
7. 1. Giới thiệu
BN is used to
evaluate users’
knowledge (mastered
or not mastered).
9/6/2017 Published in the book "Bayesian Inference" - InTechOpen 7
8. 1. Introduction
• Relationship conversion aims to determined conditional probabilities based on
weights and meanings of relationships. Especially, these relationships are adhered
to logic X-gates (Wikipedia, 2016) such as AND-gate, OR-gate, and SIGMA-gate.
The X-gate inference in this research is derived and inspired from noisy OR-gate
described in the book “Learning Bayesian Networks” by author (Neapolitan, 2003,
pp. 157-159).
• Authors (Díez & Druzdzel, 2007) also researched OR/MAX, AND/MIN, noisy
XOR inferences but they focused on canonical models, deterministic models, ICI
models whereas I focus on logic gate and graphic relationships.
• Factor graph (Wikipedia, Factor graph, 2015) represents factorization of a global
function into many partial functions. Pearl’s propagation algorithm (Pearl, 1986),
a variant of factor graph, is very successful in BN inference. I did not use factor
graph for constructing BN. The concept “X-gate inference” only implies how to
convert simple graph into BN.
8Published in the book "Bayesian Inference" - InTechOpen9/6/2017
9. 1. Introduction
• In general, my research focuses on applied probability adhered to
Bayesian network, logic gates, and Bayesian user modeling (Millán &
Pérez-de-la-Cruz, 2002). The scientific results are shared with authors
Eva Millán and José Luis Pérez-de-la-Cruz.
• As default, the research is applied in learning context in which BN is
used to assess students’ knowledge.
• Evidences are tests, exams, exercises, etc. and hypotheses are learning
concepts, knowledge items, etc.
• Diagnostic relationship is very important to Bayesian evaluation in
learning context because it is used to evaluate student’s mastery of
concepts. Now we start relationship conversion with a research on
diagnostic relationship in the next section.
9Published in the book "Bayesian Inference" - InTechOpen9/6/2017
10. 2. Diagnostic relationship
• In some opinions (Millán, Loboda, & Pérez-de-la-Cruz, 2010, p. 1666) like mine, the
diagnostic relationship should be from hypothesis to evidence. For example, disease
is hypothesis X and symptom is evidence D.
• The weight w of relationship between X and D is 1. Formula 2.1 specifies CPT of D
when D is binary (0 and 1) for converting simplest diagnostic relationship to simplest
BN.
𝑃 𝐷 𝑋 =
𝐷 if 𝑋 = 1
1 − 𝐷 if 𝑋 = 0
2.1
• Evidence D can be used to diagnose hypothesis X if the so-called sufficient diagnostic
proposition is satisfied. Such proposition is called shortly diagnostic condition stated
that “D is equivalent to X in diagnostic relationship if P(X|D) = kP(D|X) given
uniform distribution of X and the transformation coefficient k is independent from D.
In other words, k is constant with regards to D and so D is called sufficient evidence”.
X D1
10Published in the book "Bayesian Inference" - InTechOpen9/6/2017
11. 2. Diagnostic relationship
• I survey three other common cases of evidence D such as D∈{0, 1, 2,…, η}, D∈{a, a+1, a+2,…, b}, and D∈[a, b].
• In general, formula 2.6 summarizes CPT of evidence of single diagnosis relationship, satisfying diagnostic condition.
𝑃 𝐷 𝑋 =
𝐷
𝑆
if 𝑋 = 1
𝑀
𝑆
−
𝐷
𝑆
if 𝑋 = 0
𝑘 =
𝑁
2
2.6
Where
𝑁 =
2 if 𝐷 ∈ 0,1
𝜂 + 1 if 𝐷 ∈ 0,1,2, … , 𝜂
𝑏 − 𝑎 + 1 if 𝐷 ∈ 𝑎, 𝑎 + 1, 𝑎 + 2, … , 𝑏
𝑏 − 𝑎 if 𝐷 continuous and 𝐷 ∈ 𝑎, 𝑏
𝑀 =
1 if 𝐷 ∈ 0,1
𝜂 if 𝐷 ∈ 0,1,2, … , 𝜂
𝑏 + 𝑎 if 𝐷 ∈ 𝑎, 𝑎 + 1, 𝑎 + 2, … , 𝑏
𝑏 + 𝑎 if 𝐷 continuous and 𝐷 ∈ 𝑎, 𝑏
𝑆 =
𝐷
𝐷 =
𝑁𝑀
2
=
1 if 𝐷 ∈ 0,1
𝜂 𝜂 + 1
2
if 𝐷 ∈ 0,1,2, … , 𝜂
𝑏 + 𝑎 𝑏 − 𝑎 + 1
2
if 𝐷 ∈ 𝑎, 𝑎 + 1, 𝑎 + 2, … , 𝑏
𝑏2
− 𝑎2
2
if 𝐷 continuous and 𝐷 ∈ 𝑎, 𝑏
11Published in the book "Bayesian Inference" - InTechOpen9/6/2017
12. 2. Diagnostic relationship
As an example, we prove that the CPT of evidence D∈{0, 1, 2,…, η}, a special case of formula 2.6,
satisfies diagnostic condition.
𝑃 𝐷 𝑋 =
𝐷
𝑆
if 𝑋 = 1
𝜂
𝑆
−
𝐷
𝑆
if 𝑋 = 0
where 𝐷 ∈ 0,1,2, … , 𝜂 and 𝑆 = 𝐷=0
𝑛
𝐷 =
𝜂 𝜂+1
2
In fact, we have:
𝑃 𝐷 𝑋 = 0 + 𝑃 𝐷 𝑋 = 1 =
𝐷
𝑆
+
𝜂 − 𝐷
𝑆
=
2
𝜂 + 1
𝐷=0
𝜂
𝑃 𝐷 𝑋 = 1 =
𝐷=0
𝜂
𝐷
𝑆
= 𝐷=0
𝜂
𝐷
𝑆
=
𝑆
𝑆
= 1
𝐷=0
𝜂
𝑃 𝐷 𝑋 = 0 =
𝐷=0
𝜂
𝜂 − 𝐷
𝑆
=
𝜂 𝜂 + 1 − 𝑆
𝑆
= 1
Suppose the prior probability of X is uniform: 𝑃 𝑋 = 0 = 𝑃 𝑋 = 1 . We have:
𝑃 𝑋 𝐷 =
𝑃 𝐷 𝑋 𝑃 𝑋
𝑃 𝐷
=
𝑃 𝐷 𝑋
𝑃 𝐷 𝑋 = 0 + 𝑃 𝐷 𝑋 = 1
=
𝜂 + 1
2
𝑃 𝐷 𝑋
So, the transformation coefficient k is
𝜂+1
2
.
12Published in the book "Bayesian Inference" - InTechOpen9/6/2017
13. 3. X-gate inferences
• The diagnostic relationship is now extended
with more than one hypothesis. Given a
simple graph consisting of one child variable
Y and n parent variables Xi. Each relationship
from Xi to Y is quantified by a normalized
weight wi where 0 ≤ wi ≤ 1.
• Now we convert graphic relationships of
simple graph into CPT (s) of simple BN.
These relationships are adhere to X-gates
such as AND-gate, OR-gate, and SIGMA-
gate. Relationship conversion is to determine
X-gate inference. The simple graph is also
called X-gate graph or X-gate network.
X-gate network
13Published in the book "Bayesian Inference" - InTechOpen9/6/2017
14. 3. X-gate inferences
• X-gate inhibition: Given a relationship
from source Xi to target Y, there is a
factor Ii that inhibits Xi from integrated
into Y.
• Inhibition independence: Inhibitions
are mutually independent.
• Accountability: X-gate network is
established by accountable variables
Ai for Xi and Ii. Each X-gate inference
owns particular combination of Ai (s).
extended X-gate network with accountable variables
X-gate inference is based on three following assumptions mentioned in (Neapolitan, 2003, p. 157)
14Published in the book "Bayesian Inference" - InTechOpen9/6/2017
17. 3. X-gate inferences
• Given Ω = {X1, X2,…, Xn} where |Ω|=n is cardinality of Ω.
• Let a(Ω) be an arrangement of Ω which is a set of n instances {X1=x1, X2=x2,…, Xn=xn} where
xi is 1 or 0. The number of all a(Ω) is 2|Ω|. For instance, given Ω = {X1, X2}, there are 22=4
arrangements as follows: a(Ω) = {X1=1, X2=1}, a(Ω) = {X1=1, X2=0}, a(Ω) = {X1=0, X2=1},
a(Ω) = {X1=0, X2=0}. Let a(Ω:{Xi}) be the arrangement of Ω with fixed Xi.
• Let c(Ω) and c(Ω:{Xi}) be the number of arrangements a(Ω) and a(Ω:{Xi}), respectively.
• Let x denote the X-gate operator, for instance, x = ⨀ for AND-gate, x = ⨁ for OR-gate, x =
not⨀ for NAND-gate, x = not⨁ for NOR-gate, x = ⨂ for XOR-gate, x = not⨂ for XNOR-
gate, x = ⨄ for U-gate, x = + for SIGMA-gate. Given an x-operator, let s(Ω:{Xi}) and s(Ω) be
sum of all 𝑃 𝑋1x𝑋2x … x𝑋 𝑛 through every arrangement of Ω with and without fixed Xi,
respectively.
𝑠 Ω =
𝑎
𝑃 𝑋1⨁𝑋2⨁ … ⨁𝑋 𝑛 𝑎 Ω
𝑠 Ω: 𝑋𝑖 =
𝑎
𝑃 𝑋1⨁𝑋2⨁ … ⨁𝑋 𝑛 𝑎 Ω: 𝑋𝑖
17Published in the book "Bayesian Inference" - InTechOpen9/6/2017
18. 3. X-gate inferences
public class ArrangementGenerator {
private ArrayList<int[]> arrangements; private int n, r;
private ArrangementGenerator(int n, int r) {
this.n = n; this.r = r; this.arrangements = new ArrayList();
}
private void create(int[] a, int i) {
for(int j = 0; j < n; j++) {
a[i] = j;
if(i < r - 1) create(a, i + 1);
else if(i == r -1) {
int[] b = new int[a.length];
for(int k = 0; k < a.length; k++) b[k] = a[k];
arrangements.add(b);
}
}
}
}
code for producing all
arrangements
18Published in the book "Bayesian Inference" - InTechOpen9/6/2017
19. 3. X-gate inferences
• Connection between s(Ω:{Xi=1}) and s(Ω:{Xi=0}), between
c(Ω:{Xi=1}) and c(Ω:{Xi=0}).
𝑠 Ω: 𝑋𝑖 = 1 + 𝑠 Ω: 𝑋𝑖 = 0 = 𝑠 Ω
𝑐 Ω: 𝑋𝑖 = 1 + 𝑐 Ω: 𝑋𝑖 = 0 = 𝑐 Ω
• Let K be a set of Xi (s) whose values are 1 and let L be a set of Xi (s)
whose values are 0. K and L are mutually complementary.
𝐾 = 𝑖: 𝑋𝑖 = 1
𝐿 = 𝑖: 𝑋𝑖 = 0
𝐾 ∩ 𝐿 = ∅
𝐾 ∪ 𝐿 = {1,2, … , 𝑛}
19Published in the book "Bayesian Inference" - InTechOpen9/6/2017
20. 3. X-gate inferences
• AND-gate condition (3.7)
𝑃 𝑌 = 1 𝐴𝑖 = 𝑂𝐹𝐹 for some 𝑖 = 0
• AND-gate inference (3.8)
𝑃 𝑋1⨀𝑋2⨀ … ⨀𝑋 𝑛 = 𝑃 𝑌 = 1 𝑋1, 𝑋2, … , 𝑋 𝑛
=
𝑖=1
𝑛
𝑝𝑖 if all 𝑋𝑖 𝑠 are 1
0 if there exists at least one 𝑋𝑖 = 0
𝑃 𝑌 = 0 𝑋1, 𝑋2, … , 𝑋 𝑛 =
1 −
𝑖=1
𝑛
𝑝𝑖 if all 𝑋𝑖 𝑠 are 1
1 if there exists at least one 𝑋𝑖 = 0
20Published in the book "Bayesian Inference" - InTechOpen9/6/2017
21. 3. X-gate inferences
• OR-gate condition (3.9)
𝑃 𝑌 = 1 𝐴𝑖 = 𝑂𝑁 for some 𝑖 = 1
• OR-gate inference (3.10)
𝑃 𝑋1⨁𝑋2⨁ … ⨁𝑋 𝑛 = 1 − 𝑃 𝑌 = 0 𝑋1, 𝑋2, … , 𝑋 𝑛
=
1 −
𝑖∈𝐾
1 − 𝑝𝑖 if 𝐾 ≠ ∅
0 if 𝐾 = ∅
𝑃 𝑌 = 0 𝑋1, 𝑋2, … , 𝑋 𝑛 =
𝑖∈𝐾
1 − 𝑝𝑖 if 𝐾 ≠ ∅
1 if 𝐾 = ∅
21Published in the book "Bayesian Inference" - InTechOpen9/6/2017
22. 3. X-gate inferences
• NAND-gate inference and NOR-gate inference (3.11)
𝑃 not 𝑋1⨀𝑋2⨀ … ⨀𝑋 𝑛 =
1 −
𝑖∈𝐿
𝑝𝑖 if 𝐿 ≠ ∅
0 if 𝐿 = ∅
𝑃 not 𝑋1⨁𝑋2⨁ … ⨁𝑋 𝑛 =
𝑖=1
𝑛
𝑞𝑖 if 𝐾 = ∅
0 if 𝐾 ≠ ∅
22Published in the book "Bayesian Inference" - InTechOpen9/6/2017
23. 3. X-gate inferences
• Two XOR-gate conditions (3.12)
𝑃 𝑌 = 1
𝐴𝑖 = 𝑂𝑁 for 𝑖 ∈ 𝑂
𝐴𝑖 = 𝑂𝐹𝐹 for 𝑖 ∉ 𝑂
= 𝑃 𝑌 = 1 𝐴1 = 𝑂𝑁, 𝐴2 = 𝑂𝐹𝐹, … , 𝐴 𝑛−1 = 𝑂𝑁, 𝐴 𝑛 = 𝑂𝐹𝐹 = 1
𝑃 𝑌 = 1
𝐴𝑖 = 𝑂𝑁 for 𝑖 ∈ 𝐸
𝐴𝑖 = 𝑂𝐹𝐹 for 𝑖 ∉ 𝐸
= 𝑃 𝑌 = 1 𝐴1 = 𝑂𝐹𝐹, 𝐴2 = 𝑂𝑁, … , 𝐴 𝑛−1 = 𝑂𝐹𝐹, 𝐴 𝑛 = 𝑂𝑁 = 1
• Let O be the set of Xi (s) whose indices are odd. Let O1 and O2 be subsets of O, in which all Xi (s) are 1 and 0, respectively. Let E be the set of Xi (s)
whose indices are even. Let E1 and E2 be subsets of E, in which all Xi (s) are 1 and 0, respectively. XOR-gate inference (3.13) is:
𝑃 𝑋1⨂𝑋2⨂ … ⨂𝑋 𝑛 = 𝑃 𝑌 = 1 𝑋1, 𝑋2, … , 𝑋 𝑛 =
𝑖∈𝑂1
𝑝𝑖
𝑖∈𝐸1
1 − 𝑝𝑖 +
𝑖∈𝐸1
𝑝𝑖
𝑖∈𝑂1
1 − 𝑝𝑖 if 𝑂2 = ∅ and 𝐸2 = ∅
𝑖∈𝑂1
𝑝𝑖
𝑖∈𝐸1
1 − 𝑝𝑖 if 𝑂2 = ∅ and 𝐸1 ≠ ∅ and 𝐸2 ≠ ∅
𝑖∈𝑂1
𝑝𝑖 if 𝑂2 = ∅ and 𝐸1 = ∅
𝑖∈𝐸1
𝑝𝑖
𝑖∈𝑂1
1 − 𝑝𝑖 if 𝐸2 = ∅ and 𝑂1 ≠ ∅ and 𝑂2 ≠ ∅
𝑖∈𝐸1
𝑝𝑖 if 𝐸2 = ∅ and 𝑂1 = ∅
0 if 𝑂2 ≠ ∅ and 𝐸2 ≠ ∅
0 if 𝑛 < 2 or 𝑛 is odd
23Published in the book "Bayesian Inference" - InTechOpen9/6/2017
24. 3. X-gate inferences
• Two XNOR-gate conditions (3.14)
𝑃 𝑌 = 1 𝐴𝑖 = 𝑂𝑁, ∀𝑖 = 1
𝑃 𝑌 = 1 𝐴𝑖 = 𝑂𝐹𝐹, ∀𝑖 = 1
• XNOR-gate inference (3.15)
𝑃 not 𝑋1⨂𝑋2⨂ … ⨂𝑋 𝑛 = 𝑃 𝑌 = 1 𝑋1, 𝑋2, … , 𝑋 𝑛
=
𝑖=1
𝑛
𝑝𝑖 +
𝑖=1
𝑛
1 − 𝑝𝑖 if 𝐿 = ∅
𝑖∈𝐾
1 − 𝑝𝑖 if 𝐿 ≠ ∅ and 𝐾 ≠ ∅
1 if 𝐿 ≠ ∅ and 𝐾 = ∅
24Published in the book "Bayesian Inference" - InTechOpen9/6/2017
25. 3. X-gate inferences
|U|=α
𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 1 if there are exactly α variables
Ai = ON (s). Otherwise, 𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 0.
|U|≥α
𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 1 if there are at least α variables
Ai = ON (s). Otherwise, 𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 0.
|U|≤β
𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 1 if there are at most β variables
Ai = ON (s). Otherwise, 𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 0.
α≤|U|≤β
𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 1 if the number of Ai = ON (s) is
from α to β. Otherwise, 𝑃 𝑌 = 1 𝐴1, 𝐴2, … , 𝐴 𝑛 = 0.
Let U be a set of indices such that Ai=ON and let α ≥ 0 and β ≥ 0 be predefined
numbers. The U-gate inference is defined based on α, β and cardinality of U.
Formula 3.16 specifies three common U-gate conditions.
25Published in the book "Bayesian Inference" - InTechOpen9/6/2017
26. 3. X-gate inferences
• Let 𝑆 𝑈 = 𝑈∈𝒰 𝑖∈𝑈 𝑝𝑖 𝑗∈𝐾𝑈 1 − 𝑝𝑗 and 𝑃 𝑈 = 𝑃 𝑋1⨄𝑋2⨄ … ⨄𝑋 𝑛 =
𝑃 𝑌 = 1 𝑋1, 𝑋2, … , 𝑋 𝑛
• As a convention, 𝑖∈𝑈 𝑝𝑖 = 1 if 𝑈 = 0 and 𝑗∈𝐾𝑈 1 − 𝑝𝑗 = 1 if 𝑈 = 𝐾
• |U|=0: we have 𝑃 𝑈 = 𝑗=1
𝑛
1 − 𝑝𝑗 if 𝐾 > 0
1 if 𝐾 = 0
and 𝒰 = 1
• |U|≥0: we have 𝑃 𝑈 =
𝑆 𝑈 if 𝐾 > 0
1 if 𝐾 = 0
and 𝒰 = 2 𝐾
. The case |U|≥0 is the same
to the case |U|≤n.
• |U|=n: we have 𝑃 𝑈 = 𝑖=1
𝑛
𝑝𝑖 if 𝐾 = 𝑛
0 if 𝐾 < 𝑛
and 𝒰 =
1 if 𝐾 = 𝑛
0 if 𝐾 < 𝑛
Let PU be the U-gate probability, following formula 3.17 specifies U-gate inference and cardinality
of 𝒰 where 𝒰 is the set of subsets (U) of K
26Published in the book "Bayesian Inference" - InTechOpen9/6/2017
27. 3. X-gate inferences
• |U|=α and 0<α<n: we have 𝑃 𝑈 =
𝑆 𝑈 if 𝐾 ≥ 𝛼
0 if 𝐾 < 𝛼
and 𝒰 =
𝐾
𝛼
if 𝐾 ≥ 𝛼
0 if 𝐾 < 𝛼
• |U|≥α and 0<α<n: we have 𝑃 𝑈 =
𝑆 𝑈 if 𝐾 ≥ 𝛼
0 if 𝐾 < 𝛼
and 𝒰 = 𝑗=𝛼
𝐾 𝐾
𝑗
if 𝐾 ≥ 𝛼
0 if 𝐾 < 𝛼
• |U|≤β and 0<β<n: we have 𝑃 𝑈 =
𝑆 𝑈 if 𝐾 > 0
1 if 𝐾 = 0
and 𝒰 = 𝑗=0
min 𝛽, 𝐾 𝐾
𝑗
if 𝐾 > 0
1 if 𝐾 = 0
• α≤|U|≤β and 0<α<n and 0<β<n: we have 𝑃 𝑈 =
𝑆 𝑈 if 𝐾 ≥ 𝛼
0 if 𝐾 < 𝛼
and 𝒰 =
𝑗=𝛼
min 𝛽, 𝐾 𝐾
𝑗
if 𝐾 ≥ 𝛼
0 if 𝐾 < 𝛼
U-gate inference (continue)
27Published in the book "Bayesian Inference" - InTechOpen9/6/2017
28. 3. X-gate inferences
• U-gate condition on |U| can be arbitrary and it is only relevant to Ai (s)
(ON or OFF) and the way to combine Ai (s). For example, AND-gate
and OR-gate are specific cases of U-gate with |U|=n and |U|≥1,
respectively. XOR-gate and XNOR-gate are also specific cases of U-
gate with specific conditions on Ai (s).
• In this research, U-gate is the most general nonlinear gate where U-
gate probability contains products of weights.
• Next, we will research a so-called SIGMA-gate that contains only
linear combination of weights.
28Published in the book "Bayesian Inference" - InTechOpen9/6/2017
29. 3. X-gate inferences
• The SIGMA-gate inference (Nguyen, 2016) represents aggregation relationship satisfying
SIGMA-gate condition specified by formula 3.18.
𝑃 𝑌 = 𝑃
𝑖=1
𝑛
𝐴𝑖 3.18
Where the set of Ai (s) is complete and mutually exclusive.
𝑖=1
𝑛
𝑤𝑖 = 1 and 𝐴𝑖 ∩ 𝐴𝑗 = ∅, ∀𝑖 ≠ 𝑗
• The sigma sum 𝑖=1
𝑛
𝐴𝑖 indicates that Y is exclusive union of Ai (s) and here, it does not
express arithmetical additions.
𝑌 =
𝑖=1
𝑛
𝐴𝑖 =
𝑖=1
𝑛
𝐴𝑖
𝑃 𝑌 = 𝑃
𝑖=1
𝑛
𝐴𝑖 = 𝑃
𝑖=1
𝑛
𝐴𝑖 =
𝑖=1
𝑛
𝑃 𝐴𝑖
29Published in the book "Bayesian Inference" - InTechOpen9/6/2017
30. 3. X-gate inferences
• In general, formula 3.19 specifies the theorem of SIGMA-gate inference
(Nguyen, 2016). The base of this theorem was mentioned by authors
(Millán & Pérez-de-la-Cruz, 2002, pp. 292-295).
𝑃 𝑋1 + 𝑋2 + ⋯ + 𝑋 𝑛 = 𝑃
𝑖=1
𝑛
𝑋𝑖 = 𝑃 𝑌 = 1 𝑋1, 𝑋2, … , 𝑋 𝑛 =
𝑖∈𝐾
𝑤𝑖
𝑃 𝑌 = 0 𝑋1, 𝑋2, … , 𝑋 𝑛 = 1 −
𝑖∈𝐾
𝑤𝑖 =
𝑖∈𝐿
𝑤𝑖
Where the set of Ai (s) is complete and mutually exclusive.
𝑖=1
𝑛
𝑤𝑖 = 1 and 𝐴𝑖 ∩ 𝐴𝑗 = ∅, ∀𝑖 ≠ 𝑗
• Next slide is the proof of SIGMA-gate theorem.
30Published in the book "Bayesian Inference" - InTechOpen9/6/2017
31. 3. X-gate inferences
𝑌 𝑋1, 𝑋2, … , 𝑋 𝑛 = 𝑃 𝑖=1
𝑛
𝐴𝑖 𝑋1, 𝑋2, … , 𝑋 𝑛
due to SIGMA − gate condition
=
𝑖=1
𝑛
𝑃 𝐴𝑖 𝑋1, 𝑋2, … , 𝑋 𝑛
because 𝐴𝑖 s are mutually exclusive
=
𝑖=1
𝑛
𝑃 𝐴𝑖 𝑋𝑖
because 𝐴𝑖 is only dependent on 𝑋𝑖
It implies
𝑃 𝑌 = 1 𝑋1, 𝑋2, … , 𝑋 𝑛 =
𝑖=1
𝑛
𝑃 𝐴𝑖 = 𝑂𝑁 𝑋𝑖
=
𝑖∈𝐾
𝑃 𝐴𝑖 = 𝑂𝑁 𝑋𝑖 = 1
+
𝑖∉𝐾
𝑃 𝐴𝑖 = 𝑂𝑁 𝑋𝑖 = 0
=
𝑖∈𝐾
𝑤𝑖 +
𝑖∉𝐾
0 =
𝑖∈𝐾
𝑤𝑖
(Due to formula 3.3)
Proof of SIGMA-gate theorem
31Published in the book "Bayesian Inference" - InTechOpen9/6/2017
32. 3. X-gate inferences
• As usual, each arc in simple graph is
associated with a “clockwise” strength
of relationship between Xi and Y.
Event Xi=1 causes event Ai=ON with
“clockwise” weight wi.
• I define a so-called
“counterclockwise” strength of
relationship between Xi and Y denoted
ωi. Event Xi=0 causes event Ai=OFF
with “counterclockwise” weight ωi.
• In other words, each arc in simple
graph is associated with a clockwise
weight wi and a counterclockwise
weight ωi. Such graph is called bi-
weight simple graph.
Bi-weight simple graph
32Published in the book "Bayesian Inference" - InTechOpen9/6/2017
35. 3. X-gate inferences
• Formula 3.22 specifies SIGMA-gate bi-inference.
𝑃 𝑋1 + 𝑋2 + ⋯ + 𝑋 𝑛 =
𝑖∈𝐾
𝑤𝑖 +
𝑖∈𝐿
𝑑𝑖
Where the set of Xi (s) is complete and mutually exclusive.
𝑖=1
𝑛
𝑊𝑖 = 1 and 𝑋𝑖 ∩ 𝑋𝑗 = ∅, ∀𝑖 ≠ 𝑗
• Following is the proof of SIGMA-gate bi-inference.
𝑃 𝑋1 + 𝑋2 + ⋯ + 𝑋 𝑛 =
𝑖=1
𝑛
𝑃 𝐴𝑖 = 𝑂𝑁 𝑋𝑖
=
𝑖∈𝐾
𝑃 𝐴𝑖 = 𝑂𝑁 𝑋𝑖 = 1 +
𝑖∈𝐿
𝑃 𝐴𝑖 = 𝑂𝑁 𝑋𝑖 = 0 =
𝑖∈𝐾
𝑤𝑖 +
𝑖∈𝐿
𝑑𝑖
35Published in the book "Bayesian Inference" - InTechOpen9/6/2017
36. 4. Multi-hypothesis diagnostic relationship
• Given a simple graph, if we replace
the target source Y by an evidence D,
we get a so-called multi-hypothesis
diagnostic relationship whose
property adheres to X-gate inference.
Such relationship is called shortly X-
gate diagnostic relationship.
• According to aforementioned X-gate
network, the target variable must be
binary whereas evidence D can be
numeric. Thus, we add an augmented
target binary variable Y and then, the
evidence D is connected directly to Y.
Finally, we have X-gate diagnostic
network or X-D network.
36Published in the book "Bayesian Inference" - InTechOpen9/6/2017
40. 4. Multi-hypothesis diagnostic relationship
Given X-D network is combination of diagnostic relationship and X-gate inference:
𝑃 𝑌 = 1 𝑋1, 𝑋2, … , 𝑋 𝑛 = 𝑃 𝑋1x𝑋2x … x𝑋 𝑛
𝑃 𝐷 𝑌 =
𝐷
𝑆
if 𝑌 = 1
𝑀
𝑆
−
𝐷
𝑆
if 𝑌 = 0
The diagnostic condition of X-D network is satisfied if and only if
𝑠 Ω =
𝑎
𝑃 𝑌 = 1 𝑎 Ω = 2 Ω −1
, ∀Ω ≠ ∅
At that time, the transformation coefficient becomes:
𝑘 =
𝑁
2
Note that weights pi=wi and ρi=ωi, which are inputs of s(Ω), are abstract variables. Thus, the equality
s(Ω) = 2|Ω|–1 implies all abstract variables are removed and so s(Ω) does not depend on weights.
Diagnostic theorem
40Published in the book "Bayesian Inference" - InTechOpen9/6/2017
41. 4. Multi-hypothesis diagnostic relationship
The transformation coefficient is rewritten as follows: 𝑘 =
2 𝑛−1 𝑆
2𝐷 𝑠 Ω −2 𝑛−1 +𝑀 2 𝑛−𝑠 Ω
Given binary case when D=0 and S=1, we have: 2 𝑛−1
= 2 𝑛−1
∗ 1 = 2 𝑛−1
𝑆 = 𝑎𝐷 𝑗
= 𝑎 ∗ 0 𝑗
= 0
There is a contradiction, which implies that it is impossible to reduce k into the following form: 𝑘 =
𝑎𝐷 𝑗
𝑏𝐷 𝑗
Therefore, if k is constant with regard to D then, 2𝐷 𝑠 Ω − 2 𝑛−1
+ 𝑀 2 𝑛
− 𝑠 Ω = 𝐶 ≠ 0, ∀𝐷
Where C is constant. We have: 𝐷 2𝐷 𝑠 Ω − 2 𝑛−1
+ 𝑀 2 𝑛
− 𝑠 Ω = 𝐷 𝐶 ⇒ 2𝑆 𝑠 Ω − 2 𝑛−1
+ 𝑁𝑀 2 𝑛
− 𝑠 Ω = 𝑁𝐶 ⇒ 2 𝑛
𝑆 = 𝑁𝐶
It is implied that 𝑘 =
2 𝑛−1 𝑆
2𝐷 𝑠 Ω −2 𝑛−1 +𝑀 2 𝑛−𝑠 Ω
=
𝑁𝐶
2𝐶
=
𝑁
2
This holds 2 𝑛
𝑆 = 𝑁 2𝐷 𝑠 Ω − 2 𝑛−1
+ 𝑀 2 𝑛
− 𝑠 Ω = 2𝑁𝐷 𝑠 Ω − 2 𝑛−1
+ 2𝑆 2 𝑛
− 𝑠 Ω
⇒ 2𝑁𝐷 𝑠 Ω − 2 𝑛−1
− 2𝑆 𝑠 Ω − 2 𝑛−1
= 0
⇒ 𝑁𝐷 − 𝑆 𝑠 Ω − 2 𝑛−1
= 0
Assuming ND=S we have: 𝑁𝐷 = 𝑆 = 2𝑁𝑀 ⇒ 𝐷 = 2𝑀
There is a contradiction because M is maximum value of D. Therefore, if k is constant with regard to D then s(Ω) = 2n–1. Inversely, if s(Ω) = 2n–1
then k is: 𝑘 =
2 𝑛−1 𝑆
2𝐷 2 𝑛−1−2 𝑛−1 +𝑀 2 𝑛−2 𝑛−1 =
𝑁
2
In general, the event that k is constant with regard to D is equivalent to the event s(Ω) = 2n–1.
Proof of diagnostic theorem
41Published in the book "Bayesian Inference" - InTechOpen9/6/2017
42. 4. Multi-hypothesis diagnostic relationship
Probabilities and transformation coefficient according to X-D network with AND-gate reference
called AND-D network according to formula 4.5.
𝑃 𝐷 𝑋𝑖 = 1 =
2𝐷 − 𝑀 𝑖=1
𝑛
𝑝𝑖 + 2 𝑛−1 𝑀 − 𝐷
2 𝑛−1 𝑆
𝑃 𝐷 𝑋𝑖 = 0 =
𝑀 − 𝐷
𝑆
𝑃 𝑋𝑖 = 1 𝐷 =
2𝐷 − 𝑀 𝑖=1
𝑛
𝑝𝑖 + 2 𝑛−1
𝑀 − 𝐷
2𝐷 − 𝑀 𝑖=1
𝑛
𝑝𝑖 + 2 𝑛 𝑀 − 𝐷
𝑃 𝑋𝑖 = 0 𝐷 =
2 𝑛−1
𝑀 − 𝐷
2𝐷 − 𝑀 𝑖=1
𝑛
𝑝𝑖 + 2 𝑛 𝑀 − 𝐷
𝑘 =
2 𝑛−1 𝑆
2𝐷 − 𝑀 𝑖=1
𝑛
𝑝𝑖 + 2 𝑛 𝑀 − 𝐷
For convenience, we validate diagnostic condition with a case of two sources Ω = {X1, X2}, p1 =
p2 = w1 = w2 = 0.5, 𝐷 ∈ {0,1,2,3}. By applying diagnostic theorem stated for AND-D network,
because s(Ω) = 0.25, AND-D network does not satisfy diagnostic condition.
42Published in the book "Bayesian Inference" - InTechOpen9/6/2017
43. 4. Multi-hypothesis diagnostic relationship
• AND-gate, OR-gate, XOR-gate, and XNOR-gate do not satisfy diagnostic condition and so they
should not be used to assess hypotheses. However, it is not asserted if U-gate and SIGMA-gate
satisfy such diagnostic condition.
• Formula 4.6 specifies probabilities of SIGMA-D network in order to validate it, as follows:
𝑃 𝐷 𝑋𝑖 = 1 =
2𝐷 − 𝑀 𝑤𝑖 + 𝑀
2𝑆
𝑃 𝐷 𝑋𝑖 = 0 =
𝑀 − 2𝐷 𝑤𝑖 + 𝑀
2𝑆
𝑃 𝑋𝑖 = 1 𝐷 =
2𝐷 − 𝑀 𝑤𝑖 + 𝑀
2𝑀
𝑃 𝑋𝑖 = 0 𝐷 =
𝑀 − 2𝐷 𝑤𝑖 + 𝑀
2𝑀
𝑘 =
𝑁
2
• By applying diagnostic theorem stated for SIGMA-D network, we have 𝑠 Ω = 2 𝑛−1
𝑖 𝑤𝑖 +
43Published in the book "Bayesian Inference" - InTechOpen9/6/2017
44. 4. Multi-hypothesis diagnostic relationship
• In case of SIGMA-gate, the augmented
variable Y can be removed from X-D network.
The evidence D is now established as direct
target variable, which composes so-called
direct SIGMA-D network.
• CPT of direct SIGMA-D network is determined
by formula 4.7.
𝑃 𝐷 𝑋1, 𝑋2, … , 𝑋 𝑛 =
𝑖∈𝐾
𝐷
𝑆
𝑤𝑖 +
𝑗∈𝐿
𝑀 − 𝐷
𝑆
𝑤𝑗
Where the set of Xi (s) is complete and
mutually exclusive.
𝑖=1
𝑛
𝑤𝑖 = 1 and 𝑋𝑖 ∩ 𝑋𝑗 = ∅, ∀𝑖 ≠ 𝑗
44Published in the book "Bayesian Inference" - InTechOpen9/6/2017
45. 4. Multi-hypothesis diagnostic relationship
• Direct SIGMA-D network shares
the same conditional probabilities
P(Xi|D) and P(D|Xi) with SIGMA-D
network, as seen in formula 4.6.
• Direct SIGMA-D network also
satisfies diagnostic condition when
its s(Ω) = 2n–1.
45Published in the book "Bayesian Inference" - InTechOpen9/6/2017
46. 4. Multi-hypothesis diagnostic relationship
• The most general nonlinear X-D network is U-D network whereas
SIGMA-D network is linear one. Aforementioned nonlinear X-D
network such as AND, OR, NAND, NOR, XOR, and NXOR are
specific cases of X-D network. Now we validate if U-D network
satisfies diagnostic condition.
• The U-gate inference given arbitrary condition on U is
𝑃 𝑋1⨄𝑋2⨄ … ⨄𝑋 𝑛 = 𝑈∈𝒰 𝑖∈𝑈∩𝐾 𝑝𝑖 𝑖∈𝑈∩𝐿 1 −
46Published in the book "Bayesian Inference" - InTechOpen9/6/2017
47. 4. Multi-hypothesis diagnostic relationship
The function f is sum of many large expressions and each expression is product of four possible sub-products
(Π) as follows:
𝐸𝑥𝑝𝑟 =
𝑖∈𝑈∩𝐾
𝑝𝑖
𝑖∈𝑈∩𝐿
1 − 𝜌𝑖
𝑖∈ 𝑈∩𝐾
1 − 𝑝𝑖
𝑖∈ 𝑈∩𝐿
𝜌𝑖
In any case of degradation, there always exist expression Expr (s) having at least 2 sub-products (Π), for
example:𝐸𝑥𝑝𝑟 = 𝑖∈𝑈∩𝐾 𝑝𝑖 𝑖∈𝑈∩𝐿 1 − 𝜌𝑖
Consequently, there always exist Expr (s) having at least 5 terms relevant to pi and ρi if n ≥ 5, for example:
𝐸𝑥𝑝𝑟 = 𝑝1 𝑝2 𝑝3 1 − 𝜌4 1 − 𝜌5
Thus, degree of f will be larger than or equal to 5 given n ≥ 5. Without loss of generality, each pi or ρi is sum of
variable x and a variable ai or bi, respectively. Note that all pi, ρi, ai are bi are abstract variables.
𝑝𝑖 = 𝑥 + 𝑎𝑖
𝜌𝑖 = 𝑥 + 𝑏𝑖
The equation f–2n–1 = 0 becomes equation g(x) = 0 whose degree is m ≥ 5 if n ≥ 5.
𝑔 𝑥 = ±𝑥 𝑚
+ 𝐶1 𝑥 𝑚−1
+ ⋯ + 𝐶 𝑚−1 𝑥 + 𝐶 𝑚 − 2 𝑛−1
= 0
Where coefficients Ci (s) are functions of ai and bi (s). According to Abel-Ruffini theorem (Wikipedia, Abel-
Ruffini theorem, 2016), equation g(x) = 0 has no algebraic solution when m ≥ 5. Thus, abstract variables pi and
ρi cannot be eliminated entirely from g(x)=0, which causes that there is no specification of U-gate inference
P(X1xX2x…xXn) so that diagnostic condition is satisfied.
47Published in the book "Bayesian Inference" - InTechOpen9/6/2017
48. 4. Multi-hypothesis diagnostic relationship
• It is concluded that there is no nonlinear X-D network satisfying diagnostic
condition but a new question is raised: Does there exist the general linear X-
D network satisfying diagnostic condition?
• Such linear network is called GL-D network and SIGMA-D network is
specific case of GL-D network. The GL-gate probability must be linear
combination of weights.
𝑃 𝑋1x𝑋2x … x𝑋 𝑛 = 𝐶 +
𝑖=1
𝑛
𝛼𝑖 𝑤𝑖 +
𝑖=1
𝑛
𝛽𝑖 𝑑𝑖
• The GL-gate inference is singular if αi and βi are functions of only Xi as
follows:
𝑃 𝑋1x𝑋2x … x𝑋 𝑛 = 𝐶 +
𝑖=1
𝑛
ℎ𝑖 𝑋𝑖 𝑤𝑖 +
𝑖=1
𝑛
𝑔𝑖 𝑋𝑖 𝑑𝑖
48Published in the book "Bayesian Inference" - InTechOpen9/6/2017
49. 4. Multi-hypothesis diagnostic relationship
• Suppose hi and gi are probability mass functions with regard to Xi. For all i, we have: 0 ≤
ℎ𝑖 𝑋𝑖 ≤ 1, 0 ≤ 𝑔𝑖 𝑋𝑖 ≤ 1, ℎ𝑖 𝑋𝑖 = 1 + ℎ𝑖 𝑋𝑖 = 0 = 1, 𝑔𝑖 𝑋𝑖 = 1 + 𝑔𝑖 𝑋𝑖 = 0 =
1
• GL-D network satisfies diagnostic condition if 𝑠 Ω = 2 𝑛
𝐶 + 2 𝑛−1
𝑖=1
𝑛
𝑤𝑖 + 𝑑𝑖 =
2 𝑛−1 ⇒ 2𝐶 + 𝑖=1
𝑛
𝑤𝑖 + 𝑑𝑖 = 1
• Suppose the set of Xi (s) is complete, we have 𝑖=1
𝑛
𝑤𝑖 + 𝑑𝑖 = 1. This implies C=0.
• Shortly, formula 4.10 specifies the singular GL-gate inference so that GL-D network
satisfies diagnostic condition.
𝑃 𝑋1x𝑋2x … x𝑋 𝑛 =
𝑖=1
𝑛
ℎ𝑖 𝑋𝑖 𝑤𝑖 +
𝑖=1
𝑛
𝑔𝑖 𝑋𝑖 𝑑𝑖
Where hi and gi are probability mass functions and the set of Xi (s) is complete
𝑖=1
𝑛
𝑤𝑖 + 𝑑𝑖 = 1
49Published in the book "Bayesian Inference" - InTechOpen9/6/2017
50. 4. Multi-hypothesis diagnostic relationship
• According to authors (Millán & Pérez-de-la-
Cruz, 2002), a hypothesis can have multiple
evidences as seen in the next figure. This is
multi-evidence diagnostic relationship opposite
to aforementioned multi-hypothesis diagnostic
relationship, which is called shortly M-E-D
network.
• The joint probability of M-E-D network is
𝑃 𝑌, 𝐷1, 𝐷2, … , 𝐷 𝑚 = 𝑃 𝑌 𝑗=1
𝑚
𝑃 𝐷𝑗 𝑌 =
𝑃 𝑌 𝑃 𝐷1, 𝐷2, … , 𝐷 𝑚 𝑌
• The possible transformation coefficient is
1
𝑘
=
𝑗=1
𝑚
𝑃 𝐷𝑗 𝑌 = 1 + 𝑗=1
𝑚
𝑃 𝐷𝑗 𝑌 = 0
50Published in the book "Bayesian Inference" - InTechOpen9/6/2017
51. 4. Multi-hypothesis diagnostic relationship
• M-E-D network will satisfy diagnostic condition if k = 1 because all hypotheses and
evidence are binary, which leads that following equation specified by following formula
4.11 has 2m real roots P(Dj|Y) for all m ≥ 2.
𝑗=1
𝑚
𝑃 𝐷𝑗 𝑌 = 1 +
𝑗=1
𝑚
𝑃 𝐷𝑗 𝑌 = 0 = 1 4.11
• Suppose equation 4.11 has 4 real roots as follows: 𝑎1 = 𝑃 𝐷1 = 1 𝑌 = 1 , 𝑎2 =
𝑃 𝐷2 = 1 𝑌 = 1 , 𝑏1 = 𝑃 𝐷1 = 1 𝑌 = 0 , 𝑏2 = 𝑃 𝐷2 = 1 𝑌 = 0
• From equation 4.11, it holds
𝑎1 = 𝑎2 = 0
𝑏1 = 𝑏2
𝑎1
2
+ 𝑏1
2
= 1
𝑏1 = 2
or
𝑎1 = 𝑎2 = 0.5
𝑏1 = 𝑏2
𝑎1
2
+ 𝑏1
2
= 1
𝑏1 = 1.5
which leads a
contradiction (b1=2 or b1=1.5) and so it is impossible to apply the diagnostic condition
into M-E-D network.
51Published in the book "Bayesian Inference" - InTechOpen9/6/2017
52. 4. Multi-hypothesis diagnostic relationship
• It is impossible to model M-E-D network by X-gates. The potential
solution for this problem is to group many evidences D1, D2,…, Dm into one
representative evidence D which in turn is dependent on hypothesis Y but
this solution will be inaccurate in specifying conditional probabilities
because directions of dependencies become inconsistent (relationships from
Dj to D and from Y to D) except that all Dj (s) are removed and D becomes a
vector. However evidence vector does not simplify the hazardous problem
and it changes the current problem into a new problem..
• Another solution is to reverse the direction of relationship, in which the
hypothesis is dependent on evidences so as to take advantages of X-gate
inference as usual. However, the reversion method violates the viewpoint in
this research where diagnostic relationship must be from hypothesis to
evidence.
52Published in the book "Bayesian Inference" - InTechOpen9/6/2017
53. 4. Multi-hypothesis diagnostic relationship
• Another solution to model M-E-D network by X-gates is based on a
so-called partial diagnostic condition that is a loose case of diagnostic
condition for M-E-D network, which is defined as follows:
𝑃 𝑌 𝐷𝑗 = 𝑘𝑃 𝐷𝑗 𝑌
• The joint probability of M-E-D is: 𝑃 𝑌, 𝐷1, 𝐷2, … , 𝐷 𝑚 =
𝑃 𝑌 𝑗=1
𝑚
𝑃 𝐷𝑗 𝑌
• M-E-D network satisfies partial diagnostic condition because
𝑃 𝑌 𝐷𝑗 =
1
2
𝑃 𝐷𝑗 𝑌
• Partial diagnostic condition expresses a different viewpoint. It is not
an optimal solution because we cannot test a disease based on only one
symptom while ignoring other obvious symptoms, for example.
53Published in the book "Bayesian Inference" - InTechOpen9/6/2017
54. 4. Multi-hypothesis diagnostic relationship
• If we are successful in specifying conditional
probabilities of M-E-D network, it is possible
to define an extended network which is
constituted of n hypotheses X1, X2,…, Xn and
m evidences D1, D2,…, Dm. Such extended
network represents multi-hypothesis multi-
evidence diagnostic relationship, called M-
HE-D network.
• The M-HE-D network is the most general
case of diagnostic network, which was
mentioned in (Millán & Pérez-de-la-Cruz,
2002, p. 297). We can construct any large
diagnostic BN from M-HE-D networks and so
the research is still open.
54Published in the book "Bayesian Inference" - InTechOpen9/6/2017
55. 5. Conclusion
• In short, relationship conversion is to determine conditional probabilities
based on logic gates that is adhered to semantics of relationships. The weak
point of logic gates is to require that all variables must be binary.
• In order to lessen the impact of such weak point, I use numeric evidence for
extending capacity of simple BN. However, combination of binary
hypothesis and numeric evidence leads to errors or biases in inference.
Therefore, I propose the diagnostic condition so as to confirm that numeric
evidence is adequate to make complicated inference tasks in BN.
• A large BN can be constituted of many simple BN (s). Inference in large BN
is hazardous problem. In future, I will research effective inference methods
for the special BN that is constituted of X-gate BN (s).
• Moreover, I try my best to research deeply M-E-D network and M-HE-D
network whose problems I cannot solve absolutely now.
55Published in the book "Bayesian Inference" - InTechOpen9/6/2017
56. 5. Conclusion
• Two main documents I referred to do this research are the book
“Learning Bayesian Networks” by the author (Neapolitan, 2003) and
the article “A Bayesian Diagnostic Algorithm for Student Modeling
and its Evaluation” by authors (Millán & Pérez-de-la-Cruz, 2002).
• Especially, the SIGMA-gate inference is based on and derived from
the work of the authors Eva Millán and José Luis Pérez-de-la-Cruz.
• This research is originated from my PhD research “A User Modeling
System for Adaptive Learning” (Nguyen, 2014).
• Other references relevant to user modeling, overlay model, and
Bayesian network are (Fröschl, 2005), (De Bra, Smits, & Stash, 2006),
(Murphy, 1998), and (Heckerman, 1995).
56Published in the book "Bayesian Inference" - InTechOpen9/6/2017
57. Thank you for attention
57Published in the book "Bayesian Inference" - InTechOpen9/6/2017
58. References
1. De Bra, P., Smits, D., & Stash, N. (2006). The Design of AHA! In U. K. Wiil, P. J. Nürnberg, & J. Rubart (Ed.), Proceedings of the seventeenth ACM Hypertext
Conference on Hypertext and hypermedia (Hypertext '06) (pp. 171-195). Odense, Denmark: ACM Press.
2. Díez, F. J., & Druzdzel, M. J. (2007). Canonical Probabilistic Models. National University for Distance Education, Department of Inteligencia Artificial. Madrid:
Research Centre on Intelligent Decision-Support Systems. Retrieved May 9, 2016, from http://www.cisiad.uned.es/techreports/canonical.pdf
3. Fröschl, C. (2005). User Modeling and User Profiling in Adaptive E-learning Systems. Master Thesis, Graz University of Technology, Austria.
4. Heckerman, D. (1995). A Tutorial on Learning With Bayesian Networks. Microsoft Corporation, Microsoft Research. Redmond: Microsoft Research. Retrieved from
ftp://ftp.research.microsoft.com/pub/dtg/david/tutorial.ps
5. Kschischang, F. R., Frey, B. J., & Loeliger, H.-A. (2001, February). Factor Graphs and the Sum-Product Algorithm. IEEE Transactions on Information Theory, 47(2),
498-519. doi:10.1109/18.910572
6. Millán, E., & Pérez-de-la-Cruz, J. L. (2002, June). A Bayesian Diagnostic Algorithm for Student Modeling and its Evaluation. (A. Kobsa, Ed.) User Modeling and
User-Adapted Interaction, 12(2-3), 281-330. doi:10.1023/A:1015027822614
7. Millán, E., Loboda, T., & Pérez-de-la-Cruz, J. L. (2010, July 29). Bayesian networks for student model engineering. (R. S. Heller, J. D. Underwood, & C.-C. Tsai, Eds.)
Computers & Education, 55(4), 1663-1683. doi:10.1016/j.compedu.2010.07.010
8. Murphy, K. P. (1998). A Brief Introduction to Graphical Models and Bayesian Networks. Retrieved 2008, from Kevin P. Murphy's home page:
http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html
9. Neapolitan, R. E. (2003). Learning Bayesian Networks. Upper Saddle River, New Jersey, USA: Prentice Hall.
10.Nguyen, L. (2014, April). A User Modeling System for Adaptive Learning. University of Science, Ho Chi Minh city, Vietnam. Abuja, Nigeria: Standard Research
Journals. Retrieved from http://standresjournals.org/journals/SSRE/Abstract/2014/april/Loc.html
11.Nguyen, L. (2016, March 28). Theorem of SIGMA-gate Inference in Bayesian Network. (V. S. Franz, Ed.) Wulfenia Journal, 23(3), 280-289.
12.Pearl, J. (1986, September). Fusion, propagation, and structuring in belief networks. Artificial Intelligence, 29(3), 241-288. doi:10.1016/0004-3702(86)90072-X
13.Wikipedia. (2014, October 10). Set (mathematics). (A. Rubin, Editor, & Wikimedia Foundation) Retrieved October 11, 2014, from Wikipedia website:
http://en.wikipedia.org/wiki/Set_(mathematics)
14.Wikipedia. (2015, November 22). Factor graph. (Wikimedia Foundation) Retrieved February 8, 2017, from Wikipedia website:
https://en.wikipedia.org/wiki/Factor_graph
15.Wikipedia. (2016, June 10). Abel-Ruffini theorem. (Wikimedia Foundation) Retrieved June 26, 2016, from Wikipedia website:
https://en.wikipedia.org/wiki/Abel%E2%80%93Ruffini_theorem
16.Wikipedia. (2016, June 2). Logic gate. (Wikimedia Foundation) Retrieved June 4, 2016, from Wikipedia website: https://en.wikipedia.org/wiki/Logic_gate
58Published in the book "Bayesian Inference" - InTechOpen9/6/2017