Regression analysis is an important tool in statistical analysis, in which there is a demand of discovering essential independent variables among many other ones, especially in case that there is a huge number of random variables. Extreme bound analysis is a powerful approach to extract such important variables called robust regressors. In this research, a so-called Regressive Expectation Maximization with RObust regressors (REMRO) algorithm is proposed as an alternative method beside other probabilistic methods for analyzing robust variables. By the different ideology from other probabilistic methods, REMRO searches for robust regressors forming optimal regression model and sorts them according to descending ordering given their fitness values determined by two proposed concepts of local correlation and global correlation. Local correlation represents sufficient explanatories to possible regressive models and global correlation reflects independence level and stand-alone capacity of regressors. Moreover, REMRO can resist incomplete data because it applies Regressive Expectation Maximization (REM) algorithm into filling missing values by estimated values based on ideology of expectation maximization (EM) algorithm. From experimental results, REMRO is more accurate for modeling numeric regressors than traditional probabilistic methods like Sala-I-Martin method but REMRO cannot be applied in case of nonnumeric regression model yet in this research.
Bayesian Variable Selection in Linear Regression and A ComparisonAtilla YARDIMCI
In this study, Bayesian approaches, such as Zellner, Occam’s Window and Gibbs sampling, have been compared in terms of selecting the correct subset for the variable selection in a linear regression model. The aim of this comparison is to analyze Bayesian variable selection and the behavior of classical criteria by taking into consideration the different values of β and σ and prior expected levels.
A Proposal of Two-step Autoregressive ModelLoc Nguyen
Autoregressive (AR) model and conditional autoregressive (CAR) model are specific regressive models in which independent variables and dependent variable imply the same object. They are powerful statistical tools to predict values based on correlation of time domain and space domain, which are useful in epidemiology analysis. In this research, I combine them by the simple way in which AR and CAR is estimated in two separate steps so as to cover time domain and space domain in spatial-temporal data analysis. Moreover, I integrate logistic model into CAR model, which aims to improve competence of autoregressive models.
Bayesian Variable Selection in Linear Regression and A ComparisonAtilla YARDIMCI
In this study, Bayesian approaches, such as Zellner, Occam’s Window and Gibbs sampling, have been compared in terms of selecting the correct subset for the variable selection in a linear regression model. The aim of this comparison is to analyze Bayesian variable selection and the behavior of classical criteria by taking into consideration the different values of β and σ and prior expected levels.
A Proposal of Two-step Autoregressive ModelLoc Nguyen
Autoregressive (AR) model and conditional autoregressive (CAR) model are specific regressive models in which independent variables and dependent variable imply the same object. They are powerful statistical tools to predict values based on correlation of time domain and space domain, which are useful in epidemiology analysis. In this research, I combine them by the simple way in which AR and CAR is estimated in two separate steps so as to cover time domain and space domain in spatial-temporal data analysis. Moreover, I integrate logistic model into CAR model, which aims to improve competence of autoregressive models.
Using Mathematical Foundations To Study The Equivalence Between Mass And Ener...QUESTJOURNAL
Abstract:This paper study the equivalence between mass and energy in special relativity, using mathematical methods to connect this work by de-Broglie equation, in this work found the relation between the momentum and energy, It has also been connect the mass and momentum and the speed of light in the energy equation, moreover it has been found that the relative served as an answer to a logical relationship de-Broglie through equivalence relationship between mass and energy.
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...Zac Darcy
In this paper we generalized recently introduced approach for estimation of time scales of mass transport.
The approach have been illustrated by estimation of time scales of relaxation of concentrations of charge
carriers in high-doped semiconductor. Diffusion coefficients and mobility of charge carriers and electric
field strength in semiconductor could be arbitrary functions of coordinate.
Covariance matrices are central to many adaptive filtering and optimisation problems. In practice, they have to be estimated from a finite number of samples; on this, I will review some known results from spectrum estimation and multiple-input multiple-output communications systems, and how properties that are assumed to be inherent in covariance and power spectral densities can easily be lost in the estimation process. I will discuss new results on space-time covariance estimation, and how the estimation from finite sample sets will impact on factorisations such as the eigenvalue decomposition, which is often key to solving the introductory optimisation problems. The purpose of the presentation is to give you some insight into estimating statistics as well as to provide a glimpse on classical signal processing challenges such as the separation of sources from a mixture of signals.
Logistic Regression, Linear and Quadratic Discriminant Analyses, and KNN Tarek Dib
A summary of the classification methods: Logistic regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis and a comparison of these three methods with K-Nearest Neighbors algorithm.
Computer Science
Active and Programmable Networks
Active safety systems
Ad Hoc & Sensor Network
Ad hoc networks for pervasive communications
Adaptive, autonomic and context-aware computing
Advance Computing technology and their application
Advanced Computing Architectures and New Programming Models
Advanced control and measurement
Aeronautical Engineering,
Agent-based middleware
Alert applications
Automotive, marine and aero-space control and all other control applications
Autonomic and self-managing middleware
Autonomous vehicle
Biochemistry
Bioinformatics
BioTechnology(Chemistry, Mathematics, Statistics, Geology)
Broadband and intelligent networks
Broadband wireless technologies
CAD/CAM/CAT/CIM
Call admission and flow/congestion control
Capacity planning and dimensioning
Changing Access to Patient Information
Channel capacity modelling and analysis
Civil Engineering,
Cloud Computing and Applications
Collaborative applications
Communication application
Communication architectures for pervasive computing
Communication systems
Computational intelligence
Computer and microprocessor-based control
Computer Architecture and Embedded Systems
Computer Business
Computer Sciences and Applications
Computer Vision
Computer-based information systems in health care
Computing Ethics
Computing Practices & Applications
Congestion and/or Flow Control
Content Distribution
Context-awareness and middleware
Creativity in Internet management and retailing
Cross-layer design and Physical layer based issue
Cryptography
Data Base Management
Data fusion
Data Mining
Data retrieval
Data Storage Management
Decision analysis methods
Decision making
Digital Economy and Digital Divide
Digital signal processing theory
Distributed Sensor Networks
Drives automation
Drug Design,
Drug Development
DSP implementation
E-Business
E-Commerce
E-Government
Electronic transceiver device for Retail Marketing Industries
Electronics Engineering,
Embeded Computer System
Emerging advances in business and its applications
Emerging signal processing areas
Enabling technologies for pervasive systems
Energy-efficient and green pervasive computing
Environmental Engineering,
Estimation and identification techniques
Evaluation techniques for middleware solutions
Event-based, publish/subscribe, and message-oriented middleware
Evolutionary computing and intelligent systems
Expert approaches
Facilities planning and management
Flexible manufacturing systems
Formal methods and tools for designing
Fuzzy algorithms
Fuzzy logics
GPS and location-based app
Linear Discriminant Analysis and Its Generalization일상 온
The brief introduction to the linear discriminant analysis and some extended methods. Much of the materials are taken from The Elements of Statistical Learning by Hastie et al. (2008).
On the Principle of Optimality for Linear Stochastic Dynamic System ijfcstjournal
In this work, processes represented by linear stochastic dynamic system are investigated and by
considering optimal control problem, principle of optimality is proven. Also, for existence of optimal
control and corresponding optimal trajectory, proofs of theorems of necessity and sufficiency condition are
attained.
We propose a regularized method for multivariate linear regression when the number of predictors may exceed the sample size. This method is designed to strengthen the estimation and the selection of the relevant input features with three ingredients: it takes advantage of the dependency pattern between the responses by estimating the residual covariance; it performs selection on direct links between predictors and responses; and selection is driven by prior structural information. To this end, we build on a recent reformulation of the multivariate linear regression model to a conditional Gaussian graphical model and propose a new regularization scheme accompanied with an efficient optimization procedure. On top of showing very competitive performance on artificial and real data sets, our method demonstrates capabilities for fine interpretation of its parameters, as illustrated in applications to genetics, genomics and spectroscopy.
In this work, we propose to apply trust region optimization to deep reinforcement
learning using a recently proposed Kronecker-factored approximation to
the curvature. We extend the framework of natural policy gradient and propose
to optimize both the actor and the critic using Kronecker-factored approximate
curvature (K-FAC) with trust region; hence we call our method Actor Critic using
Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this
is the first scalable trust region natural gradient method for actor-critic methods.
It is also a method that learns non-trivial tasks in continuous control as well as
discrete control policies directly from raw pixel inputs. We tested our approach
across discrete domains in Atari games as well as continuous domains in the MuJoCo
environment. With the proposed methods, we are able to achieve higher
rewards and a 2- to 3-fold improvement in sample efficiency on average, compared
to previous state-of-the-art on-policy actor-critic methods. Code is available at
https://github.com/openai/baselines.
Class of Estimators of Population Median Using New Parametric Relationship fo...inventionjournals
In this paper, we have defined a class of estimators of population median using the known information of population mean (푋 ) of the auxiliary variable making use of new parametric relationship for population median. We have derived the asymptotic expression for the MSE of any estimator of the proposed class and also its minimum value. As minimum MSE of all the estimators of defined class are same so to choose the optimum estimator of the class for the given population w.r.t.bias also, we have considered some important sub-classes of the generalized class. The optimum biases of the considered estimators are obtained (up to terms of order 푛 −1 ) and compared with each other. Theoretical results are supported by an empirical study based on twelve populations to show the superiority of the suggested estimator over others.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
A PROBABILISTIC ALGORITHM OF COMPUTING THE POLYNOMIAL GREATEST COMMON DIVISOR...ijscmcj
In the earlier work, subresultant algorithm was proposed to decrease the coefficient growth in the Euclidean algorithm of polynomials. However, the output polynomial remainders may have a small factor which can be removed to satisfy our needs. Then later, an improved subresultant algorithm was given by representing the subresultant algorithm in another way, where we add a variant called 𝜏 to express the small factor. There was a way to compute the variant proposed by Brown, who worked at IBM. Nevertheless, the way failed to determine each𝜏 correctly.
Adversarial Variational Autoencoders to extend and improve generative model -...Loc Nguyen
Generative artificial intelligence (GenAI) has been developing with many incredible achievements like ChatGPT and Bard. Deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to strong points of deep neural network (DNN) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims to synaptic plasticity of human neuron network, fosters generation ability of DGM which produces surprised results with support of statistical flexibility. Two popular approaches in DGM are Variational Autoencoders (VAE) and Generative Adversarial Network (GAN). Both VAE and GAN have their own strong points although they share and imply underline theory of statistics as well as incredible complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. In this research, I try to unify VAE and GAN into a consistent and consolidated model called Adversarial Variational Autoencoders (AVA) in which VAE and GAN complement each other, for instance, VAE is a good data generator by encoding data via excellent ideology of Kullback-Leibler divergence and GAN is a significantly important method to assess reliability of data which is realistic or fake. In other words, AVA aims to improve accuracy of generative models, besides AVA extends function of simple generative models. In methodology this research focuses on combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
Conditional mixture model and its application for regression modelLoc Nguyen
Expectation maximization (EM) algorithm is a powerful mathematical tool for estimating statistical parameter when data sample contains hidden part and observed part. EM is applied to learn finite mixture model in which the whole distribution of observed variable is average sum of partial distributions. Coverage ratio of every partial distribution is specified by the probability of hidden variable. An application of mixture model is soft clustering in which cluster is modeled by hidden variable whereas each data point can be assigned to more than one cluster and degree of such assignment is represented by the probability of hidden variable. However, such probability in traditional mixture model is simplified as a parameter, which can cause loss of valuable information. Therefore, in this research I propose a so-called conditional mixture model (CMM) in which the probability of hidden variable is modeled as a full probabilistic density function (PDF) that owns individual parameter. CMM aims to extend mixture model. I also propose an application of CMM which is called adaptive regression model (ARM). Traditional regression model is effective when data sample is scattered equally. If data points are grouped into clusters, regression model tries to learn a unified regression function which goes through all data points. Obviously, such unified function is not effective to evaluate response variable based on grouped data points. The concept “adaptive” of ARM means that ARM solves the ineffectiveness problem by selecting the best cluster of data points firstly and then evaluating response variable within such best cluster. In orther words, ARM reduces estimation space of regression model so as to gain high accuracy in calculation.
Keywords: expectation maximization (EM) algorithm, finite mixture model, conditional mixture model, regression model, adaptive regression model (ARM).
More Related Content
Similar to Extreme bound analysis based on correlation coefficient for optimal regression model
Using Mathematical Foundations To Study The Equivalence Between Mass And Ener...QUESTJOURNAL
Abstract:This paper study the equivalence between mass and energy in special relativity, using mathematical methods to connect this work by de-Broglie equation, in this work found the relation between the momentum and energy, It has also been connect the mass and momentum and the speed of light in the energy equation, moreover it has been found that the relative served as an answer to a logical relationship de-Broglie through equivalence relationship between mass and energy.
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...Zac Darcy
In this paper we generalized recently introduced approach for estimation of time scales of mass transport.
The approach have been illustrated by estimation of time scales of relaxation of concentrations of charge
carriers in high-doped semiconductor. Diffusion coefficients and mobility of charge carriers and electric
field strength in semiconductor could be arbitrary functions of coordinate.
Covariance matrices are central to many adaptive filtering and optimisation problems. In practice, they have to be estimated from a finite number of samples; on this, I will review some known results from spectrum estimation and multiple-input multiple-output communications systems, and how properties that are assumed to be inherent in covariance and power spectral densities can easily be lost in the estimation process. I will discuss new results on space-time covariance estimation, and how the estimation from finite sample sets will impact on factorisations such as the eigenvalue decomposition, which is often key to solving the introductory optimisation problems. The purpose of the presentation is to give you some insight into estimating statistics as well as to provide a glimpse on classical signal processing challenges such as the separation of sources from a mixture of signals.
Logistic Regression, Linear and Quadratic Discriminant Analyses, and KNN Tarek Dib
A summary of the classification methods: Logistic regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis and a comparison of these three methods with K-Nearest Neighbors algorithm.
Computer Science
Active and Programmable Networks
Active safety systems
Ad Hoc & Sensor Network
Ad hoc networks for pervasive communications
Adaptive, autonomic and context-aware computing
Advance Computing technology and their application
Advanced Computing Architectures and New Programming Models
Advanced control and measurement
Aeronautical Engineering,
Agent-based middleware
Alert applications
Automotive, marine and aero-space control and all other control applications
Autonomic and self-managing middleware
Autonomous vehicle
Biochemistry
Bioinformatics
BioTechnology(Chemistry, Mathematics, Statistics, Geology)
Broadband and intelligent networks
Broadband wireless technologies
CAD/CAM/CAT/CIM
Call admission and flow/congestion control
Capacity planning and dimensioning
Changing Access to Patient Information
Channel capacity modelling and analysis
Civil Engineering,
Cloud Computing and Applications
Collaborative applications
Communication application
Communication architectures for pervasive computing
Communication systems
Computational intelligence
Computer and microprocessor-based control
Computer Architecture and Embedded Systems
Computer Business
Computer Sciences and Applications
Computer Vision
Computer-based information systems in health care
Computing Ethics
Computing Practices & Applications
Congestion and/or Flow Control
Content Distribution
Context-awareness and middleware
Creativity in Internet management and retailing
Cross-layer design and Physical layer based issue
Cryptography
Data Base Management
Data fusion
Data Mining
Data retrieval
Data Storage Management
Decision analysis methods
Decision making
Digital Economy and Digital Divide
Digital signal processing theory
Distributed Sensor Networks
Drives automation
Drug Design,
Drug Development
DSP implementation
E-Business
E-Commerce
E-Government
Electronic transceiver device for Retail Marketing Industries
Electronics Engineering,
Embeded Computer System
Emerging advances in business and its applications
Emerging signal processing areas
Enabling technologies for pervasive systems
Energy-efficient and green pervasive computing
Environmental Engineering,
Estimation and identification techniques
Evaluation techniques for middleware solutions
Event-based, publish/subscribe, and message-oriented middleware
Evolutionary computing and intelligent systems
Expert approaches
Facilities planning and management
Flexible manufacturing systems
Formal methods and tools for designing
Fuzzy algorithms
Fuzzy logics
GPS and location-based app
Linear Discriminant Analysis and Its Generalization일상 온
The brief introduction to the linear discriminant analysis and some extended methods. Much of the materials are taken from The Elements of Statistical Learning by Hastie et al. (2008).
On the Principle of Optimality for Linear Stochastic Dynamic System ijfcstjournal
In this work, processes represented by linear stochastic dynamic system are investigated and by
considering optimal control problem, principle of optimality is proven. Also, for existence of optimal
control and corresponding optimal trajectory, proofs of theorems of necessity and sufficiency condition are
attained.
We propose a regularized method for multivariate linear regression when the number of predictors may exceed the sample size. This method is designed to strengthen the estimation and the selection of the relevant input features with three ingredients: it takes advantage of the dependency pattern between the responses by estimating the residual covariance; it performs selection on direct links between predictors and responses; and selection is driven by prior structural information. To this end, we build on a recent reformulation of the multivariate linear regression model to a conditional Gaussian graphical model and propose a new regularization scheme accompanied with an efficient optimization procedure. On top of showing very competitive performance on artificial and real data sets, our method demonstrates capabilities for fine interpretation of its parameters, as illustrated in applications to genetics, genomics and spectroscopy.
In this work, we propose to apply trust region optimization to deep reinforcement
learning using a recently proposed Kronecker-factored approximation to
the curvature. We extend the framework of natural policy gradient and propose
to optimize both the actor and the critic using Kronecker-factored approximate
curvature (K-FAC) with trust region; hence we call our method Actor Critic using
Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this
is the first scalable trust region natural gradient method for actor-critic methods.
It is also a method that learns non-trivial tasks in continuous control as well as
discrete control policies directly from raw pixel inputs. We tested our approach
across discrete domains in Atari games as well as continuous domains in the MuJoCo
environment. With the proposed methods, we are able to achieve higher
rewards and a 2- to 3-fold improvement in sample efficiency on average, compared
to previous state-of-the-art on-policy actor-critic methods. Code is available at
https://github.com/openai/baselines.
Class of Estimators of Population Median Using New Parametric Relationship fo...inventionjournals
In this paper, we have defined a class of estimators of population median using the known information of population mean (푋 ) of the auxiliary variable making use of new parametric relationship for population median. We have derived the asymptotic expression for the MSE of any estimator of the proposed class and also its minimum value. As minimum MSE of all the estimators of defined class are same so to choose the optimum estimator of the class for the given population w.r.t.bias also, we have considered some important sub-classes of the generalized class. The optimum biases of the considered estimators are obtained (up to terms of order 푛 −1 ) and compared with each other. Theoretical results are supported by an empirical study based on twelve populations to show the superiority of the suggested estimator over others.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
A PROBABILISTIC ALGORITHM OF COMPUTING THE POLYNOMIAL GREATEST COMMON DIVISOR...ijscmcj
In the earlier work, subresultant algorithm was proposed to decrease the coefficient growth in the Euclidean algorithm of polynomials. However, the output polynomial remainders may have a small factor which can be removed to satisfy our needs. Then later, an improved subresultant algorithm was given by representing the subresultant algorithm in another way, where we add a variant called 𝜏 to express the small factor. There was a way to compute the variant proposed by Brown, who worked at IBM. Nevertheless, the way failed to determine each𝜏 correctly.
Similar to Extreme bound analysis based on correlation coefficient for optimal regression model (20)
Adversarial Variational Autoencoders to extend and improve generative model -...Loc Nguyen
Generative artificial intelligence (GenAI) has been developing with many incredible achievements like ChatGPT and Bard. Deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to strong points of deep neural network (DNN) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims to synaptic plasticity of human neuron network, fosters generation ability of DGM which produces surprised results with support of statistical flexibility. Two popular approaches in DGM are Variational Autoencoders (VAE) and Generative Adversarial Network (GAN). Both VAE and GAN have their own strong points although they share and imply underline theory of statistics as well as incredible complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. In this research, I try to unify VAE and GAN into a consistent and consolidated model called Adversarial Variational Autoencoders (AVA) in which VAE and GAN complement each other, for instance, VAE is a good data generator by encoding data via excellent ideology of Kullback-Leibler divergence and GAN is a significantly important method to assess reliability of data which is realistic or fake. In other words, AVA aims to improve accuracy of generative models, besides AVA extends function of simple generative models. In methodology this research focuses on combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
Conditional mixture model and its application for regression modelLoc Nguyen
Expectation maximization (EM) algorithm is a powerful mathematical tool for estimating statistical parameter when data sample contains hidden part and observed part. EM is applied to learn finite mixture model in which the whole distribution of observed variable is average sum of partial distributions. Coverage ratio of every partial distribution is specified by the probability of hidden variable. An application of mixture model is soft clustering in which cluster is modeled by hidden variable whereas each data point can be assigned to more than one cluster and degree of such assignment is represented by the probability of hidden variable. However, such probability in traditional mixture model is simplified as a parameter, which can cause loss of valuable information. Therefore, in this research I propose a so-called conditional mixture model (CMM) in which the probability of hidden variable is modeled as a full probabilistic density function (PDF) that owns individual parameter. CMM aims to extend mixture model. I also propose an application of CMM which is called adaptive regression model (ARM). Traditional regression model is effective when data sample is scattered equally. If data points are grouped into clusters, regression model tries to learn a unified regression function which goes through all data points. Obviously, such unified function is not effective to evaluate response variable based on grouped data points. The concept “adaptive” of ARM means that ARM solves the ineffectiveness problem by selecting the best cluster of data points firstly and then evaluating response variable within such best cluster. In orther words, ARM reduces estimation space of regression model so as to gain high accuracy in calculation.
Keywords: expectation maximization (EM) algorithm, finite mixture model, conditional mixture model, regression model, adaptive regression model (ARM).
Nghịch dân chủ luận (tổng quan về dân chủ và thể chế chính trị liên quan đến ...Loc Nguyen
Vũ trụ có vật chất và phản vật chất, xã hội có xung đột và hữu hão để phát triển và suy tàn rồi suy tàn và phát triển. Tôi dựa vào đó để biện minh cho một bài viết có tính chất phản động nghịch chuyển thời cuộc nhưng bạn đọc sẽ tự tìm ra ý nghĩa bất ly của các hình thái xã hội. Ngoài ra bài viết này không đi sâu vào nghiên cứu pháp luật, chỉ đưa ra một cách nhìn tổng quan về dân chủ và thể chế chính trị liên quan đến triết học và tôn giáo, mà theo đó đóng góp của bài viết là khái niệm “nương tạm” của tư pháp không thật sự từ bầu cử và cũng không thật sự từ bổ nhiệm.
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsLoc Nguyen
Collaborative filtering (CF) is a popular technique in recommendation study. Concretely, items which are recommended to user are determined by surveying her/his communities. There are two main CF approaches, which are memory-based and model-based. I propose a new CF model-based algorithm by mining frequent itemsets from rating database. Hence items which belong to frequent itemsets are recommended to user. My CF algorithm gives immediate response because the mining task is performed at offline process-mode. I also propose another so-called Roller algorithm for improving the process of mining frequent itemsets. Roller algorithm is implemented by heuristic assumption “The larger the support of an item is, the higher it’s likely that this item will occur in some frequent itemset”. It models upon doing white-wash task, which rolls a roller on a wall in such a way that is capable of picking frequent itemsets. Moreover I provide enhanced techniques such as bit representation, bit matching and bit mining in order to speed up recommendation process. These techniques take advantages of bitwise operations (AND, NOT) so as to reduce storage space and make algorithms run faster.
Simple image deconvolution based on reverse image convolution and backpropaga...Loc Nguyen
Deconvolution task is not important in convolutional neural network (CNN) because it is not imperative to recover convoluted image when convolutional layer is important to extract features. However, the deconvolution task is useful in some cases of inspecting and reflecting a convolutional filter as well as trying to improve a generated image when information loss is not serious with regard to trade-off of information loss and specific features such as edge detection and sharpening. This research proposes a duplicated and reverse process of recovering a filtered image. Firstly, source layer and target layer are reversed in accordance with traditional image convolution so as to train the convolutional filter. Secondly, the trained filter is reversed again to derive a deconvolutional operator for recovering the filtered image. The reverse process is associated with backpropagation algorithm which is most popular in learning neural network. Experimental results show that the proposed technique in this research is better to learn the filters that focus on discovering pixel differences. Therefore, the main contribution of this research is to inspect convolutional filters from data.
Adversarial Variational Autoencoders to extend and improve generative modelLoc Nguyen
Generative artificial intelligence (GenAI) has been developing with many incredible achievements like ChatGPT and Bard. Deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to strong points of deep neural network (DNN) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims to synaptic plasticity of human neuron network, fosters generation ability of DGM which produces surprised results with support of statistical flexibility. Two popular approaches in DGM are Variational Autoencoders (VAE) and Generative Adversarial Network (GAN). Both VAE and GAN have their own strong points although they share and imply underline theory of statistics as well as incredible complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. In this research, I try to unify VAE and GAN into a consistent and consolidated model called Adversarial Variational Autoencoders (AVA) in which VAE and GAN complement each other, for instance, VAE is good at generator by encoding data via excellent ideology of Kullback-Leibler divergence and GAN is a significantly important method to assess reliability of data which is realistic or fake. In other words, AVA aims to improve accuracy of generative models, besides AVA extends function of simple generative models. In methodology this research focuses on combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
Learning dyadic data and predicting unaccomplished co-occurrent values by mix...Loc Nguyen
Dyadic data which is also called co-occurrence data (COD) contains co-occurrences of objects. Searching for statistical models to represent dyadic data is necessary. Fortunately, finite mixture model is a solid statistical model to learn and make inference on dyadic data because mixture model is built smoothly and reliably by expectation maximization (EM) algorithm which is suitable to inherent spareness of dyadic data. This research summarizes mixture models for dyadic data. When each co-occurrence in dyadic data is associated with a value, there are many unaccomplished values because a lot of co-occurrences are inexistent. In this research, these unaccomplished values are estimated as mean (expectation) of random variable given partial probabilistic distributions inside dyadic mixture model.
Machine learning forks into three main branches such as supervised learning, unsupervised learning, and reinforcement learning where reinforcement learning is much potential to artificial intelligence (AI) applications because it solves real problems by progressive process in which possible solutions are improved and finetuned continuously. The progressive approach, which reflects ability of adaptation, is appropriate to the real world where most events occur and change continuously and unexpectedly. Moreover, data is getting too huge for supervised learning and unsupervised learning to draw valuable knowledge from such huge data at one time. Bayesian optimization (BO) models an optimization problem as a probabilistic form called surrogate model and then directly maximizes an acquisition function created from such surrogate model in order to maximize implicitly and indirectly the target function for finding out solution of the optimization problem. A popular surrogate model is Gaussian process regression model. The process of maximizing acquisition function is based on updating posterior probability of surrogate model repeatedly, which is improved after every iteration. Taking advantages of acquisition function or utility function is also common in decision theory but the semantic meaning behind BO is that BO solves problems by progressive and adaptive approach via updating surrogate model from a small piece of data at each time, according to ideology of reinforcement learning. Undoubtedly, BO is a reinforcement learning algorithm with many potential applications and thus it is surveyed in this research with attention to its mathematical ideas. Moreover, the solution of optimization problem is important to not only applied mathematics but also AI.
Support vector machine is a powerful machine learning method in data classification. Using it for applied researches is easy but comprehending it for further development requires a lot of efforts. This report is a tutorial on support vector machine with full of mathematical proofs and example, which help researchers to understand it by the fastest way from theory to practice. The report focuses on theory of optimization which is the base of support vector machine.
There are many investment ways such as bank depositing, enterprise business, and stock investment. Bank depositing is a safe and easy way to invest and hence, it is known as reference tool to compare or make decision on other investment methods. Alternately, stock investment is preferred method with good feeling about its preeminence. However, according to mathematical model, stock investment and bank depositing have the same benefit if their growth rate and interest rate are the same. Therefore, I propose a so-called jagged stock investment (JSI) strategy in which the chain of buying stock in the given time interval is modeled as a saw with expectation that JSI strategy gets frequently profitable.
Global optimization is an imperative development of local optimization because there are many problems in artificial intelligence and machine learning requires highly acute solutions over entire domain. There are many methods to resolve the global optimization, which can be classified into three groups such as analytic methods (purely mathematical methods), probabilistic methods, and heuristic methods. Especially, heuristic methods like particle swarm optimization and ant bee colony attract researchers because their effective and practical techniques which are easy to be implemented by computer programming languages. However, these heuristic methods are lacking in theoretical mathematical fundamental. Fortunately, minima distribution establishes a strict mathematical relationship between optimized target function and its global minima. In this research, I try to study minima distribution and apply it into explaining convergence and convergence speed of optimization algorithms. Especially, weak conditions of convergence and monotonicity within minima distribution are drawn so as to be appropriate to practical optimization methods.
This is chapter 4 “Variants of EM algorithm” in my book “Tutorial on EM algorithm”, which focuses on EM variants. The main purpose of expectation maximization (EM) algorithm, also GEM algorithm, is to maximize the log-likelihood L(Θ) = log(g(Y|Θ)) with observed data Y by maximizing the conditional expectation Q(Θ’|Θ). Such Q(Θ’|Θ) is defined fixedly in E-step. Therefore, most variants of EM algorithm focus on how to maximize Q(Θ’|Θ) in M-step more effectively so that EM is faster or more accurate.
This is the chapter 3 "Properties and convergence of EM algorithm" in my book “Tutorial on EM algorithm”, which focuses on mathematical explanation of the convergence of GEM algorithm given by DLR (Dempster, Laird, & Rubin, 1977, pp. 6-9).
Local optimization with convex function is solved perfectly by traditional mathematical methods such as Newton-Raphson and gradient descent but it is not easy to solve the global optimization with arbitrary function although there are some purely mathematical approaches such as approximation, cutting plane, branch and bound, and interval method which can be impractical because of their complexity and high computation cost. Recently, some evolutional algorithms which are inspired from biological activities are proposed to solve the global optimization by acceptable heuristic level. Among them is particle swarm optimization (PSO) algorithm which is proved as an effective and feasible solution for global optimization in real applications. Although the ideology of PSO is not complicated, it derives many variants, which can make new researchers confused. Therefore, this tutorial focuses on describing, systemizing, and classifying PSO by succinct and straightforward way. Moreover, a combination of PSO and another evolutional algorithm as artificial bee colony (ABC) algorithm for improving PSO itself or solving other advanced problems are mentioned too.
Maximum likelihood estimation (MLE) is a popular method for parameter estimation in both applied probability and statistics but MLE cannot solve the problem of incomplete data or hidden data because it is impossible to maximize likelihood function from hidden data. Expectation maximum (EM) algorithm is a powerful mathematical tool for solving this problem if there is a relationship between hidden data and observed data. Such hinting relationship is specified by a mapping from hidden data to observed data or by a joint probability between hidden data and observed data (showing MLE, EM, and practical EM, hidden info implies the hinting relationship).
The essential ideology of EM is to maximize the expectation of likelihood function over observed data based on the hinting relationship instead of maximizing directly the likelihood function of hidden data (showing the full EM with proof along with two steps).
An important application of EM is (finite) mixture model which in turn is developed towards two trends such as infinite mixture model and semiparametric mixture model. Especially, in semiparametric mixture model, component probabilistic density functions are not parameterized. Semiparametric mixture model is interesting and potential for other applications where probabilistic components are not easy to be specified (showing mixture models).
I raise a question that whether it is possible to backward discover semiparametric EM from semiparametric mixture model. I hope that this question will open a new trend or new extension for EM algorithm (showing the question).
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
Extreme bound analysis based on correlation coefficient for optimal regression model
1. Extreme bound analysis based on
correlation coefficient for
optimal regression model
Prof. Dr. Loc Nguyen
Loc Nguyen’s Academic Network, Vietnam
Email: ng_phloc@yahoo.com, Homepage: www.locnguyen.net
Prof. Dr. Ali A. Amer
Computer Science Department, TAIZ University, TAIZ, Yemen
Email: aliaaa2004@yahoo.com
Extreme bound analysis correlation
13/12/2022 1
TASHKENT 2ND-International Congress on Modern Sciences
Tashkent Chemical-Technological Institute
December 16-17, 2022, Tashkent, UZBEKISTAN
2. Abstract
Regression analysis is an important tool in statistical analysis, in which there is a demand of
discovering essential independent variables among many other ones, especially in case that there is a
huge number of random variables. Extreme bound analysis is a powerful approach to extract such
important variables called robust regressors. In this research, a so-called Regressive Expectation
Maximization with RObust regressors (REMRO) algorithm is proposed as an alternative method
beside other probabilistic methods for analyzing robust variables. By the different ideology from
other probabilistic methods, REMRO searches for robust regressors forming optimal regression
model and sorts them according to descending ordering given their fitness values determined by two
proposed concepts of local correlation and global correlation. Local correlation represents sufficient
explanatories to possible regressive models and global correlation reflects independence level and
stand-alone capacity of regressors. Moreover, REMRO can resist incomplete data because it applies
Regressive Expectation Maximization (REM) algorithm into filling missing values by estimated
values based on ideology of expectation maximization (EM) algorithm. From experimental results,
REMRO is more accurate for modeling numeric regressors than traditional probabilistic methods like
Sala-I-Martin method but REMRO cannot be applied in case of nonnumeric regression model yet in
this research.
2
Extreme bound analysis correlation
13/12/2022
4. 1. Introduction
Given a dependent random variable Z and a set of independent random variables X = (1, X1, X2,…, Xn)T,
regression analysis aims to build up a regression function Z = α0 + α1X1 + α2X2 + … + αnXn called regression
model from sample data (X, z) of size N. As a convention, Xj (s) are called regressors and Z is called
responsor whereas α = (α0, α1, α2,…, αn)T are called regressive coefficients. The sample (X, z) is in form of
data matrix as follows:
𝑿 =
𝒙1
𝑇
𝒙2
𝑇
⋮
𝒙𝑁
𝑇
=
1 𝑥11 𝑥12 ⋯ 𝑥1𝑛
1 𝑥21 𝑥22 ⋯ 𝑥2𝑛
⋮ ⋮ ⋮ ⋱ ⋮
1 𝑥𝑁1 𝑥𝑁2 ⋯ 𝑥𝑁𝑛
𝒙𝑖 =
1
𝑥𝑖1
𝑥𝑖2
⋮
𝑥𝑖𝑛
, 𝒙𝑗 =
𝑥1𝑗
𝑥2𝑗
⋮
𝑥𝑁𝑗
, 𝒛 =
𝑧1
𝑧2
⋮
𝑧𝑁
, 𝒁 =
1 𝑧1
1 𝑧2
⋮ ⋮
1 𝑧𝑁
(1.1)
Therefore, xij and zi is the ith instances of regressor Xj and responsor Z at the ith row of matrix (X, z). Because
the sample (X, z) can be incomplete in this research, X and z can have missing values and so, let zi
– and xij
–
denote missing values of responsor Z and regressor Xj at the ith row of matrix (X, z).
13/12/2022 Extreme bound analysis correlation 4
5. 1. Introduction
When both responsor and regressors are random variables, the assumption of their normal distribution is
specified by the probability density function (PDF) of Z as follows:
𝑃 𝑍 𝑋, 𝜶 =
1
2𝜋𝜎2
exp −
𝑍 − 𝜶𝑇
𝑋 2
2𝜎2
(1.2)
Note, αTX and σ2 are mean and variance of Z with regard to P(Z | X, α), respectively. The popular technique to
build up regression model is least squares method which produces the same result to likelihood method based on
the PDF of Z but the likelihood method can produce more results with estimation of the variance σ2. The PDF
P(Z | X, α) is essential to calculate likelihood function of given sample. Let 𝜶 = 𝛼0, 𝛼1, 𝛼2, … , 𝛼𝑛
𝑇 be the
estimates of regressive coefficients α = (α0, α1, α2,…, αn)T resulted from least squares method or likelihood
method, the estimate of responsor Z is easily calculated by regression function as follows:
𝑍 = 𝛼0 +
𝑗=1
𝑛
𝛼𝑗𝑋𝑗 = 𝜶𝑇
𝑋 (1.3)
When there is a large number of random variables which consumes a lot of computing resources to produce
regression model, there is a demand of discovering essential independent variables among many other ones.
Extreme bound analysis (EBA) is a powerful approach to extract such important variables called robust
regressors. Traditional EBA methods focus on taking advantages of probabilistic appropriateness of regressors.
13/12/2022 Extreme bound analysis correlation 5
6. 1. Introduction
With concerning domain of EBA, let A, B, and C be free set, focus set, and doubtful set of regressors,
respectively, the regression function k of regression model k is rewritten without loss of its meaning as follows:
𝑍 𝑘 = 𝛼0 + 𝜶𝐴
𝑇
𝐴 + 𝛼𝑘𝑋𝑘 + 𝜶𝐷
𝑇
𝐷 (1.4)
Where D is a combination of regressors taken from doubtful set C without regressor Xk and consequently, αA
and αD are regressive coefficients extracted from α corresponding to free set A and combination set D,
respectively. According to Levine, Renelt, and Leamer, suppose variance of each model k is σk
2, if 95%
confidence interval of αk as [αk – 1.96σk
2, αk + 1.96σk
2] (Hlavac, 2016, p. 4) is larger or smaller than 0 then, the
regressor Xk is robust. Alternately, Sala-I-Martin estimated the mean 𝛼𝑘 of αk weighted by K likelihood values
over K models where K is the number of combinations taken from doubtful set C. Later on, Sala-I-Martin
calculated every fitness value of every regressor Xk and such fitness value is represented by cumulative density
function (cdf) at 0 denoted cdf(0) given mean 𝛼𝑘 and model variance σk
2. The larger the cdf(0) is, the more
robust the regressor is. In general, these probabilistic methods are effective enough to apply into any data types
of regressors and responsor although they may not evaluate exactly the regressors which are independent from
any models because probabilistic analysis inside these methods is required concrete regression models which
are already built. Therefore, in this research, an alternative method is proposed based on correlation beside these
probabilistic methods for analyzing robust variables, in which highly independent regressors are concerned
more than ever. The proposed algorithm is described in the next section.
13/12/2022 Extreme bound analysis correlation 6
7. 2. Methodology
In this section, we describe a proposed EBA method based on correlation coefficient for optimal regression model.
Essentially, two concepts of correlation are proposed such as local correlation and global correlation. Local
correlation is also called model correlation, which implies fitness of a target regressive parameter with subject to a
given regression model. Note, regressive parameter α = (α0, α1, α2,…, αn)T is the set of regressive coefficients
corresponding to regressors X = (X1, X2,…, Xn) and let Z and 𝑍 be the responsor and its estimate, respectively. Given
regression model k, let Rk(Xj, 𝑍) and Rk(𝑍, Z) be the correlation between Xj and 𝑍 and the correlation between 𝑍 and
Z within model k, respectively.
𝑅𝑘 𝑋𝑗, 𝑍 =
𝑖=1
𝑁
𝑥𝑖𝑗 − 𝑥𝑗 𝑧𝑖 − 𝑧
𝑖=1
𝑁
𝑥𝑖𝑗 − 𝑥𝑗
2
𝑖=1
𝑁
𝑧𝑖 − 𝑧 2
𝑅𝑘 𝑍, 𝑍 =
𝑖=1
𝑁
𝑧𝑖 − 𝑧 𝑧𝑖 − 𝑧
𝑖=1
𝑁
𝑧𝑖 − 𝑧
2
𝑖=1
𝑁
𝑧𝑖 − 𝑧 2
(2.1)
Where,
𝑥𝑗 =
1
𝑁
𝑖=1
𝑁
𝑥𝑖𝑗 , 𝑧 =
1
𝑁
𝑖=1
𝑁
𝑧𝑖 , 𝑧 =
1
𝑁
𝑖=1
𝑁
𝑧𝑖 , 𝑧𝑖 = 𝜶𝑇
𝒙𝑖 = 𝛼0 +
𝑗=1
𝑛
𝛼𝑗𝑥𝑖𝑗
13/12/2022 Extreme bound analysis correlation 7
8. 2. Methodology
Let Rk(Xj, Z) be the local correlation of Xj and Z within model k. Obviously, Rk(Xj, Z) reflects fitness or
appropriateness of the regressive coefficient estimate 𝛼𝑗 regarding model k. The local correlation Rk(Xj, Z) is
defined as product of Rk(Xj, 𝑍) and Rk(𝑍, Z) as follows:
𝑅𝑘 𝑋𝑗, 𝑍 = 𝑅𝑘 𝑋𝑗, 𝑍 𝑅𝑘 𝑍, 𝑍 (2.2)
Indeed, local correlation is a conditional correlation of a regressor along its estimated coefficient given the
condition which is the estimated regression model and so, the intermediate variable representing such condition is
the estimated response 𝑍. For K estimated models, averaged local correlation 𝑅 𝑋𝑗, 𝑍 is calculated as follows:
𝑅 𝑋𝑗, 𝑍 =
1
𝐾
𝑘=1
𝐾
𝑅𝑘 𝑋𝑗, 𝑍 (2.3)
Global correlation implies fitness of the target regressive parameter without concerning any regression models.
Let R(Xj, Z) denote the global correlation between regressor Xj and responsor Z, which is defined as usual
correlation coefficient as follows:
𝑅 𝑋𝑗, 𝑍 =
𝑖=1
𝑁
𝑥𝑖𝑗 − 𝑥𝑗 𝑧𝑖 − 𝑧
𝑖=1
𝑁
𝑥𝑖𝑗 − 𝑥𝑗
2
𝑖=1
𝑁
𝑧𝑖 − 𝑧 2
(2.4)
13/12/2022 Extreme bound analysis correlation 8
9. 2. Methodology
A regressor Xj along with its implicit regressive coefficient αj are good if they can give sufficient explanatories to
possible models and they can be more independent to reflect the responsor Z. In other words, the first condition of
sufficient explanatories to possible models is represented by local correlation and the second condition of independent
reflection is represented by global correlation. Therefore, the fitness of Xj and αj are defined as product of the
averaged local correlation 𝑅 𝑋𝑗, 𝑍 and the global correlation R(Xj, Z) follows:
𝜑𝑗 = 𝑅 𝑋𝑗, 𝑍 𝑅 𝑋𝑗, 𝑍 (2.5)
The larger the fitness φj is, the better the implicit estimate 𝛼𝑗 is, and the better the regressor Xj is. Good regressors Xj
(also αj or 𝛼𝑗) which have large enough fitness values φj are called robust regressors. Consequently, Regressive
Expectation Maximization with RObust regressors (REMRO) algorithm searches for robust regressors and sorts them
according to descending ordering with their fitness values φj as searching criterion. Another problem is how to
produce K models to calculate the averaged local correlation 𝑅𝑘 𝑋𝑗, 𝑍 . Fortunately, Sala-I-Martin (Sala-I-Martin,
1997) generated a set of K combinations of doubtful regressors which need to be checked their fitness. Each model in
K models is estimated with each combination of doubtful ones and estimation method can be least squares method as
usual. Moreover, REMRO can resist incomplete data because it applies Regressive Expectation Maximization (REM)
algorithm into filling missing values for both regressors and responsor by estimated values based on ideology of
expectation maximization (EM) algorithm. Let free set A be the set of regressors which is compulsorily included in
the regression model and let focus set B = XA be the complement of A with subject to the entire set X. Let d be the
number of regressors in each combination set Dk taken from doubtful set C = B{Xj} where Xj is current focused
regressor, the next slide is flow chart of REMRO algorithm.
13/12/2022 Extreme bound analysis correlation 9
10. 2. Methodology
Indeed, REMRO estimates
fitness values of focused
regressors in B and then
builds up regression model
with high fitness regressors.
The final regression model
estimated by REMRO with
only robust regressors is
called optimal regression
model.
Figure 2.1. Flow chart of
REMRO
13/12/2022 Extreme bound analysis correlation 10
11. 2. Methodology
Each combination suggested in some literature includes three doubtful regressors, d = 3.
Because the exhausted number of combinations will get huge as 2|C|–1 if d is browsed
from 1 to the cardinality |C| of doubtful set, the size d of each combination is proposed
as half the cardinality of doubtful set C and hence, the number of models is determined
as follows:
𝑑 =
𝐶
2
𝐾 =
𝐶 !
𝑐! 2
(2.6)
Note, the notation represents lower integer of given real number. The accuracy of
fitness computation is decreased when the number of target models is limited by such
new d but this reduction will make REMRO faster and its decrease in accuracy will be
alleviated by the global correlation R(Xj, Z) which does not concern any model.
13/12/2022 Extreme bound analysis correlation 11
12. 2. Methodology
Sala-I-Martin (Sala-I-Martin, 1997, pp. 179-180) estimated the fitness of estimate 𝛼𝑗 as the value of cumulative density
function of αj at 0, denoted as cdf(αj =0 | 𝛼𝑗, 𝜎𝜎𝑗
2 ) followed by calculating the mean 𝛼𝑗 and the variance 𝜎𝜎𝑗
2 of αj based on
likelihood function over K models.
𝜑𝑗 = cdf 0 𝛼𝑗, 𝜎𝜎𝑗
2 (2.7)
Especially, Sala-I-Martin mentioned the variance 𝜎𝜎𝑗
2 as averaged variance of K models. When REMRO is tested with
Sala-I-Martin method, Sala-I-Martin formulation is improved by estimating 𝜎𝜎𝑗
2
only based on K distributed values of 𝛼𝑗
because the averaged variance of K models does not reflect variation of regressors. For instance, given K models and
suppose each estimate of αj within model k is 𝛼𝑗 𝑘 , the variance 𝜎𝜎𝑗
2 is calculated as follows:
𝜎𝜎𝑗
2
=
𝑘=1
𝐾
𝛼𝑗 𝑘 − 𝛼𝑗
2
𝐿𝑘
𝑘=1
𝐾
𝐿𝑘
(2.8)
Where Lk is likelihood function of model k with assumption that regressor instances are also mutually independent
random variables, as follows:
𝐿𝑘 =
𝑖=1
𝑁
𝑃𝑘 𝒛𝑖 𝒙𝑖, 𝜶𝑘
Where Pk(zi | xi, αk) is the PDF of zi given model k as 𝑃𝑘 𝒛𝑖 𝒙𝑖, 𝜶𝑘 =
1
2𝜋𝜎𝑘
2
exp −
𝒛𝑖−𝜶𝑘
𝑇
𝒙𝑖
2
2𝜎𝑘
2
13/12/2022 Extreme bound analysis correlation 12
13. 2. Methodology
The variance 𝜎𝑘
2
of model k is estimated as follows:
𝜎𝑘
2
= 𝜎𝑘
2
=
1
𝑁
𝑖=1
𝑁
𝑧𝑖 − 𝑧𝑖 𝑘
2
Where 𝑧𝑖 𝑘 is the estimate of zi with model k. The mean 𝛼𝑗 is still followed Sala-I-
Martin formulation (Sala-I-Martin, 1997, p. 179).
𝛼𝑗 =
𝑘=1
𝐾
𝛼𝑗 𝑘 𝐿𝑘
𝑘=1
𝐾
𝐿𝑘
(2.9)
According to formulation of 𝜎𝜎𝑗
2 here, when 𝛼𝑗 is a mean with likelihood
distribution, the variance 𝜎𝜎𝑗
2
is estimated with likelihood distribution too, which is
slightly different from sample mean and sample variance as usual. In practice, Lk is
replaced by logarithm of likelihood function lk = log(Lk) in order to prevent
producing very small number due to large matrix data with many rows.
13/12/2022 Extreme bound analysis correlation 13
14. 2. Methodology
REMRO applies REM algorithm into computing regressive estimates 𝜶 = (𝛼0, 𝛼1, 𝛼2,…, 𝛼𝑛)T and REM, in
turn, applies EM algorithm to resist missing values. It is necessary to describe shortly REM. REM (Nguyen
& Ho, 2018) builds parallelly an entire regressive function and many partial inverse regressive functions so
that missing values are estimated by both types of entire function and inverse functions. The model
construction process of REM follows ideology of EM algorithm, especially EM loop but it is a bidirectional
process. Recall that zi
– and xij
– denote missing values of responsor Z and regressor Xj at the ith row of matrix
(X, z), which are estimated by REM as follows (Nguyen & Ho, 2018):
𝑥𝑖𝑗
−
= 𝛽𝑗0
𝑡
+ 𝛽𝑗1
𝑡
𝑧𝑖
−
𝑧𝑖
−
=
𝑗∈𝑈𝑖
𝛼𝑗
𝑡
𝛽𝑗0
𝑡
+ 𝑘∉𝑈𝑖
𝛼𝑘
𝑡
𝑥𝑖𝑘
1 − 𝑗∈𝑈𝑖
𝛼𝑗
𝑡
𝛽𝑗1
𝑡
(2.10)
Note, Ui is a set of indices of missing values xij with fixed i and βjk (s) are regressive coefficients of partial
inverse regressive functions. Although the ideology of REM is interesting, the pivot of this research is the
association of local correlation and global correlation for computing fitness values of regressors. The source
code of REM and REMRO is available at
https://github.com/ngphloc/rem/tree/master/3_implementation/src/net/rem
13/12/2022 Extreme bound analysis correlation 14
15. 3. Experimental results and discussions
In this experiment, REMRO is tested with Sala-I-Martin (Sala-I-Martin, 1997) given absolute
mean error (MAE) as testing metric. MAE is absolute deviation between original response Z in
matrix data and estimated response 𝑍 produced from regression model.
MAE =
1
𝑁
𝑖=1
𝑁
𝑧𝑖 − 𝑧𝑖
The smaller the MAE is, the better the method is. The traditional data “1974 Motor Trend”
(mtcars) available in R data package (Hlavac, 2016, p. 10) measuring fuel consumption based on
technical parameters is tested dataset, in which responsor is the vehicle’s miles per gallon (mpg)
and 8 numeric regressors are number of cylinders (cyl), displacement in cubic inches (disp), gross
horsepower (hp), rear axle ratio (drat), weight in thousands of pounds (wt), quarter-mile time in
seconds (qsec), number of forward gears (gear), and carburetors (carb). Only 4 robust regressors
are extracted, which takes fifty percent of doubtful set. Table 3.1 in next slide shows the
experimental results, in which second column lists sorted fitness values of robust regressors and
third column shows optimal regression models whereas fourth column shows the evaluation
metric MAE of REMRO method and Sala-I-Martin method.
13/12/2022 Extreme bound analysis correlation 15
16. 3. Experimental results and discussions
According to table 3.1, the robust regressors of REMRO and Sala-I-Martin method are (cyl, disp, hp, wt)
and (cyl, disp, hp, qsec) along with sorted fitness values (0.7262, 0.7200, 0.6435, 0.6133) and (0.9913,
0.7055, 0.6908, 0.6545), respectively. Because MAE metric of REMRO as 1.771 is smaller than the one
of Sala-I-Martin method as 2.245, REMRO is better than Sala-I-Martin method. Moreover, REMRO and
Sala-I-Martin method share the three same regressors such as cyl, disp, and hp but their last robust
regressors are different and hence, such difference makes REMRO better than Sala-I-Martin method in
this test.
13/12/2022 Extreme bound analysis correlation 16
Method Fitness Optimal model MAE
REMRO
fit(cyl) = 0.7262
fit(disp) = 0.7200
fit(hp) = 0.6435
fit(wt) = 0.6133
mpg = 40.8285 – 1.2933*(cyl) + 0.0116*(disp) –
0.0205*(hp) – 3.8539*(wt)
1.771
Sala-I-
Martin
fit(cyl) = 0.9913
fit(disp) = 0.7055
fit(hp) = 0.6908
fit(qsec) = 0.6545
mpg = 49.2352 – 1.6137*(cyl) – 0.0119*(disp) –
0.0288*(hp) – 0.6827*(qsec) 2.245
17. 3. Experimental results and discussions
It is easy to deduce from experimental result, the strong point of REMRO is to
appreciate the important level of strongly independent regressors from their global
correlation when such regressors can explain well responsor without associating
with other regressors. However, Sala-I-Martin method can work well in cases of
binary data and multinomial data because the computing likelihoods for estimating
fitness values does not depend directly on data types of regressors whereas
arithmetic formulation of correlation coefficients requires strictly numerical
regressors. Therefore, Sala-I-Martin method is more general than REMRO when it
can be applied in many data types of regressors. Sala-I-Martin method can even be
used for logit regression model because probabilistic applications are coherent
aspects of such logistic model with note that likelihood function is essentially
probability of random variable and prior/posterior functions are probabilities of
parameter in Bayesian statistics.
13/12/2022 Extreme bound analysis correlation 17
18. 4. Conclusions
From experimental results, REMRO is more accurate for modeling
numeric regressors and responsor but it is not general and common
like Sala-I-Martin method and other ones. In the future, we will try
our best to improve REMRO by researching methods to
approximate or replace numeric correlation by similar concepts
within mixture of nonnumeric variables and numeric variables.
13/12/2022 Extreme bound analysis correlation 18
19. References
1. Hlavac, M. (2016, August 30). ExtremeBounds: Extreme Bounds
Analysis in R. (B. Grün, T. Hothorn, R. Killick, & A. Zeileis, Eds.)
Journal of Statistical Software, 72(9), 1-22.
doi:10.18637/jss.v072.i09
2. Nguyen, L., & Ho, T.-H. T. (2018, December 17). Fetal Weight
Estimation in Case of Missing Data. (T. Schmutte, Ed.) Experimental
Medicine (EM), 1(2), 45-65. doi:10.31058/j.em.2018.12004
3. Sala-I-Martin, X. X. (1997, May). I Just Ran Two Million
Regressions. The American Economic Review, 87(2), 178-183.
Retrieved from http://www.jstor.org/stable/2950909
13/12/2022 Extreme bound analysis correlation 19
20. Thank you for listening
20
Extreme bound analysis correlation
13/12/2022