Principal component analysis (PCA) is a widespread and widely used in various areas of science such as
bioinformatics, econometrics, and chemometrics among others. Once that PCA is based in the
eigenvalues and the eigenvectors which are a very weak approach to high dimension systems with
degrees of sparsity and in these situations the PCA is no longer a recommended procedure. Sparsity is
very common in near infrared spectroscopy due to the large number of spectra required and the water
absorption broad bands what makes these spectra very similar and with heavy sparsity in matrix dataset,
demoting the precision and accuracy, in the multivariate modeling and within projections of data matrix
in smaller dimensions. To overcoming these shortcomings the LASSO, a not PCA based method, model
was applied to a NIR spectra dataset from Biodiesel and its performance was, statistically, compared
with traditional multivariate modeling such as PCR and PLSR.
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...mathsjournal
Principal component analysis (PCA) is a widespread and widely used in various areas of science such as bioinformatics, econometrics, and chemometrics among others. Once that PCA is based in the eigenvalues and the eigenvectors which are a very weak approach to high dimension systems with degrees of sparsity and in these situations the PCA is no longer a recommended procedure. Sparsity is very common in near infrared spectroscopy due to the large number of spectra required and the water absorption broad bands what makes these spectra very similar and with heavy sparsity in matrix dataset, demoting the precision and accuracy, in the multivariate modeling and within projections of data matrix in smaller dimensions. To overcoming these shortcomings the LASSO, a not PCA based method, model was applied to a NIR spectra dataset from Biodiesel and its performance was, statistically, compared with traditional multivariate modeling such as PCR and PLSR.
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...mathsjournal
Principal component analysis (PCA) is a widespread and widely used in various areas of science such as
bioinformatics, econometrics, and chemometrics among others. Once that PCA is based in the
eigenvalues and the eigenvectors which are a very weak approach to high dimension systems with
degrees of sparsity and in these situations the PCA is no longer a recommended procedure. Sparsity is
very common in near infrared spectroscopy due to the large number of spectra required and the water
absorption broad bands what makes these spectra very similar and with heavy sparsity in matrix dataset,
demoting the precision and accuracy, in the multivariate modeling and within projections of data matrix
in smaller dimensions. To overcoming these shortcomings the LASSO, a not PCA based method, model
was applied to a NIR spectra dataset from Biodiesel and its performance was, statistically, compared
with traditional multivariate modeling such as PCR and PLSR.
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank SystemISA Interchange
The paper is devoted to the problem of the robust actuator fault diagnosis of the dynamic non-linear systems. In the proposed method, it is assumed that the diagnosed system can be modelled by the recurrent neural network, which can be transformed into the linear parameter varying form. Such a system description allows developing the designing scheme of the robust unknown input observer within H1 framework for a class of non-linear systems. The proposed approach is designed in such a way that a prescribed disturbance attenuation level is achieved with respect to the actuator fault estimation error, while guaranteeing the convergence of the observer. The application of the robust unknown input observer enables actuator fault estimation, which allows applying the developed approach to the fault tolerant control tasks.
Nonlinear filtering approaches to field mapping by sampling using mobile sensorsijassn
This work proposes a novel application of existing powerful nonlinear filters, such as the standard
Extended Kalman Filter (EKF), some of its variants and the standard Unscented Kalman Filter (UKF), to
the estimation of a continuous spatio-temporal field that is spread over a wide area, and hence represented
by a large number of parameters when parameterized. We couple these filters with the powerful scheme of
adaptive sampling performed by a single mobile sensor, and investigate their performances with a view to
significantly improving the speed and accuracy of the overall field estimation. An extensive simulation work
was carried out to show that different variants of the standard EKF and the standard UKF can be used to
improve the accuracy of the field estimate. This paper also aims to provide some guideline for the user of
these filters in reaching a practical trade-off between the desired field estimation accuracy and the
required computational load.
Matlab reversible watermarking based on invariant image classification and d...Ecway Technologies
Final Year IEEE Projects, Final Year Projects, Academic Final Year Projects, Academic Final Year IEEE Projects, Academic Final Year IEEE Projects 2013, Academic Final Year IEEE Projects 2014, IEEE MATLAB Projects, 2013 IEEE MATLAB Projects, 2013 IEEE MATLAB Projects in Chennai, 2013 IEEE MATLAB Projects in Trichy, 2013 IEEE MATLAB Projects in Karur, 2013 IEEE MATLAB Projects in Erode, 2013 IEEE MATLAB Projects in Madurai, 2013 IEEE MATLAB Projects in Salem, 2013 IEEE MATLAB Projects in Coimbatore, 2013 IEEE MATLAB Projects in Tirupur, 2013 IEEE MATLAB Projects in Bangalore, 2013 IEEE MATLAB Projects in Hydrabad, 2013 IEEE MATLAB Projects in Kerala, 2013 IEEE MATLAB Projects in Namakkal, IEEE MATLAB Image Processing, IEEE MATLAB Face Recognition, IEEE MATLAB Face Detection, IEEE MATLAB Brain Tumour, IEEE MATLAB Iris Recognition, IEEE MATLAB Image Segmentation, Final Year Matlab Projects in Pondichery, Final Year Matlab Projects in Tamilnadu, Final Year Matlab Projects in Chennai, Final Year Matlab Projects in Trichy, Final Year Matlab Projects in Erode, Final Year Matlab Projects in Karur, Final Year Matlab Projects in Coimbatore, Final Year Matlab Projects in Tirunelveli, Final Year Matlab Projects in Madurai, Final Year Matlab Projects in Salem, Final Year Matlab Projects in Tirupur, Final Year Matlab Projects in Namakkal, Final Year Matlab Projects in Tanjore, Final Year Matlab Projects in Coimbatore, Final Year Matlab Projects in Bangalore, Final Year Matlab Projects in Hydrabad, Final Year Matlab Projects in Kerala.
This paper investigates the plausibility of using
approximate models for hypothesis generation in a RANSAC
framework to accurately and reliably estimate the fundamental
matrix. Two novel fundamental matrix estimators are introduced
that sample two correspondences to generate affine-fundamental
matrices for RANSAC hypotheses. A new RANSAC framework is
presented that uses local optimization to estimate the fundamental
matrix from the consensus correspondence sets of verified hy-
potheses, which are approximate models. The proposed estimators
are shown to perform better than other approximate models
that have previously been used in the literature for fundamental
matrix estimation in a rigorous evaluation. In addition the
proposed estimators are over 30 times faster, in terms of models
verified, than the 7-point method, and offer comparable accuracy
and repeatability on a large subset of the test set.
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...mathsjournal
Principal component analysis (PCA) is a widespread and widely used in various areas of science such as bioinformatics, econometrics, and chemometrics among others. Once that PCA is based in the eigenvalues and the eigenvectors which are a very weak approach to high dimension systems with degrees of sparsity and in these situations the PCA is no longer a recommended procedure. Sparsity is very common in near infrared spectroscopy due to the large number of spectra required and the water absorption broad bands what makes these spectra very similar and with heavy sparsity in matrix dataset, demoting the precision and accuracy, in the multivariate modeling and within projections of data matrix in smaller dimensions. To overcoming these shortcomings the LASSO, a not PCA based method, model was applied to a NIR spectra dataset from Biodiesel and its performance was, statistically, compared with traditional multivariate modeling such as PCR and PLSR.
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...mathsjournal
Principal component analysis (PCA) is a widespread and widely used in various areas of science such as
bioinformatics, econometrics, and chemometrics among others. Once that PCA is based in the
eigenvalues and the eigenvectors which are a very weak approach to high dimension systems with
degrees of sparsity and in these situations the PCA is no longer a recommended procedure. Sparsity is
very common in near infrared spectroscopy due to the large number of spectra required and the water
absorption broad bands what makes these spectra very similar and with heavy sparsity in matrix dataset,
demoting the precision and accuracy, in the multivariate modeling and within projections of data matrix
in smaller dimensions. To overcoming these shortcomings the LASSO, a not PCA based method, model
was applied to a NIR spectra dataset from Biodiesel and its performance was, statistically, compared
with traditional multivariate modeling such as PCR and PLSR.
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank SystemISA Interchange
The paper is devoted to the problem of the robust actuator fault diagnosis of the dynamic non-linear systems. In the proposed method, it is assumed that the diagnosed system can be modelled by the recurrent neural network, which can be transformed into the linear parameter varying form. Such a system description allows developing the designing scheme of the robust unknown input observer within H1 framework for a class of non-linear systems. The proposed approach is designed in such a way that a prescribed disturbance attenuation level is achieved with respect to the actuator fault estimation error, while guaranteeing the convergence of the observer. The application of the robust unknown input observer enables actuator fault estimation, which allows applying the developed approach to the fault tolerant control tasks.
Nonlinear filtering approaches to field mapping by sampling using mobile sensorsijassn
This work proposes a novel application of existing powerful nonlinear filters, such as the standard
Extended Kalman Filter (EKF), some of its variants and the standard Unscented Kalman Filter (UKF), to
the estimation of a continuous spatio-temporal field that is spread over a wide area, and hence represented
by a large number of parameters when parameterized. We couple these filters with the powerful scheme of
adaptive sampling performed by a single mobile sensor, and investigate their performances with a view to
significantly improving the speed and accuracy of the overall field estimation. An extensive simulation work
was carried out to show that different variants of the standard EKF and the standard UKF can be used to
improve the accuracy of the field estimate. This paper also aims to provide some guideline for the user of
these filters in reaching a practical trade-off between the desired field estimation accuracy and the
required computational load.
Matlab reversible watermarking based on invariant image classification and d...Ecway Technologies
Final Year IEEE Projects, Final Year Projects, Academic Final Year Projects, Academic Final Year IEEE Projects, Academic Final Year IEEE Projects 2013, Academic Final Year IEEE Projects 2014, IEEE MATLAB Projects, 2013 IEEE MATLAB Projects, 2013 IEEE MATLAB Projects in Chennai, 2013 IEEE MATLAB Projects in Trichy, 2013 IEEE MATLAB Projects in Karur, 2013 IEEE MATLAB Projects in Erode, 2013 IEEE MATLAB Projects in Madurai, 2013 IEEE MATLAB Projects in Salem, 2013 IEEE MATLAB Projects in Coimbatore, 2013 IEEE MATLAB Projects in Tirupur, 2013 IEEE MATLAB Projects in Bangalore, 2013 IEEE MATLAB Projects in Hydrabad, 2013 IEEE MATLAB Projects in Kerala, 2013 IEEE MATLAB Projects in Namakkal, IEEE MATLAB Image Processing, IEEE MATLAB Face Recognition, IEEE MATLAB Face Detection, IEEE MATLAB Brain Tumour, IEEE MATLAB Iris Recognition, IEEE MATLAB Image Segmentation, Final Year Matlab Projects in Pondichery, Final Year Matlab Projects in Tamilnadu, Final Year Matlab Projects in Chennai, Final Year Matlab Projects in Trichy, Final Year Matlab Projects in Erode, Final Year Matlab Projects in Karur, Final Year Matlab Projects in Coimbatore, Final Year Matlab Projects in Tirunelveli, Final Year Matlab Projects in Madurai, Final Year Matlab Projects in Salem, Final Year Matlab Projects in Tirupur, Final Year Matlab Projects in Namakkal, Final Year Matlab Projects in Tanjore, Final Year Matlab Projects in Coimbatore, Final Year Matlab Projects in Bangalore, Final Year Matlab Projects in Hydrabad, Final Year Matlab Projects in Kerala.
This paper investigates the plausibility of using
approximate models for hypothesis generation in a RANSAC
framework to accurately and reliably estimate the fundamental
matrix. Two novel fundamental matrix estimators are introduced
that sample two correspondences to generate affine-fundamental
matrices for RANSAC hypotheses. A new RANSAC framework is
presented that uses local optimization to estimate the fundamental
matrix from the consensus correspondence sets of verified hy-
potheses, which are approximate models. The proposed estimators
are shown to perform better than other approximate models
that have previously been used in the literature for fundamental
matrix estimation in a rigorous evaluation. In addition the
proposed estimators are over 30 times faster, in terms of models
verified, than the 7-point method, and offer comparable accuracy
and repeatability on a large subset of the test set.
This paper develops a hybrid Measure-Correlate-Predict (MCP) strategy to predict the long term wind resource variations at a farm site. The hybrid MCP method uses the recorded data of multiple reference stations to estimate the long term wind condition at the target farm site. The weight of each reference station in the hybrid strategy is determined based on: (i) the distance and (ii) the elevation difference between the target farm site and each reference station. The applicability of the proposed hybrid strategy is investigated using four different MCP methods: (i) linear regression; (ii) variance ratio; (iii) Weibull scale; and (iv) Artificial Neural Networks (ANNs). To implement this method, we use the hourly averaged wind data recorded at six stations in North Dakota between the year 2008 and 2010. The station Pillsbury is selected as the target farm site. The recorded data at the other five stations
(Dazey, Galesbury, Hillsboro, Mayville and Prosper) is used as reference station data. Three sets of performance metrics are used to evaluate the hybrid MCP method. The first set of metrics analyze the statistical performance, including the mean wind speed, the wind speed variance, the root mean squared error, and the maximum absolute error. The second set of metrics evaluate the distribution of long term wind speed; to this end, the Weibull distribution and the Multivariate and Multimodal Wind Distribution (MMWD) models are adopted in this paper. The third set of metrics analyze the energy production capacity and the efficiency of the wind farm. The results illustrate that the many-to-one correlation in such a hybrid approach can provide more reliable prediction of the long term onsite wind variations, compared to one-to-one correlations.
WATERSHED MODELING USING ARTIFICIAL NEURAL NETWORKS IAEME Publication
Artificial Neural Networks analysis was used for modeling rainfall-runoff relationship. A new Instantaneous ANN watershed model was built and tried herein using Walnut Gulch watershed (catchment) area. For modeling the instantaneous response of a catchment to a rainfall event an ANN model was built shown herein. The built model can represent the actual response using descritized
rainfall-runoff values, over a selected time interval (∆t). As this time interval decreases the actual response is more accurately modeled. This model was applied to one of the sub-catchment of Walnut Gulch watershed (sub-catchment No.9 (flume 11)). The model was found successful to represent the lag-time and time of runoff related to the hyetograph properties
During the process of molecular structure elucidation the selection of the most probable structural hypothesis may be based on chemical shift prediction. The prediction is carried out using either empirical or quantum-mechanical (QM) methods. When QM methods are used, NMR prediction commonly utilizes the GIAO option of the DFT approximation. In this approach the structural hypotheses are expected to be investigated by scientist. In this article we hope to show that the most rational manner by which to create structural hypotheses is actually by the application of an expert system capable of deducing all potential structures consistent with the experimental spectral data and specifically using 2D NMR data. When an expert system is used the best structure(s) can be distinguished using chemical shift prediction, which is best performed either by an incremental or neural net algorithm. The time-consuming QM calculations can then be applied, if necessary, to one or more of the 'best' structures to confirm the suggested solution.
IMAGE PROCESSING Projects for M. Tech, IMAGE PROCESSING Projects in Vijayanagar, IMAGE PROCESSING Projects in Bangalore, M. Tech Projects in Vijayanagar, M. Tech Projects in Bangalore, IMAGE PROCESSING IEEE projects in Bangalore, IEEE 2015 IMAGE PROCESSING Projects, MATLAB Image Processing Projects, MATLAB Image Processing Projects in Bangalore, MATLAB Image Processing Projects in Vijayangar
Critical path analysis and low-complexity implementation of the lms adaptive ...LogicMindtech Nologies
VLSI Projects for M. Tech, VLSI Projects in Vijayanagar, VLSI Projects in Bangalore, M. Tech Projects in Vijayanagar, M. Tech Projects in Bangalore, VLSI IEEE projects in Bangalore, IEEE 2015 VLSI Projects, FPGA and Xilinx Projects, FPGA and Xilinx Projects in Bangalore, FPGA and Xilinx Projects in Vijayangar
Coordinated and adaptive information collecting in target tracking wireless s...LogicMindtech Nologies
NS2 Projects for M. Tech, NS2 Projects in Vijayanagar, NS2 Projects in Bangalore, M. Tech Projects in Vijayanagar, M. Tech Projects in Bangalore, NS2 IEEE projects in Bangalore, IEEE 2015 NS2 Projects, WSN and MANET Projects, WSN and MANET Projects in Bangalore, WSN and MANET Projects in Vijayangar
A REVIEW ON OPTIMIZATION OF LEAST SQUARES SUPPORT VECTOR MACHINE FOR TIME SER...ijaia
Support Vector Machine has appeared as an active study in machine learning community and extensively
used in various fields including in prediction, pattern recognition and many more. However, the Least
Squares Support Vector Machine which is a variant of Support Vector Machine offers better solution
strategy. In order to utilize the LSSVM capability in data mining task such as prediction, there is a need to
optimize its hyper parameters. This paper presents a review on techniques used to optimize the parameters
based on two main classes; Evolutionary Computation and Cross Validation.
DEEP LEARNING NEURAL NETWORK APPROACHES TO LAND USE-DEMOGRAPHIC- TEMPORAL BA...civejjour
Land use and transportation planning are inter-dependent, as well as being important factors in forecasting urban development. In recent years, predicting traffic based on land use, along with several other variables, has become a worthwhile area of study. In this paper, it is proposed that Deep Neural Network Regression (DNN-Regression) and Recurrent Neural Network (DNN-RNN) methods could be used to predict traffic. These methods used three key variables: land use, demographic and temporal data. The proposed methods were evaluated with other methods, using datasets collected from the City of Calgary, Canada. The proposed DNN-Regression focused on demographic and land use variables for traffic prediction. The study also predicted traffic temporally in the same geographical area by using DNN-RNN. The DNN-RNN used long short-term memory to predict traffic. Comparative experiments revealed that the proposed DNN-Regression and DNN-RNN models outperformed other methods.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...ijcsa
Missing data imputation is an important research topic in data mining. Large-scale Molecular descriptor data may contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a complete descriptor data matrix. We propose and evaluate an iterative imputation method MiFoImpute based on a random forest. By averaging over many unpruned regression trees, random forest intrinsically constitutes a multiple imputation scheme. Using the NRMSE and NMAE estimates of random forest, we are able to estimate the imputation error. Evaluation is performed on two molecular descriptor datasets generated from a diverse selection of pharmaceutical fields with artificially introduced missing values ranging from 10% to 30%. The experimental result demonstrates that missing values has a great impact on the effectiveness of imputation techniques and our method MiFoImpute is more robust to missing value than the other ten imputation methods used as benchmark. Additionally, MiFoImpute exhibits attractive computational efficiency and can cope with high-dimensional data.
This paper develops a hybrid Measure-Correlate-Predict (MCP) strategy to predict the long term wind resource variations at a farm site. The hybrid MCP method uses the recorded data of multiple reference stations to estimate the long term wind condition at the target farm site. The weight of each reference station in the hybrid strategy is determined based on: (i) the distance and (ii) the elevation difference between the target farm site and each reference station. The applicability of the proposed hybrid strategy is investigated using four different MCP methods: (i) linear regression; (ii) variance ratio; (iii) Weibull scale; and (iv) Artificial Neural Networks (ANNs). To implement this method, we use the hourly averaged wind data recorded at six stations in North Dakota between the year 2008 and 2010. The station Pillsbury is selected as the target farm site. The recorded data at the other five stations
(Dazey, Galesbury, Hillsboro, Mayville and Prosper) is used as reference station data. Three sets of performance metrics are used to evaluate the hybrid MCP method. The first set of metrics analyze the statistical performance, including the mean wind speed, the wind speed variance, the root mean squared error, and the maximum absolute error. The second set of metrics evaluate the distribution of long term wind speed; to this end, the Weibull distribution and the Multivariate and Multimodal Wind Distribution (MMWD) models are adopted in this paper. The third set of metrics analyze the energy production capacity and the efficiency of the wind farm. The results illustrate that the many-to-one correlation in such a hybrid approach can provide more reliable prediction of the long term onsite wind variations, compared to one-to-one correlations.
WATERSHED MODELING USING ARTIFICIAL NEURAL NETWORKS IAEME Publication
Artificial Neural Networks analysis was used for modeling rainfall-runoff relationship. A new Instantaneous ANN watershed model was built and tried herein using Walnut Gulch watershed (catchment) area. For modeling the instantaneous response of a catchment to a rainfall event an ANN model was built shown herein. The built model can represent the actual response using descritized
rainfall-runoff values, over a selected time interval (∆t). As this time interval decreases the actual response is more accurately modeled. This model was applied to one of the sub-catchment of Walnut Gulch watershed (sub-catchment No.9 (flume 11)). The model was found successful to represent the lag-time and time of runoff related to the hyetograph properties
During the process of molecular structure elucidation the selection of the most probable structural hypothesis may be based on chemical shift prediction. The prediction is carried out using either empirical or quantum-mechanical (QM) methods. When QM methods are used, NMR prediction commonly utilizes the GIAO option of the DFT approximation. In this approach the structural hypotheses are expected to be investigated by scientist. In this article we hope to show that the most rational manner by which to create structural hypotheses is actually by the application of an expert system capable of deducing all potential structures consistent with the experimental spectral data and specifically using 2D NMR data. When an expert system is used the best structure(s) can be distinguished using chemical shift prediction, which is best performed either by an incremental or neural net algorithm. The time-consuming QM calculations can then be applied, if necessary, to one or more of the 'best' structures to confirm the suggested solution.
IMAGE PROCESSING Projects for M. Tech, IMAGE PROCESSING Projects in Vijayanagar, IMAGE PROCESSING Projects in Bangalore, M. Tech Projects in Vijayanagar, M. Tech Projects in Bangalore, IMAGE PROCESSING IEEE projects in Bangalore, IEEE 2015 IMAGE PROCESSING Projects, MATLAB Image Processing Projects, MATLAB Image Processing Projects in Bangalore, MATLAB Image Processing Projects in Vijayangar
Critical path analysis and low-complexity implementation of the lms adaptive ...LogicMindtech Nologies
VLSI Projects for M. Tech, VLSI Projects in Vijayanagar, VLSI Projects in Bangalore, M. Tech Projects in Vijayanagar, M. Tech Projects in Bangalore, VLSI IEEE projects in Bangalore, IEEE 2015 VLSI Projects, FPGA and Xilinx Projects, FPGA and Xilinx Projects in Bangalore, FPGA and Xilinx Projects in Vijayangar
Coordinated and adaptive information collecting in target tracking wireless s...LogicMindtech Nologies
NS2 Projects for M. Tech, NS2 Projects in Vijayanagar, NS2 Projects in Bangalore, M. Tech Projects in Vijayanagar, M. Tech Projects in Bangalore, NS2 IEEE projects in Bangalore, IEEE 2015 NS2 Projects, WSN and MANET Projects, WSN and MANET Projects in Bangalore, WSN and MANET Projects in Vijayangar
A REVIEW ON OPTIMIZATION OF LEAST SQUARES SUPPORT VECTOR MACHINE FOR TIME SER...ijaia
Support Vector Machine has appeared as an active study in machine learning community and extensively
used in various fields including in prediction, pattern recognition and many more. However, the Least
Squares Support Vector Machine which is a variant of Support Vector Machine offers better solution
strategy. In order to utilize the LSSVM capability in data mining task such as prediction, there is a need to
optimize its hyper parameters. This paper presents a review on techniques used to optimize the parameters
based on two main classes; Evolutionary Computation and Cross Validation.
DEEP LEARNING NEURAL NETWORK APPROACHES TO LAND USE-DEMOGRAPHIC- TEMPORAL BA...civejjour
Land use and transportation planning are inter-dependent, as well as being important factors in forecasting urban development. In recent years, predicting traffic based on land use, along with several other variables, has become a worthwhile area of study. In this paper, it is proposed that Deep Neural Network Regression (DNN-Regression) and Recurrent Neural Network (DNN-RNN) methods could be used to predict traffic. These methods used three key variables: land use, demographic and temporal data. The proposed methods were evaluated with other methods, using datasets collected from the City of Calgary, Canada. The proposed DNN-Regression focused on demographic and land use variables for traffic prediction. The study also predicted traffic temporally in the same geographical area by using DNN-RNN. The DNN-RNN used long short-term memory to predict traffic. Comparative experiments revealed that the proposed DNN-Regression and DNN-RNN models outperformed other methods.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...ijcsa
Missing data imputation is an important research topic in data mining. Large-scale Molecular descriptor data may contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a complete descriptor data matrix. We propose and evaluate an iterative imputation method MiFoImpute based on a random forest. By averaging over many unpruned regression trees, random forest intrinsically constitutes a multiple imputation scheme. Using the NRMSE and NMAE estimates of random forest, we are able to estimate the imputation error. Evaluation is performed on two molecular descriptor datasets generated from a diverse selection of pharmaceutical fields with artificially introduced missing values ranging from 10% to 30%. The experimental result demonstrates that missing values has a great impact on the effectiveness of imputation techniques and our method MiFoImpute is more robust to missing value than the other ten imputation methods used as benchmark. Additionally, MiFoImpute exhibits attractive computational efficiency and can cope with high-dimensional data.
In silico prediction of small molecule properties is widely used today in industry and academia. NMR spectra, in particular, are predicted by a variety of software packages. In this array of software options, two main approaches are used:
Database-based. Compounds are compared against a database, the result is calculated using data for close structural relatives found in the dataset.
Regression-based. An experimental database is used to calculate the parameters for non-linear regression. The chemical shift is calculated by a non-linear function of variables which describe characteristic features of the molecule of interest.
These two outlined approaches require different strategies for implementation and optimization. Database-based results are improved by acquiring larger databases and/or including user-specific data into the calculation. Non-linear regression algorithms can be improved through the regression itself, or by improving the structural descriptors
The validation of the performance of a neural network based 13C NMR prediction algorithm using a test set available from an open source publicly available database, NMRShiftDB, is described. The validation was performed using a version of the database containing ca. 214,000 chemical shifts as well as for two subsets of the database to compare performance when overlap with the training set is taken into account. The first subset contained ca. 93,000 chemical shifts that were absent from the ACD\CNMR DB, the “excluded shift set” used for training of the neural network and the ACD\CNMR prediction algorithm, while the second contained ca. 121,000 shifts that were present in the ACD\CNMR DB training set, the “included shift set”. This work has shown that the mean error between experimental and predicted shifts for the entire database is 1.59 ppm, while the mean deviation for the subset with included shifts is 1.47 ppm and 1.74 ppm for excluded shifts. Since similar work has been reported online for another algorithm we compared the results with the errors determined using Robien’s CNMR Neural Network Predictor using the entire NMRShiftDB for program validation.
Multimode system condition monitoring using sparsity reconstruction for quali...IJECEIAES
In this paper, we introduce an improved multivariate statistical monitoring method based on the stacked sparse autoencoder (SSAE). Our contribution focuses on the choice of the SSAE model based on neural networks to solve diagnostic problems of complex systems. In order to monitor the process performance, the squared prediction error (SPE) chart is linked with nonparametric adaptive confidence bounds which arise from the kernel density estimation to minimize erroneous alerts. Then, faults are localized using two methods: contribution plots and sensor validity index (SVI). The results are obtained from experiments and real data from a drinkable water processing plant, demonstrating how the applied technique is performed. The simulation results of the SSAE model show a better ability to detect and identify sensor failures.
Similar to LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM WITH HEAVY SPARSITY: “BIODIESEL QUALITY BY NIR SPECTROSCOPY” (20)
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
Speaker diarization is a critical task in speech processing that aims to identify "who spoke when?" in an
audio or video recording that contains unknown amounts of speech from unknown speakers and unknown
number of speakers. Diarization has numerous applications in speech recognition, speaker identification,
and automatic captioning. Supervised and unsupervised algorithms are used to address speaker diarization
problems, but providing exhaustive labeling for the training dataset can become costly in supervised
learning, while accuracy can be compromised when using unsupervised approaches. This paper presents a
novel approach to speaker diarization, which defines loosely labeled data and employs x-vector embedding
and a formalized approach for threshold searching with a given abstract similarity metric to cluster
temporal segments into unique user segments. The proposed algorithm uses concepts of graph theory,
matrix algebra, and genetic algorithm to formulate and solve the optimization problem. Additionally, the
algorithm is applied to English, Spanish, and Chinese audios, and the performance is evaluated using wellknown similarity metrics. The results demonstrate that the robustness of the proposed approach. The
findings of this research have significant implications for speech processing, speaker identification
including those with tonal differences. The proposed method offers a practical and efficient solution for
speaker diarization in real-world scenarios where there are labeling time and cost constraints.
A POSSIBLE RESOLUTION TO HILBERT’S FIRST PROBLEM BY APPLYING CANTOR’S DIAGONA...mathsjournal
We present herein a new approach to the Continuum hypothesis CH. We will employ a string conditioning,
a technique that limits the range of a string over some of its sub-domains for forming subsets K of R. We
will prove that these are well defined and in fact proper subsets of R by making use of Cantor’s Diagonal
argument in its original form to establish the cardinality of K between that of (N,R) respectively
A Positive Integer 𝑵 Such That 𝒑𝒏 + 𝒑𝒏+𝟑 ~ 𝒑𝒏+𝟏 + 𝒑𝒏+𝟐 For All 𝒏 ≥ 𝑵mathsjournal
According to Bertrand's postulate, we have 𝑝𝑛 + 𝑝𝑛 ≥ 𝑝𝑛+1. Is it true that for all 𝑛 > 1 then 𝑝𝑛−1 + 𝑝𝑛 ≥𝑝𝑛+1? Then 𝑝𝑛 + 𝑝𝑛+3 > 𝑝𝑛+1 + 𝑝𝑛+2where 𝑛 ≥ 𝑁, 𝑁 is a large enough value?
A POSSIBLE RESOLUTION TO HILBERT’S FIRST PROBLEM BY APPLYING CANTOR’S DIAGONA...mathsjournal
We present herein a new approach to the Continuum hypothesis CH. We will employ a string conditioning,
a technique that limits the range of a string over some of its sub-domains for forming subsets K of R. We
will prove that these are well defined and in fact proper subsets of R by making use of Cantor’s Diagonal
argument in its original form to establish the cardinality of K between that of (N,R) respectively.
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...mathsjournal
systems in complex situations. A fundamental problem in radar systems is to automatically detect targets while maintaining a
desired constant false alarm probability. This work studies two detection approaches, the first with a fixed threshold and the
other with an adaptive one. In the latter, we have learned the three types of detectors CA, SO, and GO-CFAR. This research
aims to apply intelligent techniques to improve detection performance in a nonhomogeneous environment using standard
CFAR detectors. The objective is to maintain the false alarm probability and enhance target detection by combining
intelligent techniques. With these objectives in mind, implementing standard CFAR detectors is applied to nonhomogeneous
environment data. The primary focus is understanding the reason for the false detection when applying standard CFAR
detectors in a nonhomogeneous environment and how to avoid it using intelligent approaches.
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
Speaker diarization is a critical task in speech processing that aims to identify "who spoke when?" in an
audio or video recording that contains unknown amounts of speech from unknown speakers and unknown
number of speakers. Diarization has numerous applications in speech recognition, speaker identification,
and automatic captioning. Supervised and unsupervised algorithms are used to address speaker diarization
problems, but providing exhaustive labeling for the training dataset can become costly in supervised
learning, while accuracy can be compromised when using unsupervised approaches. This paper presents a
novel approach to speaker diarization, which defines loosely labeled data and employs x-vector embedding
and a formalized approach for threshold searching with a given abstract similarity metric to cluster
temporal segments into unique user segments. The proposed algorithm uses concepts of graph theory,
matrix algebra, and genetic algorithm to formulate and solve the optimization problem. Additionally, the
algorithm is applied to English, Spanish, and Chinese audios, and the performance is evaluated using wellknown similarity metrics. The results demonstrate that the robustness of the proposed approach. The
findings of this research have significant implications for speech processing, speaker identification
including those with tonal differences. The proposed method offers a practical and efficient solution for
speaker diarization in real-world scenarios where there are labeling time and cost constraints.
The Impact of Allee Effect on a Predator-Prey Model with Holling Type II Func...mathsjournal
There is currently much interest in predator–prey models across a variety of bioscientific disciplines. The focus is on quantifying predator–prey interactions, and this quantification is being formulated especially as regards climate change. In this article, a stability analysis is used to analyse the behaviour of a general two-species model with respect to the Allee effect (on the growth rate and nutrient limitation level of the prey population). We present a description of the local and non-local interaction stability of the model and detail the types of bifurcation which arise, proving that there is a Hopf bifurcation in the Allee effect module. A stable periodic oscillation was encountered which was due to the Allee effect on the
prey species. As a result of this, the positive equilibrium of the model could change from stable to unstable and then back to stable, as the strength of the Allee effect (or the ‘handling’ time taken by predators when predating) increased continuously from zero. Hopf bifurcation has arose yield some complex patterns that have not been observed previously in predator-prey models, and these, at the same time, reflect long term behaviours. These findings have significant implications for ecological studies, not least with respect to examining the mobility of the two species involved in the non-local domain using Turing instability. A spiral generated by local interaction (reflecting the instability that forms even when an infinitely large
carrying capacity is assumed) is used in the model.
A POSSIBLE RESOLUTION TO HILBERT’S FIRST PROBLEM BY APPLYING CANTOR’S DIAGONA...mathsjournal
We present herein a new approach to the Continuum hypothesis CH. We will employ a string conditioning,a technique that limits the range of a string over some of its sub-domains for forming subsets K of R. We will prove that these are well defined and in fact proper subsets of R by making use of Cantor’s Diagonal argument in its original form to establish the cardinality of K between that of (N,R) respectively.
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...mathsjournal
Modernization of radar technology and improved signal processing techniques are necessary to improve detection systems in complex situations. A fundamental problem in radar systems is to automatically detect targets while maintaining a
desired constant false alarm probability. This work studies two detection approaches, the first with a fixed threshold and the
other with an adaptive one. In the latter, we have learned the three types of detectors CA, SO, and GO-CFAR. This research
aims to apply intelligent techniques to improve detection performance in a nonhomogeneous environment using standard
CFAR detectors. The objective is to maintain the false alarm probability and enhance target detection by combining
intelligent techniques. With these objectives in mind, implementing standard CFAR detectors is applied to nonhomogeneous
environment data. The primary focus is understanding the reason for the false detection when applying standard CFAR
detectors in a nonhomogeneous environment and how to avoid it using intelligent approaches
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
Speaker diarization is a critical task in speech processing that aims to identify "who spoke when?" in an
audio or video recording that contains unknown amounts of speech from unknown speakers and unknown
number of speakers. Diarization has numerous applications in speech recognition, speaker identification,
and automatic captioning. Supervised and unsupervised algorithms are used to address speaker diarization
problems, but providing exhaustive labeling for the training dataset can become costly in supervised
learning, while accuracy can be compromised when using unsupervised approaches. This paper presents a
novel approach to speaker diarization, which defines loosely labeled data and employs x-vector embedding
and a formalized approach for threshold searching with a given abstract similarity metric to cluster
temporal segments into unique user segments. The proposed algorithm uses concepts of graph theory,
matrix algebra, and genetic algorithm to formulate and solve the optimization problem. Additionally, the
algorithm is applied to English, Spanish, and Chinese audios, and the performance is evaluated using wellknown similarity metrics. The results demonstrate that the robustness of the proposed approach. The
findings of this research have significant implications for speech processing, speaker identification
including those with tonal differences. The proposed method offers a practical and efficient solution for
speaker diarization in real-world scenarios where there are labeling time and cost constraints
Modified Alpha-Rooting Color Image Enhancement Method on the Two Side 2-D Qua...mathsjournal
Color in an image is resolved to 3 or 4 color components and 2-Dimages of these components are stored in separate channels. Most of the color image enhancement algorithms are applied channel-by-channel on each image. But such a system of color image processing is not processing the original color. When a color image is represented as a quaternion image, processing is done in original colors. This paper proposes an implementation of the quaternion approach of enhancement algorithm for enhancing color images and is referred as the modified alpha-rooting by the two-dimensional quaternion discrete Fourier transform (2-D QDFT). Enhancement results of this proposed method are compared with the channel-by-channel image enhancement by the 2-D DFT. Enhancements in color images are quantitatively measured by the color enhancement measure estimation (CEME), which allows for selecting optimum parameters for processing by thegenetic algorithm. Enhancement of color images by the quaternion based method allows for obtaining images which are closer to the genuine representation of the real original color.
An Application of Assignment Problem in Laptop Selection Problem Using MATLABmathsjournal
The assignment – selection problem used to find one-to- one match of given “Users” to “Laptops”, the main objective is to minimize the cost as per user requirement. This paper presents satisfactory solution for real assignment – Laptop selection problem using MATLAB coding.
The aim of this paper is to study the class of β-normal spaces. The relationships among s-normal spaces, pnormal spaces and β-normal spaces are investigated. Moreover, we study the forms of generalized β-closed
functions. We obtain characterizations of β-normal spaces, properties of the forms of generalized β-closed
functions and preservation theorems.
Cubic Response Surface Designs Using Bibd in Four Dimensionsmathsjournal
Response Surface Methodology (RSM) has applications in Chemical, Physical, Meteorological, Industrial and Biological fields. The estimation of slope response surface occurs frequently in practical situations for the experimenter. The rates of change of the response surface, like rates of change in the yield of crop to various fertilizers, to estimate the rates of change in chemical experiments etc. are of
interest. If the fit of second order response is inadequate for the design points, we continue the
experiment so as to fit a third order response surface. Higher order response surface designs are sometimes needed in Industrial and Meteorological applications. Gardiner et al (1959) introduced third order rotatable designs for exploring response surface. Anjaneyulu et al (1994-1995) constructed third order slope rotatable designs using doubly balanced incomplete block designs. Anjaneyulu et al (2001)
introduced third order slope rotatable designs using central composite type design points. Seshu babu et al (2011) studied modified construction of third order slope rotatable designs using central composite
designs. Seshu babu et al (2014) constructed TOSRD using BIBD. In view of wide applicability of third
order models in RSM and importance of slope rotatability, we introduce A Cubic Slope Rotatable Designs Using BIBD in four dimensions.
The caustic that occur in geodesics in space-times which are solutions to the gravitational field equations with the energy-momentum tensor satisfying the dominant energy condition can be circumvented if quantum variations are allowed. An action is developed such that the variation yields the field equations
and the geodesic condition, and its quantization provides a method for determining the extent of the wave packet around the classical path.
Approximate Analytical Solution of Non-Linear Boussinesq Equation for the Uns...mathsjournal
For one dimensional homogeneous, isotropic aquifer, without accretion the governing Boussinesq equation under Dupuit assumptions is a nonlinear partial differential equation. In the present paper approximate analytical solution of nonlinear Boussinesq equation is obtained using Homotopy perturbation transform method(HPTM). The solution is compared with the exact solution. The comparison shows that the HPTM is efficient, accurate and reliable. The analysis of two important aquifer
parameters namely viz. specific yield and hydraulic conductivity is studied to see the effects on the height of water table. The results resemble well with the physical phenomena.
Common Fixed Point Theorems in Compatible Mappings of Type (P*) of Generalize...mathsjournal
In this paper, we give some new definition of Compatible mappings of type (P), type (P-1) and type (P-2) in intuitionistic generalized fuzzy metric spaces and prove Common fixed point theorems for six mappings
under the conditions of compatible mappings of type (P-1) and type (P-2) in complete intuitionistic fuzzy
metric spaces. Our results intuitionistically fuzzify the result of Muthuraj and Pandiselvi [15]
Mathematics subject classifications: 45H10, 54H25
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...mathsjournal
In the earlier work, Knuth present an algorithm to decrease the coefficient growth in the Euclidean algorithm of polynomials called subresultant algorithm. However, the output polynomials may have a small factor which can be removed. Then later, Brown of Bell Telephone Laboratories showed the subresultant in another way by adding a variant called 𝜏 and gave a way to compute the variant. Nevertheless, the way failed to determine every 𝜏 correctly.
In this paper, we will give a probabilistic algorithm to determine the variant 𝜏 correctly in most cases by adding a few steps instead of computing 𝑡(𝑥) when given 𝑓(𝑥) and𝑔(𝑥) ∈ ℤ[𝑥], where 𝑡(𝑥) satisfies that 𝑠(𝑥)𝑓(𝑥) + 𝑡(𝑥)𝑔(𝑥) = 𝑟(𝑥), here 𝑡(𝑥), 𝑠(𝑥) ∈ ℤ[𝑥]
Table of Contents - September 2022, Volume 9, Number 2/3mathsjournal
Applied Mathematics and Sciences: An International Journal (MathSJ ) aims to publish original research papers and survey articles on all areas of pure mathematics, theoretical applied mathematics, mathematical physics, theoretical mechanics, probability and mathematical statistics, and theoretical biology. All articles are fully refereed and are judged by their contribution to advancing the state of the science of mathematics.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM WITH HEAVY SPARSITY: “BIODIESEL QUALITY BY NIR SPECTROSCOPY”
1. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
DOI : 10.5121/mathsj.2020.7101 1
LASSO MODELING AS AN ALTERNATIVE TO
PCA BASED MULTIVARIATE MODELS TO
SYSTEM WITH HEAVY SPARSITY: “BIODIESEL
QUALITY BY NIR SPECTROSCOPY”
Cesar Mello1
, Cassiano Escudeiro1
, and Isao Noda2
1
Integrated Institute of Clinical Research - IPclin, Jundiai, State of São Paulo, Brazil.
2
Department of Materials Science and Engineering, University of Delaware, Newark,
DE, USA.
ABSTRACT
Principal component analysis (PCA) is a widespread and widely used in various areas of science such as
bioinformatics, econometrics, and chemometrics among others. Once that PCA is based in the
eigenvalues and the eigenvectors which are a very weak approach to high dimension systems with
degrees of sparsity and in these situations the PCA is no longer a recommended procedure. Sparsity is
very common in near infrared spectroscopy due to the large number of spectra required and the water
absorption broad bands what makes these spectra very similar and with heavy sparsity in matrix dataset,
demoting the precision and accuracy, in the multivariate modeling and within projections of data matrix
in smaller dimensions. To overcoming these shortcomings the LASSO, a not PCA based method, model
was applied to a NIR spectra dataset from Biodiesel and its performance was, statistically, compared
with traditional multivariate modeling such as PCR and PLSR.
KEYWORDS
lassomodel, sparsity, multivariate modeling, NIR spectroscopy, biodiesel from recyclable sources.
1. INTRODUCTION
The multivariate modeling is a traditional chemometric tool and very useful to build models
near infrared (NIR) spectroscopy data sets to ensure products quality and to evaluate, in a very
fast way, the performance of industrials processes, in many areas, such as foods, fuels and
polymers, for short. This spectroscopy technique provides fast, noninvasive and nondestructive
analysis of samples, predicting a large number of samples properties, from NIR spectra [1]. The
great majority of multivariate modeling use, at least in initial steps, the principal components
analysis (PCA) [2]. The PCA based methods are very important especially in spectroscopy data
sets, where the number of independent variables is greater than the number of acquired spectra;
the equations system is overdetermined. However, PCA based methods as principal component
regression (PCR) [3] and partial least squares regression (PLSR) [4] have at least two important
shortcomings, when the matrix from NIR dataset tends to sparsity and the relationship among
the sample properties and spectra, is lightly nonlinear. At this point in this text, it is important to
make it clear that when we refer to matrix from NIR spectra set, we are referring to correlation
matrix (C) calculated by the following steps:
i. Data centering on average:
𝑥𝑖,𝑗 = 𝑥𝑖,𝑗 − 𝑥𝑗 (1)
2. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
2
𝑎𝑢𝑡𝑜𝑠𝑐𝑎𝑙𝑒𝑑
Where𝑥𝑗 is the average of the j-th column of data.
𝑥 =
1
∑𝐼 𝑥 (2)
𝑗 𝐼 𝑖=1 𝑖,𝑗
Aondeié o número de linhas,j o número de colunas e Io número total de colunas.
ii. Autoscale
𝑥𝑖,𝑗 =
𝑥 𝑖,𝑗−𝑥𝑗
𝑠 𝑗
(3)
Where S is the standard deviation from data matrix (X)
iii. Correlation matrix C
𝐶 =
𝑋 𝑇
∙𝑋 𝑎𝑢𝑡𝑜𝑠𝑐𝑎𝑙𝑒𝑑
𝐼−1
(4)
Although some authors claim that their PCR and PLSR models can to model small
nonlinearities and sparsity in very small degree, with the addition of extra principal components
(PC's). However in practice, external disturbances such as light scattering, baseline fluctuations
and noise, for short, will introduce nonlinearities and increasing to the sparsity in spectral data,
deteriorating the prediction accuracy step and precision from PCR and PLSR modeling.
However, in many cases it is possible to minimize these sources of nonlinearities in a very
effective way. There are several mathematical techniques to do these tasks such multiplicative
scattering correction (MSC)[5], noise minimization by Fast Fourier Transform (FFT) [6] or
wavelets filter[7] Gram-Schmidt orthogonalization [8] and detrend baseline correction[9] just to
name a few. Such methods of pre-processing spectral data set, allow the use of PCA based
models without great problems.However, the spectroscopic technique NIR presents several
shortcomings very difficult to overcoming. The intrinsic NIR broad absorption bands called
overtones and highly undesirable effects due to water absorption broad bands making the NIR
spectra a very broad bands spectrum, referred as overtone bands, of difficult Physical and/or
Chemical interpretation and analysis. This broad band affects the NIR spectrum as whole
making the NIR spectra from the same sample very similar, with heavy overlaps. All of these
physical effects, together, such effects together, generate heavy sparsity in C. In such cases the
PCA based multivariate modeling methods completely lose accuracy and precision, with orders
of magnitude dependent on the system under evaluation. Moreover, in such cases it is necessary
to acquire a large number of NIR spectra for prediction and validation procedures of these
models.
Thus we have a reasonably complicated situation, that is, a large data matrix with heavy
sparsity. Accordingly, in such cases multivariate modeling methods based on principal
component analysis can fail. Therefore, in such cases it is recommended to analyze, previously,
the asymptotic behavior of the Principals Components i.e. the direction of the main components
(PC's) into large dimension systems as in matrices from NIR spectroscopy dataset. The
asymptotic behavior analysis basically shows that if the first few eigenvalues of a population
covariance matrix are large enough compared to the others, then the corresponding estimated
PC directions are consistent or converge to the appropriate subspace and most other PC
directions are strongly inconsistent, that is, such PC's do not have anything physical meaning.
After such mathematical analysis one can decide on whether it is possible to use PCA or not.
However it is much simpler to use C to calculate its condition number before applying to PCA.
Thus, in a first approximation one can establish the very simple rule to the use of the PCA by
calculating the condition number (CN) [10] from NIR matrix of dataset. However a little bit
further analysis over CN from C it is necessary. The CN from a matrix is a measure of whether
3. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
3
the problem has "good condition" to be treated numerically. A problem with a small CN is
called well-conditioned, while problems that have a high condition number are called poorly
conditioned due to, among others effects, the sparsity. For example, the CN associated with a
linear system is a number that estimates the accuracy that can be obtained for an approximate
solution of the system. Note that this is before the effects of rounding errors are taken into
account. Paired with this problem there are any number of algorithms that can be used to solve
such kind of problem, that is, to calculate the solution. Some numeric algorithms have a
property called backward stability. In general, a backward stable algorithm can be expected, to
accurately solve, only, well-conditioned problems. However, CN is an intrinsic property of the
C and we can use the following simple rule of thumb: if the CN is small them is possible, in a
first approximation, to apply to principal component analysis (PCA) and to use the traditional
and well established multivariate models like PCR and PLSR.However, this type of
mathematical analysis is a little cumbersome and very unusual in daily use of NIR so it is much
more practical and efficient to use the LASSO modeling as a choice method. Despite the
possible difficulties mentioned above, the PCA is still an ordinary and very efficient
mathematical method for dimensionality reducing from the spectral data set with the
shortcomings aforementionedand the vast majority of NIR spectrometers have software to
perform PCA based methods such as PCR and PLSR, among others. The widespread use of
these nonparametric methods relies on its easy computational implementation, and such
methods, in general, are algorithmic methods. In the 1980s, routine interactive computing was
just beginning to raise its head, and exploratory data analysis was a new idea. Since then, we
have witnessed a number of remarkable developments in local computing power and data
storage. Besides that, efficient and interactive chemometrics packages have enabled
sophisticated data analyses to carry out effortlessly and the vast majority of NIR spectrometers
have software to perform PCA based methods such as PCR and PLSR, among others. However
such kind of nonparametric methods provide approximates solutions and not necessarily robust
ones, causing a lot of wrong solutions and misinterpretations even in quite simple
systems.These nonparametric methods are not robust inmathematically sense due to its high
sensibility to the nonlinearities, outliers and heavy sparsity in C. Hence, before using the
nonparametric methods, it might be a good strategy to analyze the sparsity of C, before building
PCA based models as extensively mentioned.In this work, a parametric method not largely
usedin multivariate modelingmainly in, at least,three very important areas such as
chemometrics, econometrics and financial sciences, called Least Absolute Shrinkage, Selection
Operator (LASSO) was used and its performance was evaluated against models done with PCR
and PLSR.The LASSO method, also reduce the dimensionality of a data set such as PCA does,
but using a different mathematical approach as we will describe ahead [11, 12]. The LASSO
method, also reduce the dimensionality of C such as PCA does, but using a different
mathematical approach as we will describe ahead. PCR and PLSR as along with LASSO
method were used to build multivariate calibration models to determinate the percentage in
mass (w/w) of Biodiesel[13], from animal fat recyclable sources, contained in Diesel B20,
which is a complex blend of soybean, animal fat Biodiesel and petroleum Diesel. Depending on
the percentage (w/w) of animal fat Biodiesel and operational temperatures, it can cause several
and different engine failures. Hence, the multivariate modelingvery robust, less sensitive to
outliers and mainly sparsity is a very attractive alternative to determine the content of animal fat
Biodiesel in Diesel B20, by NIR spectroscopy, to improve the quality control of fuel to avoiding
operational failures related to the Diesel quality, in engine vehicles. Nowadays, the use of
multivariate modeling in modeling NIR spectroscopic data sets is widely employed, being
considered a standard procedure for quantitative analysis in several analytical techniques and
precisely because of this fact, very robust multivariate modeling methods with low sensitivity to
sparsity of the NIR spectra matrix should be used.
4. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
4
j=1
i=1
1.1. The LASSO Modeling
The LASSO method is a hybrid of variable selection and shrinkage estimators. The procedure
shrinks the coefficients of some of the variables not simply towards zero, but exactly to zero,
giving an implicit form of variable selection. In the simple standard multiple regression we have
the Eq.5:
yi = λ0 + ∑p λj . xij + ε (5)
where y1, y2,..., yn are measurements on a response variable y; xij, i = 1, 2, … ,n, and j = 1,
2,... , p, are corresponding values of p predictor variables; λ, λ0, λ1, … , λ 𝑝 are the parameters in
the regression of the Eq.5. The ε value is an error term. In least squares regression, the
parameters are obtained minimizing the residual sum of squares:
min. ∑n (yi − λ0
p
j=1
2
λjxij) (6)
The LASSO modeling imposes an additional restriction on the coefficients, namely:
∑n |λj| ≤ 𝑡 (7)
j=1
Thus in LASSO modeling we have a typical problem of constrained minimization problem
where λ is a kind “tuning parameter”, suitable choices of this constraint, t, has an interesting
property, which forces some of the coefficients in the regression equation to become zero. The
computation of the solution to Eq.6, with the restriction given by Eq.7 is a quadratic
programming problem with linear inequality constraints. There are several efficient and very
stable algorithms for solving this kind of inequality problem. Finally, it is possible to summarize
the LASSO theory as a simple ordinary least square(OLS)[14] with restrictions leading to an
exact solution, obtained by quadratic programming. The LASSO parametric method was
applied in this work to build a direct modeling to a spectral data set with one hundred of NIR
spectra, to determine the percentage of animal fat Biodiesel in Diesel B20. However, it can be
used to C with two, three hundred or even more NIR spectra. Finally, the results were compared
with two traditional nonparametric methods, the partial least square regression(PLSR) and
principal component regression (PCR).
2. EXPERIMENTAL
The process of producing biodiesel from renewable natural sources is well established and
widely knownand references can be found anywhere as aforementioned. Therefore, it is not
necessary to present in great detail such a process in this work. We emphasize once again that
the most important is the evaluation of biodiesel produced through NIR spectroscopy. With this
fact in mind the experimental part can be simplified. Therefore,experimental part can be
simplified. Aset of one hundred blends of Diesel B20 and animal fat Biodiesel were initially
prepared, with different concentrations of animal fat Biodiesel in Diesel B20.The animal fat
Biodiesel content in Diesel B20 ranged from 0.00% to 13.80 %(w/w), where the Diesel B20 is a
blend of soybean Biodiesel and petroleum diesel, containing at most 50 ppm in mass of sulfur.
Near infrared (NIR) spectra were acquired, after each addition of animal fat Biodiesel using a
NIR spectrometer Perkin Elmer 100N spectrometer equipped with a liquid reflectance accessory
in the range from 4,000 to 400 cm-1
using aresolution of 4 cm-1
and 64 scans per sample. All
samples were acquired in triplicate and the average spectrum was used to build the calibration
model. The set of one hundred NIR spectra was splitted into two data sets, using the
Kenston[16] algorithm, one of them for calibration with 70 samples and the other one with 30
samples, for predictions and modeling validation. The C was pre-processed, using MSC and the
− ∑
5. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
5
noise was minimized by using Fast Fourier Transformfilter. All computational procedures were
performed with routines developed in GNU Octave 5.1.0 [15].
3. RESULTS AND DISCUSSION
Using the above procedure, in materials and methods section, a set of one hundred NIR spectra,
were acquired in reflectance mode. To better visualization the usual transformation in the
reflectance measurements (R), was applied:
𝐿𝑜𝑔
1
𝑅
Eq.8
The set of spectra was plotted with 𝑙𝑜𝑔 1⁄ 𝑅 in abscises and the wavenumbers in cm-1
in the
ordinates, as shown the Figure 1.
Figure 1. The NIR Spectra obtained for 99 additions of animal fat Biodiesel in Diesel B20, and one with
pure Diesel B20, in a final set of 100 NIR spectra.
The coefficients λwere calculated using simple quadratic programming [16], as shown in the
Figure 1. The Figure 2 points out the shrinkage of the coefficients (λ) to zero, as described in the
introduction of LASSO method.
Figure 2. Graphical representation of the Lambda (λ) set values fitted by LASSO against degrees of
freedom (df).
In the Figure 2 is shown λ valuesobtained by cross-validation. The cross validation was done by
using blocks [17], since it is impossible to use the leave one out (LOO)scheme for LASSO
method.
6. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
6
Figure 3. Plot of λ against mean square error (MSE), to choose the λ value with MSE.
The calibration curve obtained using the LASSO method with in the values of the abscissas
being the actual value of %(w/w) animal fat Biodiesel and the values in the ordinates, the
obtained values of % (w/w) animal fat Biodiesel from LASSO modeling. The merit figures are
inside of the Figure 4.
Figure 4. Plot of calibration curve obtained by Lasso method. The goodness of fit is show inside the
figure; 𝑅2 = 0.9999.
The plot of Q residuals vs. Hoteling’s T2
statistic gives a compact view of both, residual and
score outliers (as well as inliers). The axis shows that scores capture 90.25% (abscises) of the
total variance and 9.75% (ordinates) of the variance remains in the residuals. TheFigure 5 shows
there are not so many scores outliers; some of them present a small value. As can be observed in
the Figure 5 there, the LASSO method outperforms the PCA-based models, because the
presence of some outliers does not degrade the performance as it does in PCAbased models.
Figure 5. Plot of Hoteling T2
test to verify the presence of outliers.
7. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
7
A possiblesolution for PCA based models is to remove the outliers before, which is a
cumbersome and in some sense,a subjective task. Moreover, it is essential to emphasize that a
reduction in the size of the data set also reduces,at same time, the performance of the PCA
based methods due to condition number, is always a trade-off. Nevertheless, if it is necessary to
remove some outliers, it is possible to apply a heuristic method to remove them from the large
data set. This heuristic method is quite simple; manually removing the outliers whose values are
very discrepant in the data set. Such heuristic method goes against one of the main idea of
chemometrics, which is to extract maximum information with a minimal experimental and
subjective work. In cases in which the data set are small, as observed in many chemometrics
applications, perhaps the best way is to keep all outliers and hope that the error does not
increase so much. The LASSO method is a very robust alternative for these cases, that is, do not
show high sensitivity to outliers, especially for small datasets. The optimal number of Principal
Component (PC’s) values to PCR werecalculated by cross-validation using the LOO method.
From the Figure 6 is easy to choose the optimal number of PC’s, four to this data set. Using the
parsimony principle and observing that after the value 4 the curve of RMSECV/RMSEC against
PC's tend toward asymptotically to a constant value as shown in the Figure6.
Figure 6. Plot of root mean square error for calibration (RMSEC) and root mean square error for cross
validation samples (RMSECV), for PCR using model leave one out method.
The Figure 7 shown the calibration curve to PCR, in the values of the abscissas are the
actual value of %(w/w) animal fat Biodiesel, and in the values of the ordinates are the
obtained values of %(w/w) animal fat Biodiesel, with PCR model. The goodness of fit
is inside the Figure 7.
Figure 7. Plot of calibration curve obtained by PCR method, the goodness of fit are show inside of the
figure;2 = 0.9298.
8. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
8
The optimal number of latent variables (LV’s) value to PLSR was obtainedby cross-validation
using the LOO method.From the Figure 8 is easy to choose the optimal number of LV’s, such
that for this data set,these number is around four or five once that after these values of
RMSEC/RMSECV the curve tend asymptotically to zero. Then, using the parsimony principle
of was chosen four LV’s to build the PLSR model.
Figure 8. Plot of (RMSEC) and (RMSECV), for PLSR using model leave one method.
The calibration curve for PLSR shown in the Figure 9,in the abscissas values are the actual
valuesand in the values of the ordinates are the obtained values of %(w/w) animal fat Biodiesel,
with PLSR model. The merit figures are shown inside of the Figure 9.
Figure 9. Plot of calibration curve obtained by PLSR method, the goodness of fit are shown in the
figure;2 = 0.9918.
The next three graphics (Figures 10, 11 and 12), the models obtained will be further checked,
evaluating the possible presence of overfitting orunderfitting. Such assessment will be
performed in a simple, but statistically efficient and correct way. This procedure will be carried
out by fitting a normal probability distribution function for to fit the residues leave of each one
model performed in this work. Naturally, a model with thebest fit to the experimental data set
should show invariably residues with a tendency to a perfect Gaussian distribution shape in
ideal conditions. In Figure 10 it is possible to observe that the model obtained with PCR leaves
an approximate Gaussian distribution without tail. The Gaussian fit of the residues indicates
clearly that the PCR model does not present overfitting or underfitting.
9. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
9
Figure 10. Gaussian fit of residues leave by PCR, the goodness of fit are shown in the Figure 10; 𝑥 =
427562 10−14 = 4.7562 and 𝑠 = 1,1381.
In Figure11 it is possible to observe that the model obtained withPLSR leaves a Gaussian
distribution with a light tail, indicating that the PLSR model presents a small overfitting and/or
underfitting.
Figure 11. Gaussian fit of residues leave by PCR, the goodness of fit are shown in the Figure 10;
𝑥 = 1,6954 10−14 and 𝑠 = 1,1381.
A definitive discussion about a comparative approach of ,PCR against PLSR is not a
simple task, once that there are so many possible interpretations to overfitting or
underfitting. However is possible to point out, specifically for this dataset without being
flippant, that the PCA applied for independent variables revealed a poor performance,
possibly due to the size of data set. There is no consensus, in the literature, to explain
the better performance of PCR than PLSR in some data sets, since the optimal number
of LV’s was chosen by cross validation, as shown in Figure 8. Perhaps, if the dataset
was larger, the light tail would not happen. However, obtaining a large data set is not a
good idea too, as aforementioned and this little overfitting presented by PLSRmodel, is
not a big problem for practical purposes, once that, local models have been constructed
for both kind of model and in this range of concentrations PLSRmodel is valid and
useful too.
In Figure13 it is possible to observe that the model obtained withLASSO leaves a
Gaussian distribution with a very light tail as PLSR, indicating clearlythat the LASSO
model present a smaller overfitting and/or underfitting than PCR and PLSR Models in
an overall way.
10. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
10
Figure 12. Gaussian fit of residues leave by PCR, the goodness of fit are shown in the Figure 10;
𝑥 = 1,2093 10−14 and 𝑠 = 0,0629.
The relative performances from different models, was evaluated in terms of root-mean-square-
error (RMSE): 2 (y - ŷ )
2
RMSE= √ i model, i
n-1
(9)
Where is the actual value and ̂is the predicted value by the model and n is the number of the
samples used, in the prediction set. A very simple analysis of RMSE values, shown in Table 1,
indicates that the LASSO method was the most efficient multivariate modeling, followed by
PLSR and PCR respectively.
Table 1. Results of RMSE for different types of models evaluated: LASSO, PLSR and PCR.
Model
LASSO 0.1133
PLSR 0.3646
PCR 1.3040
A very simple analysis of RMSE values, shown in Table 1, indicates that the LASSO method
was the most efficient calibration method, followed by PLSR and PCR respectively. Such
differences observed in the models, are due to the nature of the methods and size of NIR
spectroscopy data set used in this article. The parametric model carried out with the LASSO
method has a very robust numeric solution while the nonparametricPLSR and PCR models are
PCA based methods, which, in some data sets, provides a mathematically weak and unstable
approximated solution. Perhaps, if the data used in this article was larger, the results obtained
with the nonparametric model could be approximately equal to the parametric method LASSO.
It is possible to affirm that the nonparametric methods used in this article are very sensible to
outliers and the size of data set, that is, they are not robust methods for small data sets or ill
conditioned data set. To check if such the assumptions previously made are correct anF-test
[18], with 95% confidence level wasused to compare RMSE for the differentmultivariate
modeling used in this article, the F-test is given by Eq.10:
𝐹(𝑝𝑖, 𝑝𝑗) =
(
𝑅𝑀𝑆𝐸 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑟𝑖𝑐𝑚𝑜𝑑𝑒𝑙,𝐿𝐴𝑆𝑆𝑂
)𝑅𝑀𝑆𝐸 𝑁𝑜𝑛−𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑟𝑖𝑐𝑚𝑜𝑑𝑒𝑙,𝑃𝐶𝑅 𝑎𝑛𝑑 𝑃𝐿𝑆𝑅
(10)
2
11. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
11
Where,p is the number of validation samples, in our case p = 30. The F-test critical value at the
95% confidence level is Fcritical= 1.61. When the F-test is applied to the RMSE values obtained
from different modeling methods, it is observed that theLASSO model is significantly better
than PLSR (F= 2.1) and PCR (F = 2.4). Despite of thebest performance, LASSO is a little bit
more difficult to implement, computationally,its require more computational time, around 1
minute while the PLSR and PCR modeling only few seconds. However, this statement may not
have a long life since the new low cost microcomputers and new numerical algorithms that are
more efficient have grown exponentially, in addition to enhanced processing power, what is
drastically reducing CPU use time.
4. CONCLUSIONS
Based on the results obtained and in the discussion before, it is possible to conclude that the
method LASSO showed better results than nonparametric methods, like PCR and PLSR, due to
size of data set as aforementioned. Another very important outcome about PCA based inverse
models is the necessity of repeat the NIR spectrometer calibration frequently. Once that with the
pass of the time and the high use of NIR spectrometer the internal calibration is lost and the NIR
spectra are no longer reliable. When we decided to use the LASSO method as multivariate
modeling method for the quality determination of Biodiesel by NIR spectrometry, our intention
was not only to evaluate the use NIR spectroscopy applied to the quality of Biodiesel but also
verify if the LASSO method overcomes the performance of traditional inverse chemometrics as
PCR and PLSR. Although we have presented the main advantages of the LASSO over PCA
based methods in spectroscopic datasets where the CN is large, the LASSO method can present
severe discrepancies if the distribution of regression errors shows a heavy tail for a finite
sample. Sparse principal component analysis (SPCA) aims at estimating a PCA-like model
where sparsity is induced on the model parameters; scores and / or loadings. However, this is
still an algorithm-based method, the PCA, without an exact solution. The SPCA works
reasonably well in certain types of spectroscopy, such as mass spectroscopy where the
spectroscopy matrix dataset, can have its sparsity reduced experimentally, only in some cases,
using spectrometers of high accuracy and precision, as well as chemical standards of very high
purity , making the very high-priced method accessible only in big researcher centres,
inaccessible to the vast majority of researchers worldwide. Therefore, the SPCA goes in the
opposite direction of the PCA-based methods which were developed, precisely, to obtain
calibration models in situations where high instrumental accuracy was not possible or important,
as occurred in the NIR spectroscopy in the early 80s. However, the purpose of this article is not
to criticize the use of the PCA or SPCA-based methods [19], but to present alternatives with
exact solutions concerning purely algorithmic methods.
The better results obtained with LASSO method shown clearly that PCA based models had an
important shortcoming to data sets with tendency to sparsity, which is quite usual in diary
practical situations of NIR spectroscopy in reflectance mode. Besides that, the LASSO is a very
simple alternative to PCA based models where the matrix describing the spectroscopic data set
tends to become still sparser. Finally, it is possible to notice that the parametric methods works
better in presence of outliers than nonparametric methods, such PCR and PLSR.
5. ACKNOWLEDGEMENTS
The Authors thanks to FAPESP, CNPq, University of Delaware and Institute for Integrated
Clinical Research – IPclin.
12. Applied Mathematics and Sciences: An International Journal (MathSJ), Vol. 7, No. 1, March 2020
12
REFERENCES
[1] Siesler, H. W. , Y. Ozaki and Kawata, S. , (2001) “Near‐Infrared Spectroscopy: Principles,
Instruments”, Applications, Wiley-VCH Verlag Gmbh.
[2] Jolliffe, I.T., 2nd edition, (2002), “Principal Component Analysis”, Spring.
[3] R, De Maesschalck, F, Estienne ,Verdú-Andrés, Candol. J, Centner V. A. ,Despagne, F., D. Jouan-
Rimbaud, Walczak. B. ,Massart , D.L., de Jong, S, de Noord, O.E , Puel, C, Vandeginste, B.M.G.
[4] Brereton, R. , Jansen, J. , Lopes, J., Marine, F., Pomerantse, A., Rodionova O., Roger, M. J., Walczak,
B. and Tauler. R.( 2018), Anal. and Bioanal. Chem., Vol. 410 (26), pp 540-543.
[5] Naes, T., Isaksson, T and Kowalski, B., (1990), Anal. Chem. Vol. 62(7), pp.664-673.
[6] Mello. C.; Ozório, E.; Kubota, L. T.; (2000), Química Nova, 2000, Vol. 25, pp690-698.
[7] RIsmail, A.; SAsfour. S. (1999), Jour. Biomech., Vol. 32, pp317-322
[8] Å.Björck., (1994), Linear Algebra and its Applications, Vol. 197-198, pp. 297-316.
[9] Tanabe. J.; Miller. D.; Tregellas. J.; Freedman. R. and Mayer. F.G. (2002); NeuroImage, Vol. 15, pp-
902-907.
[10] Charpentier. S.; Fouchet. K and Zarouf. S. (2019), Analysis and Mathematical Physics, Vol.9 (3), pp-
971-990.
[11] Friedman. J; Hastie. T. and Tibishirani. R.,(2008) Biosataistics,Vol. 9, pp432-441
[12] Horowitz. J. L. (2019), Annual Review Economics, Vol. 11, pp193-224.
[13] Knothe. G (Editor), Krah. Jl (Editor), Van Gerpen. J (Editor), (2010). “The Biodiesel Handbook”,
Academic Press and AOCS Press.
[14] Goldberger, A. S. (1964). "Classical Linear Regression". Econometric Theory. New York: John
Wiley & Sons. pp. 158.
[15] Pencheva. T ;Atanassov. K and Shannon. A. Anthony ; Tenth Int. Workshop on Generalized Nets Sofia,
5 December 2009, 1-7.
[15] http://www.gnu.org/software/octave/doc/interpreter/, last accessed in November 2019.
[16] Marguerite. F. and Wolfe P.; "An Algorithm for Quadratic Programming." Naval Research Logistics
Quarterly 3 (1956): 95-110. Web. 4 June 2015.
[17] Wülfert.F; Kok. T. W; de Nood. O. E.andSmilde A. K.(2000), Chemon. Intell. Lab. Sys., Vol. 24 (2),
pp189-200.
[18] Box, G. E. P. (1953). "Non-Normality and Tests on Variances". Biometrika. Vol.40 (3/4), pp318– 335.
[19] Gajjar, S.; Kulahci, M. and Palazoglu. A., (2016), “Use of Sparse Principal Component Analysis
(SPCA) for Fault Detection”, IFAC,pp 693-698, 11th IFAC Symposium on Dynamics and Control of
Process Systems, including Biosystems June 6-8, 2016. NTNU, Trondheim, Norway