Suicide-related behaviours need to be prevented on psychiatric patients. Prediction of those behaviours based on patient medical records would be very useful for the prevention by the psychiatric hospital. This research focused on developing this prediction at the only one psychiatric hospital of Bali Province by using Smooth Support Vector Machine method, as the further development of Support Vector Machine. The method used 30.660 patient medical records from the last five years. Data cleaning gave 2665 relevant data for this research, includes 111 patients that have suicide-related behaviours and under active treatment. Those cleaned data then were transformed into ten predictor variables and a response variable. Splitting training and testing data on those transformed data were done for building and accuracy evaluation of the method model. Based on the experiment, the best average accuracy at 63% can be obtained by using 30% of relevant data as data testing and by using training data which has one-to-one ratio in number between patients that have suicide-related behaviours and patients that have no such behaviours. In the future work, accuracy improvement need to be confirmed by using Reduced Support Vector Machine method, as the further development of Smooth Support Vector Machine.
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMSijcsit
Diabetes disease is amongst the most common disease in India. It affects patient’s health and also leads to
other chronic diseases. Prediction of diabetes plays a significant role in saving of life and cost. Predicting
diabetes in human body is a challenging task because it depends on several factors. Few studies have reported the performance of classification algorithms in terms of accuracy. Results in these studies are difficult and complex to understand by medical practitioner and also lack in terms of visual aids as they arepresented in pure text format. This reported survey uses ROC and PRC graphical measures toimproveunderstanding of results. A detailed parameter wise discussion of comparison is also presented which lacksin other reported surveys. Execution time, Accuracy, TP Rate, FP Rate, Precision, Recall, F Measureparameters are used for comparative analysis and Confusion Matrix is prepared for quick review of each
algorithm. Ten fold cross validation method is used for estimation of prediction model. Different sets of
classification algorithms are analyzed on diabetes dataset acquired from UCI repository
Improved probabilistic distance based locality preserving projections method ...IJECEIAES
In this paper, a dimensionality reduction is achieved in large datasets using the proposed distance based Non-integer Matrix Factorization (NMF) technique, which is intended to solve the data dimensionality problem. Here, NMF and distance measurement aim to resolve the non-orthogonality problem due to increased dataset dimensionality. It initially partitions the datasets, organizes them into a defined geometric structure and it avoids capturing the dataset structure through a distance based similarity measurement. The proposed method is designed to fit the dynamic datasets and it includes the intrinsic structure using data geometry. Therefore, the complexity of data is further avoided using an Improved Distance based Locality Preserving Projection. The proposed method is evaluated against existing methods in terms of accuracy, average accuracy, mutual information and average mutual information.
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
This document discusses clustering dichotomous health care data using the K-means algorithm after transforming the data using Wiener transformation. It begins with an introduction to dichotomous data and the challenges of clustering medical data. It then describes the K-means clustering algorithm and various distance measures used for binary data clustering. The document proposes using Wiener transformation to first transform binary data to real values before applying K-means clustering. It evaluates the results on a lens dataset using inter-cluster and intra-cluster distances, finding the transformed data yields better clusters than the original binary data according to these metrics.
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...IJECEIAES
Much real world decision making is based on binary categories of information that agree or disagree, accept or reject, succeed or fail and so on. Information of this category is the output of a classification method that is the domain of statistical field studies (eg Logistic Regression method) and machine learning (eg Learning Vector Quantization (LVQ)). The input argument of a classification method has a very crucial role to the resulting output condition. This paper investigated the influence of various types of input data measurement (interval, ratio, and nominal) to the performance of logistic regression method and LVQ in classifying an object. Logistic regression modeling is done in several stages until a model that meets the suitability model test is obtained. Modeling on LVQ was tested on several codebook sizes and selected the most optimal LVQ model. The best model of each method compared to its performance on object classification based on Hit Ratio indicator. In logistic regression model obtained 2 models that meet the model suitability test is a model with predictive variables scaled interval and nominal, while in LVQ modeling obtained 3 pieces of the most optimal model with a different codebook. In the data with interval-scale predictor variable, the performance of both methods is the same. The performance of both models is just as bad when the data have the predictor variables of the nominal scale. In the data with predictor variable has ratio scale, the LVQ method able to produce moderate enough performance, while on logistic regression modeling is not obtained the model that meet model suitability test. Thus if the input dataset has interval or ratio-scale predictor variables than it is preferable to use the LVQ method for modeling the object classification.
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...IJCI JOURNAL
When only a few lower modes data are available to evaluate a large number of unknown parameters, it is
difficult to acquire information about all unknown parameters. The challenge in this kind of updation
problem is first to get confidence about the parameters that are evaluated correctly using the available
data and second to get information about the remaining parameters. In this work, the first issue is resolved
employing the sensitivity of the modal data used for updation. Once it is fixed that which parameters are
evaluated satisfactorily using the available modal data the remaining parameters are evaluated employing
modal data of a virtual structure. This virtual structure is created by adding or removing some known
stiffness to or from some of the stories of the original structure. A 12-story shear building is considered for
the numerical illustration of the approach. Results of the study show that the present approach is an
effective tool in system identification problem when only a few data is available for updation.
Multimodal authentication is one of the prime concepts in current applications of real scenario. Various
approaches have been proposed in this aspect. In this paper, an intuitive strategy is proposed as a
framework for providing more secure key in biometric security aspect. Initially the features will be
extracted through PCA by SVD from the chosen biometric patterns, then using LU factorization technique
key components will be extracted, then selected with different key sizes and then combined the selected key
components using convolution kernel method (Exponential Kronecker Product - eKP) as Context-Sensitive
Exponent Associative Memory model (CSEAM). In the similar way, the verification process will be done
and then verified with the measure MSE. This model would give better outcome when compared with SVD
factorization[1] as feature selection. The process will be computed for different key sizes and the results
will be presented.
Accuracy, Sensitivity and Specificity Measurement of Various Classification T...IOSR Journals
This document compares the accuracy, sensitivity, and specificity of various classification techniques when applied to healthcare data on diabetes. It analyzes several algorithms implemented in Weka (Multilayer Perception, Bayes Network, J48graft, JRip) and other tools (PNN, LVQ, FFN, etc. in MATLAB and GINI in RapidMiner) on a diabetes dataset. The results show that J48graft had the highest accuracy at 81.33% while PNN had the highest sensitivity at 63.33% and DTDN had the highest specificity at 88.8% based on calculations using true/false positive/negative values. Therefore, different algorithms performed best for different evaluation metrics on this healthcare
This document describes a study that developed an Android application to predict and suggest measures for diabetes using data mining techniques. The study used the Pima Indian diabetes dataset to build a decision tree classification model using the C4.5 algorithm to predict whether a person is diabetic or not based on their attributes. The most significant attributes identified for prediction were plasma glucose level, body mass index, diabetes pedigree function, and insulin level. The developed Android app allows a user to enter their details, which are then run through the decision tree model to predict their diabetes status and provide suggested measures if predicted to be diabetic. The goal of the study was to create a mobile application to help individuals assess their risk of diabetes and maintain healthy habits.
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMSijcsit
Diabetes disease is amongst the most common disease in India. It affects patient’s health and also leads to
other chronic diseases. Prediction of diabetes plays a significant role in saving of life and cost. Predicting
diabetes in human body is a challenging task because it depends on several factors. Few studies have reported the performance of classification algorithms in terms of accuracy. Results in these studies are difficult and complex to understand by medical practitioner and also lack in terms of visual aids as they arepresented in pure text format. This reported survey uses ROC and PRC graphical measures toimproveunderstanding of results. A detailed parameter wise discussion of comparison is also presented which lacksin other reported surveys. Execution time, Accuracy, TP Rate, FP Rate, Precision, Recall, F Measureparameters are used for comparative analysis and Confusion Matrix is prepared for quick review of each
algorithm. Ten fold cross validation method is used for estimation of prediction model. Different sets of
classification algorithms are analyzed on diabetes dataset acquired from UCI repository
Improved probabilistic distance based locality preserving projections method ...IJECEIAES
In this paper, a dimensionality reduction is achieved in large datasets using the proposed distance based Non-integer Matrix Factorization (NMF) technique, which is intended to solve the data dimensionality problem. Here, NMF and distance measurement aim to resolve the non-orthogonality problem due to increased dataset dimensionality. It initially partitions the datasets, organizes them into a defined geometric structure and it avoids capturing the dataset structure through a distance based similarity measurement. The proposed method is designed to fit the dynamic datasets and it includes the intrinsic structure using data geometry. Therefore, the complexity of data is further avoided using an Improved Distance based Locality Preserving Projection. The proposed method is evaluated against existing methods in terms of accuracy, average accuracy, mutual information and average mutual information.
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
This document discusses clustering dichotomous health care data using the K-means algorithm after transforming the data using Wiener transformation. It begins with an introduction to dichotomous data and the challenges of clustering medical data. It then describes the K-means clustering algorithm and various distance measures used for binary data clustering. The document proposes using Wiener transformation to first transform binary data to real values before applying K-means clustering. It evaluates the results on a lens dataset using inter-cluster and intra-cluster distances, finding the transformed data yields better clusters than the original binary data according to these metrics.
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...IJECEIAES
Much real world decision making is based on binary categories of information that agree or disagree, accept or reject, succeed or fail and so on. Information of this category is the output of a classification method that is the domain of statistical field studies (eg Logistic Regression method) and machine learning (eg Learning Vector Quantization (LVQ)). The input argument of a classification method has a very crucial role to the resulting output condition. This paper investigated the influence of various types of input data measurement (interval, ratio, and nominal) to the performance of logistic regression method and LVQ in classifying an object. Logistic regression modeling is done in several stages until a model that meets the suitability model test is obtained. Modeling on LVQ was tested on several codebook sizes and selected the most optimal LVQ model. The best model of each method compared to its performance on object classification based on Hit Ratio indicator. In logistic regression model obtained 2 models that meet the model suitability test is a model with predictive variables scaled interval and nominal, while in LVQ modeling obtained 3 pieces of the most optimal model with a different codebook. In the data with interval-scale predictor variable, the performance of both methods is the same. The performance of both models is just as bad when the data have the predictor variables of the nominal scale. In the data with predictor variable has ratio scale, the LVQ method able to produce moderate enough performance, while on logistic regression modeling is not obtained the model that meet model suitability test. Thus if the input dataset has interval or ratio-scale predictor variables than it is preferable to use the LVQ method for modeling the object classification.
POSTERIOR RESOLUTION AND STRUCTURAL MODIFICATION FOR PARAMETER DETERMINATION ...IJCI JOURNAL
When only a few lower modes data are available to evaluate a large number of unknown parameters, it is
difficult to acquire information about all unknown parameters. The challenge in this kind of updation
problem is first to get confidence about the parameters that are evaluated correctly using the available
data and second to get information about the remaining parameters. In this work, the first issue is resolved
employing the sensitivity of the modal data used for updation. Once it is fixed that which parameters are
evaluated satisfactorily using the available modal data the remaining parameters are evaluated employing
modal data of a virtual structure. This virtual structure is created by adding or removing some known
stiffness to or from some of the stories of the original structure. A 12-story shear building is considered for
the numerical illustration of the approach. Results of the study show that the present approach is an
effective tool in system identification problem when only a few data is available for updation.
Multimodal authentication is one of the prime concepts in current applications of real scenario. Various
approaches have been proposed in this aspect. In this paper, an intuitive strategy is proposed as a
framework for providing more secure key in biometric security aspect. Initially the features will be
extracted through PCA by SVD from the chosen biometric patterns, then using LU factorization technique
key components will be extracted, then selected with different key sizes and then combined the selected key
components using convolution kernel method (Exponential Kronecker Product - eKP) as Context-Sensitive
Exponent Associative Memory model (CSEAM). In the similar way, the verification process will be done
and then verified with the measure MSE. This model would give better outcome when compared with SVD
factorization[1] as feature selection. The process will be computed for different key sizes and the results
will be presented.
Accuracy, Sensitivity and Specificity Measurement of Various Classification T...IOSR Journals
This document compares the accuracy, sensitivity, and specificity of various classification techniques when applied to healthcare data on diabetes. It analyzes several algorithms implemented in Weka (Multilayer Perception, Bayes Network, J48graft, JRip) and other tools (PNN, LVQ, FFN, etc. in MATLAB and GINI in RapidMiner) on a diabetes dataset. The results show that J48graft had the highest accuracy at 81.33% while PNN had the highest sensitivity at 63.33% and DTDN had the highest specificity at 88.8% based on calculations using true/false positive/negative values. Therefore, different algorithms performed best for different evaluation metrics on this healthcare
This document describes a study that developed an Android application to predict and suggest measures for diabetes using data mining techniques. The study used the Pima Indian diabetes dataset to build a decision tree classification model using the C4.5 algorithm to predict whether a person is diabetic or not based on their attributes. The most significant attributes identified for prediction were plasma glucose level, body mass index, diabetes pedigree function, and insulin level. The developed Android app allows a user to enter their details, which are then run through the decision tree model to predict their diabetes status and provide suggested measures if predicted to be diabetic. The goal of the study was to create a mobile application to help individuals assess their risk of diabetes and maintain healthy habits.
25 16285 a hybrid ijeecs 012 edit septianIAESIJEECS
Feature selection aims to choose an optimal subset of features that are necessary and sufficient
to improve the generalization performance and the running efficiency of the learning algorithm. To get the
optimal subset in the feature selection process, a hybrid feature selection based on mutual information and
genetic algorithm is proposed in this paper. In order to make full use of the advantages of filter and
wrapper model, the algorithm is divided into two phases: the filter phase and the wrapper phase. In the
filter phase, this algorithm first uses the mutual information to sort the feature, and provides the heuristic
information for the subsequent genetic algorithm, to accelerate the search process of the genetic
algorithm. In the wrapper phase, using the genetic algorithm as the search strategy, considering the
performance of the classifier and dimension of subset as an evaluation criterion, search the best subset of
features. Experimental results on benchmark datasets show that the proposed algorithm has higher
classification accuracy and smaller feature dimension, and its running time is less than the time of using
genetic algorithm.
Classification of medical datasets using back propagation neural network powe...IJECEIAES
The classification is a one of the most indispensable domains in the data mining and machine learning. The classification process has a good reputation in the area of diseases diagnosis by computer systems where the progress in smart technologies of computer can be invested in diagnosing various diseases based on data of real patients documented in databases. The paper introduced a methodology for diagnosing a set of diseases including two types of cancer (breast cancer and lung), two datasets for diabetes and heart attack. Back Propagation Neural Network plays the role of classifier. The performance of neural net is enhanced by using the genetic algorithm which provides the classifier with the optimal features to raise the classification rate to the highest possible. The system showed high efficiency in dealing with databases differs from each other in size, number of features and nature of the data and this is what the results illustrated, where the ratio of the classification reached to 100% in most datasets).
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
In this paper, different classification algorithms for data mining are discussed. Data Mining is about
explaining the past & predicting the future by means of data analysis. Classification is a task of data mining,
which categories data based on numerical or categorical variables. To classify the data many algorithms are
proposed, out of them five algorithms are comparatively studied for data mining through classification. There are
four different classification approaches namely Frequency Table, Covariance Matrix, Similarity Functions &
Others. As work for research on classification methods, algorithms like Naive Bayesian, K Nearest Neighbors,
Decision Tree, Artificial Neural Network & Support Vector Machine are studied & examined using benchmark
datasets like Iris & Lung Cancer.
Discretization methods for Bayesian networks in the case of the earthquakejournalBEEI
The document discusses different discretization methods that can be used to discretize continuous variables when applying Bayesian networks (BN) to earthquake damage data. It compares equal-width, equal-frequency, and K-means discretization. All three methods are applied to discretize continuous variables from earthquake damage data into three groups. A BN structure is constructed using the discretized data to determine building damage risk. The K-means method produced the highest accuracy based on a confusion matrix, indicating it is the best discretization method for this data.
This document proposes using truncated non-negative matrix factorization (NMF) with sparseness constraints for privacy-preserving data perturbation. NMF is used to distort individual data values while preserving statistical distributions. Experimental results on breast cancer and ionosphere datasets show that the method effectively conceals sensitive information while maintaining data mining performance after distortion, as measured by a k-nearest neighbors classifier's accuracy. The degree of data distortion and privacy can be controlled by varying the NMF rank, sparseness constraint, and truncation threshold.
This document summarizes several major data classification techniques, including decision tree induction, Bayesian classification, rule-based classification, classification by back propagation, support vector machines, lazy learners, genetic algorithms, rough set approach, and fuzzy set approach. It provides an overview of each technique, describing their basic concepts and key algorithms. The goal is to help readers understand different data classification methodologies and which may be suitable for various domain-specific problems.
Analysis On Classification Techniques In Mammographic Mass Data SetIJERA Editor
Data mining, the extraction of hidden information from large databases, is to predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. Data-Mining classification techniques deals with determining to which group each data instances are associated with. It can deal with a wide variety of data so that large amount of data can be involved in processing. This paper deals with analysis on various data mining classification techniques such as Decision Tree Induction, Naïve Bayes , k-Nearest Neighbour (KNN) classifiers in mammographic mass dataset.
Textural Feature Extraction of Natural Objects for Image ClassificationCSCJournals
The field of digital image processing has been growing in scope in the recent years. A digital image is represented as a two-dimensional array of pixels, where each pixel has the intensity and location information. Analysis of digital images involves extraction of meaningful information from them, based on certain requirements. Digital Image Analysis requires the extraction of features, transforms the data in the high-dimensional space to a space of fewer dimensions. Feature vectors are n-dimensional vectors of numerical features used to represent an object. We have used Haralick features to classify various images using different classification algorithms like Support Vector Machines (SVM), Logistic Classifier, Random Forests Multi Layer Perception and Naïve Bayes Classifier. Then we used cross validation to assess how well a classifier works for a generalized data set, as compared to the classifications obtained during training.
This summary provides the key details from the document in 3 sentences:
The document discusses optimizing the parameters of an artificial neural network (ANN) model for predicting schizophrenia. It explores different numbers of hidden layers, epochs, and fold units to determine the configuration that results in the highest training accuracy and lowest error. The optimal parameters were found to be 8 hidden layers, 2 fold units, and up to 1000 epochs, achieving a peak training accuracy of 71.34% and prediction accuracy of 74.53%.
A new model for iris data set classification based on linear support vector m...IJECEIAES
1. The authors propose a new model for classifying the iris data set using a linear support vector machine (SVM) classifier with genetic algorithm optimization of the SVM's C and gamma parameters.
2. Principal component analysis was used to reduce the iris data set features from four to three before classification.
3. The genetic algorithm was shown to optimize the SVM parameters, achieving 98.7% accuracy on the iris data set classification compared to 95.3% accuracy without parameter optimization.
K-Means Segmentation Method for Automatic Leaf Disease DetectionIJERA Editor
Automatic detection of leaf disease is an essential research topic in agricultural research. It may prove benefits
in monitoring fields and early detection of leaf diseases by the symptoms that appear on the leaves. Defect
segmentation is carried out in two steps. This paper illustrates K-means clustering method for segmentation of
diseased portion of leaf. At first, the pixels are clustered based on their color and spatial features, where the
clustering process is accomplished. Then the clustered blocks are merged to a specific number of regions. This
approach provides a feasible robust solution for defect segmentation of leaves
Preprocessing and Classification in WEKA Using Different ClassifiersIJERA Editor
Data mining is a process of extracting information from a dataset and transform it into understandable structure
for further use, also it discovers patterns in large data sets [1]. Data mining has number of important techniques
such as preprocessing, classification. Classification is one such technique which is based on supervised learning.
It is a technique used for predicting group membership for the data instance. Here in this paper we use
preprocessing, classification on diabetes database. Here we apply classifiers on this database and compare the
result based on certain parameters using WEKA. 77.2 million people in India are suffering from pre diabetes.
ICMR estimates that around 65.1million are diabetes patients. Globally in year 2010, 227 to 285 million people
had diabetes, out of that 90% cases are related to type 2 ,this is equal to 3.3% of the population with equal rates
in both women and men in 2011 it resulted in 1.4 million deaths worldwide making it the leading cause of
death.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...IOSR Journals
The document describes research on improving the termination criteria for the random forest algorithm when classifying disease data. The random forest algorithm constructs a forest of decision trees to predict disease classification. Typically, random forest terminates tree construction based on accuracy and correlation metrics. The proposed method uses binomial distribution, multinomial distribution, and sequential probability ratio testing to determine when to terminate tree construction, with the goal of stopping earlier than typical random forest approaches. Experimental results on five disease datasets show the proposed method with probability distributions achieves better accuracy than classical random forest while constructing fewer trees, improving the efficiency of the algorithm.
Mammogram image segmentation using rough clusteringeSAT Journals
This document discusses using rough clustering algorithms for mammogram image segmentation. It proposes using Rough K-Means clustering on Haralick texture features extracted from mammogram images. The Rough K-Means algorithm is compared to traditional K-Means and Fuzzy C-Means using metrics like mean square error and root mean square error. Preliminary results found that Rough K-Means produced better segmentation results than the other methods. The document provides background on rough set theory, image segmentation, feature extraction, and different clustering algorithms that can be used.
The document compares the predictive performance of classification trees and logistic regression models in determining whether individuals are insured or not based on demographic and socioeconomic characteristics. It first describes the data and techniques used, including classification trees, cost complexity pruning, logistic regression, bagging, and random forests. It then describes the research method of using different portions of the data for model training and testing. The results show that logistic regression performs as well as classification trees on nonlinear data and better on linear data. Both methods select income as an important predictor, while logistic regression favors dummy variables and classification trees favor continuous variables. Random forests have the highest predictive accuracy overall.
This document presents a new algorithm called UDT-CDF for building decision trees to classify uncertain numerical data. It improves on previous algorithms like UDT that were based on probability density functions (PDFs). The key aspects of the new algorithm are:
1. It uses cumulative distribution functions (CDFs) rather than PDFs to represent uncertain numerical attributes, since CDFs provide more complete probability information.
2. It splits data at decision tree nodes based on the CDF, placing data with values covering the split point into both branches weighted by the CDF.
3. Experimental results show the new CDF-based algorithm achieves more accurate classifications and is more computationally efficient than the PDF-based UDT algorithm,
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTIONIJDKP
The geometry of data, also known as probability distribution, is an important consideration for accurate computation of data mining tasks, such as pre-processing, classification and interpretation. The data geometry influences outcome and accuracy of the statistical analysis to a large extent. The current paper focuses on, understanding the influence of data geometry in the feature subset selection process using random forest algorithm. In practice, it is assumed that the data follows normal distribution and most of the time, it may not be true. The dimensionality reduction varies, due to change in the distribution of the data. A comparison is made using three standard distributions such as Triangular, Uniform and Normal Distribution. The results are discussed in this paper.
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI) and relevance analysis are incorporated with concept hierarchy’s knowledge and HeightBalancePriority algorithm for construction of decision tree along with Multi level mining. The assignment of priorities to attributes is done by evaluating information entropy, at different levels of abstraction for building decision tree using HeightBalancePriority algorithm. Modified DMQL queries are used to understand and explore the shortcomings of the decision trees generated by C4.5 classifier for education dataset and the results are compared with the proposed approach.
This document summarizes and compares different classification algorithms that can be used for disease prediction in data mining. It first introduces disease prediction and classification processes. It then reviews related works that have used various classification algorithms like random forest, support vector machine, and naive Bayes for tasks like disease diagnosis, text classification, and rainfall forecasting. The document also discusses supervised, unsupervised, and semi-supervised machine learning. It provides details on support vector machine and random forest algorithms, describing how each works and is used for classification. Finally, it analyzes the random forest algorithm construction process.
A Comparative Analysis on the Evaluation of Classification Algorithms in the ...IJECEIAES
Data mining techniques are applied in many applications as a standard procedure for analyzing the large volume of available data, extracting useful information and knowledge to support the major decision-making processes. Diabetes mellitus is a continuing, general, deadly syndrome occurring all around the world. It is characterized by hyperglycemia occurring due to abnormalities in insulin secretion which would in turn result in irregular rise of glucose level. In recent years, the impact of Diabetes mellitus has increased to a great extent especially in developing countries like India. This is mainly due to the irregularities in the food habits and life style. Thus, early diagnosis and classification of this deadly disease has become an active area of research in the last decade. Numerous clustering and classifications techniques are available in the literature to visualize temporal data to identify trends for controlling diabetes mellitus. This work presents an experimental study of several algorithms which classifies Diabetes Mellitus data effectively. The existing algorithms are analyzed thoroughly to identify their advantages and limitations. The performance assessment of the existing algorithms is carried out to determine the best approach.
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
The document presents a mathematical programming approach for selecting important variables in cluster analysis. It formulates a nonlinear binary model to minimize the distance between observations within clusters, using indicator variables to select important variables. The model is applied to a sample dataset of 30 observations across 5 variables, correctly identifying variables 3, 4 and 5 as most important for clustering the observations into two groups. The results are compared to an existing variable selection heuristic, with the mathematical programming approach achieving a 100% correct classification versus 97% for the other method.
25 16285 a hybrid ijeecs 012 edit septianIAESIJEECS
Feature selection aims to choose an optimal subset of features that are necessary and sufficient
to improve the generalization performance and the running efficiency of the learning algorithm. To get the
optimal subset in the feature selection process, a hybrid feature selection based on mutual information and
genetic algorithm is proposed in this paper. In order to make full use of the advantages of filter and
wrapper model, the algorithm is divided into two phases: the filter phase and the wrapper phase. In the
filter phase, this algorithm first uses the mutual information to sort the feature, and provides the heuristic
information for the subsequent genetic algorithm, to accelerate the search process of the genetic
algorithm. In the wrapper phase, using the genetic algorithm as the search strategy, considering the
performance of the classifier and dimension of subset as an evaluation criterion, search the best subset of
features. Experimental results on benchmark datasets show that the proposed algorithm has higher
classification accuracy and smaller feature dimension, and its running time is less than the time of using
genetic algorithm.
Classification of medical datasets using back propagation neural network powe...IJECEIAES
The classification is a one of the most indispensable domains in the data mining and machine learning. The classification process has a good reputation in the area of diseases diagnosis by computer systems where the progress in smart technologies of computer can be invested in diagnosing various diseases based on data of real patients documented in databases. The paper introduced a methodology for diagnosing a set of diseases including two types of cancer (breast cancer and lung), two datasets for diabetes and heart attack. Back Propagation Neural Network plays the role of classifier. The performance of neural net is enhanced by using the genetic algorithm which provides the classifier with the optimal features to raise the classification rate to the highest possible. The system showed high efficiency in dealing with databases differs from each other in size, number of features and nature of the data and this is what the results illustrated, where the ratio of the classification reached to 100% in most datasets).
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
In this paper, different classification algorithms for data mining are discussed. Data Mining is about
explaining the past & predicting the future by means of data analysis. Classification is a task of data mining,
which categories data based on numerical or categorical variables. To classify the data many algorithms are
proposed, out of them five algorithms are comparatively studied for data mining through classification. There are
four different classification approaches namely Frequency Table, Covariance Matrix, Similarity Functions &
Others. As work for research on classification methods, algorithms like Naive Bayesian, K Nearest Neighbors,
Decision Tree, Artificial Neural Network & Support Vector Machine are studied & examined using benchmark
datasets like Iris & Lung Cancer.
Discretization methods for Bayesian networks in the case of the earthquakejournalBEEI
The document discusses different discretization methods that can be used to discretize continuous variables when applying Bayesian networks (BN) to earthquake damage data. It compares equal-width, equal-frequency, and K-means discretization. All three methods are applied to discretize continuous variables from earthquake damage data into three groups. A BN structure is constructed using the discretized data to determine building damage risk. The K-means method produced the highest accuracy based on a confusion matrix, indicating it is the best discretization method for this data.
This document proposes using truncated non-negative matrix factorization (NMF) with sparseness constraints for privacy-preserving data perturbation. NMF is used to distort individual data values while preserving statistical distributions. Experimental results on breast cancer and ionosphere datasets show that the method effectively conceals sensitive information while maintaining data mining performance after distortion, as measured by a k-nearest neighbors classifier's accuracy. The degree of data distortion and privacy can be controlled by varying the NMF rank, sparseness constraint, and truncation threshold.
This document summarizes several major data classification techniques, including decision tree induction, Bayesian classification, rule-based classification, classification by back propagation, support vector machines, lazy learners, genetic algorithms, rough set approach, and fuzzy set approach. It provides an overview of each technique, describing their basic concepts and key algorithms. The goal is to help readers understand different data classification methodologies and which may be suitable for various domain-specific problems.
Analysis On Classification Techniques In Mammographic Mass Data SetIJERA Editor
Data mining, the extraction of hidden information from large databases, is to predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. Data-Mining classification techniques deals with determining to which group each data instances are associated with. It can deal with a wide variety of data so that large amount of data can be involved in processing. This paper deals with analysis on various data mining classification techniques such as Decision Tree Induction, Naïve Bayes , k-Nearest Neighbour (KNN) classifiers in mammographic mass dataset.
Textural Feature Extraction of Natural Objects for Image ClassificationCSCJournals
The field of digital image processing has been growing in scope in the recent years. A digital image is represented as a two-dimensional array of pixels, where each pixel has the intensity and location information. Analysis of digital images involves extraction of meaningful information from them, based on certain requirements. Digital Image Analysis requires the extraction of features, transforms the data in the high-dimensional space to a space of fewer dimensions. Feature vectors are n-dimensional vectors of numerical features used to represent an object. We have used Haralick features to classify various images using different classification algorithms like Support Vector Machines (SVM), Logistic Classifier, Random Forests Multi Layer Perception and Naïve Bayes Classifier. Then we used cross validation to assess how well a classifier works for a generalized data set, as compared to the classifications obtained during training.
This summary provides the key details from the document in 3 sentences:
The document discusses optimizing the parameters of an artificial neural network (ANN) model for predicting schizophrenia. It explores different numbers of hidden layers, epochs, and fold units to determine the configuration that results in the highest training accuracy and lowest error. The optimal parameters were found to be 8 hidden layers, 2 fold units, and up to 1000 epochs, achieving a peak training accuracy of 71.34% and prediction accuracy of 74.53%.
A new model for iris data set classification based on linear support vector m...IJECEIAES
1. The authors propose a new model for classifying the iris data set using a linear support vector machine (SVM) classifier with genetic algorithm optimization of the SVM's C and gamma parameters.
2. Principal component analysis was used to reduce the iris data set features from four to three before classification.
3. The genetic algorithm was shown to optimize the SVM parameters, achieving 98.7% accuracy on the iris data set classification compared to 95.3% accuracy without parameter optimization.
K-Means Segmentation Method for Automatic Leaf Disease DetectionIJERA Editor
Automatic detection of leaf disease is an essential research topic in agricultural research. It may prove benefits
in monitoring fields and early detection of leaf diseases by the symptoms that appear on the leaves. Defect
segmentation is carried out in two steps. This paper illustrates K-means clustering method for segmentation of
diseased portion of leaf. At first, the pixels are clustered based on their color and spatial features, where the
clustering process is accomplished. Then the clustered blocks are merged to a specific number of regions. This
approach provides a feasible robust solution for defect segmentation of leaves
Preprocessing and Classification in WEKA Using Different ClassifiersIJERA Editor
Data mining is a process of extracting information from a dataset and transform it into understandable structure
for further use, also it discovers patterns in large data sets [1]. Data mining has number of important techniques
such as preprocessing, classification. Classification is one such technique which is based on supervised learning.
It is a technique used for predicting group membership for the data instance. Here in this paper we use
preprocessing, classification on diabetes database. Here we apply classifiers on this database and compare the
result based on certain parameters using WEKA. 77.2 million people in India are suffering from pre diabetes.
ICMR estimates that around 65.1million are diabetes patients. Globally in year 2010, 227 to 285 million people
had diabetes, out of that 90% cases are related to type 2 ,this is equal to 3.3% of the population with equal rates
in both women and men in 2011 it resulted in 1.4 million deaths worldwide making it the leading cause of
death.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...IOSR Journals
The document describes research on improving the termination criteria for the random forest algorithm when classifying disease data. The random forest algorithm constructs a forest of decision trees to predict disease classification. Typically, random forest terminates tree construction based on accuracy and correlation metrics. The proposed method uses binomial distribution, multinomial distribution, and sequential probability ratio testing to determine when to terminate tree construction, with the goal of stopping earlier than typical random forest approaches. Experimental results on five disease datasets show the proposed method with probability distributions achieves better accuracy than classical random forest while constructing fewer trees, improving the efficiency of the algorithm.
Mammogram image segmentation using rough clusteringeSAT Journals
This document discusses using rough clustering algorithms for mammogram image segmentation. It proposes using Rough K-Means clustering on Haralick texture features extracted from mammogram images. The Rough K-Means algorithm is compared to traditional K-Means and Fuzzy C-Means using metrics like mean square error and root mean square error. Preliminary results found that Rough K-Means produced better segmentation results than the other methods. The document provides background on rough set theory, image segmentation, feature extraction, and different clustering algorithms that can be used.
The document compares the predictive performance of classification trees and logistic regression models in determining whether individuals are insured or not based on demographic and socioeconomic characteristics. It first describes the data and techniques used, including classification trees, cost complexity pruning, logistic regression, bagging, and random forests. It then describes the research method of using different portions of the data for model training and testing. The results show that logistic regression performs as well as classification trees on nonlinear data and better on linear data. Both methods select income as an important predictor, while logistic regression favors dummy variables and classification trees favor continuous variables. Random forests have the highest predictive accuracy overall.
This document presents a new algorithm called UDT-CDF for building decision trees to classify uncertain numerical data. It improves on previous algorithms like UDT that were based on probability density functions (PDFs). The key aspects of the new algorithm are:
1. It uses cumulative distribution functions (CDFs) rather than PDFs to represent uncertain numerical attributes, since CDFs provide more complete probability information.
2. It splits data at decision tree nodes based on the CDF, placing data with values covering the split point into both branches weighted by the CDF.
3. Experimental results show the new CDF-based algorithm achieves more accurate classifications and is more computationally efficient than the PDF-based UDT algorithm,
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTIONIJDKP
The geometry of data, also known as probability distribution, is an important consideration for accurate computation of data mining tasks, such as pre-processing, classification and interpretation. The data geometry influences outcome and accuracy of the statistical analysis to a large extent. The current paper focuses on, understanding the influence of data geometry in the feature subset selection process using random forest algorithm. In practice, it is assumed that the data follows normal distribution and most of the time, it may not be true. The dimensionality reduction varies, due to change in the distribution of the data. A comparison is made using three standard distributions such as Triangular, Uniform and Normal Distribution. The results are discussed in this paper.
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI) and relevance analysis are incorporated with concept hierarchy’s knowledge and HeightBalancePriority algorithm for construction of decision tree along with Multi level mining. The assignment of priorities to attributes is done by evaluating information entropy, at different levels of abstraction for building decision tree using HeightBalancePriority algorithm. Modified DMQL queries are used to understand and explore the shortcomings of the decision trees generated by C4.5 classifier for education dataset and the results are compared with the proposed approach.
This document summarizes and compares different classification algorithms that can be used for disease prediction in data mining. It first introduces disease prediction and classification processes. It then reviews related works that have used various classification algorithms like random forest, support vector machine, and naive Bayes for tasks like disease diagnosis, text classification, and rainfall forecasting. The document also discusses supervised, unsupervised, and semi-supervised machine learning. It provides details on support vector machine and random forest algorithms, describing how each works and is used for classification. Finally, it analyzes the random forest algorithm construction process.
A Comparative Analysis on the Evaluation of Classification Algorithms in the ...IJECEIAES
Data mining techniques are applied in many applications as a standard procedure for analyzing the large volume of available data, extracting useful information and knowledge to support the major decision-making processes. Diabetes mellitus is a continuing, general, deadly syndrome occurring all around the world. It is characterized by hyperglycemia occurring due to abnormalities in insulin secretion which would in turn result in irregular rise of glucose level. In recent years, the impact of Diabetes mellitus has increased to a great extent especially in developing countries like India. This is mainly due to the irregularities in the food habits and life style. Thus, early diagnosis and classification of this deadly disease has become an active area of research in the last decade. Numerous clustering and classifications techniques are available in the literature to visualize temporal data to identify trends for controlling diabetes mellitus. This work presents an experimental study of several algorithms which classifies Diabetes Mellitus data effectively. The existing algorithms are analyzed thoroughly to identify their advantages and limitations. The performance assessment of the existing algorithms is carried out to determine the best approach.
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
The document presents a mathematical programming approach for selecting important variables in cluster analysis. It formulates a nonlinear binary model to minimize the distance between observations within clusters, using indicator variables to select important variables. The model is applied to a sample dataset of 30 observations across 5 variables, correctly identifying variables 3, 4 and 5 as most important for clustering the observations into two groups. The results are compared to an existing variable selection heuristic, with the mathematical programming approach achieving a 100% correct classification versus 97% for the other method.
IRJET- Disease Prediction using Machine LearningIRJET Journal
This document discusses using machine learning techniques to predict diseases based on patient symptoms. Specifically, it proposes using naive bayes, k-nearest neighbors (KNN), and logistic regression algorithms on structured and unstructured hospital data to predict diseases like diabetes, malaria, jaundice, dengue, and tuberculosis. The system is intended to make disease prediction more accessible to end users by analyzing their symptoms without needing to visit a doctor. It aims to improve prediction accuracy by handling both structured and unstructured data using machine learning models.
Prediction model of algal blooms using logistic regression and confusion matrix IJECEIAES
Algal blooms data are collected and refined as experimental data for algal blooms prediction. Refined algal blooms dataset is analyzed by logistic regression analysis, and statistical tests and regularization are performed to find the marine environmental factors affecting algal blooms. The predicted value of algal bloom is obtained through logistic regression analysis using marine environment factors affecting algal blooms. The actual values and the predicted values of algal blooms dataset are applied to the confusion matrix. By improving the decision boundary of the existing logistic regression, and accuracy, sensitivity and precision for algal blooms prediction are improved. In this paper, the algal blooms prediction model is established by the ensemble method using logistic regression and confusion matrix. Algal blooms prediction is improved, and this is verified through big data analysis.
Regression, theil’s and mlp forecasting models of stock indexiaemedu
This document compares different forecasting models for daily stock prices: linear regression, Theil's incomplete method, and multilayer perceptron (MLP). Principal component analysis was used to reduce the input variables to a single component. Linear regression and Theil's method had similar error rates that were lower than MLP based on MAE, MAPE, and SMAPE. The linear regression and Theil's method models had R-squared values near 1, indicating close fit to the data. Overall, the linear and Theil's models provided more accurate short-term forecasts of daily stock prices than the MLP based on error and fit metrics.
Regression, theil’s and mlp forecasting models of stock indexiaemedu
This document compares three models for forecasting daily stock prices: linear regression, Theil's incomplete method, and multilayer perceptron (MLP). Principal component analysis was used to reduce four input variables (high, low, open prices) into one principal component, which was then used to predict closing prices. Linear regression and Theil's method produced similar results, with slightly lower error than MLP based on MAE, MAPE, and SMAPE. The linear regression and Theil's models had a near perfect R-squared of 0.9977, while MLP was 0.9974. Overall, the simple linear and Theil's models performed best at forecasting closing prices based on this single stock index data.
Regression, theil’s and mlp forecasting models of stock indexIAEME Publication
This document compares different forecasting models for daily stock prices: linear regression, Theil's incomplete method, and multilayer perceptron (MLP). Principal component analysis was used to reduce 4 stock price variables to 1 principal component, which was then used to predict closing prices. Linear regression and Theil's method produced similar results, with MAE around 110 and R-squared over 0.99. MLP had slightly higher error at 118 MAE. Overall, linear regression and Theil's method provided the best forecasts of closing stock prices based on this analysis of models and error metrics.
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...cscpconf
Improving accuracy of supervised classification algorithms in biomedical applications,
especially CADx, is one of active area of research. This paper proposes construction of rotation
forest (RF) ensemble using 20 learners over two clinical datasets namely lymphography and
backache. We propose a new feature selection strategy based on support vector machines
optimized by particle swarm optimization for relevant and minimum feature subset for obtaining
higher accuracy of ensembles. We have quantitatively analyzed 20 base learners over two
datasets and carried out the experiments with 10 fold cross validation leave-one-out strategy
and the performance of 20 classifiers are evaluated using performance metrics namely accuracy
(acc), kappa value (K), root mean square error (RMSE) and area under receiver operating
characteristics curve (ROC). Base classifiers succeeded 79.96% & 81.71% average accuracies
for lymphography & backache datasets respectively. As for RF ensembles, they produced
average accuracies of 83.72% & 85.77% for respective diseases. The paper presents promising
results using RF ensembles and provides a new direction towards construction of reliable and robust medical diagnosis systems.
— The healthcare industry is considered one of the
largest industry in the world. The healthcare industry is same as
the medical industries having the largest amount of health related
and medical related data. This data helps to discover useful
trends and patters that can be used in diagnosis and decision
making. Clustering techniques like K-means, D-streams,
COBWEB, EM have been used for healthcare purposes like heart
disease diagnosis, cancer detection etc. This paper focuses on the
use of K-means and D-stream algorithm in healthcare. This
algorithms were used in healthcare to determine whether a
person is fit or unfit and this fitness decision was taken based on
his/her historical and current data. Both the clustering
algorithms were analyzed by applying them on patients current
biomedical historical databases, this analysis depends on the
attributes like peripheral blood oxygenation, diastolic arterial
blood pressure, systolic arterial blood pressure, heart rate,
heredity, obesity, and this fitness decision was taken based on
his/her historical and current data. Both the clustering
algorithms were analyzed by applying them on patients current
biomedical historical databases, this analysis depends on the
attributes like peripheral blood oxygenation, diastolic arterial
blood pressure, systolic arterial blood pressure, heart rate,
heredity, obesity, cigarette smoking. By analyzing both the
algorithm it was found that the Density-based clustering
algorithm i.e. the D-stream algorithm proves to give more
accurate results than K-means when used for cluster formation of
historical biomedical data. D-stream algorithm overcomes
drawbacks of K-means algorithm
Iganfis Data Mining Approach for Forecasting Cancer Threatsijsrd.com
This document proposes a new approach called IGANFIS that combines information gain (IG) and adaptive neuro fuzzy inference system (ANFIS) to classify cancer threats from medical data. IG is used to select the most important cancer features from the data to reduce dimensionality before being input to ANFIS. ANFIS then trains on the selected features to build a fuzzy inference system for cancer diagnosis and prediction. The approach is tested on breast cancer datasets and achieves higher classification accuracy compared to other methods.
Cerebral infarction classification using multiple support vector machine with...journalBEEI
Stroke ranks the third leading cause of death in the world after heart disease and cancer. It also occupies the first position as a disease that causes both mild and severe disability. The most common type of stroke is cerebral infarction, which increases every year in Indonesia. This disease does not only occur in the elderly, but in young and productive people which makes early detection very important. Although there are varied of medical methods used to classify cerebral infarction, this study uses a multiple support vector machine with information gain feature selection (MSVM-IG). MSVM-IG is a modification among IG Feature Selection and SVM, where SVM conducted doubly in the process of classification which utilizes the support vector as a new dataset. The data obtained from Cipto Mangunkusumo Hospital, Jakarta. Based on the results, the proposed method was able to achieve an accuracy value of 81%, therefore, this method can be considered to use for better classification result.
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAIJSCAI Journal
Advancement in information and technology has made a major impact on medical science where the
researchers come up with new ideas for improving the classification rate of various diseases. Breast cancer
is one such disease killing large number of people around the world. Diagnosing the disease at its earliest
instance makes a huge impact on its treatment. The authors propose a Binary Bat Algorithm (BBA) based
Feedforward Neural Network (FNN) hybrid model, where the advantages of BBA and efficiency of FNN is
exploited for the classification of three benchmark breast cancer datasets into malignant and benign cases.
Here BBA is used to generate a V-shaped hyperbolic tangent function for training the network and a fitness
function is used for error minimization. FNNBBA based classification produces 92.61% accuracy for
training data and 89.95% for testing data.
A Novel Efficient Medical Image Segmentation Methodologyaciijournal
Image segmentation plays a crucial role in many medical applications. The threshold based medical image
segmentation approach is the most common and effective method for medical image segmentation, but it
has some shortcomings such as high complexity, poor real time capability and premature convergence, etc.
To address above issues, an improved evolution strategies is proposed to use for medical image
segmentation, there are 2 populations concurrently during evolution, one focuses on local search in order
to search solutions near optimal solution, and the other population that implemented based on chaotic
theory focuses on global search so as to keep the variety of individuals and jump out from the local
maximum to overcome the problem of premature convergence. The encoding scheme, fitness function, and
evolution operators are also designed. The experimental results validated the effectiveness and efficiency of
the proposed approach.
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...IJAEMSJORNAL
The current trend in the industry is to analyze large data sets and apply data mining, machine learning techniques to identify a pattern. But the challenges with huge data sets are the high dimensions associated with it. Sometimes in data analytics applications, large amounts of data produce worse performance. Also, most of the data mining algorithms are implemented column wise and too many columns restrict the performance and make it slower. Therefore, dimensionality reduction is an important step in data analysis. Dimensionality reduction is a technique that converts high dimensional data into much lower dimension, such that maximum variance is explained within the first few dimensions. This paper focuses on multivariate statistical and artificial neural networks techniques for data reduction. Each method has a different rationale to preserve the relationship between input parameters during analysis. Principal Component Analysis which is a multivariate technique and Self Organising Map a neural network technique is presented in this paper. Also, a hierarchical clustering approach has been applied to the reduced data set. A case study of Air quality measurement has been considered to evaluate the performance of the proposed techniques.
Effect of Feature Selection on Gene Expression Datasets Classification Accura...IJECEIAES
Feature selection attracts researchers who deal with machine learning and data mining. It consists of selecting the variables that have the greatest impact on the dataset classification, and discarding the rest. This dimentionality reduction allows classifiers to be fast and more accurate. This paper traits the effect of feature selection on the accuracy of widely used classifiers in literature. These classifiers are compared with three real datasets which are pre-processed with feature selection methods. More than 9% amelioration in classification accuracy is observed, and k-means appears to be the most sensitive classifier to feature selection
This document summarizes a research paper that proposes a new inventory prediction method for supply chain management called BP-GA chaos prediction algorithm. The method uses a backpropagation neural network combined with a genetic algorithm to forecast inventory levels based on chaotic time series analysis. It aims to overcome limitations of traditional chaos prediction approaches. The paper reviews other inventory forecasting research and chaotic prediction methods. It then describes the new hybrid BP-GA method in detail, which establishes a chaotic neural network model optimized through a genetic algorithm. An experiment applying this method to inventory prediction is said to achieve good results.
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA ijscai
The document summarizes a research paper that proposes using a Binary Bat Algorithm (BBA) combined with a Feedforward Neural Network (FNN) model for classifying breast cancer data. BBA is a nature-inspired metaheuristic algorithm based on bat echolocation behavior. It is used to generate a hyperbolic tangent function to train the FNN network and minimize error through a fitness function. When tested on three breast cancer datasets, the FNNBBA model achieved 92.61% accuracy for training data and 89.95% for testing data, demonstrating its effectiveness for breast cancer classification.
Comparative analysis of multimodal medical image fusion using pca and wavelet...IJLT EMAS
nowadays, there are a lot of medical images and their
numbers are increasing day by day. These medical images are
stored in large database. To minimize the redundancy and
optimize the storage capacity of images, medical image fusion is
used. The main aim of medical image fusion is to combine
complementary information from multiple imaging modalities
(Eg: CT, MRI, PET etc.) of the same scene. After performing
image fusion, the resultant image is more informative and
suitable for patient diagnosis. There are some fusion techniques
which are described in this paper to obtain fused image. This
paper presents two approaches to image fusion, namely Spatial
Fusion and Transform Fusion. This paper describes Techniques
such as Principal Component Analysis which is spatial domain
technique and Discrete Wavelet Transform, Stationary Wavelet
Transform which are Transform domain techniques.
Performance metrics are implemented to evaluate the
performance of image fusion algorithm. An experimental result
shows that image fusion method based on Stationary Wavelet
Transform is better than Principal Component Analysis and
Discrete Wavelet Transform.
The document summarizes research on medical image segmentation algorithms. It discusses k-means clustering, fuzzy c-means clustering, and proposes enhancements to these algorithms. Specifically, it introduces an enhanced k-means algorithm that improves initial cluster center selection. It also presents a kernelized fuzzy c-means approach that maps data points into a feature space to perform clustering. The algorithms are tested on MRI brain images and evaluated based on segmentation accuracy. The enhanced methods aim to produce more precise segmentations for medical applications such as diagnosis and treatment planning.
Chronic Kidney Disease prediction is one of the most important issues in healthcare analytics. The most interesting and challenging tasks in day to day life is prediction in medical field. In this paper, we employ some machine learning techniques for predicting the chronic kidney disease using clinical data. We use three machine learning algorithms such as Decision Tree(DT) algorithm, Naive Bayesian (NB) algorithm. The performance of the above models are compared with each other in order to select the best classifier in predicting the chronic kidney disease for given dataset.
Similar to Smooth Support Vector Machine for Suicide-Related Behaviours Prediction (20)
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Neural network optimizer of proportional-integral-differential controller par...IJECEIAES
Wide application of proportional-integral-differential (PID)-regulator in industry requires constant improvement of methods of its parameters adjustment. The paper deals with the issues of optimization of PID-regulator parameters with the use of neural network technology methods. A methodology for choosing the architecture (structure) of neural network optimizer is proposed, which consists in determining the number of layers, the number of neurons in each layer, as well as the form and type of activation function. Algorithms of neural network training based on the application of the method of minimizing the mismatch between the regulated value and the target value are developed. The method of back propagation of gradients is proposed to select the optimal training rate of neurons of the neural network. The neural network optimizer, which is a superstructure of the linear PID controller, allows increasing the regulation accuracy from 0.23 to 0.09, thus reducing the power consumption from 65% to 53%. The results of the conducted experiments allow us to conclude that the created neural superstructure may well become a prototype of an automatic voltage regulator (AVR)-type industrial controller for tuning the parameters of the PID controller.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
A review on features and methods of potential fishing zoneIJECEIAES
This review focuses on the importance of identifying potential fishing zones in seawater for sustainable fishing practices. It explores features like sea surface temperature (SST) and sea surface height (SSH), along with classification methods such as classifiers. The features like SST, SSH, and different classifiers used to classify the data, have been figured out in this review study. This study underscores the importance of examining potential fishing zones using advanced analytical techniques. It thoroughly explores the methodologies employed by researchers, covering both past and current approaches. The examination centers on data characteristics and the application of classification algorithms for classification of potential fishing zones. Furthermore, the prediction of potential fishing zones relies significantly on the effectiveness of classification algorithms. Previous research has assessed the performance of models like support vector machines, naïve Bayes, and artificial neural networks (ANN). In the previous result, the results of support vector machine (SVM) were 97.6% more accurate than naive Bayes's 94.2% to classify test data for fisheries classification. By considering the recent works in this area, several recommendations for future works are presented to further improve the performance of the potential fishing zone models, which is important to the fisheries community.
Electrical signal interference minimization using appropriate core material f...IJECEIAES
As demand for smaller, quicker, and more powerful devices rises, Moore's law is strictly followed. The industry has worked hard to make little devices that boost productivity. The goal is to optimize device density. Scientists are reducing connection delays to improve circuit performance. This helped them understand three-dimensional integrated circuit (3D IC) concepts, which stack active devices and create vertical connections to diminish latency and lower interconnects. Electrical involvement is a big worry with 3D integrates circuits. Researchers have developed and tested through silicon via (TSV) and substrates to decrease electrical wave involvement. This study illustrates a novel noise coupling reduction method using several electrical involvement models. A 22% drop in electrical involvement from wave-carrying to victim TSVs introduces this new paradigm and improves system performance even at higher THz frequencies.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEMHODECEDSIET
Time Division Multiplexing (TDM) is a method of transmitting multiple signals over a single communication channel by dividing the signal into many segments, each having a very short duration of time. These time slots are then allocated to different data streams, allowing multiple signals to share the same transmission medium efficiently. TDM is widely used in telecommunications and data communication systems.
### How TDM Works
1. **Time Slots Allocation**: The core principle of TDM is to assign distinct time slots to each signal. During each time slot, the respective signal is transmitted, and then the process repeats cyclically. For example, if there are four signals to be transmitted, the TDM cycle will divide time into four slots, each assigned to one signal.
2. **Synchronization**: Synchronization is crucial in TDM systems to ensure that the signals are correctly aligned with their respective time slots. Both the transmitter and receiver must be synchronized to avoid any overlap or loss of data. This synchronization is typically maintained by a clock signal that ensures time slots are accurately aligned.
3. **Frame Structure**: TDM data is organized into frames, where each frame consists of a set of time slots. Each frame is repeated at regular intervals, ensuring continuous transmission of data streams. The frame structure helps in managing the data streams and maintaining the synchronization between the transmitter and receiver.
4. **Multiplexer and Demultiplexer**: At the transmitting end, a multiplexer combines multiple input signals into a single composite signal by assigning each signal to a specific time slot. At the receiving end, a demultiplexer separates the composite signal back into individual signals based on their respective time slots.
### Types of TDM
1. **Synchronous TDM**: In synchronous TDM, time slots are pre-assigned to each signal, regardless of whether the signal has data to transmit or not. This can lead to inefficiencies if some time slots remain empty due to the absence of data.
2. **Asynchronous TDM (or Statistical TDM)**: Asynchronous TDM addresses the inefficiencies of synchronous TDM by allocating time slots dynamically based on the presence of data. Time slots are assigned only when there is data to transmit, which optimizes the use of the communication channel.
### Applications of TDM
- **Telecommunications**: TDM is extensively used in telecommunication systems, such as in T1 and E1 lines, where multiple telephone calls are transmitted over a single line by assigning each call to a specific time slot.
- **Digital Audio and Video Broadcasting**: TDM is used in broadcasting systems to transmit multiple audio or video streams over a single channel, ensuring efficient use of bandwidth.
- **Computer Networks**: TDM is used in network protocols and systems to manage the transmission of data from multiple sources over a single network medium.
### Advantages of TDM
- **Efficient Use of Bandwidth**: TDM all
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
2. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 8, No. 5, October 2018 : 3399 – 3406
3400
techniques. The numerical results show that SSVM is faster than other methods and has better generalization
ability [7].
2. SMOOTH SUPPORT VECTOR MACHINE
As a base of SSVM, SVM [11] is a method to find optimal hyperplane that separates two classes of
input space. Separation of more than two classes have conducted previously by authors on fingerprint data
[12]-[15]. Figure 1 shows several alternative hyperplanes (discrimination boundaries) and the best
hyperplane of a data set consists of two classes, i.e. class {−1} and {+1}. The best hyperplane is the
hyperplane which has a maximum margin obtained from alternative dividing lines (discriminant boundaries).
Margin is the distance between the hyperplane to the nearest point of each class. This nearest point is so-
called support vector [16].
Figure 1. Several alternative hyperplanes (left), the best hyperplane (right)
Classification problem of m points in n-dimensional space (Rn
) is represented as m×n -sized matrix
A. The matrix element T
to the class {−1} and {+1} is defined on m×m -sized diagonal matrix D with −1
and +1 at its diagonal. Linear SVM algorithm is shown by (1), with constrains + and ;
a positive value SVM parameter v; m×1 -sized slack variable vector y that measures classification error and
has non-negative value; m-sized column vector e and has value of 1; n×1 -sized normal vector w; and bias
value γ that determine hyperplane relative location to the original class.
min
, γ, +1+
T
+
1 T
(1)
The constrains equation above compares each vector element. When two classes can be separated
perfectly by the defined hyperplane T
+ γ , there are two parallel hyperplane which are boundaries of
those two classes, i.e. T
+ γ -1 of the class {−1} and T
+ γ +1 of the class {+1}. A non-linear
hyperplane is obtained by transforming the standard SVM formulation (2), and by using "kernel trick"
through a Gaussian kernel function (3), where μ is a kernel parameter and i, j = 1, 2, ..., m.
T
(2)
K , exp || || , μ (3)
By using (2) into (1), non-linear problem functions is obtained, as shown by (4), with constrains
T
+ and .
min
, γ, +1+
T
+
1 T T
(4)
The solution for the functions (4) is K T T
γ. By replacing T
with non-linear kernel
K T
and variable y is minimized by weighting , generalized non-linear SVM is shown by (5) with
constrains K T
- γ + and .
min
, γ, +1+
T
+
1 T
+ γ (5)
3. Int J Elec & Comp Eng ISSN: 2088-8708
Smooth Support Vector Machine for Suicide-Related Behaviours Prediction (G. Indrawan)
3401
To solve (5), constraint function is defined by (6).
(6)
By replacing (6) into (5), SVM problem equation is obtained which is equivalent to unconstrained
SVM optimization, as shown by (7).
(7)
At (.)+, negative components are replaced by zeros. Equation (7) has a unique solution but its
objective function is not twice differentiable which precludes the use of a fast Newton method. For that, a
smoothing technique was proposed [10] that replaces plus function (.)+ by using integral of sigmoid function
1 + exp -
-1
of neural network. Equation (8) shows the SSVM where is the smoothing parameter.
(8)
Equation (8) can be optimized by using numerical approach through the Newton-Armijo method.
The first step is to initiate , γ +1
where indicates ith iteration of w. The second step is to
repeat the iteration until the gradient of the objective function at (8) is equal to zero or , γ . The
third step is to calculate +1
, γ +1
as follows:
a. Newton Direction: determine the direction of +1
, as shown by (9).
(9)
b. Armijo Stepsize: choose the stepsize , such that
+1
, γ +1
, γ + (10)
where max {1,
1
,
1
, }, such that
(11)
where ,
1
When , γ , the Newton-Armijo algorithm iteration stopped and convergent value of w
and γ were obtained for hyperplane function, as shown by (12).
(12)
3. RESEARCH METHOD
For SRBs prediction using SSVM, five stages of research method were conducted, i.e.: 1) data
preparation; 2) data transformation; 3) training and testing data selection; 4) SSVM model development; and
5) SSVM model evaluation.
Data preparation is related to the data collection from electronic and non-electronic medical record
of the only one psychiatric hospital in Bali Province. There are 30.660 inpatient and outpatient medical
record from the last five years up to April 2016. Data were collected through database query on the hospital
information system and then they were exported to CSV format. Data cleaning gave 2665 relevant data for
this research, includes 111 patients that have SRBs and under active treatment. Removed data have one or
more than one of this three condition, i.e. 1) not psychiatric-disorder patient data (drug-free or psychiatric-
disorder-free certificate applicant, dental patient, drug patient, neurology patient, or physiotherapy patient);
2) incomplete data (manual data that was migrated into information system and related patient has not been
4. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 8, No. 5, October 2018 : 3399 – 3406
3402
inpatient or outpatient since data migration time); and 3) inactive patient data (pass away patient or not an
outpatient).
Data transformation is related to the transformation of previous data into predictor variables and
response variable. Ten predictor variables were obtained from database query above, i.e. disease diagnosis
(x1), profession (x2), education (x3), payment type / health insurance type (x4), domicile (x5), age (x6), age
range (x7), sex (x8), marital status (x9), and family history (x10). A response variable (y) is a variable with
value −1 and +1 that represents class of patients that have no SRBs and class of patients that have SRBs,
respectively. Response variable data was obtained from non-electronic medical record related to data of
suicide attempt or instrumental SRBs [1]. Figure 2 shows research sample of predictor (instance) and
response (label) matrix by using SSVM toolbox library [17]. Row and column of instance matrix represent a
patient data and his/her related ten predictor variables, respectively.
Figure 2. Predictor and response matrix
Training and testing data selection is related to the next stage of SSVM model development and
evaluation. Two data selection mechanisms were used to get the best hyperplane in Figure 1, i.e. by using:
a. Ten-folds cross validation (10-fcv) selection [18] and data ratio selection of 2665 relevant data. k-fold
cross validation splits the data into k sections at random. Each section has the same class proportion to the
initial class proportion. Each section will be used as training data and the rest is used as testing data, so
there will be k accuracy. Final accuracy is the average of those k accuracy. On data ratio, data of patients
that have SRBs and patients that have no SRBs were randomly selected in certain ratio for training and
testing data, respectively. For an example, ratio 90:10 means 90% of relevant data as training data and
10% of relevant data as testing data. Training data consist of 90% of data of patients that have SRBs and
90% of data of patients that have no SRBs, while testing data consist of 10% of data of patients that have
SRBs and 10% of data of patients that have no SRBs;
b. Data ratio selection based on 111 data of patients that have SRBs from 2665 relevant data. For an
example, data ratio 1:2 means training data (also used as testing data) consist of data of patients that have
SRBs and 222 data of patients that have no SRBs. Several best results of obtained SSVM models than
were tested by using randomly selected data in number of 1 %, % and 100% of 2665 relevant data.
SSVM model development is related to parameter w and γ (12) that was computed by using SSVM
toolbox library [17]. Several parameter w and γ was computed based on various training and testing data
selection above. SSVM model evaluation is related to the classification accuracy that can be determined by
using contingency table [19], as shown by Table 1. Based on that table, the classification accuracy can be
measured by using (13).
Table 1. Classification accuracy contingency
Actual Prediction
I (Negative) II (Positive)
Negative True Negative (TN) False Positive (FP)
Positive False Negative (FN) True Positive (TP)
% (13)
where TN is the number of prediction of patients that have no SRBs and in fact that patients have no
such behaviours; TP is the number of prediction of patients that have SRBs and in fact that patients have such
behaviours; FP is the number of prediction of patients that have SRBs and in fact that patients have no such
behaviours; and FN is the number of prediction of patients that have no SRBs and in fact that patients have
such behaviours.
5. Int J Elec & Comp Eng ISSN: 2088-8708
Smooth Support Vector Machine for Suicide-Related Behaviours Prediction (G. Indrawan)
3403
4. RESULTS AND ANALYSIS
Experiment was based on previous two selection mechanisms of training and testing data, running
on Intel Core™ i5-4460T CPU @1.90GHz with 4GB RAM and Windows 10 64-bit Operating System.
4.1. Data selection mechanism 1
Table 2 shows high accuracy of SSVM model on every data selection type but all of their TP were
zero that make all of those model cannot be used for prediction of patients that have SRBs. High accuracy
came from high number of TN (13) since many patients that have no SRBs are on the data in this research.
4.2. Data selection mechanism 2
Table 3 shows non-zero TPs by six SSVM models that were obtained by using six data ratio
selections from 1:05 up to 1:1. Each of those SSVM models then were tested by using ten portions of 2665
data. So, each SSVM model will give ten accuracy result where its average accuracy is shown by Figure 3.
Table 2. SSVM Model Performance by using Data
Selection Mechanism 1
Table 3. SSVM Model Performance by using Data
Selection Mechanism 2
Based on Figure 3, SSVM model generated by data ratio 1:1 (training data consist of 111 data of
patients that have SRBs and 111 data of patients that have no SRBs) gave the best average accuracy at about
63%. On running time, most of the time was used for SSVM model development related to parameter w and
γ (12). No significant increase on running time for different portion of relevant data, as shown by Table 4.
Figure 3. Six SSVM models performance by using six data ratio selections
Selection Testing Data w γ TN FN FP TP Acuracy Time (s)
10-fcv variable 0.01 0.0017 2554 111 0 0 0.9583 907.954
90:10 266 0.01 0.0018 255 11 0 0 0.9586 722.597
80:20 533 0.01 0.0019 511 22 0 0 0.9587 555.397
70:30 799 0.01 0.0016 766 33 0 0 0.9587 361.729
60:40 1065 0.01 0.0023 1021 22 0 0 0.9587 250.639
50:50 1282 0.01 0.0018 1227 55 0 0 0.9571 165.321
Selection Testing Data w γ TN FN FP TP Acuracy Time (s)
1:0.5 167 0.01 7.88E-05 8 5 48 106 0.6826 1.312
1:0.6 178 0.01 8.83E-05 10 5 57 106 0.6517 1.317
1:0.7 189 1.78E+03 5.79E-04 77 52 1 59 0.7196 1.627
1:0.8 200 1.78E+03 5.60E-04 86 53 3 58 0.72 1.978
1:0.9 211 1.00E+04 7.78E-05 96 49 4 62 0.7488 2.634
1:1 222 1.7783 2.67E-05 101 57 10 54 0.6982 2.094
1:2 333 0.0562 0.0323 222 111 0 0 0.6667 4.155
1:3 444 0.0562 0.0268 333 111 0 0 0.75 10.11
1:4 555 0.0562 0.0245 444 111 0 0 0.8 18.337
1:5 666 0.0562 0.0216 555 111 0 0 0.8333 32.273
1:6 777 0.01 0.002 666 111 0 0 0.8571 40.845
1:7 888 0.01 0.003 777 111 0 0 0.875 69.532
1:8 999 0.01 0.0022 888 111 0 0 0.8889 80.796
1:9 1110 0.01 0.0021 999 111 0 0 0.9 127.757
1:10 1221 0.01 0.0022 1110 111 0 0 0.9091 155.998
1:11 1332 0.01 0.0021 1221 111 0 0 0.9167 197.339
1:12 1443 0.01 0.0028 1332 111 0 0 0.9231 233.046
1:13 1554 0.01 0.0015 1443 111 0 0 0.9286 224.112
1:14 1665 0.01 0.0021 1554 111 0 0 0.9333 278.644
1:15 1776 0.01 0.0022 1665 111 0 0 0.9375 327.802
1:16 1887 0.01 0.0023 1776 111 0 0 0.9412 396.622
1:17 1998 0.01 0.0021 1887 111 0 0 0.9444 465.335
1:18 2109 0.01 0.0022 1998 111 0 0 0.9474 560.466
1:19 2220 0.01 0.0017 2109 111 0 0 0.95 576.861
1:20 2331 0.01 0.0023 2220 111 0 0 0.9524 637.395
1:21 2442 0.01 0.0024 2331 111 0 0 0.9545 768.092
1:22 2553 0.01 0.0023 2442 111 0 0 0.9565 860.427
1:23 2665 0.01 0.0017 2554 111 0 0 0.9583 907.954
6. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 8, No. 5, October 2018 : 3399 – 3406
3404
Table 4. SSVM Model Performance by using Data Ratio 1:1
Results above were based on ten predictor variables, as described previously. Table 5 gave the result
about the influence of each of those variables to the SSVM model performance by using data ratio 1:1 and
testing data at 30% of 2665 relevant data as shown in Table 4.
Table 5. SSVM Model Performance on Reduced Predictor Variables
Table 5 shows that six predictor variables, i.e. disease diagnosis (x1), profession (x2), education (x3),
payment type/health insurance type (x4), domicile (x5), and age (x6), have much influence to the SSVM model
result because of the decreasing value of its TP and/or the increasing value of its FP without each of those
variables. Hypothetically, age range (x7), sex (x8), marital status (x9), and family history (x10) have influence
on the prediction. Age range between 19− 5 years old that was used as reference by the psychiatric hospital
apparently has relatively small influence in this SSVM model, neither do sex, marital status, nor family
history, even though female or unmarried status or social network was considered to be the risk factors of
SRBs [1].
5. CONCLUSION AND FUTURE WORK
Suicide-related behaviours (SRBs) prediction with SSVM gave the best average accuracy at 63%.
This accuracy can be obtained by using 30% of 2665 relevant data as data testing and by using training data
which have one-to-one ratio in number between patients that have SRBs and patients that have no SRBs. In
the future work, accuracy improvement need to be confirmed by using Reduced Support Vector Machine
(RSVM) method, as the further development of SSVM [10].
ACKNOWLEDGEMENTS
The authors would like to thank to Bali Province Government through its Investment and Licensing
Agency and its Psychiatric Hospital for the permission to conduct this research.
Data Portion Testing Data TN FN FP TP Acuracy Time (s)
10% 266 170 4 85 7 0.6654 2.042
20% 533 394 12 117 10 0.75797 2.044
30% 799 672 19 94 14 0.8586 2.042
40% 1065 531 27 490 17 0.51455 2.202
50% 1383 860 28 467 28 0.6421 2.042
60% 1600 959 30 574 37 0.6225 2.034
70% 1866 818 38 970 40 0.4598 2.02
80% 2243 1207 45 947 44 0.5577 2.038
90% 2399 1320 53 979 47 0.5698 2.064
100% 2665 1490 57 1064 54 0.57936 2.058
No ReducedPredictor Variables TN FN FP TP Acuracy
1 x 1, x 2, x 3, x 4, x 5, x 6, x 7, x 8, x 9 629 18 137 15 0.80601
2 x 1, x 2, x 3, x 4, x 5, x 6, x 7, x 8, x 10 629 18 137 15 0.80601
3 x 1, x 2, x 3, x 4, x 5, x 6, x 7, x 9, x 10 629 18 137 15 0.80601
4 x 1, x 2, x 3, x 4, x 5, x 6, x 8, x 9, x 10 629 18 137 15 0.80601
5 x 1, x 2, x 3, x 4, x 5, x 7, x 8, x 9, x 10 619 18 147 15 0.79349
6 x 1, x 2, x 3, x 4, x 6, x 7, x 8, x 9, x 10 667 30 99 3 0.83855
7 x 1, x 2, x 3, x 5, x 6, x 7, x 8, x 9, x 10 759 32 7 1 0.95129
8 x 1, x 2, x 4, x 5, x 6, x 7, x 8, x 9, x 10 722 30 44 3 0.9074
9 x 1, x 3, x 4, x 5, x 6, x 7, x 8, x 9, x 10 737 33 29 0 0.9224
10 x 2, x 3, x 4, x 5, x 6, x 7, x 8, x 9, x 10 716 32 50 1 0.8974
7. Int J Elec & Comp Eng ISSN: 2088-8708
Smooth Support Vector Machine for Suicide-Related Behaviours Prediction (G. Indrawan)
3405
REFERENCES
[1] J. L. Skeem, E. Silver, P. S. Aippelbaum, and J. Tiemann, “Suicide-Related Behavior after Psychiatric Hospital
Discharge: Implications for Risk Assessment and Management”, Behavioral Sciences and the Law, vol. 24,
pp. 731-746, 2006.
[2] T. L. Stedman, Stedman’s Medical Dictionary, 28th ed. Philadelphia: Lippincott Williams & Wilkins, 2006.
[3] K. Hawton and K. van Heeringen, “Suicide”, Lancet, vol. 373, no. 9672, pp. 1372-1381, 2009.
[4] WHO, “Suicide Fact Sheet,” 16.
[5] G. K. Brown, A. T. Beck, . Steer, and J. . Grisham, “ isk Factors for Suicide in Psychiatric Outpatients: A
20-year Prospective Study”, Journal of Consulting and Clinical Psychology, vol. 68, no. 5, pp. 371-377, 2000.
[6] J. Cooper, et al., “Suicide after Deliberate Self-Harm: A 4-year Cohort Study”, American Journal of Psychiatry,
vol. 162, no. 2, pp. 297-303, 2005.
[7] Y. J. Lee and O. L. Mangasarian, “A Smooth Support Vector Machine for Cassification”, Journal of Computational
Optimization and Applications, pp. 5-22, 2001.
[8] S. W. Purnami and A. Embong, “Smooth Support Vector Machine for Breast Cancer Classification”, in The 4th
IMT-GT 2008 Conference of Mathematics, Statistics and Its Application (ICMSA 2008), 2008.
[9] M. Furqan, A. Embong, A. Suryanti, S. W. Purnami, and S. Sajadin, “Smooth Support Vector Machine for Face
Recognition using Principal Componen Analysis”, in 2nd International Conference On Green Technology and
Engineering (ICGTE), 2009.
[10] Y. J. Lee and O. L. Mangasarian, “ SVM:Reduced Support Vector Machine”, in The First SIAM International
Conference on Data Mining, 2001.
[11] J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed. San Francisco: Morgan Kaufmann
Publishers, 2006.
[12] G. Indrawan, S. Akbar, and B. Sitohang, “ eview of Sequential Access Method for Fingerprint Identification”,
TELKOMNIKA (Telecommunication, Computing, Electronics and Control), vol. 10, no. 2, pp. 199-206, Jun. 2012.
[13] G. Indrawan, A. S. Nugroho, S. Akbar, and B. Sitohang, “A Multi-Threaded Fingerprint Direct-Access Strategy
Using Local-Star-Structure-based Discriminator Features”, TELKOMNIKA (Telecommunication, Computing,
Electronics and Control), vol. 12, no. 5, pp. 4079-4090, 2014.
[14] G. Indrawan, S. Akbar, and B. Sitohang, “Fingerprint Direct-Access Strategy Using Local-Star-Structure-based
Discriminator Features: A Comparison Study”, International Journal of Electrical and Computer Engineering
(IJECE), vol. 4, no. 5, pp. 817-829, 2014.
[15] G. Indrawan, S. Akbar, and B. Sitohang, “On Analyzing of Fingerprint Direct-Access Strategies”, in International
Conference on Data and Software Engineering (ICoDSE), 2014.
[16] A. S. Nugroho, A. B. Witarto, and D. Handoko, “Application of Support Vector Machine in Bioinformatics”, in
Indonesian Scientific Meeting in Central Japan, 2003.
[17] DSMI, “Smooth Support Vector Machine Toolbox | Data Science and Machine Intelligence Lab”, 2014. [Online].
Available: http://dmlab8.csie.ntust.edu.tw/#toolbox. [Accessed: 06-Apr-2017].
[18] P. Refaeilzadeh, L. Tang, and H. Liu, “Cross Validation”, Encyclopedia of Database Systems. Springer, 2009.
[19] S. W. Purnami, J. M. Zain, and T. Heriawan, “An Alternative Algorithm for Classification Large Categorical
Dataset: k-mode Clustering Reduced Support Vector Machine”, International Journal of Database Theory and
Application, vol. 4, no. 1, 2011.
BIOGRAPHIES OF AUTHORS
Dr. G. Indrawan. Head of Computer Science Department, Graduate Program, Universitas
Pendidikan Ganesha, Bali, Indonesia. He received his doctoral degree in Electrical Engineering
and Informatics from Bandung Institute of Technology, Indonesia. His research interests include
biometrics, pattern recognition, and robotics. He can be reached at gindrawan@undiksha.ac.id.
I K. P. Sudiarsa, M.Kom. IT Administrator of Psychiatric Hospital of Bali, Indonesia. He
received his master degree in Computer Science from Universitas Pendidikan Ganesha, Bali,
Indonesia. His research interests include machine learning, and data mining. He can be reached at
kompassudiarsa@gmail.com.
8. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 8, No. 5, October 2018 : 3399 – 3406
3406
Dr. K. Agustini. Vice Head of Instructional Technology Department, Graduate Program,
Universitas Pendidikan Ganesha, Bali, Indonesia. She is also lecturer at Computer Science
Department, Graduate Program, Universitas Pendidikan Ganesha, Bali, Indonesia. She received
her doctoral degree in Education Technology from Jakarta State University and her master degree
in Computer Science from Bogor Agricultural Institute. His research interests include pattern
recognition, and learning media. She can be reached at ketutagustini@undiksha.ac.id.
Prof. Dr. Sariyasa. Head of Mathematics Education Department, Graduate Program, Universitas
Pendidikan Ganesha, Bali, Indonesia. He is also lecturer at Computer Science Department,
Graduate Program, Universitas Pendidikan Ganesha, Bali, Indonesia. He received his master and
doctoral degree in Mathematics from Flinders University, Australia. His research interests include
complex analysis, and numerical method. He can be reached at sariyasa@undiksha.ac.id.