This document compares three classification methods - artificial neural networks, decision trees, and logistic regression - for predicting malignancy in thyroid tumor patients using a clinical dataset. It describes each method and applies them to a dataset of 259 thyroid tumor patients. The artificial neural network achieved 98% accuracy on the training set and 92% on the validation set. The decision tree method used 150 cases to build a model and achieved 86% accuracy. Logistic regression analysis resulted in 88% accuracy. The artificial neural network was able to accurately predict malignancy and identified important attributes like multiple nodules and family cancer history.
Abstract In this paper, the concept of data mining was summarized and its significance towards its methodologies was illustrated. The data mining based on Neural Network and Genetic Algorithm is researched in detail and the key technology and ways to achieve the data mining on Neural Network and Genetic Algorithm are also surveyed. This paper also conducts a formal review of the area of rule extraction from ANN and GA. Keywords: Data Mining, Neural Network, Genetic Algorithm, Rule Extraction.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Propose a Enhanced Framework for Prediction of Heart DiseaseIJERA Editor
Heart disease diagnosis requires more experience and it is a complex task. The Heart MRI, ECG and Stress Test etc are the numbers of medical tests are prescribed by the doctor for examining the heart disease and it is the way of tradition in the prediction of heart disease. Today world, the hidden information of the huge amount of health care data is contained by the health care industry. The effective decisions are made by means of this hidden information. For appropriate results, the advanced data mining techniques with the information which is based on the computer are used. In any empirical sciences, for the inference and categorisation, the new mathematical techniques to be used called Artificial neural networks (ANNs) it also be used to the modelling of the real neural networks. Acting, Wanting, knowing, remembering, perceiving, thinking and inferring are the nature of mental phenomena and these can be understand by using the theory of ANN. The problem of probability and induction can be arised for the inference and classification because these are the powerful instruments of ANN. In this paper, the classification techniques like Naive Bayes Classification algorithm and Artificial Neural Networks are used to classify the attributes in the given data set. The attribute filtering techniques like PCA (Principle Component Analysis) filtering and Information Gain Attribute Subset Evaluation technique for feature selection in the given data set to predict the heart disease symptoms. A new framework is proposed which is based on the above techniques, the framework will take the input dataset and fed into the feature selection techniques block, which selects any one techniques that gives the least number of attributes and then classification task is done using two algorithms, the same attributes that are selected by two classification task is taken for the prediction of heart disease. This framework consumes the time for predicting the symptoms of heart disease which make the user to know the important attributes based on the proposed framework.
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...ijsc
As the size of the biomedical databases are growing day by day, finding an essential features in the disease prediction have become more complex due to high dimensionality and sparsity problems. Also, due to the
availability of a large number of micro-array datasets in the biomedical repositories, it is difficult to analyze, predict and interpret the feature information using the traditional feature selection based classification models. Most of the traditional feature selection based classification algorithms have computational issues such as dimension reduction, uncertainty and class imbalance on microarray datasets. Ensemble classifier is one of the scalable models for extreme learning machine due to its high efficiency, the fast processing speed for real-time applications. The main objective of the feature selection
based ensemble learning models is to classify the high dimensional data with high computational efficiency
and high true positive rate on high dimensional datasets. In this proposed model an optimized Particle swarm optimization (PSO) based Ensemble classification model was developed on high dimensional microarray
datasets. Experimental results proved that the proposed model has high computational efficiency compared to the traditional feature selection based classification models in terms of accuracy , true positive rate and error rate are concerned.
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...ijistjournal
Feature extraction is a method of capturing visual content of an image. The feature extraction is the process to represent raw image in its reduced form to facilitate decision making such as pattern classification. We have tried to address the problem of classification MRI brain images by creating a robust and more accurate classifier which can act as an expert assistant to medical practitioners. The objective of this paper is to present a novel method of feature selection and extraction. This approach combines the Intensity, Texture, shape based features and classifies the tumor as white matter, Gray matter, CSF, abnormal and normal area. The experiment is performed on 140 tumor contained brain MR images from the Internet Brain Segmentation Repository. The proposed technique has been carried out over a larger database as compare to any previous work and is more robust and effective. PCA and Linear Discriminant Analysis (LDA) were applied on the training sets. The Support Vector Machine (SVM) classifier served as a comparison of nonlinear techniques Vs linear ones. PCA and LDA methods are used to reduce the number of features used. The feature selection using the proposed technique is more beneficial as it analyses the data according to grouping class variable and gives reduced feature set with high classification accuracy.
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Waqas Tariq
Machine Learning aims to generate classifying expressions simple enough to be understood easily by the human. There are many machine learning approaches available for classification. Among which decision tree learning is one of the most popular classification algorithms. In this paper we propose a systematic approach based on decision tree which is used to automatically determine the patient’s post–operative recovery status. Decision Tree structures are constructed, using data mining methods and then are used to classify discharge decisions.
Automated segmentation and classification technique for brain strokeIJECEIAES
Difussion-Weighted Imaging (DWI) plays an important role in the diagnosis of brain stroke by providing detailed information regarding the soft tissue contrast in the brain organ. Conventionally, the differential diagnosis of brain stroke lesions is performed manually by professional neuroradiologists during a highly subjective and time- consuming process. This study proposes a segmentation and classification technique to detect brain stroke lesions based on diffusion-weighted imaging (DWI). The type of stroke lesions consists of acute ischemic, sub-acute ischemic, chronic ischemic and acute hemorrhage. For segmentation, fuzzy c-Means (FCM) and active contour is proposed to segment the lesion’s region. FCM is implemented with active contour to separate the cerebral spinal fluid (CSF) with the hypointense lesion. Pre-processing is applied to the DWI for image normalization, background removal and image enhancement. The algorithm performance has been evaluated using Jaccard Index, Dice Coefficient (DC) and both false positive rate (FPR) and false negative rate (FNR). The average results for the Jaccard index, DC, FPR and FNR are 0.55, 0.68, 0.23 and 0.23, respectively. First statistical order method is applied to the segmentation result to obtain the features for the classifier input. For classification technique, bagged tree classifier is proposed to classify the type of stroke. The accuracy results for the classification is 90.8%. Based on the results, the proposed technique has potential to segment and classify brain stroke lesion from DWI images.
Abstract In this paper, the concept of data mining was summarized and its significance towards its methodologies was illustrated. The data mining based on Neural Network and Genetic Algorithm is researched in detail and the key technology and ways to achieve the data mining on Neural Network and Genetic Algorithm are also surveyed. This paper also conducts a formal review of the area of rule extraction from ANN and GA. Keywords: Data Mining, Neural Network, Genetic Algorithm, Rule Extraction.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Propose a Enhanced Framework for Prediction of Heart DiseaseIJERA Editor
Heart disease diagnosis requires more experience and it is a complex task. The Heart MRI, ECG and Stress Test etc are the numbers of medical tests are prescribed by the doctor for examining the heart disease and it is the way of tradition in the prediction of heart disease. Today world, the hidden information of the huge amount of health care data is contained by the health care industry. The effective decisions are made by means of this hidden information. For appropriate results, the advanced data mining techniques with the information which is based on the computer are used. In any empirical sciences, for the inference and categorisation, the new mathematical techniques to be used called Artificial neural networks (ANNs) it also be used to the modelling of the real neural networks. Acting, Wanting, knowing, remembering, perceiving, thinking and inferring are the nature of mental phenomena and these can be understand by using the theory of ANN. The problem of probability and induction can be arised for the inference and classification because these are the powerful instruments of ANN. In this paper, the classification techniques like Naive Bayes Classification algorithm and Artificial Neural Networks are used to classify the attributes in the given data set. The attribute filtering techniques like PCA (Principle Component Analysis) filtering and Information Gain Attribute Subset Evaluation technique for feature selection in the given data set to predict the heart disease symptoms. A new framework is proposed which is based on the above techniques, the framework will take the input dataset and fed into the feature selection techniques block, which selects any one techniques that gives the least number of attributes and then classification task is done using two algorithms, the same attributes that are selected by two classification task is taken for the prediction of heart disease. This framework consumes the time for predicting the symptoms of heart disease which make the user to know the important attributes based on the proposed framework.
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...ijsc
As the size of the biomedical databases are growing day by day, finding an essential features in the disease prediction have become more complex due to high dimensionality and sparsity problems. Also, due to the
availability of a large number of micro-array datasets in the biomedical repositories, it is difficult to analyze, predict and interpret the feature information using the traditional feature selection based classification models. Most of the traditional feature selection based classification algorithms have computational issues such as dimension reduction, uncertainty and class imbalance on microarray datasets. Ensemble classifier is one of the scalable models for extreme learning machine due to its high efficiency, the fast processing speed for real-time applications. The main objective of the feature selection
based ensemble learning models is to classify the high dimensional data with high computational efficiency
and high true positive rate on high dimensional datasets. In this proposed model an optimized Particle swarm optimization (PSO) based Ensemble classification model was developed on high dimensional microarray
datasets. Experimental results proved that the proposed model has high computational efficiency compared to the traditional feature selection based classification models in terms of accuracy , true positive rate and error rate are concerned.
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...ijistjournal
Feature extraction is a method of capturing visual content of an image. The feature extraction is the process to represent raw image in its reduced form to facilitate decision making such as pattern classification. We have tried to address the problem of classification MRI brain images by creating a robust and more accurate classifier which can act as an expert assistant to medical practitioners. The objective of this paper is to present a novel method of feature selection and extraction. This approach combines the Intensity, Texture, shape based features and classifies the tumor as white matter, Gray matter, CSF, abnormal and normal area. The experiment is performed on 140 tumor contained brain MR images from the Internet Brain Segmentation Repository. The proposed technique has been carried out over a larger database as compare to any previous work and is more robust and effective. PCA and Linear Discriminant Analysis (LDA) were applied on the training sets. The Support Vector Machine (SVM) classifier served as a comparison of nonlinear techniques Vs linear ones. PCA and LDA methods are used to reduce the number of features used. The feature selection using the proposed technique is more beneficial as it analyses the data according to grouping class variable and gives reduced feature set with high classification accuracy.
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Waqas Tariq
Machine Learning aims to generate classifying expressions simple enough to be understood easily by the human. There are many machine learning approaches available for classification. Among which decision tree learning is one of the most popular classification algorithms. In this paper we propose a systematic approach based on decision tree which is used to automatically determine the patient’s post–operative recovery status. Decision Tree structures are constructed, using data mining methods and then are used to classify discharge decisions.
Automated segmentation and classification technique for brain strokeIJECEIAES
Difussion-Weighted Imaging (DWI) plays an important role in the diagnosis of brain stroke by providing detailed information regarding the soft tissue contrast in the brain organ. Conventionally, the differential diagnosis of brain stroke lesions is performed manually by professional neuroradiologists during a highly subjective and time- consuming process. This study proposes a segmentation and classification technique to detect brain stroke lesions based on diffusion-weighted imaging (DWI). The type of stroke lesions consists of acute ischemic, sub-acute ischemic, chronic ischemic and acute hemorrhage. For segmentation, fuzzy c-Means (FCM) and active contour is proposed to segment the lesion’s region. FCM is implemented with active contour to separate the cerebral spinal fluid (CSF) with the hypointense lesion. Pre-processing is applied to the DWI for image normalization, background removal and image enhancement. The algorithm performance has been evaluated using Jaccard Index, Dice Coefficient (DC) and both false positive rate (FPR) and false negative rate (FNR). The average results for the Jaccard index, DC, FPR and FNR are 0.55, 0.68, 0.23 and 0.23, respectively. First statistical order method is applied to the segmentation result to obtain the features for the classifier input. For classification technique, bagged tree classifier is proposed to classify the type of stroke. The accuracy results for the classification is 90.8%. Based on the results, the proposed technique has potential to segment and classify brain stroke lesion from DWI images.
View classification of medical x ray images using pnn classifier, decision tr...eSAT Journals
Abstract: In this era of electronic advancements in the field of medical image processing, the quantum of medical X-ray images so produced exorbitantly can be effectively addressed by means of automated indexing, comparing, analysing and annotating that will really be pivotal to the radiologists in interpreting and diagnosing the diseases. In order to envisage such an objective, it has been humbly endeavoured in this paper by proposing an efficient methodology that takes care of the view classification of the X-ray images for the automated annotation from their vast database, with which the decision making for the physicians and radiologists becomes simpler despite an immeasurable and ever-growing trends of the X-ray images. In this paper, X-ray images of six different classes namely chest, head, foot, palm, spine and neck have been collected. The framework proposed in this paper involves the following: The images are pre-processed using M3 filter and segmentation by Expectation Maximization (EM) algorithm, followed by feature extraction through Discrete Wavelet Transform. The orientation of X-ray images has been performed in this work by comparing among the Probabilistic Neural Network (PNN), Decision Tree algorithm and Support Vector Machine (SVM), while the PNN yields an accuracy of 75%, the Decision Tree with 92.77% and the SVM of 93.33%. Key Words: M3 filter, Expectation Maximaization, Discrete Wavelet Transformation, Probabilistic Neural Network, Decision Tree Algorithm and Support Vector Machine.
Medical informatics growth can be observed now days. Advancement in different medical fields
discovers the various critical diseases and provides the guidelines for their cure. This has been possible
only because of well heeled medical databases as well as automation of data analysis process. Towards
this analysis process lots of learning and intelligence is required, the data mining techniques provides the
basis for that and various data mining techniques are available like Decision tree Induction, Rule Based
Classification or mining, Support vector machine, Stochastic classification, Logistic regression, Naïve
bayes, Artificial Neural Network & Fuzzy Logic, Genetic Algorithms. This paper provides the basic of
data mining with their effective techniques availability in medical sciences & reveals the efforts done on
medical databases using data mining techniques for human disease diagnosis.
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...IJERA Editor
Brain image classification is one of the utmost imperative parts of clinical investigative tools. Brain images
typically comprise noise, inhomogeneity and sometimes deviation. Therefore, precise segmentation of brain
images is a very challenging task. Nevertheless, the process of perfect segmentation of these images is very
important and crucial for a spot-on diagnosis by clinical tools. Also, intensity inhomogeneity often arises in realworld
images, which presents a substantial challenge in image segmentation. The most extensively used image
segmentation algorithms are region-based and usually rely on the homogeneousness of the image intensities in
the sections of interest, which often fail to afford precise segmentation results due to the intensity
inhomogeneity. This Research presents a more accurate segmentation using Gradient Based watershed
transform in level set method for a medical diagnosis system. Experimental results proved that our method
validating a much better rate of segmentation accuracy as compare to the traditional approaches, results are also
validated in terms of certain Measure properties of image regions like eccentricity, perimeter etc.
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
Dichotomous data is a type of categorical data, which is binary with categories zero and one. Health care data is one of the heavily used categorical data. Binary data are the simplest form of data used for heath care databases in which close ended questions can be used; it is very efficient based on computational efficiency and memory capacity to represent categorical type data. Clustering health care or medical data is very tedious due to its complex data representation models, high dimensionality and data sparsity. In this paper, clustering is performed after transforming the dichotomous data into real by wiener transformation. The proposed algorithm can be usable for determining the correlation of the health disorders and symptoms observed in large medical and health binary databases. Computational results show that the clustering based on Wiener transformation is very efficient in terms of objectivity and subjectivity.
An optimized approach for extensive segmentation and classification of brain ...IJECEIAES
With the significant contribution in medical image processing for an effective diagnosis of critical health condition in human, there has been evolution of various methods and techniques in abnormality detection and classification process. An insight to the existing approaches highlights that potential amount of work is being carried out in detection and segmentation process but less effective modelling towards classification problems. This manuscript discusses about a simple and robust modelling of a technique that offers comprehensive segmentation process as well as classification process using Artificial Neural Network. Different from any existing approach, the study offers more granularities towards foreground/ background indexing with its comprehensive segmentation process while introducing a unique morphological operation along with graph-believe network for ensuring approximately 99% of accuracy of proposed system in contrast to existing learning scheme.
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAIJSCAI Journal
Advancement in information and technology has made a major impact on medical science where the
researchers come up with new ideas for improving the classification rate of various diseases. Breast cancer
is one such disease killing large number of people around the world. Diagnosing the disease at its earliest
instance makes a huge impact on its treatment. The authors propose a Binary Bat Algorithm (BBA) based
Feedforward Neural Network (FNN) hybrid model, where the advantages of BBA and efficiency of FNN is
exploited for the classification of three benchmark breast cancer datasets into malignant and benign cases.
Here BBA is used to generate a V-shaped hyperbolic tangent function for training the network and a fitness
function is used for error minimization. FNNBBA based classification produces 92.61% accuracy for
training data and 89.95% for testing data.
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
Distributed digital artifacts incorporate cryptographic hash values to URI called trusty URIs in a distributed environment
building good in quality, verifiable and unchangeable web resources to prevent the rising man in the middle attack. The greatest
challenge of a centralized system is that it gives users no possibility to check whether data have been modified and the communication
is limited to a single server. As a solution for this, is the distributed digital artifact system, where resources are distributed among
different domains to enable inter-domain communication. Due to the emerging developments in web, attacks have increased rapidly,
among which man in the middle attack (MIMA) is a serious issue, where user security is at its threat. This work tries to prevent MIMA
to an extent, by providing self reference and trusty URIs even when presented in a distributed environment. Any manipulation to the
data is efficiently identified and any further access to that data is blocked by informing user that the uniform location has been
changed. System uses self-reference to contain trusty URI for each resource, lineage algorithm for generating seed and SHA-512 hash
generation algorithm to ensure security. It is implemented on the semantic web, which is an extension to the world wide web, using
RDF (Resource Description Framework) to identify the resource. Hence the framework was developed to overcome existing
challenges by making the digital artifacts on the semantic web distributed to enable communication between different domains across
the network securely and thereby preventing MIMA.
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...ijcsa
Front end of data collection and loading into database manually may cause potential errors in data sets and a very time consuming process. Scanning of a data document in the form of an image and recognition of corresponding information in that image can be considered as a possible solution of this challenge. This paper presents an automated solution for the problem of data cleansing and recognition of user written data to transform into standard printed format with the help of artificial neural networks. Three different neural models namely direct, correlation based and hierarchical have been developed to handle this issue. In a very hostile input environment, the solution is developed to justify the proposed logic.
Classification of medical datasets using back propagation neural network powe...IJECEIAES
The classification is a one of the most indispensable domains in the data mining and machine learning. The classification process has a good reputation in the area of diseases diagnosis by computer systems where the progress in smart technologies of computer can be invested in diagnosing various diseases based on data of real patients documented in databases. The paper introduced a methodology for diagnosing a set of diseases including two types of cancer (breast cancer and lung), two datasets for diabetes and heart attack. Back Propagation Neural Network plays the role of classifier. The performance of neural net is enhanced by using the genetic algorithm which provides the classifier with the optimal features to raise the classification rate to the highest possible. The system showed high efficiency in dealing with databases differs from each other in size, number of features and nature of the data and this is what the results illustrated, where the ratio of the classification reached to 100% in most datasets).
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
In health research, one of the major tasks is to retrieve, and analyze heterogeneous databases containing
one single patient’s information gathered from a large volume of data over a long period of time. The
main objective of this paper is to represent our ontology-based information retrieval approach for
clinical Information System. We have performed a Case Study in the real life hospital settings. The results
obtained illustrate the feasibility of the proposed approach which significantly improved the information
retrieval process on a large volume of data over a long period of time from August 2011 until January
2012
Classification Of Iris Plant Using Feedforward Neural Networkirjes
The classification and recognition of type on the basis of individual features and behaviors constitute
a preliminary measure and is an important target in the behavioral sciences. Current statistical methods do not
always yield satisfactory answers. A Feed Forward Artificial Neural Network is the computer model inspired by
the structure of the Human Brain. It views as in the set of artificial nerve cells that are interconnected with the
other neurons. The primary aim of this paper is to demonstrate the process of developing the Artificial Neural
network based classifier which classifies the Iris database. The problem concerns the identification of Iris plant
species on the basis of plant attribute measurements. This paper is related to the use of feed forward neural
networks towards the identification of iris plants on the basis of the following measurements: sepal length, sepal
width, petal length, and petal width. Using this data set a Neural Network (NN) is used for the classification of
iris data set. The EBPA is used for training of this ANN. The results of simulations illustrate the effectiveness of
the neural system in iris class identification.
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...ijsc
As the size of the biomedical databases are growing day by day, finding an essential features in the disease prediction have become more complex due to high dimensionality and sparsity problems. Also, due to the availability of a large number of micro-array datasets in the biomedical repositories, it is difficult to
analyze, predict and interpret the feature information using the traditional feature selection based classification models. Most of the traditional feature selection based classification algorithms have computational issues such as dimension reduction, uncertainty and class imbalance on microarray datasets. Ensemble classifier is one of the scalable models for extreme learning machine due to its high efficiency, the fast processing speed for real-time applications. The main objective of the feature selection based ensemble learning models is to classify the high dimensional data with high computational efficiency and high true positive rate on high dimensional datasets. In this proposed model an optimized Particle swarm optimization (PSO) based Ensemble classification model was developed on high dimensional microarray datasets. Experimental results proved that the proposed model has high computational efficiency compared to the traditional feature selection based classification models in terms of accuracy , true positive rate and error rate are concerned.
View classification of medical x ray images using pnn classifier, decision tr...eSAT Journals
Abstract: In this era of electronic advancements in the field of medical image processing, the quantum of medical X-ray images so produced exorbitantly can be effectively addressed by means of automated indexing, comparing, analysing and annotating that will really be pivotal to the radiologists in interpreting and diagnosing the diseases. In order to envisage such an objective, it has been humbly endeavoured in this paper by proposing an efficient methodology that takes care of the view classification of the X-ray images for the automated annotation from their vast database, with which the decision making for the physicians and radiologists becomes simpler despite an immeasurable and ever-growing trends of the X-ray images. In this paper, X-ray images of six different classes namely chest, head, foot, palm, spine and neck have been collected. The framework proposed in this paper involves the following: The images are pre-processed using M3 filter and segmentation by Expectation Maximization (EM) algorithm, followed by feature extraction through Discrete Wavelet Transform. The orientation of X-ray images has been performed in this work by comparing among the Probabilistic Neural Network (PNN), Decision Tree algorithm and Support Vector Machine (SVM), while the PNN yields an accuracy of 75%, the Decision Tree with 92.77% and the SVM of 93.33%. Key Words: M3 filter, Expectation Maximaization, Discrete Wavelet Transformation, Probabilistic Neural Network, Decision Tree Algorithm and Support Vector Machine.
Medical informatics growth can be observed now days. Advancement in different medical fields
discovers the various critical diseases and provides the guidelines for their cure. This has been possible
only because of well heeled medical databases as well as automation of data analysis process. Towards
this analysis process lots of learning and intelligence is required, the data mining techniques provides the
basis for that and various data mining techniques are available like Decision tree Induction, Rule Based
Classification or mining, Support vector machine, Stochastic classification, Logistic regression, Naïve
bayes, Artificial Neural Network & Fuzzy Logic, Genetic Algorithms. This paper provides the basic of
data mining with their effective techniques availability in medical sciences & reveals the efforts done on
medical databases using data mining techniques for human disease diagnosis.
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...IJERA Editor
Brain image classification is one of the utmost imperative parts of clinical investigative tools. Brain images
typically comprise noise, inhomogeneity and sometimes deviation. Therefore, precise segmentation of brain
images is a very challenging task. Nevertheless, the process of perfect segmentation of these images is very
important and crucial for a spot-on diagnosis by clinical tools. Also, intensity inhomogeneity often arises in realworld
images, which presents a substantial challenge in image segmentation. The most extensively used image
segmentation algorithms are region-based and usually rely on the homogeneousness of the image intensities in
the sections of interest, which often fail to afford precise segmentation results due to the intensity
inhomogeneity. This Research presents a more accurate segmentation using Gradient Based watershed
transform in level set method for a medical diagnosis system. Experimental results proved that our method
validating a much better rate of segmentation accuracy as compare to the traditional approaches, results are also
validated in terms of certain Measure properties of image regions like eccentricity, perimeter etc.
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
Dichotomous data is a type of categorical data, which is binary with categories zero and one. Health care data is one of the heavily used categorical data. Binary data are the simplest form of data used for heath care databases in which close ended questions can be used; it is very efficient based on computational efficiency and memory capacity to represent categorical type data. Clustering health care or medical data is very tedious due to its complex data representation models, high dimensionality and data sparsity. In this paper, clustering is performed after transforming the dichotomous data into real by wiener transformation. The proposed algorithm can be usable for determining the correlation of the health disorders and symptoms observed in large medical and health binary databases. Computational results show that the clustering based on Wiener transformation is very efficient in terms of objectivity and subjectivity.
An optimized approach for extensive segmentation and classification of brain ...IJECEIAES
With the significant contribution in medical image processing for an effective diagnosis of critical health condition in human, there has been evolution of various methods and techniques in abnormality detection and classification process. An insight to the existing approaches highlights that potential amount of work is being carried out in detection and segmentation process but less effective modelling towards classification problems. This manuscript discusses about a simple and robust modelling of a technique that offers comprehensive segmentation process as well as classification process using Artificial Neural Network. Different from any existing approach, the study offers more granularities towards foreground/ background indexing with its comprehensive segmentation process while introducing a unique morphological operation along with graph-believe network for ensuring approximately 99% of accuracy of proposed system in contrast to existing learning scheme.
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAIJSCAI Journal
Advancement in information and technology has made a major impact on medical science where the
researchers come up with new ideas for improving the classification rate of various diseases. Breast cancer
is one such disease killing large number of people around the world. Diagnosing the disease at its earliest
instance makes a huge impact on its treatment. The authors propose a Binary Bat Algorithm (BBA) based
Feedforward Neural Network (FNN) hybrid model, where the advantages of BBA and efficiency of FNN is
exploited for the classification of three benchmark breast cancer datasets into malignant and benign cases.
Here BBA is used to generate a V-shaped hyperbolic tangent function for training the network and a fitness
function is used for error minimization. FNNBBA based classification produces 92.61% accuracy for
training data and 89.95% for testing data.
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
Distributed digital artifacts incorporate cryptographic hash values to URI called trusty URIs in a distributed environment
building good in quality, verifiable and unchangeable web resources to prevent the rising man in the middle attack. The greatest
challenge of a centralized system is that it gives users no possibility to check whether data have been modified and the communication
is limited to a single server. As a solution for this, is the distributed digital artifact system, where resources are distributed among
different domains to enable inter-domain communication. Due to the emerging developments in web, attacks have increased rapidly,
among which man in the middle attack (MIMA) is a serious issue, where user security is at its threat. This work tries to prevent MIMA
to an extent, by providing self reference and trusty URIs even when presented in a distributed environment. Any manipulation to the
data is efficiently identified and any further access to that data is blocked by informing user that the uniform location has been
changed. System uses self-reference to contain trusty URI for each resource, lineage algorithm for generating seed and SHA-512 hash
generation algorithm to ensure security. It is implemented on the semantic web, which is an extension to the world wide web, using
RDF (Resource Description Framework) to identify the resource. Hence the framework was developed to overcome existing
challenges by making the digital artifacts on the semantic web distributed to enable communication between different domains across
the network securely and thereby preventing MIMA.
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...ijcsa
Front end of data collection and loading into database manually may cause potential errors in data sets and a very time consuming process. Scanning of a data document in the form of an image and recognition of corresponding information in that image can be considered as a possible solution of this challenge. This paper presents an automated solution for the problem of data cleansing and recognition of user written data to transform into standard printed format with the help of artificial neural networks. Three different neural models namely direct, correlation based and hierarchical have been developed to handle this issue. In a very hostile input environment, the solution is developed to justify the proposed logic.
Classification of medical datasets using back propagation neural network powe...IJECEIAES
The classification is a one of the most indispensable domains in the data mining and machine learning. The classification process has a good reputation in the area of diseases diagnosis by computer systems where the progress in smart technologies of computer can be invested in diagnosing various diseases based on data of real patients documented in databases. The paper introduced a methodology for diagnosing a set of diseases including two types of cancer (breast cancer and lung), two datasets for diabetes and heart attack. Back Propagation Neural Network plays the role of classifier. The performance of neural net is enhanced by using the genetic algorithm which provides the classifier with the optimal features to raise the classification rate to the highest possible. The system showed high efficiency in dealing with databases differs from each other in size, number of features and nature of the data and this is what the results illustrated, where the ratio of the classification reached to 100% in most datasets).
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
In health research, one of the major tasks is to retrieve, and analyze heterogeneous databases containing
one single patient’s information gathered from a large volume of data over a long period of time. The
main objective of this paper is to represent our ontology-based information retrieval approach for
clinical Information System. We have performed a Case Study in the real life hospital settings. The results
obtained illustrate the feasibility of the proposed approach which significantly improved the information
retrieval process on a large volume of data over a long period of time from August 2011 until January
2012
Classification Of Iris Plant Using Feedforward Neural Networkirjes
The classification and recognition of type on the basis of individual features and behaviors constitute
a preliminary measure and is an important target in the behavioral sciences. Current statistical methods do not
always yield satisfactory answers. A Feed Forward Artificial Neural Network is the computer model inspired by
the structure of the Human Brain. It views as in the set of artificial nerve cells that are interconnected with the
other neurons. The primary aim of this paper is to demonstrate the process of developing the Artificial Neural
network based classifier which classifies the Iris database. The problem concerns the identification of Iris plant
species on the basis of plant attribute measurements. This paper is related to the use of feed forward neural
networks towards the identification of iris plants on the basis of the following measurements: sepal length, sepal
width, petal length, and petal width. Using this data set a Neural Network (NN) is used for the classification of
iris data set. The EBPA is used for training of this ANN. The results of simulations illustrate the effectiveness of
the neural system in iris class identification.
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...ijsc
As the size of the biomedical databases are growing day by day, finding an essential features in the disease prediction have become more complex due to high dimensionality and sparsity problems. Also, due to the availability of a large number of micro-array datasets in the biomedical repositories, it is difficult to
analyze, predict and interpret the feature information using the traditional feature selection based classification models. Most of the traditional feature selection based classification algorithms have computational issues such as dimension reduction, uncertainty and class imbalance on microarray datasets. Ensemble classifier is one of the scalable models for extreme learning machine due to its high efficiency, the fast processing speed for real-time applications. The main objective of the feature selection based ensemble learning models is to classify the high dimensional data with high computational efficiency and high true positive rate on high dimensional datasets. In this proposed model an optimized Particle swarm optimization (PSO) based Ensemble classification model was developed on high dimensional microarray datasets. Experimental results proved that the proposed model has high computational efficiency compared to the traditional feature selection based classification models in terms of accuracy , true positive rate and error rate are concerned.
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
In this paper we focus on some techniques for solving data mining tasks such as: Statistics, Decision Trees and Neural
Networks. The new approach has succeed in defining some new criteria for the evaluation process, and it has obtained valuable results
based on what the technique is, the environment of using each techniques, the advantages and disadvantages of each technique, the
consequences of choosing any of these techniques to extract hidden predictive information from large databases, and the methods of
implementation of each technique. Finally, the paper has presented some valuable recommendations in this field.
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATAcscpconf
Many studies uses different data mining techniques to analyze mass spectrometry data and
extract useful knowledge about biomarkers. These Biomarkers allow the medical experts to
determine whether an individual has a disease or not. Some of these studies have proposed
models that have obtained high accuracy. However, the black-box nature and complexity of the
proposed models have posed significant issues. Thus, to address this problem and build an
accurate model, we use a genetic algorithm for feature selection along with a rule-based
classifier, namely Genetic Rule-Based Classifier algorithm for Mass Spectra data (GRC-MS).
According to the literature, rule-based classifiers provide understandable rules, but not
accurate. In addition, genetic algorithms have achieved excellent results when used with
different classifiers for feature selection. Experiments are conducted on real dataset and the
proposed classifier GRC-MS achieves 99.7% accuracy. In addition, the generated rules are
more understandable than those of other classifier models
An efficient convolutional neural network-based classifier for an imbalanced ...IAESIJAI
Imbalanced datasets pose a major challenge for the researchers while addressing machine learning tasks. In these types of datasets, samples of different classes are not in equal proportion rather the gap between the numbers of individual class samples is significantly large. Classification models perform better for datasets having equal proportion of data tuples in both the classes. But, in reality, the medical image datasets are skewed and hence are not always suitable for a model to achieve improved classification performance. Therefore, various techniques have been suggested in the literature to overcome this challenge. This paper applies oversampling technique on an imbalanced dataset and focuses on a customized convolutional neural network model that classifies the images into two categories: diseased and non-diseased. Outcome of the proposed model can assist the health experts in the detection of oral cancer. The proposed model exhibits 99% accuracy after data augmentation. Performance metrics such as precision, recall and F1-score values are very close to 1. In addition, statistical test is performed to validate the statistical significance of the model. It has been found that the proposed model is an optimised classifier in terms of number of network layers and number of neurons.
Clustering and Classification of Cancer Data Using Soft Computing Technique IOSR Journals
Clustering and classification of cancer data has been used with success in field of medical side. In
this paper the two algorithm K-means and fuzzy C-means proposed for the comparison and find the accuracy of
the result. this paper address the problem of learning to classify the cancer data with two different method and
information derived from the training and testing .various soft computing based classification and show the
comparison of classification technique and classification of this health care data .this paper present the
accuracy of the result in cancer data.
An Enhanced Feature Selection Method to Predict the Severity in Brain Tumorijtsrd
The level of severity of brain tumor is captured through MRI and then assessed by the physician for their medical interpretation. The facts behind the MRI images are then analyzed by the physician for further medication and follow up activities. An MRI image composed of large volume of features. It has irrelevant, missing and information which is not certain. In medical data analysis, an MRI image doesn't express facts very clearly to the physician for correct interpretation all the time. It also includes huge amount of redundant information within it. A mathematical model known as rough set theory has been applied to resolve this problem by eliminating the redundancy in medical image data. This paper uses a rough set method to find the severity level of the brain tumor of the given MRI image. Rough set feature selection algorithms are applied over the medical image data to select the prominent features. The classification accuracy of the brain tumor can be improved to a better level by using this rough set approach. The prominent features selected through this approach deliver a set of decision rules for the classification task. A search method based on the particle swarm optimization is proposed in this paper for minimizing the attribute set. This approach is compared with previously existing rough set reduction algorithm for finding the accuracy. The reducts originated from the proposed algorithm is more efficient and can generate decision rules that will better classify the tumor types. The rule based method provided by the rough set method delivers classification accuracy in higher level than other smart methods such as fuzzy rule extraction, neural networks, decision trees and Fuzzy Networks like Fuzzy Min Max Neural Networks. Parthiban J | Dr. B. Sathees Kumar "An Enhanced Feature Selection Method to Predict the Severity in Brain Tumor" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26802.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26802/an-enhanced-feature-selection-method-to-predict-the-severity-in-brain-tumor/parthiban-j
Chronic Kidney Disease prediction is one of the most important issues in healthcare analytics. The most interesting and challenging tasks in day to day life is prediction in medical field. In this paper, we employ some machine learning techniques for predicting the chronic kidney disease using clinical data. We use three machine learning algorithms such as Decision Tree(DT) algorithm, Naive Bayesian (NB) algorithm. The performance of the above models are compared with each other in order to select the best classifier in predicting the chronic kidney disease for given dataset.
An efficient feature selection algorithm for health care data analysisjournalBEEI
Diabete is a silent killer, which will slowly kill the person if it goes undetected. The existing system which uses F-score method and K-means clustering of checking whether a person has diabetes or not are 100% accurate, and anything which isn't a 100% is not acceptable in the medical field, as it could cost the lives of many people. Our proposed system aims at using some of the best features of the existing algorithms to predict diabetes, and combine these and based on these features; This research work turns them into a novel algorithm, which will be 100% accurate in its prediction. With the surge in technological advancements, we can use data mining to predict when a person would be diagnosed with diabetes. Specifically, we analyze the best features of chi-square algorithm and advanced clustering algorithm (ACA). This research work is done using the Pima Indian Diabetes dataset provided by National Institutes of Diabetes and Digestive and Kidney Diseases. Using classification theorems and methods we can consider different factors like age, BMI, blood pressure and the importance given to these attributes overall, and singles these attributes out, and use them for the prediction of diabetes.
Similar to PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFERENT METHODS OF CLASSIFICATION IN DATA MINING (20)
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR cscpconf
The progressive development of Synthetic Aperture Radar (SAR) systems diversify the exploitation of the generated images by these systems in different applications of geoscience. Detection and monitoring surface deformations, procreated by various phenomena had benefited from this evolution and had been realized by interferometry (InSAR) and differential interferometry (DInSAR) techniques. Nevertheless, spatial and temporal decorrelations of the interferometric couples used, limit strongly the precision of analysis results by these techniques. In this context, we propose, in this work, a methodological approach of surface deformation detection and analysis by differential interferograms to show the limits of this technique according to noise quality and level. The detectability model is generated from the deformation signatures, by simulating a linear fault merged to the images couples of ERS1 / ERS2 sensors acquired in a region of the Algerian south.
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATIONcscpconf
A novel based a trajectory-guided, concatenating approach for synthesizing high-quality image real sample renders video is proposed . The lips reading automated is seeking for modeled the closest real image sample sequence preserve in the library under the data video to the HMM predicted trajectory. The object trajectory is modeled obtained by projecting the face patterns into an KDA feature space is estimated. The approach for speaker's face identification by using synthesise the identity surface of a subject face from a small sample of patterns which sparsely each the view sphere. An KDA algorithm use to the Lip-reading image is discrimination, after that work consisted of in the low dimensional for the fundamental lip features vector is reduced by using the 2D-DCT.The mouth of the set area dimensionality is ordered by a normally reduction base on the PCA to obtain the Eigen lips approach, their proposed approach by[33]. The subjective performance results of the cost function under the automatic lips reading modeled , which wasn’t illustrate the superior performance of the
method.
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...cscpconf
Universities offer software engineering capstone course to simulate a real world-working environment in which students can work in a team for a fixed period to deliver a quality product. The objective of the paper is to report on our experience in moving from Waterfall process to Agile process in conducting the software engineering capstone project. We present the capstone course designs for both Waterfall driven and Agile driven methodologies that highlight the structure, deliverables and assessment plans.To evaluate the improvement, we conducted a survey for two different sections taught by two different instructors to evaluate students’ experience in moving from traditional Waterfall model to Agile like process. Twentyeight students filled the survey. The survey consisted of eight multiple-choice questions and an open-ended question to collect feedback from students. The survey results show that students were able to attain hands one experience, which simulate a real world-working environment. The results also show that the Agile approach helped students to have overall better design and avoid mistakes they have made in the initial design completed in of the first phase of the capstone project. In addition, they were able to decide on their team capabilities, training needs and thus learn the required technologies earlier which is reflected on the final product quality
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIEScscpconf
Using social media in education provides learners with an informal way for communication. Informal communication tends to remove barriers and hence promotes student engagement. This paper presents our experience in using three different social media technologies in teaching software project management course. We conducted different surveys at the end of every semester to evaluate students’ satisfaction and engagement. Results show that using social media enhances students’ engagement and satisfaction. However, familiarity with the tool is an important factor for student satisfaction.
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
In real world computing environment with using a computer to answer questions has been a human dream since the beginning of the digital era, Question-answering systems are referred to as intelligent systems, that can be used to provide responses for the questions being asked by the user based on certain facts or rules stored in the knowledge base it can generate answers of questions asked in natural , and the first main idea of fuzzy logic was to working on the problem of computer understanding of natural language, so this survey paper provides an overview on what Question-Answering is and its system architecture and the possible relationship and
different with fuzzy logic, as well as the previous related research with respect to approaches that were followed. At the end, the survey provides an analytical discussion of the proposed QA models, along or combined with fuzzy logic and their main contributions and limitations.
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
Human beings generate different speech waveforms while speaking the same word at different times. Also, different human beings have different accents and generate significantly varying speech waveforms for the same word. There is a need to measure the distances between various words which facilitate preparation of pronunciation dictionaries. A new algorithm called Dynamic Phone Warping (DPW) is presented in this paper. It uses dynamic programming technique for global alignment and shortest distance measurements. The DPW algorithm can be used to enhance the pronunciation dictionaries of the well-known languages like English or to build pronunciation dictionaries to the less known sparse languages. The precision measurement experiments show 88.9% accuracy.
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
In education, the use of electronic (E) examination systems is not a novel idea, as Eexamination systems have been used to conduct objective assessments for the last few years. This research deals with randomly designed E-examinations and proposes an E-assessment system that can be used for subjective questions. This system assesses answers to subjective questions by finding a matching ratio for the keywords in instructor and student answers. The matching ratio is achieved based on semantic and document similarity. The assessment system is composed of four modules: preprocessing, keyword expansion, matching, and grading. A survey and case study were used in the research design to validate the proposed system. The examination assessment system will help instructors to save time, costs, and resources, while increasing efficiency and improving the productivity of exam setting and assessments.
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICcscpconf
African Buffalo Optimization (ABO) is one of the most recent swarms intelligence based metaheuristics. ABO algorithm is inspired by the buffalo’s behavior and lifestyle. Unfortunately, the standard ABO algorithm is proposed only for continuous optimization problems. In this paper, the authors propose two discrete binary ABO algorithms to deal with binary optimization problems. In the first version (called SBABO) they use the sigmoid function and probability model to generate binary solutions. In the second version (called LBABO) they use some logical operator to operate the binary solutions. Computational results on two knapsack problems (KP and MKP) instances show the effectiveness of the proposed algorithm and their ability to achieve good and promising solutions.
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINcscpconf
In recent years, many malware writers have relied on Dynamic Domain Name Services (DDNS) to maintain their Command and Control (C&C) network infrastructure to ensure a persistence presence on a compromised host. Amongst the various DDNS techniques, Domain Generation Algorithm (DGA) is often perceived as the most difficult to detect using traditional methods. This paper presents an approach for detecting DGA using frequency analysis of the character distribution and the weighted scores of the domain names. The approach’s feasibility is demonstrated using a range of legitimate domains and a number of malicious algorithmicallygenerated domain names. Findings from this study show that domain names made up of English characters “a-z” achieving a weighted score of < 45 are often associated with DGA. When a weighted score of < 45 is applied to the Alexa one million list of domain names, only 15% of the domain names were treated as non-human generated.
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...cscpconf
The amount of piracy in the streaming digital content in general and the music industry in specific is posing a real challenge to digital content owners. This paper presents a DRM solution to monetizing, tracking and controlling online streaming content cross platforms for IP enabled devices. The paper benefits from the current advances in Blockchain and cryptocurrencies. Specifically, the paper presents a Global Music Asset Assurance (GoMAA) digital currency and presents the iMediaStreams Blockchain to enable the secure dissemination and tracking of the streamed content. The proposed solution provides the data owner the ability to control the flow of information even after it has been released by creating a secure, selfinstalled, cross platform reader located on the digital content file header. The proposed system provides the content owners’ options to manage their digital information (audio, video, speech, etc.), including the tracking of the most consumed segments, once it is release. The system benefits from token distribution between the content owner (Music Bands), the content distributer (Online Radio Stations) and the content consumer(Fans) on the system blockchain.
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMcscpconf
This paper discusses the importance of verb suffix mapping in Discourse translation system. In
discourse translation, the crucial step is Anaphora resolution and generation. In Anaphora
resolution, cohesion links like pronouns are identified between portions of text. These binders
make the text cohesive by referring to nouns appearing in the previous sentences or nouns
appearing in sentences after them. In Machine Translation systems, to convert the source
language sentences into meaningful target language sentences the verb suffixes should be
changed as per the cohesion links identified. This step of translation process is emphasized in
the present paper. Specifically, the discussion is on how the verbs change according to the
subjects and anaphors. To explain the concept, English is used as the source language (SL) and
an Indian language Telugu is used as Target language (TL)
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...cscpconf
In this paper, based on the definition of conformable fractional derivative, the functional
variable method (FVM) is proposed to seek the exact traveling wave solutions of two higherdimensional
space-time fractional KdV-type equations in mathematical physics, namely the
(3+1)-dimensional space–time fractional Zakharov-Kuznetsov (ZK) equation and the (2+1)-
dimensional space–time fractional Generalized Zakharov-Kuznetsov-Benjamin-Bona-Mahony
(GZK-BBM) equation. Some new solutions are procured and depicted. These solutions, which
contain kink-shaped, singular kink, bell-shaped soliton, singular soliton and periodic wave
solutions, have many potential applications in mathematical physics and engineering. The
simplicity and reliability of the proposed method is verified.
AUTOMATED PENETRATION TESTING: AN OVERVIEWcscpconf
The using of information technology resources is rapidly increasing in organizations,
businesses, and even governments, that led to arise various attacks, and vulnerabilities in the
field. All resources make it a must to do frequently a penetration test (PT) for the environment
and see what can the attacker gain and what is the current environment's vulnerabilities. This
paper reviews some of the automated penetration testing techniques and presents its
enhancement over the traditional manual approaches. To the best of our knowledge, it is the
first research that takes into consideration the concept of penetration testing and the standards
in the area.This research tackles the comparison between the manual and automated
penetration testing, the main tools used in penetration testing. Additionally, compares between
some methodologies used to build an automated penetration testing platform.
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKcscpconf
Since the mid of 1990s, functional connectivity study using fMRI (fcMRI) has drawn increasing
attention of neuroscientists and computer scientists, since it opens a new window to explore
functional network of human brain with relatively high resolution. BOLD technique provides
almost accurate state of brain. Past researches prove that neuro diseases damage the brain
network interaction, protein- protein interaction and gene-gene interaction. A number of
neurological research paper also analyse the relationship among damaged part. By
computational method especially machine learning technique we can show such classifications.
In this paper we used OASIS fMRI dataset affected with Alzheimer’s disease and normal
patient’s dataset. After proper processing the fMRI data we use the processed data to form
classifier models using SVM (Support Vector Machine), KNN (K- nearest neighbour) & Naïve
Bayes. We also compare the accuracy of our proposed method with existing methods. In future,
we will other combinations of methods for better accuracy.
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
In order to treat and analyze real datasets, fuzzy association rules have been proposed. Several
algorithms have been introduced to extract these rules. However, these algorithms suffer from
the problems of utility, redundancy and large number of extracted fuzzy association rules. The
expert will then be confronted with this huge amount of fuzzy association rules. The task of
validation becomes fastidious. In order to solve these problems, we propose a new validation
method. Our method is based on three steps. (i) We extract a generic base of non redundant
fuzzy association rules by applying EFAR-PN algorithm based on fuzzy formal concept analysis.
(ii) we categorize extracted rules into groups and (iii) we evaluate the relevance of these rules
using structural equation model.
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAcscpconf
In many applications of data mining, class imbalance is noticed when examples in one class are
overrepresented. Traditional classifiers result in poor accuracy of the minority class due to the
class imbalance. Further, the presence of within class imbalance where classes are composed of
multiple sub-concepts with different number of examples also affect the performance of
classifier. In this paper, we propose an oversampling technique that handles between class and
within class imbalance simultaneously and also takes into consideration the generalization
ability in data space. The proposed method is based on two steps- performing Model Based
Clustering with respect to classes to identify the sub-concepts; and then computing the
separating hyperplane based on equal posterior probability between the classes. The proposed
method is tested on 10 publicly available data sets and the result shows that the proposed
method is statistically superior to other existing oversampling methods.
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHcscpconf
Data collection is an essential, but manpower intensive procedure in ecological research. An
algorithm was developed by the author which incorporated two important computer vision
techniques to automate data cataloging for butterfly measurements. Optical Character
Recognition is used for character recognition and Contour Detection is used for imageprocessing.
Proper pre-processing is first done on the images to improve accuracy. Although
there are limitations to Tesseract’s detection of certain fonts, overall, it can successfully identify
words of basic fonts. Contour detection is an advanced technique that can be utilized to
measure an image. Shapes and mathematical calculations are crucial in determining the precise
location of the points on which to draw the body and forewing lines of the butterfly. Overall,
92% accuracy were achieved by the program for the set of butterflies measured.
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
Smart cities utilize Internet of Things (IoT) devices and sensors to enhance the quality of the city
services including energy, transportation, health, and much more. They generate massive
volumes of structured and unstructured data on a daily basis. Also, social networks, such as
Twitter, Facebook, and Google+, are becoming a new source of real-time information in smart
cities. Social network users are acting as social sensors. These datasets so large and complex
are difficult to manage with conventional data management tools and methods. To become
valuable, this massive amount of data, known as 'big data,' needs to be processed and
comprehended to hold the promise of supporting a broad range of urban and smart cities
functions, including among others transportation, water, and energy consumption, pollution
surveillance, and smart city governance. In this work, we investigate how social media analytics
help to analyze smart city data collected from various social media sources, such as Twitter and
Facebook, to detect various events taking place in a smart city and identify the importance of
events and concerns of citizens regarding some events. A case scenario analyses the opinions of
users concerning the traffic in three largest cities in the UAE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGEcscpconf
The anonymity of social networks makes it attractive for hate speech to mask their criminal
activities online posing a challenge to the world and in particular Ethiopia. With this everincreasing
volume of social media data, hate speech identification becomes a challenge in
aggravating conflict between citizens of nations. The high rate of production, has become
difficult to collect, store and analyze such big data using traditional detection methods. This
paper proposed the application of apache spark in hate speech detection to reduce the
challenges. Authors developed an apache spark based model to classify Amharic Facebook
posts and comments into hate and not hate. Authors employed Random forest and Naïve Bayes
for learning and Word2Vec and TF-IDF for feature selection. Tested by 10-fold crossvalidation,
the model based on word2vec embedding performed best with 79.83%accuracy. The
proposed method achieve a promising result with unique feature of spark for big data.
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
This article presents Part of Speech tagging for Nepali text using General Regression Neural
Network (GRNN). The corpus is divided into two parts viz. training and testing. The network is
trained and validated on both training and testing data. It is observed that 96.13% words are
correctly being tagged on training set whereas 74.38% words are tagged correctly on testing
data set using GRNN. The result is compared with the traditional Viterbi algorithm based on
Hidden Markov Model. Viterbi algorithm yields 97.2% and 40% classification accuracies on
training and testing data sets respectively. GRNN based POS Tagger is more consistent than the
traditional Viterbi decoding technique.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
2. 2 Computer Science & Information Technology (CS & IT)
There are different classification methods in data mining techniques [2]. Some of them are deeply
dependent to underlying theoretical assumptions such as linear and logistic regression models.
They are called parametric methods. And some the others are assumptions free like artificial
neural networks, decision trees, K-nearest neighbourhood, etc.
Recently, non parametric methods of data mining techniques which are not depending on
theoretical distributional assumptions receive more attention in practice. The real dataset seldom
follows the underlying theoretical assumptions of parametric methods. Clinical dataset is such an
example. The avoidance of ideal theoretical assumptions in one hand and the variability in the
nature of clinical dataset and their vague relations on the other hand cause this favourability. The
attribute nature of biological data and their vague relation does not consist with theoretical
distributional assumptions of parametric methods. For instances, dichotomous attribute in logistic
regression analysis which is modelled based on the other attributes should follow Bernoulli
theoretical distribution. The independency of error terms of models, the independency of other
attributes in the model and enough sample size are the other ideal theoretical assumptions of these
modelling methods. The application of such methods and the accuracy of their results depend on
fulfilment of their ideal assumptions. In comparison, other methods such as artificial neural
networks and decision trees use the learning process from a set of existent prototypes without any
specific underlying assumptions. Therefore, the relation among the attributes is discovered from a
part of the dataset (training set) and the parameters are estimated in such a way that the error
prediction is minimized. Then, the power of model’s prediction is evaluated by the other part of
dataset (testing set).
These two mentioned methods have some advantages and disadvantages in their own [3]. In
general, decision trees will require much less training time than neural networks. However,
despite of neural network methods, decision tree techniques are less sensitive to the noise. In
other words, the hidden layers in neural network discover complicated relations of the data and
assign more weights to the important attributes. Nevertheless, their performance depends on the
dataset. For some data, much more time and complicated computing is needed to train a neural
network or/and the results are hardly interpretable by the expert of that field. For some other data,
decision tree techniques are not able to discover reasonable relations. There are some studies
which compare the abilities of these two methods and also, with logistic regression analysis in
different research fields. For instances see [4, 5, 6].
In this paper, we compare feed forward back propagation neural networks and a well-known
algorithms of decision tree namely J48 on our clinical dataset. A feed forward back propagation
neural network repeatedly examines all the training data in the process of updating its weights [7].
The mentioned decision tree learning algorithms recursively partitions the training data into ever
smaller subsets on which a test is made [8].
2. METHODS AND MATERIALS
In this section, three applied classification methods are described briefly. Then, the clinical
diagnosis problem and the patients’ population used in practical part of the study are explained at
the end.
2.1. Artificial Neural Networks
Neural networks are the branch of artificial intelligence. Their models are inspired by the neural
systems of human brain. And have been applied in many research fields such as biology,
psychology, statistics, mathematics, medical science, computer sciences, and also, a variety of
business areas like finance, management and decision making, marketing and production [9].
Recently, artificial neural networks (ANNs) become a very popular model to diagnose disease.
However, ANNs have some disadvantages and some advantages for medical analysis. Their
3. Computer Science & Information Technology (CS & IT) 3
discrimination power, discovery the complex and nonlinear relationship among the attributes, and
prediction of the cases are the most important advantages of ANNs. Nevertheless, they can be
over-fitted for training data, and time consuming because of computational requirements [9]. The
selection of an appropriate training algorithm, transfer functions, initial values of network weights
and also, the number of parameters and hidden layers to define the network size determine the
performance of an ANN. But compared to logistic regression analysis, neural network models are
more flexible [10].
In this study, we use a type of neural networks namely feed-forwards network with back
propagation algorithm to model the relations among attributes in our clinical dataset. A feed-
forward back propagation neural network is a supervised network. That is, it uses training and
testing data to build a model. The data involves a set of input attributes with their corresponding
output. The network uses the training data to “learn” how to predict the known output, and the
testing data is used for validation. The aim is to predict the output for any given inputs in such a
way that the distance between the observed and predicted outputs becomes minimized. This
algorithm repeatedly examines all the training data to update its weights. These weights are
adjusted during training and the process is only in the forward direction through the network
without any feedback loops [9].
The simplified process for training a feed-forward back propagation network is as follows [9]:
1. Input data is entered to the network and propagated through the network until it reaches
the output layer. This forward process produces a predicted output.
2. The predicted output is subtracted from the actual output and an error value for the
networks is calculated.
3. The neural network then uses supervised learning, which in most cases is back
propagation, to train the network. Back propagation is a learning algorithm for adjusting
the weights.
4. Once back propagation has finished, the forward process starts again, and this cycle is
continued until the error between predicted and actual outputs is minimized.
Equations 1 to 3 summarize the formulations as follows:
ܽାଵ
= ݂ାଵሺݓାଵ
ܽ
+ ܾାଵሻ ݂ݎ ݉ = 0, … , ܯ − 1 ሺ1ሻ
Where,
ሼଵ, ݐଵሽ, ሼଶ, ݐଶሽ, … , ሼொ, ݐொሽ is the data set (pi and ti are the ith
input and output respectively), and
M is the number of layers with f m
as the transfer function, wm
as the weight and bm
as the bias in
the mth
layer. The input of the first layer is the network input and the output of the last layer is the
network output.
ܽ = ܽ
, ܽ
= ሺ2ሻ
The parameters (wm
and bm
) are estimated in such a way that the mean square errors (the mean
distances between the observed and estimated outputs, ti and ai respectively) is minimized.
2.2. Decision Trees
Decision tree is a typical method for the classification of objects in to decision classes [12]. A
decision tree classifier is a function as follows:
)()(...)()(: 21 YdomXdomXdomXdomdt n →××× (4)
In which,
4. 4 Computer Science & Information Technology (CS & IT)
nXXX ,...,, 21 are input attributes and Y is the output, where Xi has domain dom (Xi) and Y has
domain dom (Y). We assume without loss of generality that dom (Y)={Y1, Y2} (a dichotomous
discrete attributes).
A decision tree is a directed, acyclic graph T in a form of a tree. Each node in a tree has either
zero or more outgoing edges. If a node has no outgoing edges, then it is called a decision node (a
leaf node); otherwise, a node is called a test node (or an attribute node). Each decision node N is
labeled with one of the possible decision classes }.,{ 21 YYY ∈ Each test node is labelled with one
input attribute },...,,{ 21 ni XXXX ∈ i.e. called the splitting attribute. Each splitting attribute Xi
has a splitting function fi associated with it. The splitting function fi determine the outgoing edge
from the test node, based on the attribute value Xi of an object O in question. It is in form of
ii YX ∈ where );( ii XdomY ⊂ if the value of the attribute Xi of object O is within Yi , then the
corresponding outgoing edge from the test node is chosen.
The problem of decision tree construction is as follows: Given a data set ),...,,( 21 nxxxX = ,
where xi are input random samples from an unknown probability distribution P, find a decision
tree classifier T such that the misclassification rate RT(P) is minimal [12].
2.3. Logistic Regression Analysis
Logistic regression is also called logistic model or logit model, is a type of predictive model in
which the target variable (output attribute) is a dichotomous variable for instances healthy or
unhealthy, dead or alive, win or loss, etc. Logistic regression is used for the prediction of the
probability of occurrence the desired event from two existent ones in output variable by fitting the
data into a logistic curve. Like many forms of regression analysis, the input attributes may be
either numerical or categorical. For example, the probability that a person has a heart attack in a
specified time that might be predicted from the knowledge of person’s age, sex and body mass
index [9]. Logistic regression is widely applied in the medical sciences.
The output attribute Y, of a subject can take one of two possible values, denoted by 1 and 0 (for
example, Y=1 if a disease is present; otherwise Y=0). Let X=(x1, x2,,…, xn) be the vector of input
attributes. The logistic regression model is used to explain the effects of the input attributes on the
probability of occurrence the value 1 for output Y.
ݐ݅݃ܮ ሼܲሺܻ = 1ሻሽ = log ቊ
ܲሺܻ = 1ሻ
1 − ܲሺܻ = 1ሻ
ቋ = ܾ + ܾଵݔଵ + ⋯ + ܾݔ ሺ5ሻ
Where, P stands for probability, b0 is called the “intercept” and b1, b2,… are called the “regression
coefficients” of x1, x2, … respectively. Each of the regression coefficients describes the
importance of corresponding input attribute on output. A positive regression coefficient means
that this input increases the probability of outcome, where as a negative regression coefficient
means that the considered input decreases the probability of outcome. In addition, the absolute
value of the coefficient detects its effect on the probability of outcome. A large values means
strongly influences and a non-zero regression coefficient means little influence on the probability
of outcome [9].
Analysis of our clinical data was done by Matlab 2008a for artificial neural network method,
WEKA software, version 3.7.1 for decision tree technique and SPSS software, version 16.0 for
logistic regression analysis.
The accuracy rate in prediction for these three methods is calculated to determine the best
predictive methods.
5. Computer Science & Information Technology (CS & IT) 5
2.4. Patients’ Population and Clinical Problem
The malignancy or benignity of thyroid nodule is sometimes in ambiguity for the physicians.
Only based on the pathology result after the surgery and removal of the thyroid tumour, the type
of tumour is certainly determined, whereas it is better to avoid the unnecessary surgery. Although
there are various factors which help the physician in diagnosis before the surgery but, the ultimate
decision is still in ambiguity. For instance, FNA (Fine Needle Aspiration) of the nodule is one of
the most useful tools in determining type of tumour thyroid but it has some limitations in accurate
report and some significant mistakes while decision making, are made [1].
In the present study, all the patients in two recent years (2011 & 2012) which are referred to
Shahid Rajaee hospital in Shiraz, Iran for the surgery of thyroid tumour are participated to our
study. During this time, 259 cases with positive FNA result are entered the study. Most of them
are female (211 versus 48 cases) with overall mean age of 42.3± 13.6 (41±13 for female and
47.9±15 for male). Some patients’ characteristics are considered as the inputs attributes such as
gender, age, thyroid nodules size, tumour size, type of operation, the duration of the disease,
patient family history,The dichotomous output is the malignancy or benignity of thyroid nodule
derived from the pathology results after the surgery.
3. RESULTS
Three explained methods applied to the clinical dataset and the results are summarized separately
as follows:
3.1 Artificial Neural Network Results
Table 1 explains the information of the network trained by the thyroid tumour data. From the 259
cases, 75 percents (196 cases) which were chosen randomly are used in training set and 25
percents (63 cases) are used for validation. The absolute values of the final updated weights on
the 14th
step determine the important attributes on the malignancy of the tumour. According to the
trained network results, the first five important attributes are Multiple Nodule, Cancer Family
History, Size of Left Lobe, Lobectomy and Type of Operation, respectively.
Table 1. The information of trsined artificial neural network on thyroid tumour data
Network type
Algorithm
No. of inputs
No. of output
No. of hidden layer
No. of neuron in hidden layer
Transfer function
Iteration
Epochs
Convergence precision
No. of step to convergence
Feed-forward
Back Propagation
29
1
1
10
Log-sigmoid
50
1000
0.00001
14
The accuracy of the prediction is 98 percents for training set and 92 percents for validation set.
These results confirm the power of the trained network in prediction. Table 2 shows the
prediction results. The target output is the observed result from pathology which detects the real
status of the patient after surgery and the estimated output is derived from the trained network.
6. 6 Computer Science & Information Technology (CS & IT)
Table 2. The estimated output versus the observed one by the trained neural network
Estimated output
Target
In training set
Malignant Benign
In validation set
Malignant Benign
Malignant 81 2 32 3
Benign 1 112 2 26
3.2 Decision Tree Results
150 cases were randomly chosen from 259 cases to train the decision tree. J48 algorithm is used
and the results are summarized in Tables 3 and 4. The first five important attributes near the root
of the tree are Size of Left Lobe, Type of Operation, Multiple Nodule, Encapsulation and Cancer
Family History, respectively. Almost near to neural network results but with lower accuracy rate
(80 percents in training and 75 percents in validation set).
Table 3. The estimated output versus the observed one by the derived decision tree
Estimated output
Target
In training set
Malignant Benign
In validation set
Malignant Benign
Malignant 43 12 52 11
Benign 18 77 16 30
Table 4. The information of derived decision tree on thyroid tumour data
Root mean squared error
Relative absolute error
Root relative squared error
Coverage of cases (0.95 level)
Mean rel. region size (0.95 level)
Total Number of Instances
0.4826
99.1188 %
101.0544 %
98.7013 %
98.7013 %
259
3.3 Logistic Regression Analysis Results
In our clinical dataset for logistic regression analysis, the real type of tumour for each case from
pathology result is considered as the binary output of the model. Therefore, Y=1 for malignant
and Y=0 for benign tumour. And 29 patients’ characteristics are considered as the input vector of
attributes, i.e. X=(x1, x2,,…, x29). Conditional forwards method is used as the variables’ entrance
method to the model. No significant relation found among the data. That is, none of the input
attribute enters the model and logistic regression analysis was not able to find the significant
relation between the inputs and the output. Several methods of variables’ entrance were also used
without any significant results. Table 5 and Eq. 6 describe the estimated model containing the
only coefficient as the intercept. Coefficient of determination which detects the model ability to
explain the output’s variability is 0.69. It means that without any information of these 29
7. Computer Science & Information Technology (CS & IT) 7
attributes of the patient, 69 percents of tumor malignancy is estimable. This result is not
confirmed by the physician.
Table 5. Estimated coefficient of logistic regression fitted on thyroid tumor dataset
Variables
in the
equation
B Standard
Error
Wald
statistics
Degree of
freedom
Significance EXP(B)
Constant -0.974 0.139 48.868 1 <0.001 0.378
log ቊ
ܲሺܻ = 1ሻ
1 − ܲሺܻ = 1ሻ
ቋ = −0.974 ሺ6ሻ
4. CONCLUSIONS
As mentioned earlier, the ideal theoretical assumptions’ of parametric classification methods
bring them some limitations in practice. In other words, the real dataset seldom fulfil their
assumptions. For logistic regression classification method for instance, in addition of underlying
distributional assumptions such as Bernoulli probability distribution for observed dichotomous
outputs, independency of observed inputs, independency and identically distribution of error
terms and enough sample size in both categories of output, the number of inputs is important too.
Usually, parametric methods are seldom able to manage more than 30 attributes. In the clinical
example of our study, the theoretical assumptions may not be held. Furthermore, 29 input
attributes with 12 categorical variables which are entered to the model by 27 indicator variables
may seem out of the model’s ability. And this fact may cause to the strange results far from the
physician’s expectance.
In comparison, two nonparametric classification methods discussed in the present study are more
flexible and nearer to the real data circumstances. In our clinical dataset, neural network method
consumed more time for computation with complicated results not simply interpretable by the
clinical experts. Decision tree method in contrast, represents the simple decision rules in shorter
time. However, it is less sensitive to the noise (irrelevant attribute) with lower accuracy rate than
neural network.
To choose the best method, authors suggest further attempt to test more networks with different
algorithms and transfer functions and also various algorithms of decision tree on the discussed
dataset. However, it depends to the physician’s preference too.
ACKNOWLEDGEMENTS
The authors would like to thank trauma research center of Shahid Rajaee hospital in Shiraz
University of Medical Sciences and its staff, for their valuable cooperation in data gathering
process.
REFERENCES
[1] Papageorgiou E, Kotsioni I & Linos A,(2005) “Data mining: a new technique in medical research”,
Hormones, Vol. 4, No. 4, pp 189-191.
[2] Seifert J.W. (2010) “Data Mining and Homeland Security: An Overview”, BiblioGov.
[3] Lawrence O. Hall, Xiaomei Liu, Kevin W. Bowyer2 & Robert Ban_eld, (2003) “Why are neural
networks sometimes much more accurate than decision trees: an analysis on a
bio-informatics problem”, IEEE International Conference on Systems, Man & Cybernetics,
Washington, D.C., pp. 2851-2856, October 5–8.
8. 8 Computer Science & Information Technology (CS & IT)
[4] Kazemnejad A, Batvandi Z & Faradmal J, (2010) “Comparison of artificial neural network and binary
logistic regression for determination of impaired glucose tolerance/diabetes”, Eastern Mediterranean
Health Journal, Vol. 16, No. 6, pp 615-620.
[5] Kue W.J, Chang R.F, Chen D.R, et al. (2001) “Data Mining with decision trees for diagnosis of
breast tumor in medical ultrasonic images”, Breast Cancer Research and Treatment, Vol. 66, pp 51-
57.
[6] Sakai S, Kobayashi K, et al. (2007) “Comparison of the levels of accuracy of an artificial neural
network model and a logistic regression model for the diagnosis of acute appendicitis” J Med Syst,
Vol. 31, pp 357–364.
[7] Anthony M. & Bartlett P, (1999) “Neural Network Learning: Theoretical Foundations”, Cambridge
University press.
[8] Quinlan J.R, (1996) “C4.5: Programs for Machine Learning”, Morgan Kaufmann, San Mateo, CA.
[9] Raghavendra B.K & Srivatsa S.K, (2011) “Evaluation of logistic regression and eural network model
with sensitivity analysis on medical datasets”, International jcomputer science and security, Vol 5,
no, 5. pp 503-511.
[10] Tasdelen B, Helvaci S, Kaleagasi H & Ozge A, (2009) “Artificial neural network analysis for
prediction of headache prognosis in elderly patients”, Turk J Med Sci , Vol. 39, No. 1, pp 5-12.
[11] Zhang J, Lok T, R.Lyu M, (2007) “A hybrid particle swarm optimization-back-Propagation algorithm
for feedforward neural network training”, Applied Mathematics and Computation, Vol. 185, pp1026-
1037.
[12] Kokol P, Pohorec S, Stiglic G & Podgorelec V, (2012) “Evolutionary design of decision trees for
medical application”, WIRE Data Mining Knowl Discov, Vol. 2, pp 237-254.
[13] Shahbaz Khan F, Anwer RM, Torgersson O, Falkman G, (2007) “Data Mining in Oral Medicine
Using Decision tree”, World Academy of Science,Engineering and Technology, Vol. 37, pp 12-16.
Authors
Saeedeh Pourahmad is an assistant professor at the Biostatistics Department of Shiraz
University of Medical Sciences in Iran. She obtained a B.Sc. in Statistics from Shiraz
University in 2002. She also obtained an M.S. and a Ph.D. degree in Biostatistics from
Shiraz University of Medical Sciences in 2004 and 2011 in Iran, respectively. Her
research is modelling in Fuzzy environments, neural networks, nonlinear and linear
relations in crisp environments and their clinical applications.
Mohsen Azad is a M.S. student at the Biostatistics Department of Shiraz University of
Medical Sciences in Iran. He obtained a B.Sc. in Statistics from Shahid Bahonar
University in Kerman (Iran) in 2010. He is currently involved with the present research.
Shahram Paydar is an assistant professor at the Department of Surgery in Medical School
of Shiraz University of Medical Sciences in Iran. He obtained a MD d egree and Na
tional board of surgery in 2000 and 2007 from Shiraz University of Medical Sciences in
Iran, respectively. He is a full time member of Trauma Research Center in Rajaee
Hospital of Shiraz, Iran, oct. 2007 and his research is in Trauma and emergency
management and Thoracic surgery.
Hamid Reza Abbasi is an associated professor at the Department of Surgery in Medical
School of Shiraz University of Medical Sciences in Iran. He obtained a MD degree and
General Surgery, MD in 1994 and 1998 from Shiraz University of Medical Sciences in
Iran, respectively. His research is in determining prognosis of acutely ischemic limb,
correlation of sonographic measurement of IVC diameter, etc.