SlideShare a Scribd company logo
1 of 8
Download to read offline
Sundarapandian et al. (Eds): ICAITA, SAI, SEAS, CDKP, CMCA, CS & IT 08,
pp. 01–08, 2012. © CS & IT-CSCP 2012 DOI : 10.5121/csit.2012.2501
PREDICTION OF MALIGNANCY IN SUSPECTED
THYROID TUMOUR PATIENTS BY THREE
DIFFERENT METHODS OF CLASSIFICATION IN
DATA MINING
Saeedeh Pourahmad1,3
, Mohsen Azad1
,Shahram Paydar2
and Hamid Reza
Abbasi2
1
Department of Biostatistics, School of Medicine,
Shiraz University of Medical Sciences, Shiraz, Iran
pourahmad@sums.ac.ir
azadmo@sums.ac.ir
2
Department of Surgery and Trauma Research Center, School of Medicine,
Shiraz University of Medical Sciences, Shiraz, Iran
paydarsh@sums.ac.ir
abbasihr@sums.ac.ir
3
Colorectal Research Center,
Shiraz University of Medical Sciences, Shiraz, Iran
ABSTRACT
In the present study, the abilities of three classification methods of data mining namely artificial
neural networks with feed-forward back propagation algorithm, J48 decision tree method and
logistic regression analysis are compared in a medical real dataset. The prediction of
malignancy in suspected thyroid tumour patients is the objective of the study. The accuracy of
the correct predictions (the minimum error rate), the amount of time consuming in the
modelling process and the interpretability and simplicity of the results for clinical experts are
the factors considered to choose the best method.
Keywords
Data Mining, Artificial Neural Networks, Decision Trees, Logistic Regression Analysis,
Malignancy, Thyroid Tumour Patients
1. INTRODUCTION
Thyroid nodule is a common problem in population and decision making for type of management
is subject of controversies. Management varies from observation to total thyroidectomy.
Sequence of diagnostic procedures is based on management protocols of the center. FNA (Fine
Needle Aspiration) of the nodule is one of the most useful tools in determining type of
management of thyroid nodules but has some limitations in accurate report and some significant
mistakes while decision making, are made [1].
Therefore, it is necessary to help the physicians by data mining techniques and search deeply in
patients’ attributes to find the meaningful relations and develop the diagnosis process.
2 Computer Science & Information Technology (CS & IT)
There are different classification methods in data mining techniques [2]. Some of them are deeply
dependent to underlying theoretical assumptions such as linear and logistic regression models.
They are called parametric methods. And some the others are assumptions free like artificial
neural networks, decision trees, K-nearest neighbourhood, etc.
Recently, non parametric methods of data mining techniques which are not depending on
theoretical distributional assumptions receive more attention in practice. The real dataset seldom
follows the underlying theoretical assumptions of parametric methods. Clinical dataset is such an
example. The avoidance of ideal theoretical assumptions in one hand and the variability in the
nature of clinical dataset and their vague relations on the other hand cause this favourability. The
attribute nature of biological data and their vague relation does not consist with theoretical
distributional assumptions of parametric methods. For instances, dichotomous attribute in logistic
regression analysis which is modelled based on the other attributes should follow Bernoulli
theoretical distribution. The independency of error terms of models, the independency of other
attributes in the model and enough sample size are the other ideal theoretical assumptions of these
modelling methods. The application of such methods and the accuracy of their results depend on
fulfilment of their ideal assumptions. In comparison, other methods such as artificial neural
networks and decision trees use the learning process from a set of existent prototypes without any
specific underlying assumptions. Therefore, the relation among the attributes is discovered from a
part of the dataset (training set) and the parameters are estimated in such a way that the error
prediction is minimized. Then, the power of model’s prediction is evaluated by the other part of
dataset (testing set).
These two mentioned methods have some advantages and disadvantages in their own [3]. In
general, decision trees will require much less training time than neural networks. However,
despite of neural network methods, decision tree techniques are less sensitive to the noise. In
other words, the hidden layers in neural network discover complicated relations of the data and
assign more weights to the important attributes. Nevertheless, their performance depends on the
dataset. For some data, much more time and complicated computing is needed to train a neural
network or/and the results are hardly interpretable by the expert of that field. For some other data,
decision tree techniques are not able to discover reasonable relations. There are some studies
which compare the abilities of these two methods and also, with logistic regression analysis in
different research fields. For instances see [4, 5, 6].
In this paper, we compare feed forward back propagation neural networks and a well-known
algorithms of decision tree namely J48 on our clinical dataset. A feed forward back propagation
neural network repeatedly examines all the training data in the process of updating its weights [7].
The mentioned decision tree learning algorithms recursively partitions the training data into ever
smaller subsets on which a test is made [8].
2. METHODS AND MATERIALS
In this section, three applied classification methods are described briefly. Then, the clinical
diagnosis problem and the patients’ population used in practical part of the study are explained at
the end.
2.1. Artificial Neural Networks
Neural networks are the branch of artificial intelligence. Their models are inspired by the neural
systems of human brain. And have been applied in many research fields such as biology,
psychology, statistics, mathematics, medical science, computer sciences, and also, a variety of
business areas like finance, management and decision making, marketing and production [9].
Recently, artificial neural networks (ANNs) become a very popular model to diagnose disease.
However, ANNs have some disadvantages and some advantages for medical analysis. Their
Computer Science & Information Technology (CS & IT) 3
discrimination power, discovery the complex and nonlinear relationship among the attributes, and
prediction of the cases are the most important advantages of ANNs. Nevertheless, they can be
over-fitted for training data, and time consuming because of computational requirements [9]. The
selection of an appropriate training algorithm, transfer functions, initial values of network weights
and also, the number of parameters and hidden layers to define the network size determine the
performance of an ANN. But compared to logistic regression analysis, neural network models are
more flexible [10].
In this study, we use a type of neural networks namely feed-forwards network with back
propagation algorithm to model the relations among attributes in our clinical dataset. A feed-
forward back propagation neural network is a supervised network. That is, it uses training and
testing data to build a model. The data involves a set of input attributes with their corresponding
output. The network uses the training data to “learn” how to predict the known output, and the
testing data is used for validation. The aim is to predict the output for any given inputs in such a
way that the distance between the observed and predicted outputs becomes minimized. This
algorithm repeatedly examines all the training data to update its weights. These weights are
adjusted during training and the process is only in the forward direction through the network
without any feedback loops [9].
The simplified process for training a feed-forward back propagation network is as follows [9]:
1. Input data is entered to the network and propagated through the network until it reaches
the output layer. This forward process produces a predicted output.
2. The predicted output is subtracted from the actual output and an error value for the
networks is calculated.
3. The neural network then uses supervised learning, which in most cases is back
propagation, to train the network. Back propagation is a learning algorithm for adjusting
the weights.
4. Once back propagation has finished, the forward process starts again, and this cycle is
continued until the error between predicted and actual outputs is minimized.
Equations 1 to 3 summarize the formulations as follows:
ܽ௠ାଵ
= ݂௠ାଵሺ‫ݓ‬௠ାଵ
ܽ௠
+ ܾ௠ାଵሻ ݂‫ݎ݋‬ ݉ = 0, … , ‫ܯ‬ − 1 ሺ1ሻ
Where,
ሼ‫݌‬ଵ, ‫ݐ‬ଵሽ, ሼ‫݌‬ଶ, ‫ݐ‬ଶሽ, … , ሼ‫݌‬ொ, ‫ݐ‬ொሽ is the data set (pi and ti are the ith
input and output respectively), and
M is the number of layers with f m
as the transfer function, wm
as the weight and bm
as the bias in
the mth
layer. The input of the first layer is the network input and the output of the last layer is the
network output.
ܽ = ܽ௠
, ܽ଴
= ‫݌‬ ሺ2ሻ
The parameters (wm
and bm
) are estimated in such a way that the mean square errors (the mean
distances between the observed and estimated outputs, ti and ai respectively) is minimized.
2.2. Decision Trees
Decision tree is a typical method for the classification of objects in to decision classes [12]. A
decision tree classifier is a function as follows:
)()(...)()(: 21 YdomXdomXdomXdomdt n →××× (4)
In which,
4 Computer Science & Information Technology (CS & IT)
nXXX ,...,, 21 are input attributes and Y is the output, where Xi has domain dom (Xi) and Y has
domain dom (Y). We assume without loss of generality that dom (Y)={Y1, Y2} (a dichotomous
discrete attributes).
A decision tree is a directed, acyclic graph T in a form of a tree. Each node in a tree has either
zero or more outgoing edges. If a node has no outgoing edges, then it is called a decision node (a
leaf node); otherwise, a node is called a test node (or an attribute node). Each decision node N is
labeled with one of the possible decision classes }.,{ 21 YYY ∈ Each test node is labelled with one
input attribute },...,,{ 21 ni XXXX ∈ i.e. called the splitting attribute. Each splitting attribute Xi
has a splitting function fi associated with it. The splitting function fi determine the outgoing edge
from the test node, based on the attribute value Xi of an object O in question. It is in form of
ii YX ∈ where );( ii XdomY ⊂ if the value of the attribute Xi of object O is within Yi , then the
corresponding outgoing edge from the test node is chosen.
The problem of decision tree construction is as follows: Given a data set ),...,,( 21 nxxxX = ,
where xi are input random samples from an unknown probability distribution P, find a decision
tree classifier T such that the misclassification rate RT(P) is minimal [12].
2.3. Logistic Regression Analysis
Logistic regression is also called logistic model or logit model, is a type of predictive model in
which the target variable (output attribute) is a dichotomous variable for instances healthy or
unhealthy, dead or alive, win or loss, etc. Logistic regression is used for the prediction of the
probability of occurrence the desired event from two existent ones in output variable by fitting the
data into a logistic curve. Like many forms of regression analysis, the input attributes may be
either numerical or categorical. For example, the probability that a person has a heart attack in a
specified time that might be predicted from the knowledge of person’s age, sex and body mass
index [9]. Logistic regression is widely applied in the medical sciences.
The output attribute Y, of a subject can take one of two possible values, denoted by 1 and 0 (for
example, Y=1 if a disease is present; otherwise Y=0). Let X=(x1, x2,,…, xn) be the vector of input
attributes. The logistic regression model is used to explain the effects of the input attributes on the
probability of occurrence the value 1 for output Y.
‫ݐ݅݃݋ܮ‬ ሼܲሺܻ = 1ሻሽ = log ቊ
ܲሺܻ = 1ሻ
1 − ܲሺܻ = 1ሻ
ቋ = ܾ଴ + ܾଵ‫ݔ‬ଵ + ⋯ + ܾ௡‫ݔ‬௡ ሺ5ሻ
Where, P stands for probability, b0 is called the “intercept” and b1, b2,… are called the “regression
coefficients” of x1, x2, … respectively. Each of the regression coefficients describes the
importance of corresponding input attribute on output. A positive regression coefficient means
that this input increases the probability of outcome, where as a negative regression coefficient
means that the considered input decreases the probability of outcome. In addition, the absolute
value of the coefficient detects its effect on the probability of outcome. A large values means
strongly influences and a non-zero regression coefficient means little influence on the probability
of outcome [9].
Analysis of our clinical data was done by Matlab 2008a for artificial neural network method,
WEKA software, version 3.7.1 for decision tree technique and SPSS software, version 16.0 for
logistic regression analysis.
The accuracy rate in prediction for these three methods is calculated to determine the best
predictive methods.
Computer Science & Information Technology (CS & IT) 5
2.4. Patients’ Population and Clinical Problem
The malignancy or benignity of thyroid nodule is sometimes in ambiguity for the physicians.
Only based on the pathology result after the surgery and removal of the thyroid tumour, the type
of tumour is certainly determined, whereas it is better to avoid the unnecessary surgery. Although
there are various factors which help the physician in diagnosis before the surgery but, the ultimate
decision is still in ambiguity. For instance, FNA (Fine Needle Aspiration) of the nodule is one of
the most useful tools in determining type of tumour thyroid but it has some limitations in accurate
report and some significant mistakes while decision making, are made [1].
In the present study, all the patients in two recent years (2011 & 2012) which are referred to
Shahid Rajaee hospital in Shiraz, Iran for the surgery of thyroid tumour are participated to our
study. During this time, 259 cases with positive FNA result are entered the study. Most of them
are female (211 versus 48 cases) with overall mean age of 42.3± 13.6 (41±13 for female and
47.9±15 for male). Some patients’ characteristics are considered as the inputs attributes such as
gender, age, thyroid nodules size, tumour size, type of operation, the duration of the disease,
patient family history,The dichotomous output is the malignancy or benignity of thyroid nodule
derived from the pathology results after the surgery.
3. RESULTS
Three explained methods applied to the clinical dataset and the results are summarized separately
as follows:
3.1 Artificial Neural Network Results
Table 1 explains the information of the network trained by the thyroid tumour data. From the 259
cases, 75 percents (196 cases) which were chosen randomly are used in training set and 25
percents (63 cases) are used for validation. The absolute values of the final updated weights on
the 14th
step determine the important attributes on the malignancy of the tumour. According to the
trained network results, the first five important attributes are Multiple Nodule, Cancer Family
History, Size of Left Lobe, Lobectomy and Type of Operation, respectively.
Table 1. The information of trsined artificial neural network on thyroid tumour data
Network type
Algorithm
No. of inputs
No. of output
No. of hidden layer
No. of neuron in hidden layer
Transfer function
Iteration
Epochs
Convergence precision
No. of step to convergence
Feed-forward
Back Propagation
29
1
1
10
Log-sigmoid
50
1000
0.00001
14
The accuracy of the prediction is 98 percents for training set and 92 percents for validation set.
These results confirm the power of the trained network in prediction. Table 2 shows the
prediction results. The target output is the observed result from pathology which detects the real
status of the patient after surgery and the estimated output is derived from the trained network.
6 Computer Science & Information Technology (CS & IT)
Table 2. The estimated output versus the observed one by the trained neural network
Estimated output
Target
In training set
Malignant Benign
In validation set
Malignant Benign
Malignant 81 2 32 3
Benign 1 112 2 26
3.2 Decision Tree Results
150 cases were randomly chosen from 259 cases to train the decision tree. J48 algorithm is used
and the results are summarized in Tables 3 and 4. The first five important attributes near the root
of the tree are Size of Left Lobe, Type of Operation, Multiple Nodule, Encapsulation and Cancer
Family History, respectively. Almost near to neural network results but with lower accuracy rate
(80 percents in training and 75 percents in validation set).
Table 3. The estimated output versus the observed one by the derived decision tree
Estimated output
Target
In training set
Malignant Benign
In validation set
Malignant Benign
Malignant 43 12 52 11
Benign 18 77 16 30
Table 4. The information of derived decision tree on thyroid tumour data
Root mean squared error
Relative absolute error
Root relative squared error
Coverage of cases (0.95 level)
Mean rel. region size (0.95 level)
Total Number of Instances
0.4826
99.1188 %
101.0544 %
98.7013 %
98.7013 %
259
3.3 Logistic Regression Analysis Results
In our clinical dataset for logistic regression analysis, the real type of tumour for each case from
pathology result is considered as the binary output of the model. Therefore, Y=1 for malignant
and Y=0 for benign tumour. And 29 patients’ characteristics are considered as the input vector of
attributes, i.e. X=(x1, x2,,…, x29). Conditional forwards method is used as the variables’ entrance
method to the model. No significant relation found among the data. That is, none of the input
attribute enters the model and logistic regression analysis was not able to find the significant
relation between the inputs and the output. Several methods of variables’ entrance were also used
without any significant results. Table 5 and Eq. 6 describe the estimated model containing the
only coefficient as the intercept. Coefficient of determination which detects the model ability to
explain the output’s variability is 0.69. It means that without any information of these 29
Computer Science & Information Technology (CS & IT) 7
attributes of the patient, 69 percents of tumor malignancy is estimable. This result is not
confirmed by the physician.
Table 5. Estimated coefficient of logistic regression fitted on thyroid tumor dataset
Variables
in the
equation
B Standard
Error
Wald
statistics
Degree of
freedom
Significance EXP(B)
Constant -0.974 0.139 48.868 1 <0.001 0.378
log ቊ
ܲሺܻ = 1ሻ
1 − ܲሺܻ = 1ሻ
ቋ = −0.974 ሺ6ሻ
4. CONCLUSIONS
As mentioned earlier, the ideal theoretical assumptions’ of parametric classification methods
bring them some limitations in practice. In other words, the real dataset seldom fulfil their
assumptions. For logistic regression classification method for instance, in addition of underlying
distributional assumptions such as Bernoulli probability distribution for observed dichotomous
outputs, independency of observed inputs, independency and identically distribution of error
terms and enough sample size in both categories of output, the number of inputs is important too.
Usually, parametric methods are seldom able to manage more than 30 attributes. In the clinical
example of our study, the theoretical assumptions may not be held. Furthermore, 29 input
attributes with 12 categorical variables which are entered to the model by 27 indicator variables
may seem out of the model’s ability. And this fact may cause to the strange results far from the
physician’s expectance.
In comparison, two nonparametric classification methods discussed in the present study are more
flexible and nearer to the real data circumstances. In our clinical dataset, neural network method
consumed more time for computation with complicated results not simply interpretable by the
clinical experts. Decision tree method in contrast, represents the simple decision rules in shorter
time. However, it is less sensitive to the noise (irrelevant attribute) with lower accuracy rate than
neural network.
To choose the best method, authors suggest further attempt to test more networks with different
algorithms and transfer functions and also various algorithms of decision tree on the discussed
dataset. However, it depends to the physician’s preference too.
ACKNOWLEDGEMENTS
The authors would like to thank trauma research center of Shahid Rajaee hospital in Shiraz
University of Medical Sciences and its staff, for their valuable cooperation in data gathering
process.
REFERENCES
[1] Papageorgiou E, Kotsioni I & Linos A,(2005) “Data mining: a new technique in medical research”,
Hormones, Vol. 4, No. 4, pp 189-191.
[2] Seifert J.W. (2010) “Data Mining and Homeland Security: An Overview”, BiblioGov.
[3] Lawrence O. Hall, Xiaomei Liu, Kevin W. Bowyer2 & Robert Ban_eld, (2003) “Why are neural
networks sometimes much more accurate than decision trees: an analysis on a
bio-informatics problem”, IEEE International Conference on Systems, Man & Cybernetics,
Washington, D.C., pp. 2851-2856, October 5–8.
8 Computer Science & Information Technology (CS & IT)
[4] Kazemnejad A, Batvandi Z & Faradmal J, (2010) “Comparison of artificial neural network and binary
logistic regression for determination of impaired glucose tolerance/diabetes”, Eastern Mediterranean
Health Journal, Vol. 16, No. 6, pp 615-620.
[5] Kue W.J, Chang R.F, Chen D.R, et al. (2001) “Data Mining with decision trees for diagnosis of
breast tumor in medical ultrasonic images”, Breast Cancer Research and Treatment, Vol. 66, pp 51-
57.
[6] Sakai S, Kobayashi K, et al. (2007) “Comparison of the levels of accuracy of an artificial neural
network model and a logistic regression model for the diagnosis of acute appendicitis” J Med Syst,
Vol. 31, pp 357–364.
[7] Anthony M. & Bartlett P, (1999) “Neural Network Learning: Theoretical Foundations”, Cambridge
University press.
[8] Quinlan J.R, (1996) “C4.5: Programs for Machine Learning”, Morgan Kaufmann, San Mateo, CA.
[9] Raghavendra B.K & Srivatsa S.K, (2011) “Evaluation of logistic regression and eural network model
with sensitivity analysis on medical datasets”, International jcomputer science and security, Vol 5,
no, 5. pp 503-511.
[10] Tasdelen B, Helvaci S, Kaleagasi H & Ozge A, (2009) “Artificial neural network analysis for
prediction of headache prognosis in elderly patients”, Turk J Med Sci , Vol. 39, No. 1, pp 5-12.
[11] Zhang J, Lok T, R.Lyu M, (2007) “A hybrid particle swarm optimization-back-Propagation algorithm
for feedforward neural network training”, Applied Mathematics and Computation, Vol. 185, pp1026-
1037.
[12] Kokol P, Pohorec S, Stiglic G & Podgorelec V, (2012) “Evolutionary design of decision trees for
medical application”, WIRE Data Mining Knowl Discov, Vol. 2, pp 237-254.
[13] Shahbaz Khan F, Anwer RM, Torgersson O, Falkman G, (2007) “Data Mining in Oral Medicine
Using Decision tree”, World Academy of Science,Engineering and Technology, Vol. 37, pp 12-16.
Authors
Saeedeh Pourahmad is an assistant professor at the Biostatistics Department of Shiraz
University of Medical Sciences in Iran. She obtained a B.Sc. in Statistics from Shiraz
University in 2002. She also obtained an M.S. and a Ph.D. degree in Biostatistics from
Shiraz University of Medical Sciences in 2004 and 2011 in Iran, respectively. Her
research is modelling in Fuzzy environments, neural networks, nonlinear and linear
relations in crisp environments and their clinical applications.
Mohsen Azad is a M.S. student at the Biostatistics Department of Shiraz University of
Medical Sciences in Iran. He obtained a B.Sc. in Statistics from Shahid Bahonar
University in Kerman (Iran) in 2010. He is currently involved with the present research.
Shahram Paydar is an assistant professor at the Department of Surgery in Medical School
of Shiraz University of Medical Sciences in Iran. He obtained a MD d egree and Na
tional board of surgery in 2000 and 2007 from Shiraz University of Medical Sciences in
Iran, respectively. He is a full time member of Trauma Research Center in Rajaee
Hospital of Shiraz, Iran, oct. 2007 and his research is in Trauma and emergency
management and Thoracic surgery.
Hamid Reza Abbasi is an associated professor at the Department of Surgery in Medical
School of Shiraz University of Medical Sciences in Iran. He obtained a MD degree and
General Surgery, MD in 1994 and 1998 from Shiraz University of Medical Sciences in
Iran, respectively. His research is in determining prognosis of acutely ischemic limb,
correlation of sonographic measurement of IVC diameter, etc.

More Related Content

What's hot

View classification of medical x ray images using pnn classifier, decision tr...
View classification of medical x ray images using pnn classifier, decision tr...View classification of medical x ray images using pnn classifier, decision tr...
View classification of medical x ray images using pnn classifier, decision tr...eSAT Journals
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetIRJET Journal
 
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...IJERA Editor
 
Heart Disease Prediction Using Associative Relational Classification Techniq...
Heart Disease Prediction Using Associative Relational  Classification Techniq...Heart Disease Prediction Using Associative Relational  Classification Techniq...
Heart Disease Prediction Using Associative Relational Classification Techniq...IJMER
 
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...IOSR Journals
 
Successive iteration method for reconstruction of missing data
Successive iteration method for reconstruction of missing dataSuccessive iteration method for reconstruction of missing data
Successive iteration method for reconstruction of missing dataIAEME Publication
 
An optimized approach for extensive segmentation and classification of brain ...
An optimized approach for extensive segmentation and classification of brain ...An optimized approach for extensive segmentation and classification of brain ...
An optimized approach for extensive segmentation and classification of brain ...IJECEIAES
 
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAA BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAIJSCAI Journal
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET Journal
 
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...ijcsa
 
Classification of medical datasets using back propagation neural network powe...
Classification of medical datasets using back propagation neural network powe...Classification of medical datasets using back propagation neural network powe...
Classification of medical datasets using back propagation neural network powe...IJECEIAES
 
IRJET - Detection of Heamorrhage in Brain using Deep Learning
IRJET - Detection of Heamorrhage in Brain using Deep LearningIRJET - Detection of Heamorrhage in Brain using Deep Learning
IRJET - Detection of Heamorrhage in Brain using Deep LearningIRJET Journal
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
 
Classification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural NetworkClassification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural Networkirjes
 
Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Editor IJARCET
 

What's hot (19)

50120130406032
5012013040603250120130406032
50120130406032
 
View classification of medical x ray images using pnn classifier, decision tr...
View classification of medical x ray images using pnn classifier, decision tr...View classification of medical x ray images using pnn classifier, decision tr...
View classification of medical x ray images using pnn classifier, decision tr...
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease Dataset
 
[IJCT-V3I2P26] Authors: Sunny Sharma
[IJCT-V3I2P26] Authors: Sunny Sharma[IJCT-V3I2P26] Authors: Sunny Sharma
[IJCT-V3I2P26] Authors: Sunny Sharma
 
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...
MRI Image Segmentation Using Gradient Based Watershed Transform In Level Set ...
 
Heart Disease Prediction Using Associative Relational Classification Techniq...
Heart Disease Prediction Using Associative Relational  Classification Techniq...Heart Disease Prediction Using Associative Relational  Classification Techniq...
Heart Disease Prediction Using Associative Relational Classification Techniq...
 
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
 
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...Performance Evaluation of Different Data Mining Classification Algorithm and ...
Performance Evaluation of Different Data Mining Classification Algorithm and ...
 
Successive iteration method for reconstruction of missing data
Successive iteration method for reconstruction of missing dataSuccessive iteration method for reconstruction of missing data
Successive iteration method for reconstruction of missing data
 
An optimized approach for extensive segmentation and classification of brain ...
An optimized approach for extensive segmentation and classification of brain ...An optimized approach for extensive segmentation and classification of brain ...
An optimized approach for extensive segmentation and classification of brain ...
 
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATAA BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
A BINARY BAT INSPIRED ALGORITHM FOR THE CLASSIFICATION OF BREAST CANCER DATA
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data Mining
 
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
 
Classification of medical datasets using back propagation neural network powe...
Classification of medical datasets using back propagation neural network powe...Classification of medical datasets using back propagation neural network powe...
Classification of medical datasets using back propagation neural network powe...
 
IRJET - Detection of Heamorrhage in Brain using Deep Learning
IRJET - Detection of Heamorrhage in Brain using Deep LearningIRJET - Detection of Heamorrhage in Brain using Deep Learning
IRJET - Detection of Heamorrhage in Brain using Deep Learning
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
 
Classification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural NetworkClassification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural Network
 
Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397
 

Similar to PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFERENT METHODS OF CLASSIFICATION IN DATA MINING

An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...ijsc
 
Heart Failure Prediction using Different MachineLearning Techniques
Heart Failure Prediction using Different MachineLearning TechniquesHeart Failure Prediction using Different MachineLearning Techniques
Heart Failure Prediction using Different MachineLearning TechniquesIRJET Journal
 
Survey of various methods used for integrating machine learning into brain tu...
Survey of various methods used for integrating machine learning into brain tu...Survey of various methods used for integrating machine learning into brain tu...
Survey of various methods used for integrating machine learning into brain tu...Drjabez
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...IRJET Journal
 
Health Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep LearningHealth Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep LearningIRJET Journal
 
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATAGRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATAcscpconf
 
An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...IAESIJAI
 
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...IRJET Journal
 
Clustering and Classification of Cancer Data Using Soft Computing Technique
Clustering and Classification of Cancer Data Using Soft Computing Technique Clustering and Classification of Cancer Data Using Soft Computing Technique
Clustering and Classification of Cancer Data Using Soft Computing Technique IOSR Journals
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesAlexander Decker
 
An Introduction To Artificial Intelligence And Its Applications In Biomedical...
An Introduction To Artificial Intelligence And Its Applications In Biomedical...An Introduction To Artificial Intelligence And Its Applications In Biomedical...
An Introduction To Artificial Intelligence And Its Applications In Biomedical...Jill Brown
 
An Enhanced Feature Selection Method to Predict the Severity in Brain Tumor
An Enhanced Feature Selection Method to Predict the Severity in Brain TumorAn Enhanced Feature Selection Method to Predict the Severity in Brain Tumor
An Enhanced Feature Selection Method to Predict the Severity in Brain Tumorijtsrd
 
IRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET-Survey on Data Mining Techniques for Disease PredictionIRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET-Survey on Data Mining Techniques for Disease PredictionIRJET Journal
 
Chronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningChronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningIJCSIS Research Publications
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Alexander Decker
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IJDKP
 
Review on Mesothelioma Diagnosis
Review on Mesothelioma DiagnosisReview on Mesothelioma Diagnosis
Review on Mesothelioma DiagnosisIRJET Journal
 
IRJET - Heart Health Classification and Prediction using Machine Learning
IRJET -  	  Heart Health Classification and Prediction using Machine LearningIRJET -  	  Heart Health Classification and Prediction using Machine Learning
IRJET - Heart Health Classification and Prediction using Machine LearningIRJET Journal
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisjournalBEEI
 

Similar to PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFERENT METHODS OF CLASSIFICATION IN DATA MINING (20)

An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
 
Heart Failure Prediction using Different MachineLearning Techniques
Heart Failure Prediction using Different MachineLearning TechniquesHeart Failure Prediction using Different MachineLearning Techniques
Heart Failure Prediction using Different MachineLearning Techniques
 
Survey of various methods used for integrating machine learning into brain tu...
Survey of various methods used for integrating machine learning into brain tu...Survey of various methods used for integrating machine learning into brain tu...
Survey of various methods used for integrating machine learning into brain tu...
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...
 
Health Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep LearningHealth Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep Learning
 
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATAGRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
 
An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...
 
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
 
Clustering and Classification of Cancer Data Using Soft Computing Technique
Clustering and Classification of Cancer Data Using Soft Computing Technique Clustering and Classification of Cancer Data Using Soft Computing Technique
Clustering and Classification of Cancer Data Using Soft Computing Technique
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machines
 
An Introduction To Artificial Intelligence And Its Applications In Biomedical...
An Introduction To Artificial Intelligence And Its Applications In Biomedical...An Introduction To Artificial Intelligence And Its Applications In Biomedical...
An Introduction To Artificial Intelligence And Its Applications In Biomedical...
 
An Enhanced Feature Selection Method to Predict the Severity in Brain Tumor
An Enhanced Feature Selection Method to Predict the Severity in Brain TumorAn Enhanced Feature Selection Method to Predict the Severity in Brain Tumor
An Enhanced Feature Selection Method to Predict the Severity in Brain Tumor
 
IRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET-Survey on Data Mining Techniques for Disease PredictionIRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET-Survey on Data Mining Techniques for Disease Prediction
 
Chronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningChronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine Learning
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
 
Review on Mesothelioma Diagnosis
Review on Mesothelioma DiagnosisReview on Mesothelioma Diagnosis
Review on Mesothelioma Diagnosis
 
IRJET - Heart Health Classification and Prediction using Machine Learning
IRJET -  	  Heart Health Classification and Prediction using Machine LearningIRJET -  	  Heart Health Classification and Prediction using Machine Learning
IRJET - Heart Health Classification and Prediction using Machine Learning
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysis
 

More from cscpconf

ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR cscpconf
 
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATIONcscpconf
 
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...cscpconf
 
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIESPROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIEScscpconf
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
 
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
 
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
 
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICTWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICcscpconf
 
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINDETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINcscpconf
 
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...cscpconf
 
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMIMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMcscpconf
 
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...cscpconf
 
AUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEWAUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEWcscpconf
 
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKCLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKcscpconf
 
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
 
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAPROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAcscpconf
 
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHCHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHcscpconf
 
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
 
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGESOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGEcscpconf
 
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTGENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
 

More from cscpconf (20)

ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
 
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
 
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
 
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIESPROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
 
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
 
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
 
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICTWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
 
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINDETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
 
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
 
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMIMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
 
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
 
AUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEWAUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEW
 
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKCLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
 
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
 
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAPROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
 
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHCHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
 
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
 
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGESOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
 
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTGENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
 

Recently uploaded

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 

Recently uploaded (20)

Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 

PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFERENT METHODS OF CLASSIFICATION IN DATA MINING

  • 1. Sundarapandian et al. (Eds): ICAITA, SAI, SEAS, CDKP, CMCA, CS & IT 08, pp. 01–08, 2012. © CS & IT-CSCP 2012 DOI : 10.5121/csit.2012.2501 PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFERENT METHODS OF CLASSIFICATION IN DATA MINING Saeedeh Pourahmad1,3 , Mohsen Azad1 ,Shahram Paydar2 and Hamid Reza Abbasi2 1 Department of Biostatistics, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran pourahmad@sums.ac.ir azadmo@sums.ac.ir 2 Department of Surgery and Trauma Research Center, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran paydarsh@sums.ac.ir abbasihr@sums.ac.ir 3 Colorectal Research Center, Shiraz University of Medical Sciences, Shiraz, Iran ABSTRACT In the present study, the abilities of three classification methods of data mining namely artificial neural networks with feed-forward back propagation algorithm, J48 decision tree method and logistic regression analysis are compared in a medical real dataset. The prediction of malignancy in suspected thyroid tumour patients is the objective of the study. The accuracy of the correct predictions (the minimum error rate), the amount of time consuming in the modelling process and the interpretability and simplicity of the results for clinical experts are the factors considered to choose the best method. Keywords Data Mining, Artificial Neural Networks, Decision Trees, Logistic Regression Analysis, Malignancy, Thyroid Tumour Patients 1. INTRODUCTION Thyroid nodule is a common problem in population and decision making for type of management is subject of controversies. Management varies from observation to total thyroidectomy. Sequence of diagnostic procedures is based on management protocols of the center. FNA (Fine Needle Aspiration) of the nodule is one of the most useful tools in determining type of management of thyroid nodules but has some limitations in accurate report and some significant mistakes while decision making, are made [1]. Therefore, it is necessary to help the physicians by data mining techniques and search deeply in patients’ attributes to find the meaningful relations and develop the diagnosis process.
  • 2. 2 Computer Science & Information Technology (CS & IT) There are different classification methods in data mining techniques [2]. Some of them are deeply dependent to underlying theoretical assumptions such as linear and logistic regression models. They are called parametric methods. And some the others are assumptions free like artificial neural networks, decision trees, K-nearest neighbourhood, etc. Recently, non parametric methods of data mining techniques which are not depending on theoretical distributional assumptions receive more attention in practice. The real dataset seldom follows the underlying theoretical assumptions of parametric methods. Clinical dataset is such an example. The avoidance of ideal theoretical assumptions in one hand and the variability in the nature of clinical dataset and their vague relations on the other hand cause this favourability. The attribute nature of biological data and their vague relation does not consist with theoretical distributional assumptions of parametric methods. For instances, dichotomous attribute in logistic regression analysis which is modelled based on the other attributes should follow Bernoulli theoretical distribution. The independency of error terms of models, the independency of other attributes in the model and enough sample size are the other ideal theoretical assumptions of these modelling methods. The application of such methods and the accuracy of their results depend on fulfilment of their ideal assumptions. In comparison, other methods such as artificial neural networks and decision trees use the learning process from a set of existent prototypes without any specific underlying assumptions. Therefore, the relation among the attributes is discovered from a part of the dataset (training set) and the parameters are estimated in such a way that the error prediction is minimized. Then, the power of model’s prediction is evaluated by the other part of dataset (testing set). These two mentioned methods have some advantages and disadvantages in their own [3]. In general, decision trees will require much less training time than neural networks. However, despite of neural network methods, decision tree techniques are less sensitive to the noise. In other words, the hidden layers in neural network discover complicated relations of the data and assign more weights to the important attributes. Nevertheless, their performance depends on the dataset. For some data, much more time and complicated computing is needed to train a neural network or/and the results are hardly interpretable by the expert of that field. For some other data, decision tree techniques are not able to discover reasonable relations. There are some studies which compare the abilities of these two methods and also, with logistic regression analysis in different research fields. For instances see [4, 5, 6]. In this paper, we compare feed forward back propagation neural networks and a well-known algorithms of decision tree namely J48 on our clinical dataset. A feed forward back propagation neural network repeatedly examines all the training data in the process of updating its weights [7]. The mentioned decision tree learning algorithms recursively partitions the training data into ever smaller subsets on which a test is made [8]. 2. METHODS AND MATERIALS In this section, three applied classification methods are described briefly. Then, the clinical diagnosis problem and the patients’ population used in practical part of the study are explained at the end. 2.1. Artificial Neural Networks Neural networks are the branch of artificial intelligence. Their models are inspired by the neural systems of human brain. And have been applied in many research fields such as biology, psychology, statistics, mathematics, medical science, computer sciences, and also, a variety of business areas like finance, management and decision making, marketing and production [9]. Recently, artificial neural networks (ANNs) become a very popular model to diagnose disease. However, ANNs have some disadvantages and some advantages for medical analysis. Their
  • 3. Computer Science & Information Technology (CS & IT) 3 discrimination power, discovery the complex and nonlinear relationship among the attributes, and prediction of the cases are the most important advantages of ANNs. Nevertheless, they can be over-fitted for training data, and time consuming because of computational requirements [9]. The selection of an appropriate training algorithm, transfer functions, initial values of network weights and also, the number of parameters and hidden layers to define the network size determine the performance of an ANN. But compared to logistic regression analysis, neural network models are more flexible [10]. In this study, we use a type of neural networks namely feed-forwards network with back propagation algorithm to model the relations among attributes in our clinical dataset. A feed- forward back propagation neural network is a supervised network. That is, it uses training and testing data to build a model. The data involves a set of input attributes with their corresponding output. The network uses the training data to “learn” how to predict the known output, and the testing data is used for validation. The aim is to predict the output for any given inputs in such a way that the distance between the observed and predicted outputs becomes minimized. This algorithm repeatedly examines all the training data to update its weights. These weights are adjusted during training and the process is only in the forward direction through the network without any feedback loops [9]. The simplified process for training a feed-forward back propagation network is as follows [9]: 1. Input data is entered to the network and propagated through the network until it reaches the output layer. This forward process produces a predicted output. 2. The predicted output is subtracted from the actual output and an error value for the networks is calculated. 3. The neural network then uses supervised learning, which in most cases is back propagation, to train the network. Back propagation is a learning algorithm for adjusting the weights. 4. Once back propagation has finished, the forward process starts again, and this cycle is continued until the error between predicted and actual outputs is minimized. Equations 1 to 3 summarize the formulations as follows: ܽ௠ାଵ = ݂௠ାଵሺ‫ݓ‬௠ାଵ ܽ௠ + ܾ௠ାଵሻ ݂‫ݎ݋‬ ݉ = 0, … , ‫ܯ‬ − 1 ሺ1ሻ Where, ሼ‫݌‬ଵ, ‫ݐ‬ଵሽ, ሼ‫݌‬ଶ, ‫ݐ‬ଶሽ, … , ሼ‫݌‬ொ, ‫ݐ‬ொሽ is the data set (pi and ti are the ith input and output respectively), and M is the number of layers with f m as the transfer function, wm as the weight and bm as the bias in the mth layer. The input of the first layer is the network input and the output of the last layer is the network output. ܽ = ܽ௠ , ܽ଴ = ‫݌‬ ሺ2ሻ The parameters (wm and bm ) are estimated in such a way that the mean square errors (the mean distances between the observed and estimated outputs, ti and ai respectively) is minimized. 2.2. Decision Trees Decision tree is a typical method for the classification of objects in to decision classes [12]. A decision tree classifier is a function as follows: )()(...)()(: 21 YdomXdomXdomXdomdt n →××× (4) In which,
  • 4. 4 Computer Science & Information Technology (CS & IT) nXXX ,...,, 21 are input attributes and Y is the output, where Xi has domain dom (Xi) and Y has domain dom (Y). We assume without loss of generality that dom (Y)={Y1, Y2} (a dichotomous discrete attributes). A decision tree is a directed, acyclic graph T in a form of a tree. Each node in a tree has either zero or more outgoing edges. If a node has no outgoing edges, then it is called a decision node (a leaf node); otherwise, a node is called a test node (or an attribute node). Each decision node N is labeled with one of the possible decision classes }.,{ 21 YYY ∈ Each test node is labelled with one input attribute },...,,{ 21 ni XXXX ∈ i.e. called the splitting attribute. Each splitting attribute Xi has a splitting function fi associated with it. The splitting function fi determine the outgoing edge from the test node, based on the attribute value Xi of an object O in question. It is in form of ii YX ∈ where );( ii XdomY ⊂ if the value of the attribute Xi of object O is within Yi , then the corresponding outgoing edge from the test node is chosen. The problem of decision tree construction is as follows: Given a data set ),...,,( 21 nxxxX = , where xi are input random samples from an unknown probability distribution P, find a decision tree classifier T such that the misclassification rate RT(P) is minimal [12]. 2.3. Logistic Regression Analysis Logistic regression is also called logistic model or logit model, is a type of predictive model in which the target variable (output attribute) is a dichotomous variable for instances healthy or unhealthy, dead or alive, win or loss, etc. Logistic regression is used for the prediction of the probability of occurrence the desired event from two existent ones in output variable by fitting the data into a logistic curve. Like many forms of regression analysis, the input attributes may be either numerical or categorical. For example, the probability that a person has a heart attack in a specified time that might be predicted from the knowledge of person’s age, sex and body mass index [9]. Logistic regression is widely applied in the medical sciences. The output attribute Y, of a subject can take one of two possible values, denoted by 1 and 0 (for example, Y=1 if a disease is present; otherwise Y=0). Let X=(x1, x2,,…, xn) be the vector of input attributes. The logistic regression model is used to explain the effects of the input attributes on the probability of occurrence the value 1 for output Y. ‫ݐ݅݃݋ܮ‬ ሼܲሺܻ = 1ሻሽ = log ቊ ܲሺܻ = 1ሻ 1 − ܲሺܻ = 1ሻ ቋ = ܾ଴ + ܾଵ‫ݔ‬ଵ + ⋯ + ܾ௡‫ݔ‬௡ ሺ5ሻ Where, P stands for probability, b0 is called the “intercept” and b1, b2,… are called the “regression coefficients” of x1, x2, … respectively. Each of the regression coefficients describes the importance of corresponding input attribute on output. A positive regression coefficient means that this input increases the probability of outcome, where as a negative regression coefficient means that the considered input decreases the probability of outcome. In addition, the absolute value of the coefficient detects its effect on the probability of outcome. A large values means strongly influences and a non-zero regression coefficient means little influence on the probability of outcome [9]. Analysis of our clinical data was done by Matlab 2008a for artificial neural network method, WEKA software, version 3.7.1 for decision tree technique and SPSS software, version 16.0 for logistic regression analysis. The accuracy rate in prediction for these three methods is calculated to determine the best predictive methods.
  • 5. Computer Science & Information Technology (CS & IT) 5 2.4. Patients’ Population and Clinical Problem The malignancy or benignity of thyroid nodule is sometimes in ambiguity for the physicians. Only based on the pathology result after the surgery and removal of the thyroid tumour, the type of tumour is certainly determined, whereas it is better to avoid the unnecessary surgery. Although there are various factors which help the physician in diagnosis before the surgery but, the ultimate decision is still in ambiguity. For instance, FNA (Fine Needle Aspiration) of the nodule is one of the most useful tools in determining type of tumour thyroid but it has some limitations in accurate report and some significant mistakes while decision making, are made [1]. In the present study, all the patients in two recent years (2011 & 2012) which are referred to Shahid Rajaee hospital in Shiraz, Iran for the surgery of thyroid tumour are participated to our study. During this time, 259 cases with positive FNA result are entered the study. Most of them are female (211 versus 48 cases) with overall mean age of 42.3± 13.6 (41±13 for female and 47.9±15 for male). Some patients’ characteristics are considered as the inputs attributes such as gender, age, thyroid nodules size, tumour size, type of operation, the duration of the disease, patient family history,The dichotomous output is the malignancy or benignity of thyroid nodule derived from the pathology results after the surgery. 3. RESULTS Three explained methods applied to the clinical dataset and the results are summarized separately as follows: 3.1 Artificial Neural Network Results Table 1 explains the information of the network trained by the thyroid tumour data. From the 259 cases, 75 percents (196 cases) which were chosen randomly are used in training set and 25 percents (63 cases) are used for validation. The absolute values of the final updated weights on the 14th step determine the important attributes on the malignancy of the tumour. According to the trained network results, the first five important attributes are Multiple Nodule, Cancer Family History, Size of Left Lobe, Lobectomy and Type of Operation, respectively. Table 1. The information of trsined artificial neural network on thyroid tumour data Network type Algorithm No. of inputs No. of output No. of hidden layer No. of neuron in hidden layer Transfer function Iteration Epochs Convergence precision No. of step to convergence Feed-forward Back Propagation 29 1 1 10 Log-sigmoid 50 1000 0.00001 14 The accuracy of the prediction is 98 percents for training set and 92 percents for validation set. These results confirm the power of the trained network in prediction. Table 2 shows the prediction results. The target output is the observed result from pathology which detects the real status of the patient after surgery and the estimated output is derived from the trained network.
  • 6. 6 Computer Science & Information Technology (CS & IT) Table 2. The estimated output versus the observed one by the trained neural network Estimated output Target In training set Malignant Benign In validation set Malignant Benign Malignant 81 2 32 3 Benign 1 112 2 26 3.2 Decision Tree Results 150 cases were randomly chosen from 259 cases to train the decision tree. J48 algorithm is used and the results are summarized in Tables 3 and 4. The first five important attributes near the root of the tree are Size of Left Lobe, Type of Operation, Multiple Nodule, Encapsulation and Cancer Family History, respectively. Almost near to neural network results but with lower accuracy rate (80 percents in training and 75 percents in validation set). Table 3. The estimated output versus the observed one by the derived decision tree Estimated output Target In training set Malignant Benign In validation set Malignant Benign Malignant 43 12 52 11 Benign 18 77 16 30 Table 4. The information of derived decision tree on thyroid tumour data Root mean squared error Relative absolute error Root relative squared error Coverage of cases (0.95 level) Mean rel. region size (0.95 level) Total Number of Instances 0.4826 99.1188 % 101.0544 % 98.7013 % 98.7013 % 259 3.3 Logistic Regression Analysis Results In our clinical dataset for logistic regression analysis, the real type of tumour for each case from pathology result is considered as the binary output of the model. Therefore, Y=1 for malignant and Y=0 for benign tumour. And 29 patients’ characteristics are considered as the input vector of attributes, i.e. X=(x1, x2,,…, x29). Conditional forwards method is used as the variables’ entrance method to the model. No significant relation found among the data. That is, none of the input attribute enters the model and logistic regression analysis was not able to find the significant relation between the inputs and the output. Several methods of variables’ entrance were also used without any significant results. Table 5 and Eq. 6 describe the estimated model containing the only coefficient as the intercept. Coefficient of determination which detects the model ability to explain the output’s variability is 0.69. It means that without any information of these 29
  • 7. Computer Science & Information Technology (CS & IT) 7 attributes of the patient, 69 percents of tumor malignancy is estimable. This result is not confirmed by the physician. Table 5. Estimated coefficient of logistic regression fitted on thyroid tumor dataset Variables in the equation B Standard Error Wald statistics Degree of freedom Significance EXP(B) Constant -0.974 0.139 48.868 1 <0.001 0.378 log ቊ ܲሺܻ = 1ሻ 1 − ܲሺܻ = 1ሻ ቋ = −0.974 ሺ6ሻ 4. CONCLUSIONS As mentioned earlier, the ideal theoretical assumptions’ of parametric classification methods bring them some limitations in practice. In other words, the real dataset seldom fulfil their assumptions. For logistic regression classification method for instance, in addition of underlying distributional assumptions such as Bernoulli probability distribution for observed dichotomous outputs, independency of observed inputs, independency and identically distribution of error terms and enough sample size in both categories of output, the number of inputs is important too. Usually, parametric methods are seldom able to manage more than 30 attributes. In the clinical example of our study, the theoretical assumptions may not be held. Furthermore, 29 input attributes with 12 categorical variables which are entered to the model by 27 indicator variables may seem out of the model’s ability. And this fact may cause to the strange results far from the physician’s expectance. In comparison, two nonparametric classification methods discussed in the present study are more flexible and nearer to the real data circumstances. In our clinical dataset, neural network method consumed more time for computation with complicated results not simply interpretable by the clinical experts. Decision tree method in contrast, represents the simple decision rules in shorter time. However, it is less sensitive to the noise (irrelevant attribute) with lower accuracy rate than neural network. To choose the best method, authors suggest further attempt to test more networks with different algorithms and transfer functions and also various algorithms of decision tree on the discussed dataset. However, it depends to the physician’s preference too. ACKNOWLEDGEMENTS The authors would like to thank trauma research center of Shahid Rajaee hospital in Shiraz University of Medical Sciences and its staff, for their valuable cooperation in data gathering process. REFERENCES [1] Papageorgiou E, Kotsioni I & Linos A,(2005) “Data mining: a new technique in medical research”, Hormones, Vol. 4, No. 4, pp 189-191. [2] Seifert J.W. (2010) “Data Mining and Homeland Security: An Overview”, BiblioGov. [3] Lawrence O. Hall, Xiaomei Liu, Kevin W. Bowyer2 & Robert Ban_eld, (2003) “Why are neural networks sometimes much more accurate than decision trees: an analysis on a bio-informatics problem”, IEEE International Conference on Systems, Man & Cybernetics, Washington, D.C., pp. 2851-2856, October 5–8.
  • 8. 8 Computer Science & Information Technology (CS & IT) [4] Kazemnejad A, Batvandi Z & Faradmal J, (2010) “Comparison of artificial neural network and binary logistic regression for determination of impaired glucose tolerance/diabetes”, Eastern Mediterranean Health Journal, Vol. 16, No. 6, pp 615-620. [5] Kue W.J, Chang R.F, Chen D.R, et al. (2001) “Data Mining with decision trees for diagnosis of breast tumor in medical ultrasonic images”, Breast Cancer Research and Treatment, Vol. 66, pp 51- 57. [6] Sakai S, Kobayashi K, et al. (2007) “Comparison of the levels of accuracy of an artificial neural network model and a logistic regression model for the diagnosis of acute appendicitis” J Med Syst, Vol. 31, pp 357–364. [7] Anthony M. & Bartlett P, (1999) “Neural Network Learning: Theoretical Foundations”, Cambridge University press. [8] Quinlan J.R, (1996) “C4.5: Programs for Machine Learning”, Morgan Kaufmann, San Mateo, CA. [9] Raghavendra B.K & Srivatsa S.K, (2011) “Evaluation of logistic regression and eural network model with sensitivity analysis on medical datasets”, International jcomputer science and security, Vol 5, no, 5. pp 503-511. [10] Tasdelen B, Helvaci S, Kaleagasi H & Ozge A, (2009) “Artificial neural network analysis for prediction of headache prognosis in elderly patients”, Turk J Med Sci , Vol. 39, No. 1, pp 5-12. [11] Zhang J, Lok T, R.Lyu M, (2007) “A hybrid particle swarm optimization-back-Propagation algorithm for feedforward neural network training”, Applied Mathematics and Computation, Vol. 185, pp1026- 1037. [12] Kokol P, Pohorec S, Stiglic G & Podgorelec V, (2012) “Evolutionary design of decision trees for medical application”, WIRE Data Mining Knowl Discov, Vol. 2, pp 237-254. [13] Shahbaz Khan F, Anwer RM, Torgersson O, Falkman G, (2007) “Data Mining in Oral Medicine Using Decision tree”, World Academy of Science,Engineering and Technology, Vol. 37, pp 12-16. Authors Saeedeh Pourahmad is an assistant professor at the Biostatistics Department of Shiraz University of Medical Sciences in Iran. She obtained a B.Sc. in Statistics from Shiraz University in 2002. She also obtained an M.S. and a Ph.D. degree in Biostatistics from Shiraz University of Medical Sciences in 2004 and 2011 in Iran, respectively. Her research is modelling in Fuzzy environments, neural networks, nonlinear and linear relations in crisp environments and their clinical applications. Mohsen Azad is a M.S. student at the Biostatistics Department of Shiraz University of Medical Sciences in Iran. He obtained a B.Sc. in Statistics from Shahid Bahonar University in Kerman (Iran) in 2010. He is currently involved with the present research. Shahram Paydar is an assistant professor at the Department of Surgery in Medical School of Shiraz University of Medical Sciences in Iran. He obtained a MD d egree and Na tional board of surgery in 2000 and 2007 from Shiraz University of Medical Sciences in Iran, respectively. He is a full time member of Trauma Research Center in Rajaee Hospital of Shiraz, Iran, oct. 2007 and his research is in Trauma and emergency management and Thoracic surgery. Hamid Reza Abbasi is an associated professor at the Department of Surgery in Medical School of Shiraz University of Medical Sciences in Iran. He obtained a MD degree and General Surgery, MD in 1994 and 1998 from Shiraz University of Medical Sciences in Iran, respectively. His research is in determining prognosis of acutely ischemic limb, correlation of sonographic measurement of IVC diameter, etc.