• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Mining lung cancer data and other diseases data using data mining techniques
 

Mining lung cancer data and other diseases data using data mining techniques

on

  • 1,122 views

 

Statistics

Views

Total Views
1,122
Views on SlideShare
1,122
Embed Views
0

Actions

Likes
0
Downloads
30
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Mining lung cancer data and other diseases data using data mining techniques Mining lung cancer data and other diseases data using data mining techniques Document Transcript

    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME508MINING LUNG CANCER DATA AND OTHER DISEASES DATAUSING DATA MINING TECHNIQUES: A SURVEYParag Deoskar1, Dr. Divakar Singh2, Dr. Anju Singh3MTech Scholar CSE Deptt. BUIT, Barkatullah University, Bhopal1HOD of CSE Deptt. BUIT, Barkatullah University, Bhopal2Astt. Prof. of CSE Deptt. BUIT, Barkatullah University, Bhopal3ABSTRACTIf you think about the dangerous diseases in the world then you always list Cancer asone. Lung cancer is one of the most dangerous cancer types in the world. These diseases canspread by uncontrolled cell growth in tissues of the lung. Early detection can save the life andsurvivability of the patients. In this paper we survey several aspects of data mining which isused for lung cancer prediction. Data mining is useful in lung cancer classification. We alsosurvey the aspects of ant colony optimization (ACO) technique. Ant colony optimizationhelps in increasing or decreasing the disease prediction value. This study assorted datamining and ant colony optimization techniques for appropriate rule generation andclassification, which pilot to exact cancer classification. In addition to, it provides basicframework for further improvement in medical diagnosis.Keywords: ACO, data mining, rule pruning, Pheromone1. INTRODUCTIONLung cancer is a disease which is because of uncontrolled cell growth in tissues of thelung. If the cancer is not treated in the early stage, this growth can spread beyond the lung ina process called metastasis into nearby tissue and, eventually, into other parts of the body.Most cancers which are in the primary stage are carcinomas that derive from epithelial cells.Common causes of lung cancer are tobacco and smoke. It is the main cause of cancer deathworldwide, and it is difficult to detect in its early stages because symptoms can show theirproperties at advanced stages sometimes in the last stager. There are several research suggestthat the early detection of lung cancer will decrease the mortality rate.INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING& TECHNOLOGY (IJCET)ISSN 0976 – 6367(Print)ISSN 0976 – 6375(Online)Volume 4, Issue 2, March – April (2013), pp. 508-516© IAEME: www.iaeme.com/ijcet.aspJournal Impact Factor (2013): 6.1302 (Calculated by GISI)www.jifactor.comIJCET© I A E M E
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME509Decision classification is the most important task for mining any data set. Theproblem which is classified is mainly collaborated with the assignment of an object to anobject oriented parameter that is class and its parameter [1], [2]. There are several decisiontasks which we observe in several fields of engineering, medical, and management relatedscience can be considered as classification problems. Popular examples are patternclassification, speech recognition, character recognition, medical diagnosis and creditscoring.But in our case classification alone is insufficient for classifying lung cancer dataset.If we consider data mining for frequent pattern classification then it is better tool forclassifying relevant data from the raw dataset. The performance of association rules isdirectly depend on frequent pattern mining, to balance it is the core problem of miningassociation rules [3]. With the developing and more detailed of the research on frequent itemsets mining, it is widely used in the field of data mining, for example, mining associationrule, correlation analysis, classification, clustering 4],support vector machine[5] and positiveassociation rule classification[6].The main aim of data mining is to extract important information from huge amount ofraw data. We emphasize to mine lung cancer data to discover knowledge that is not onlyaccurate, but also comprehensible for the lung cancer detection [7], [8], [9].Comprehensibility is important whenever discovered knowledge will be used for supporting ahuman decision. After all, if discovered knowledge is not comprehensible for a user, it willnot be possible to interpret and validate the knowledge. So we can say that trust indiscovering rule knowledge is very important. In decision making, this can lead to incorrectdecisions.We provide here an overview of medical data mining technique. The rest of this paperis arranged as follows: Section 2 introduces medical data mining; Section 3 describes aboutant colony optimization; Section 4 describes about related works; section 5 discuss about theTheoretical extraction. Section 6 describes Conclusion.2. MEDICAL DATA MININGIf we study the definition of the term data mining, then we can say data mining refersto extracting or “mining” knowledge from large amounts of data or databases [10]. Theprocess of finding useful patterns or meaning in raw data has been called KDD [11]. KDDprovides a cleaning to the inconsistent data. Data Mining also provides pattern classification,visualization and rule separation.For understanding the utility of data mining then we better categorize data miningbased on their function ability as below [12]:1) Regression is a statistical methodology that is often used for numeric prediction.2) Association returns affinities of a set of records.3) Sequential pattern function searches for frequent subsequences in a sequencedataset, where a sequence records an ordering of events.4) Summarization is to make compact description for a subset of data.5) Classification maps a data item into one of the predefined classes.6) Clustering identifies a finite set of categories to describe the data.7) Dependency modeling describes significant dependencies between variables.8) Change and deviation detection is to discover the most significant changes in thedata by using previously measured values.
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME510Medical diagnosis is very subjective because of the clinical research and personalperception of the doctors matter the diagnosis. A number of studies have shown that thediagnosis of one patient can differ significantly if the patient is examined by different doctorsor even by the same doctors at various times [13]. The idea of medical data mining is toextract hidden knowledge in medical field using data mining techniques. It is possible toidentify patterns even if we do not have fully understood the casual mechanisms behind thosepatterns. Even the patterns which are irrelevant can be discovered [14]. Clinical repositoriescontaining large amounts of biological, clinical, and administrative data are increasinglybecoming available as health care systems integrate patient information for research andutilization objectives [15]. Data mining techniques applied on these databases discoverrelationships and patterns which are helpful in studying the progression and the managementof disease [16]. A typical clinic data mining research including following ring: structured datanarrative text, hypotheses, tabulate data statistics, analysis interpretation, new knowledgemore questions, outcomes observations and structured data narrative text [17]. Prediction orearly diagnosis of a disease can be kinds of evaluation. About diseases like skin cancer,breast cancer or lung cancer early detection is vital because it can help in saving a patient’slife [18].3. ANT COLONY OPTIMIZATIONThe Ant Colony Optimization (ACO) algorithm is a meta-heuristic which is agrouping of distributed environment, positive feedback system, and systematic greedyapproach to find an optimal solution for combinatorial optimization problems.The Ant Colony Optimization algorithm is mainly inspired by the experiments run byGoss et al. [19] which using a grouping of real ants in the real environment. They study andobserve the behaviour of those real ants and suggest that the real ants were able to select theshortest path between their nest and food resource, in the existence of alternate paths betweenthe two. The above searching for food resource is possible through an indirectcommunication known as stigmergy amongst the ants. When ants are travelling for the foodresources, ants deposit a chemical substance, called pheromone, on the ground. When theyarrive at a destination point, ants make a probability based choice, biased by the intensity ofpheromone they smell. This behaviour has an autocatalytic effect because of the very fact thatan ant choosing a path will increase the probability that the corresponding path will be chosenagain by other ants in the next move. After finishing the search ants return back, theprobability of choosing the same path is higher because of increasing pheromone quantity. Soby the pheromone will be released on the chosen path, it provides the path for the ants. Inshort we can say that, all ants will select the shortest path.Figure 1 shows the behaviour of ants in a double bridge experiment [20]. If weanalyse the case then we observed that because of the same pheromone laying the shortestpath will be chosen. It will be starts with first ants which arrive at the food source are thosethat took the two shortest branches. After approaching the food destination these ants starttheir return trip, more pheromone is present on the short branch is the possibility for choosingthe shortest one than the one on the Long Branch. This ant behaviour was first formulatedand arranged as Ant System (AS) by Dorigo et al. [21]. Based on the AS algorithm, the AntColony Optimization (ACO) algorithm was proposed [22]. In ACO algorithm, theoptimization problem can be expressed as an formulated graph G = (C; L), where C is the setof components of the problem, and L is the set of possible connections or transitions among
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME511the elements of C. The proposed solution is represented in terms of feasible paths on thegraph G, with respect to a set of given constraints and predicate. The population of ants thatis also called agents collectively solves the problem under consideration using the graphrepresentation. We assume that the ants are probably very poor of finding a solution, goodquality solutions can emerge as a result of collective interaction amongst ants. Pheromonetrails encode a long-term memory about the whole ant search process from the starting to thefood resource destination. The value depends on the problem formulation, representation andthe optimization objective which is different in case to case.Figure 1: Double bridge experiment. (a) Ants start exploring the double bridge. (b)Eventually most of the ants choose the shortest path [20].The algorithm presented by Dorigo et al. [22] was given below:Algorithm ACO meta heuristic();while (termination criterion not satisfied)ant generation and activity();pheromone evaporation();daemon actions();“optional”end whileend Algorithm4. RELATED WORKSIn 2011, Hnin Wint Khaing et al. [23] presented an efficient approach for theprediction of heart attack risk levels from the heart disease database as presented by theauthors. They proposed the algorithm in which the heart disease database is firstly clusteredfor creating alike element grouping using the K-means clustering algorithm. Their approachallows mastering the number of fragments through its k parameter. After that they applymining on frequent patterns from the extracted data, which are relevant to heart disease, usingthe MAFIA (Maximal Frequent Item set Algorithm) algorithm. Then the learning algorithm
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME512is trained with the selected significant patterns for the effective prediction of heart attackdiseases. They have employed the ID3 algorithm as the training algorithm to show level ofheart attack with the decision tree. According to the author results showed that the designedprediction system is capable of predicting the heart attack effectively.In 2011, Zenggui Ou et al. [24] discuss about how to use the sequential characteristicin the course of Web data mining to carry out structural transfer of semi-structured data basedon time effect of data, that is the systematic structuring of Web resources data, and solve theproblem which is about the effectiveness in retrieval accordingly.In 2010, Zakaria Suliman Zubi et al. [25] study that the lung cancer is a disease ofuncontrolled cell growth in tissues of the lung, Lung cancer is one of the most common anddeadly diseases in the world. Authors suggest that the detection of lung cancer in its earlystage is the key of its cure. So in general, a measure for early stage lung cancer diagnosismainly includes those utilizing X-ray chest films, CT, MRI, etc. Medical images mining is apromising area of computational intelligence applied to automatically analysing patientsrecords aiming at the discovery of new knowledge potentially useful for medical decisionmaking.In 2011, Yao Liu et al. [26] proposed and implement a classifier using discreteparticle swarm optimization (DPSO) with an additional new rule pruning procedure fordetecting lung cancer and breast cancer, which are the most common cancer for men andwomen as per the author’s observation. According to the author experiment which shows thenew pruning method further improves the classification accuracy and their approach iseffective in making cancer prediction.In 2011, Chandrasekhar U et al. [27] discuss and analyses recent improvements onclustering algorithms like PP (Project Pursuit) based on the ACO algorithm for highdimensional data, recent applications of Data Clustering with ACO, application of Ant-basedclustering algorithm for object finding by multiple robots in image processing field and thehybrid PSO/ACO algorithm for better optimized results. According to the author ClusterAnalysis is a popular and widely used data analysis and data mining technique. The highquality and fast clustering algorithms play a vital role for users to navigate, effectivelyorganize and structure the data. They observed that Ant Colony Optimization (ACO), aSwarm Intelligence technique, integrated with clustering algorithms, is being used by manyapplications for past few years.In 2011, Shyi-Ching Liang et al. [28] suggest Classification rule is the most commonrepresentation of the rule in data mining. It is based on supervised learning process whichgenerates rules from training data set. The main goal of the classification rule mining is theprediction of the predefined class based on the group. Based on ACO algorithm, Ant-Minersolved the classification rule problem. According to the author, Ant-Miner shows goodperformance in many dataset. In this research paper author proposed, an extension of Ant-Miner is proposed to incorporate the concept of parallel processing and grouping. In thispaper intercommunication is provided via pheromone among ants is a critical part in antcolony optimization’s searching mechanism. The algorithm design in such a way, with aslight modification in this part which removes the parallel searching capability. Based onAnt-Miner, they propose an extension that modifies the algorithm design to incorporateparallel processing. The pheromone trail deposited by ants during the searching procedureaffected each other. With the help of pheromone, ants can have better decision making whilesearching. They provide a possible direction for researches toward the classification ruleproblem.
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME513In 2011, Mete ÇEL K et al. [29] discuss about several classical and heuristicalgorithms proposed to mine classification rules out of large datasets. In this research authorsproposed, a new and novel heuristic classification data mining approach based on artificialbee colony algorithm (ABC) ABC-Miner. Authors proposed approach was compared withParticle Swarm Optimization (PSO) rule classification algorithm and C4.5 algorithm usingbenchmark datasets. The experimental results show good efficiency of the proposed method.In 2011, G. Sophia Reena et al. [30] suggest that the Cancer research is an interestingresearch area in the field of medicine. Authors suggest that classification is momentouslynecessary for cancer diagnosis and treatment. The precise prophecy of dissimilar tumor typeshas immense value in providing better care and toxicity minimization on the patients. Authorsuggest that classification of patient taster obtainable as gene expression profiles has becomean issue of prevalent study in biomedical research in modern years. Formerly, cancerclassification depends upon the morphological and clinical. The modern arrival of the microarray technology has permitted the concurrent observation of thousands of genes, whichprovoked the progress in cancer classification using gene expression data. This study hub onthe broadly used assorted data mining and machine learning techniques for appropriate geneselection, which pilot to exact cancer classification.In 2013, S.Vijiyarani et al. [31] reviewed and suggest thatdData mining is defined assifting through very large amounts of data for useful information. Some of the most importantand popular data mining techniques are association rules, classification, clustering, predictionand sequential patterns. Data mining techniques are used for variety of applications. In healthcare industry, data mining plays an important role for predicting diseases. For detecting adisease number of tests should be required from the patient. But using data mining techniquethe number of test should be reduced. This reduced test plays an important role in time andperformance. This technique has an advantages and disadvantages. They analyses how datamining techniques are used for predicting different types of diseases.As per our study there are several woks and algorithm is presented for efficient cancerdetection. The algorithms are based on data mining, fuzzy logic, particle swarm optimizationetc. Several authors categorically work on different types of cancer. After analysing thoseresearch papers we analyse that several research work are based on Lung cancer, heartDiseases and breast Cancer. Some of the authors presenting good results in the case of breastcancer and Herat diseases but fail to achieve higher accuracy in the case of Lung Cancer. In2011 yao lio et al. [26] also proposed and implement a classifier using DPSO with new rulepruning procedure for detecting lung cancer and breast cancer from the UCI repository,which are the most common cancer for men and women. In the case of Lung Cancer theyachieve the accuracy of 68.33 in the case of discrete particle swarm optimization and 64.44 inthe case of particle swarm optimization. In the case of breast cancer they achieve theaccuracy of 97.23in the case of discrete particle swarm optimization and 97.06 in the case ofparticle swarm optimization. They also provide the comparison from different relatedtechniques like PART, SMO, Naïve Bayes, KNN and classification tree. As per our analysisthe result is good in the case of breast cancer. But there is the hope in the case of lung cancer,because the prediction accuracy is not so high. Data mining and Ant colony optimization withthe combined effort will produce better result by using pheromone trails, which is updatedautomatically on the basis of iteration and frequent pattern analysis.
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME5145. THEORETICAL ANALYSISThe theoretical analysis of different diseases with different data mining techniquesand their accuracy of detection are shown in Table 1.Table 1: Theoretical AnalysisAuthor Technique Name Disease Name AccuracyHnin Wint Khaing etaal. [23]K-Means Based MAFIA Heart DiseasePrediction74%Hnin Wint Khaing etaal. [23]K-mean based MAFIA withID3for Heart diseasePrediction85 %Shyi-Ching Liang et al.[28]Ant Colony Optimization andClassification Rule ProblemBreast Cancer 70.33Mete ÇEL K[29] ABC-Miner Breast Cancer Standard Deviationof 0.082Yao Liu et al. [26] Mining Cancer data withDiscrete Particle SwarmOptimization and Rule PruningLung Cancer 68.33(DPSO (new))Yao Liu et al. [26] Mining Cancer data withDiscrete Particle SwarmOptimization and Rule PruningLung Cancer 64.44(PSO (new))6. CONCLUSIONThe use of data mining techniques in Lung cancer classification increases the chanceof making a correct and early detection, which could prove to be vital in combating thedisease. In this paper, we provide a survey on lung cancer detection. We also analyses theutility of data mining by which we can find the efficient lung cancer detection technique.After analysis we find several classifications algorithm and their result by which we can findthe future insights.As the area of Lung cancer is very challenging and the researchers are continuingtheir research progress in efficient detection, there are lot of scope in the case of efficientdetection. As per our observation there are some future suggestions which are listed below:1) We can apply neural network and Fuzzy based technique to traincancer data set for finding better classification and accuracy.2) We can apply optimization technique like Ant Colony Optimization tooptimize the classification [33] for improving the detection.3) Machine learning environment or Support Vector machine [32] is alsoan insight for better detection.4) We can use some homogeneity based algorithm to find over fitting andovergeneralization Characteristics. It can be applied by clustering algorithm like K-Means.
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME515REFERENCES[1] R. O. Duda, P. E. Hart, and D. G. Stork, “Pattern Classification”. New York: Wiley, 2000.[2] D. J. Hand and S. D. Jacka, “Discrimination and Classification.” New York: Wiley, 1981.[3] Agrawal R and Srikant R, “Fast Algorithms for Mining Association Rules”, Proc of theInternational Conference on Very Large Databases. Santiago,USA, 1994.[4] Chen XiaoYun and Hu YunFa, “Mining Algorithms of NMost Frequent Itemsets”, PatternRecognition and Artificial Intelligence, 2007.[5] Hetal Bhavsar, Dr. Amit Ganatra, “Variations of Support Vector Machine classificationTechnique: A survey”, International Journal of Advanced Computer Research (IJACR)Volume-2 Number-4 Issue-6 December-2012.[6] Nikhil Jain,Vishal Sharma,Mahesh Malviya,” Reduction of Negative and PositiveAssociation Rule Mining and Maintain Superiority of Rule Using Modified GeneticAlgorithm”, International Journal of Advanced Computer Research (IJACR) Volume-2Number-4 Issue-6 December-2012.[7] M. Dorigo, G. Di Caro, and L. M. Gambardella, “Ant algorithms for discreteoptimization,” Artif. Life, vol. 5, no. 2, pp. 137–172, 1999.[8] U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “From data mining to knowledgediscovery: An overview,” in Advances in Knowledge Discovery & Data Mining, U. Fayyad,G. Piatetsky-Shapiro, P. Smyth,and R. Uthurusamy, Eds. Cambridge, MA: MIT Press, pp. 1–34, 1996.[9] Junzo Watada, Keisuke Aoki, Masahiro Kawano, Muhammad Suzuri Hitam, Dual ScalingApproach to Data M Journal of Advanced Computational Intelligence Intelligent Informatics(JACIII), Vol. 10, No. 4, pp. 441-447, 2006.[10]Jiawei Han and Micheline Kamber, “Data Mining Concepts and Techniques.” SanFrancisco, CA: Elsevier Inc, 2006.[11]U. M.Piatetsky-Shapiro, G.& myth P. & Uthurusamy, R. Fayyad, "From Data Mining toKnowledge Discovery: An Overview," in Advances in Knowledge Discovery and DataMining, 1996.[12] S.-C. Liao & M. Embrechts I. -N. Lee, “Data mining techniques applied to medicalinformation,” Med. Inform , pp. 81-102, 2000.[13]E, Donald, “Introduction to Data Mining for Medical Informatics,” Clin Lab Med, pp. 9-35, 2008.[14] R. Zhang, Y, Katta, “Medical Data Mining,” Data Mining and Knowledge Discovery,pp. 305-308, 2002.[15] Irene M. Mullins et al., “Data mining and clinical data repositories: Insights from a667,000 patient data set,” Computers in Biology and Medicine, vol. 36, pp. 1351-1377, 2006.[16] J. C. Lobach, D. F. Goodwin, L. K. Hales, J. W. Hage, M. L. EdwardHammond, W.Parther, “Medical Data Mining: Knowledge Discovery in a Clinical Data Warehouse,” 1997.[17] Dan Dalan, “Clinical data mining and research in the allergy office,” Current Opinion inAllergy & Clinical Immunology, vol. 10, no. 3, pp. 171-177, June 2010.[18] Y. Mahajani, G. Aslandogan, “Evidence Combination in Medical Data Mining ,” inInformation Technology: Coding and Computing (ITCC), 2004.[19] S. Goss, S. Aron, J. L. Deneubourg, and J. M. Pasteels. “Self-organized Shorcuts in theArgentine Ant.” Naturwissenschaften, 76:579–581, 1989.
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME516[20] M. Dorigo, Gianni Di Caro, and Luca M. Gambardella. “Ant Algorithms for DiscreteOptimization.” Technical Report Tech. Rep. IRIDIA/98-10, IRIDIA, Universite Libre deBruxelles, Brussels, Belgium, 1998.[21] M. Dorigo and M. Maniezzo and A. Colorni. “The Ant Systems: An AutocatalyticOptimizing Process.” Revised 91-016, Dept. of Electronica, Milan Polytechnic, 1991.[22] M.Dorigo and G.Di Car. New“Ideas in Optimisation. ” McGraw Hill, London, UK,1999.[23] Hnin Wint Khaing, “Data Mining based Fragmentation and Prediction of Medical Data”,IEEE 2011.[24]Zenggui Ou, “Data structuring and effective retrieval in the mining of web sequentialcharacteristic”, Electronic and Mechanical Engineering and Information Technology(EMEIT), 2011.[25] Zakaria Suliman Zubi, Rema Asheibani Saad, “Using Some Data Mining Techniques forEarly Diagnosis of LungCancer”, Recent Researches in Artificial Intelligence, Knowledge Engineering and DataBases,2010.[26] Yao Liu and Yuk Ying Chung, “Mining Cancer data with Discrete Particle SwarmOptimization and Rule Pruning”, IEEE 2011.[27] Chandrasekhar U, Naga Poojitha Rao P, “ Recent Trends in Ant Colony Optimizationand Data Clustering: A Brief Survey”, IEEE 2011.[28] Shyi-Ching Liang, Yen-Chun Lee and Pei-Chiang Lee , “The Application of Ant ColonyOptimization to the Classification Rule Problem”,IEEE International Conference on GranularComputing , 2011.[29] Mete ÇEL K, Derviş KARABOĞA and Fehim KÖYLÜ, “Artificial Bee Colony DataMiner (ABC-Miner)”, IEEE 2011.[30] G. Sophia Reena and P. Rajeswari, “A Survey of Human Cancer Classification usingMicro Array Data”, IJCTA 2011.[31] S.Vijiyarani and S.Sudha, “Disease Prediction in Data Mining Technique A Survey”,International Journal of Computer Applications & Information Technology Vol. II, Issue I,January 2013.[32] Smruti Rekha Das, Pradeepta Kumar Panigrahi, Kaberi Das and Debahuti Mishra,”Improving RBF Kernel Function of Support Vector Machine using Particle SwarmOptimization”, International Journal of Advanced Computer Research (IJACR) Volume-2Number-4 Issue-7 December-2012.[33] Anshuman Singh Sadh, Nitin Shukla,” Association Rules Optimization: A Survey”,International Journal of Advanced Computer Research (IJACR), Volume-3 Number-1 Issue-9March-2013.[34] R. Manickam and D. Boominath, “An Analysis of Data Mining: Past, Present andFuture”, International journal of Computer Engineering & Technology (IJCET), Volume 3,Issue 1, 2012, pp. 1 - 9, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.[35] Chaitrali S. Dangare and Dr. Sulabha S. Apte, “A Data Mining Approach for Predictionof Heart Disease Using Neural Networks”, International journal of Computer Engineering &Technology (IJCET), Volume 3, Issue 3, 2012, pp. 30 - 40, ISSN Print: 0976 – 6367,ISSN Online: 0976 – 6375.