Abstract: We analyze the breast Cancer data available from the WBC, WDBC from UCI machine learning with
the aim of developing accurate prediction models for breast cancer using data mining techniques. Data mining
has, for good reason, recently attracted a lot of attention, it is a new Technology, tackling new problem, with
great potential for valuable commercial and scientific discoveries. The experiments are conducted in WEKA.
Several data mining classification techniques were used on the proposed data. There are many classification
techniques in data mining such as Decision Tree, Rules NNge, Tree random forest, Random Tree, lazy IBK. The
aim of this paper is to investigate the performance of different classification techniques. The data breast cancer
data with a total 286 rows and 10 columns will be used to test and justify the different between the classification
methods and algorithm.
Keywords - Machine learning, data mining Weka, classification, breast cancer
Applying Deep Learning to Transform Breast Cancer DiagnosisCognizant
Deep convolutional neural networks can assist pathologists in breast cancer diagnosis by automatically filtering benign tissue biopsies, identifying malignant regions and labeling important cellular features like nuclei for further analysis. Automatic detection of diagnostically relevant regions-of-interest and nuclei segmentation reduces the pathologist’s workload, while ensuring that no critical region is overlooked, rendering breast cancer diagnosis more reliable, efficient and cost-effective.
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
The most common type of cancer in women
worldwide is the Breast Cancer. Breast cancer may be
detected early using Mammograms, probably before it's
spread. Recurrent breast cancer could occur months or years
after initial treatment. The cancer could return within the
same place because the original cancer (local recurrence), or it
may spread to different areas of your body (distant
recurrence). Early stage treatment is done not only to cure
breast cancer however additionally facilitate in preventing its
repetition/recurrence. Data mining algorithms provide
assistance in predicting the early-stage breast cancer that
continually has been difficult analysis drawback. The
projected analysis can establish the most effective algorithm
that predicts the recurrence of the breast cancer and improve
the accuracy the algorithms. Large information like Clump,
Classification, Association Rules, Prediction and Neural
Networks, Decision Trees can be analyzed using data mining
applications and techniques.
Breast Cancer Detection from Mammography Images Using Machine Learning Algorithms (U-Net Segmentation and Dense Net Classifier implementation are in progress)
Machine Learning - Breast Cancer DiagnosisPramod Sharma
Machine learning is helping in making smart decisions faster. In this presentation measurements carried out on FNAC was analysed. The results were validated using 20 percent of the data. The data used for POC is from UCI Repository/
For the scope of this project, it was decided to analyze the data to form distinct clusters based on their tumor type. Unsupervised learning (K-means clustering and hierarchical clustering) were used. Also, it was decided to analyze this data as a classification task. Based on different attributes (primarily mass spectrometry analysis results for 12553 proteins) few classification algorithms were implemented to see if the model can generate the accurate label of cancer type.
Applying Deep Learning to Transform Breast Cancer DiagnosisCognizant
Deep convolutional neural networks can assist pathologists in breast cancer diagnosis by automatically filtering benign tissue biopsies, identifying malignant regions and labeling important cellular features like nuclei for further analysis. Automatic detection of diagnostically relevant regions-of-interest and nuclei segmentation reduces the pathologist’s workload, while ensuring that no critical region is overlooked, rendering breast cancer diagnosis more reliable, efficient and cost-effective.
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
The most common type of cancer in women
worldwide is the Breast Cancer. Breast cancer may be
detected early using Mammograms, probably before it's
spread. Recurrent breast cancer could occur months or years
after initial treatment. The cancer could return within the
same place because the original cancer (local recurrence), or it
may spread to different areas of your body (distant
recurrence). Early stage treatment is done not only to cure
breast cancer however additionally facilitate in preventing its
repetition/recurrence. Data mining algorithms provide
assistance in predicting the early-stage breast cancer that
continually has been difficult analysis drawback. The
projected analysis can establish the most effective algorithm
that predicts the recurrence of the breast cancer and improve
the accuracy the algorithms. Large information like Clump,
Classification, Association Rules, Prediction and Neural
Networks, Decision Trees can be analyzed using data mining
applications and techniques.
Breast Cancer Detection from Mammography Images Using Machine Learning Algorithms (U-Net Segmentation and Dense Net Classifier implementation are in progress)
Machine Learning - Breast Cancer DiagnosisPramod Sharma
Machine learning is helping in making smart decisions faster. In this presentation measurements carried out on FNAC was analysed. The results were validated using 20 percent of the data. The data used for POC is from UCI Repository/
For the scope of this project, it was decided to analyze the data to form distinct clusters based on their tumor type. Unsupervised learning (K-means clustering and hierarchical clustering) were used. Also, it was decided to analyze this data as a classification task. Based on different attributes (primarily mass spectrometry analysis results for 12553 proteins) few classification algorithms were implemented to see if the model can generate the accurate label of cancer type.
Breast Cancer Diagnostics with Bayesian NetworksBayesia USA
The Wisconsin Breast Cancer Database (WBCD) is a widely studied (and publicly available) data set from the field of breast cancer diagnostics. The creators of this database, Wolberg, Street, Heisey and Managasarian, made an important contribution with their research towards automating diagnostics with image processing and machine learning.
Beyond the medical field, many statisticians and computer scientists have proposed a wide range of classification models based on WBCD. Such new methods have continuously raised the benchmark in terms of diagnostic performance.
Our white paper now reevaluates the Wisconsin Breast Cancer Database within the framework of Bayesian networks, which, to our knowledge, has not been done before. We demonstrate how the BayesiaLab software can extremely quickly — and simply — create a Bayesian network model that is on par performance-wise with virtually all existing models that have been developed from WBCD over the last 15 years.
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYijdpsjournal
Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals
have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast
cancer for a long time may increase the possibility of the cancer spreading. Therefore a computerized
breast cancer diagnosis has been developed to reduce the time taken to diagnose the breast cancer and
reduce the death rate. This paper summarizes the survey on breast cancer diagnosis using various machine
learning algorithms and methods, which are used to improve the accuracy of predicting cancer. This survey
can also help us to know about number of papers that are implemented to diagnose the breast cancer.
A Comparative Study on the Methods Used for the Detection of Breast Cancerrahulmonikasharma
Among women in the world, the death caused by the Breast cancer has become the leading role. At an initial stage, the tumor in the breast is hard to detect. Manual attempt have proven to be time consuming and inefficient in many cases. Hence there is a need for efficient methods that diagnoses the cancerous cell without human involvement with high accuracy. Mammography is a special case of CT scan which adopts X-ray method with high resolution film. so that it can detect well the tumors in the breast. This paper describes the comparative study of the various data mining methods on the detection of the breast cancer by using image processing techniques.
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
Abstract Breast Cancer has become the common cause of death among women. Due to long hours invested in manual diagnosis and lesser diagnostic system available emphasize the development of automated diagnosis for early diagnosis of the disease. Our aim is to classify whether the breast cancer is benign or malignant and predict the recurrence and non-recurrence of malignant cases after a certain period. To achieve this we have used machine learning techniques such as Support Vector Machine, Logistic Regression, KNN and Naive Bayes. These techniques are coded in MATLAB using UCI machine learning depository. We have compared the accuracies of different techniques and observed the results. We found SVM most suited for predictive analysis and KNN performed best for our overall methodology. Keywords: Breast Cancer, SVM, KNN, Naive Bayes, Logistic Regression, Classification.
Predictive Analysis of Breast Cancer Detection using Classification AlgorithmSushanti Acharya
Dissertation project titled “Predictive analysis of Breast Cancer detection using Classification”. For the research conducted, Breast Cancer Wisconsin Diagnostics dataset was used for analysis. Using R language machine learning model was designed based on various algorithms and the derived results were then visualized to present the most accurate model of them all (SVM in this case).
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
According to World Health Organization (WHO), breast cancer is the top cancer in women both in the
developed and the developing world. Increased life expectancy, urbanization and adoption of western
lifestyles trigger the occurrence of breast cancer in the developing world. Most cancer events are
diagnosed in the late phases of the illness and so, early detection in order to improve breast cancer
outcome and survival is very crucial.
In this study, it is intended to contribute to the early diagnosis of breast cancer. An analysis on breast
cancer diagnoses for the patients is given. For the purpose, first of all, data about the patients whose
cancers’ have already been diagnosed is gathered and they are arranged, and then whether the other
patients are in trouble with breast cancer is tried to be predicted under cover of those data. Predictions of
the other patients are realized through seven different algorithms and the accuracies of those have been
given. The data about the patients have been taken from UCI Machine Learning Repository thanks to Dr.
William H. Wolberg from the University of Wisconsin Hospitals, Madison. During the prediction process,
RapidMiner 5.0 data mining tool is used to apply data mining with the desired algorithms.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Thermal Imaging: Opportunities and Challenges for Breast Cancer DetectionTarek Gaber
Thermal Imaging: Opportunities and Challenges for Breast Cancer Detection
Abstract:
Breast cancer is the most common cancer among women in the world. It is estimated that one in eight women, all over the wide, would develop breast cancer during her life. Breast cancer is considered one of the first-leading causes of cancer deaths among women. The early detection of breast cancer could save many women's life. Mammogram is one of the most imaging technology used for diagnosing breast cancer. Although mammogram has recorded a high detection and classification accuracy, it is difficult in imaging dense breast tissues, its performance is poor in younger women, it is harmful, and it couldn’t detect breast tumor that less than 2 mm. To overcome these limitations, it was found that there is a relation between the temperature and the presence of breast cancer. Utilizing this fact, infrared thermography could be a good source of breast images to study and detect cancer at the early stages which is crucial for cancer patients for increasing the rate of breast cancer survival.
This talk aims to give an overview of the thermal imaging technology, its possible applications in the medical field, focusing on its opportunities and challenges for the early detection of breast cancer and highlighting the state-of-the-art of this point.
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
Breast Cancer is one of the crucial and burning diseases that has invaded women. Predicting breast cancer manually takes a lot of time and it is difficult for the physician to classification. So, detecting cancer through various automatic diagnostic techniques is very necessary. Data mining is the process of running powerful classification techniques that extract useful information from data. The uses and potentials of these techniques have found its scope in medical data. Classification techniques tend to simplify the prediction segment.
Old wine in new wineskins: Revisiting counselling in traditional Ndebele and ...IOSR Journals
The institution of counselling is present in all human communities as people share their sorrows,
mentor, empower and advise each other. The service of advising and grooming is all that counselling is. This
paper seeks to explore the institution of counselling in Shona and Ndebele traditional societies before the advent
of western formalised counselling institutions. The research sets to prove that counselling is not a new
phenomenon in these societies, that is a remnant of colonialism but rather it is an old institution that has been
window dressed with western strategies and formalisms. African traditional counselling strategies as seen in the
Shona and Ndebele examples emphasise more on the preventive forms of counselling than crisis counselling.
Advice and mentoring are prioritised in these societies as a way of helping people stay out of trouble that in
future will require therapeutic or crisis counselling. Modern day counselling has been professionalised and
commercialised and requires people to pay for it yet in Shona and Ndebele traditional societies it was part of
one’s responsibility to make sure others are well advised and counselled if they are emotionally troubled.
Professional counselling in marriage, carrier guidance, teenage grooming for example is not a new practice but
an old practice done differently like old wine in new wineskins.
Breast Cancer Diagnostics with Bayesian NetworksBayesia USA
The Wisconsin Breast Cancer Database (WBCD) is a widely studied (and publicly available) data set from the field of breast cancer diagnostics. The creators of this database, Wolberg, Street, Heisey and Managasarian, made an important contribution with their research towards automating diagnostics with image processing and machine learning.
Beyond the medical field, many statisticians and computer scientists have proposed a wide range of classification models based on WBCD. Such new methods have continuously raised the benchmark in terms of diagnostic performance.
Our white paper now reevaluates the Wisconsin Breast Cancer Database within the framework of Bayesian networks, which, to our knowledge, has not been done before. We demonstrate how the BayesiaLab software can extremely quickly — and simply — create a Bayesian network model that is on par performance-wise with virtually all existing models that have been developed from WBCD over the last 15 years.
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYijdpsjournal
Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals
have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast
cancer for a long time may increase the possibility of the cancer spreading. Therefore a computerized
breast cancer diagnosis has been developed to reduce the time taken to diagnose the breast cancer and
reduce the death rate. This paper summarizes the survey on breast cancer diagnosis using various machine
learning algorithms and methods, which are used to improve the accuracy of predicting cancer. This survey
can also help us to know about number of papers that are implemented to diagnose the breast cancer.
A Comparative Study on the Methods Used for the Detection of Breast Cancerrahulmonikasharma
Among women in the world, the death caused by the Breast cancer has become the leading role. At an initial stage, the tumor in the breast is hard to detect. Manual attempt have proven to be time consuming and inefficient in many cases. Hence there is a need for efficient methods that diagnoses the cancerous cell without human involvement with high accuracy. Mammography is a special case of CT scan which adopts X-ray method with high resolution film. so that it can detect well the tumors in the breast. This paper describes the comparative study of the various data mining methods on the detection of the breast cancer by using image processing techniques.
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
Abstract Breast Cancer has become the common cause of death among women. Due to long hours invested in manual diagnosis and lesser diagnostic system available emphasize the development of automated diagnosis for early diagnosis of the disease. Our aim is to classify whether the breast cancer is benign or malignant and predict the recurrence and non-recurrence of malignant cases after a certain period. To achieve this we have used machine learning techniques such as Support Vector Machine, Logistic Regression, KNN and Naive Bayes. These techniques are coded in MATLAB using UCI machine learning depository. We have compared the accuracies of different techniques and observed the results. We found SVM most suited for predictive analysis and KNN performed best for our overall methodology. Keywords: Breast Cancer, SVM, KNN, Naive Bayes, Logistic Regression, Classification.
Predictive Analysis of Breast Cancer Detection using Classification AlgorithmSushanti Acharya
Dissertation project titled “Predictive analysis of Breast Cancer detection using Classification”. For the research conducted, Breast Cancer Wisconsin Diagnostics dataset was used for analysis. Using R language machine learning model was designed based on various algorithms and the derived results were then visualized to present the most accurate model of them all (SVM in this case).
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
According to World Health Organization (WHO), breast cancer is the top cancer in women both in the
developed and the developing world. Increased life expectancy, urbanization and adoption of western
lifestyles trigger the occurrence of breast cancer in the developing world. Most cancer events are
diagnosed in the late phases of the illness and so, early detection in order to improve breast cancer
outcome and survival is very crucial.
In this study, it is intended to contribute to the early diagnosis of breast cancer. An analysis on breast
cancer diagnoses for the patients is given. For the purpose, first of all, data about the patients whose
cancers’ have already been diagnosed is gathered and they are arranged, and then whether the other
patients are in trouble with breast cancer is tried to be predicted under cover of those data. Predictions of
the other patients are realized through seven different algorithms and the accuracies of those have been
given. The data about the patients have been taken from UCI Machine Learning Repository thanks to Dr.
William H. Wolberg from the University of Wisconsin Hospitals, Madison. During the prediction process,
RapidMiner 5.0 data mining tool is used to apply data mining with the desired algorithms.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Thermal Imaging: Opportunities and Challenges for Breast Cancer DetectionTarek Gaber
Thermal Imaging: Opportunities and Challenges for Breast Cancer Detection
Abstract:
Breast cancer is the most common cancer among women in the world. It is estimated that one in eight women, all over the wide, would develop breast cancer during her life. Breast cancer is considered one of the first-leading causes of cancer deaths among women. The early detection of breast cancer could save many women's life. Mammogram is one of the most imaging technology used for diagnosing breast cancer. Although mammogram has recorded a high detection and classification accuracy, it is difficult in imaging dense breast tissues, its performance is poor in younger women, it is harmful, and it couldn’t detect breast tumor that less than 2 mm. To overcome these limitations, it was found that there is a relation between the temperature and the presence of breast cancer. Utilizing this fact, infrared thermography could be a good source of breast images to study and detect cancer at the early stages which is crucial for cancer patients for increasing the rate of breast cancer survival.
This talk aims to give an overview of the thermal imaging technology, its possible applications in the medical field, focusing on its opportunities and challenges for the early detection of breast cancer and highlighting the state-of-the-art of this point.
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
Breast Cancer is one of the crucial and burning diseases that has invaded women. Predicting breast cancer manually takes a lot of time and it is difficult for the physician to classification. So, detecting cancer through various automatic diagnostic techniques is very necessary. Data mining is the process of running powerful classification techniques that extract useful information from data. The uses and potentials of these techniques have found its scope in medical data. Classification techniques tend to simplify the prediction segment.
Old wine in new wineskins: Revisiting counselling in traditional Ndebele and ...IOSR Journals
The institution of counselling is present in all human communities as people share their sorrows,
mentor, empower and advise each other. The service of advising and grooming is all that counselling is. This
paper seeks to explore the institution of counselling in Shona and Ndebele traditional societies before the advent
of western formalised counselling institutions. The research sets to prove that counselling is not a new
phenomenon in these societies, that is a remnant of colonialism but rather it is an old institution that has been
window dressed with western strategies and formalisms. African traditional counselling strategies as seen in the
Shona and Ndebele examples emphasise more on the preventive forms of counselling than crisis counselling.
Advice and mentoring are prioritised in these societies as a way of helping people stay out of trouble that in
future will require therapeutic or crisis counselling. Modern day counselling has been professionalised and
commercialised and requires people to pay for it yet in Shona and Ndebele traditional societies it was part of
one’s responsibility to make sure others are well advised and counselled if they are emotionally troubled.
Professional counselling in marriage, carrier guidance, teenage grooming for example is not a new practice but
an old practice done differently like old wine in new wineskins.
Numerical solution of heat equation through double interpolationIOSR Journals
In this article an attempt is made to find the solution of one-dimensional Heat equation with initial and boundary conditions using the techniques of numerical methods, and the finite differences. Applying Bender-Schmidt recurrence relation formula we found u(x ,t) values at lattice points. Further using the double interpolation we found the solution of Heat equation as double interpolating polynomial
Structure and transport coefficients of liquid Argon and neon using molecular...IOSR Journals
Molecular dynamics simulation was employed to deduce the dynamics property distribution function of Argon
and Neon liquid. With the use of a Lennnard-Jones pair potential model, an inter-atomic interaction function was observed
between pair of particles in a system of many particles, which indicates that the pair distribution function determines the
structures of liquid Argon. This distribution effect regarding the liquid structure of Lennard-Jones potential was strongly
affected such that its viscosity depends on density distribution of the model. The radial distribution function, g(r) agrees well
with the experimental data used. Our results regarding Argon and Neon show that their signatures are quite different at
each temperature, such that their corresponding viscosity is not consistent. Two sharps turning points are more
prominent in Argon, one at temperature of 83.88 Kelvin (K) with viscosity of -0.548 Pascal second (Pa-s) and the
other at temperature of 215.64 K with viscosity of -0.228 Pa-s.
In Argon and Neon liquid, temperature and density are inversely and directly proportional to diffusion
coefficient, in that order. This characteristic suggests that the observed non linearity could result from the non
uniform thermal expansion in liquid Argon and Neon, which are between the temperature range of 21.98 K and
239.52 K.
Micellar Effect On Dephosphorylation Of Bis-4-Chloro-3,5-Dimethylphenylphosph...IOSR Journals
The rate enhancement depends on the hydrophobicity of the nucleophile. The micellar catalyzed reaction between bis-4-chloro-3,5-dimethylphenylphosphate ester and hydroxide or hydroperoxide anions has been examined in buffered medium (pH 8-10). First order rate constant (Kψ) for the reaction of hydroxide ion with bis-4-CDMPP go through maxima with the increasing concentration of cetyltrimethylammoniumbromide (CTABr). Micelles of CTABr very effective catalyst to the reactions of phosphate diesters. Rate constants measured with OH2- ions are approximately twice and thrice than that of OH- ions in presence of CTABr.
Performance Evaluation of IEEE STD 802.16d TransceiverIOSR Journals
WiMAX ("Worldwide Interoperability for Microwave Access") technology is developed to meet the
growing demand of increased data rate and accessing the internet at high speeds. 802.16 family of standards is
officially called Wireless MAN in IEEE. Orthogonal frequency division multiplexing (OFDM) is multicarrier
modulation technique used in IEEE 802.16d (fixed WiMAX) communication standard. OFDM is used to
increase data rate of wireless medium with higher spectral efficiency. The proposed work is to evaluate
performance of IEEE Std 802.16d transceiver in MATLAB R2009b simulink environment. System performance
evaluated using BER vs SNR for different modulation technique such as 4 QAM, 16 QAM, 64 QAM under
different channel condition
Chemical Investigations of Some Commercial Samples of Calcium Based Ayurvedic...IOSR Journals
Kapardika bhasma is an important Ayurvedic drug of marine origin. Even though it is composed of mainly of calcium carbonate it exhibits excellent medicinal properties which are not associated with standard calcium carbonate. In the present study four commercial samples are characterized using techniques like EDX, SEM, IR, UV,XRD and TG analysis to throw light on their chemical composition and chemical properties .Such comparative study may help to standardise and to interpret the biological and medicinal properties of such traditional drug.
Measurement of Efficiency Level in Nigerian Seaport after Reform Policy Imple...IOSR Journals
This paper focuses on the impact of reforms on port performance using Onne and Rivers ports as a reference point. It analyses the pre and post reform eras of the ports in terms of their performance. The reforms took effect from 1996 after the Federal Government of Nigeria concessioned the ports to private investors. Parameters such as Ship traffic, Cargo throughput, Ship turn round time, Berth Occupancy and personnel were used as variables for the assessment. Secondary Data were collected from the Nigerian Ports Authority and Integrated Logistic Services Nigeria (Intels) for the period 2001 to 2010 and analyzed using Data Envelopment Analysis to assess the efficiency of the port. Analysis revealed a continuous improvement in the overall efficiency of both Ports Since 2006 when the new measure was introduced. Average Ship turn-around time improved in the ports due to modern and fast cargo handling equipment and more cargo handling space which were provided. There is an increase in Ship traffic calling at the ports, resulting in increased cargo throughput and berth occupancy rate at ports of Onne and Rivers. The reform also led to more private investment in the ports’ existing and new facilities and the introduction of a World Class service in port operation. This study concludes that the Ports of Onne and Rivers are performing better under the reform programme of the Federal Government of Nigeria. It finally recommends the urgent need for a regulator to appraise the performance of the reform programme from time to time as provided by the agreement and for the full adoption and utilization of management information system (MIS) to aid performance efficiency.
Kinetic study of free and immobilized protease from Aspergillus sp.IOSR Journals
In the present investigation partially purified alkaline protease from Aspergillus sp. As#6 and As#7 strains were entrapped in calcium alginate beads and characterized using casein as a substrate. Temperature and pH maxima of protease from As#6 strain showed no changes before and after immobilization and remained stable at 450C and pH 9, respectively. However km value was slightly shifted from 4.5mg/ml to 5 mg/ml. Proteases from As#7 strain showed shifting in pH optima to a more alkaline range (10.0) as compared with free enzyme (9.0). Optimum temperature for protease from As#7 strain showed changes after immobilization and shifted from 650C to 850C. However there was no significant effect on Km value but Vmax of immobilized protease from As#7 strain was also shifted from 200U/ml to 370U/ml. Immobilized protease from As#6 strain was reused for 3 cycles with 22% loss in its activity whereas immobilize protease from As#7 strain was reused for 3 cycles with 17% loss in its activity. Protease from As#7 strain has a higher affinity for the substrate and higher proteolysis activity than protease from As#6 strain. The present work concludes that Aspergillus As#7 strain may be a good source of industrial protease
Breast cancer is the leading cause of death for women worldwide. Cancer can be discovered early, lowering the rate of death. Machine learning techniques are a hot field of research, and they have been shown to be helpful in cancer prediction and early detection. The primary purpose of this research is to identify which machine learning algorithms are the most successful in predicting and diagnosing breast cancer, according to five criteria: specificity, sensitivity, precision, accuracy, and F1 score. The project is finished in the Anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries as well as matplotlib and Pandas. In this study, the Wisconsin diagnostic breast cancer dataset was used to evaluate eleven machine learning classifiers: decision tree, quadratic discriminant analysis, AdaBoost, Bagging meta estimator, Extra randomized trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-Nearest neighbors, multilayer perceptron, and support vector classifier. During performance analysis, extremely randomized trees outperformed all other classifiers with an F1-score of 96.77% after data collection and data analysis.
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
Breast cancer is one of the leading cancers for women in developed countries including India. It is the
second most common cause of cancer death in women. The high incidence of breast cancer in women has
increased significantly in the last years. In this paper we have discussed various data mining approaches
that have been utilized for breast cancer diagnosis and prognosis. Breast Cancer Diagnosis is
distinguishing of benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast
Cancer is to recur in patients that have had their cancers excised. This study paper summarizes various
review and technical articles on breast cancer diagnosis and prognosis also we focus on current research
being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis.
Classification AlgorithmBased Analysis of Breast Cancer DataIIRindia
The classification algorithms are very frequently used algorithms for analyzing various kinds of data available in different repositories which have real world applications. The main objective of this research work is to find the performance of classification algorithms in analyzing Breast Cancer data via analyzing the mammogram images based its characteristics.Different attribute values of cancer affected mammogram images are considered for analysis in this work. The Patients food habits, age of the patients, their life styles, occupation, their problem about the diseases and other information are taken into account for classification. Finally, performance of classification algorithms J48, CART and ADTree are given with its accuracy. The accuracy of taken algorithms is measured by various measures like specificity, sensitivity and kappa statistics (Errors).
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
inAbstract— Pattern Recognition (PR) plays an important role in field of Bioinformatics. PR is concerned with processing raw measurement data by a computer to arrive at a prediction that can be used to formulate a decision to be taken. The important problem in which pattern recognition are applied have common that they are too complex to model explicitly. Diverse methods of this PR are used to analyze, segment and manage the high dimensional microarray gene data for classification. PR is concerned with the development of systems that learn to solve a given problem using a set of instances, each instances represented by a number of features. The microarray expression technologies are possible to monitor the expression levels of thousands of genes simultaneously. The microarrays generated large amount of data has stimulate the development of various computational methods to different biological processes by gene expression profiling. Microarray Gene Expression Profiling (MGEP) is important in Bioinformatics, it yield various high dimensional data used in various clinical applications like cancer diagnostics and drug designing. In this work a new schema has developed for classification of unknown malignant tumors into known class. According to this work an new classification scheme includes the transformation of very high dimensional microarray data into mahalanobis space before classification. The eligibility of the proposed classification scheme has proved to 10 commonly available cancer gene datasets, this contains both the binary and multiclass data sets. To improve the performance of the classification gene selection method is applied to the datasets as a preprocessing and data extraction step.
Breast Tumor Detection Using Efficient Machine Learning and Deep Learning Tec...mlaij
Breast cancer tissues grow when cells in the breast expand and divide uncontrollably, resulting in a lump of tissue commonly called and named tumor. Breast cancer is the second most prevalent cancer among women, following skin cancer. While it is more commonly diagnosed in women aged 50 and above, it can affect individuals of any age. Although it is rare, men can also develop breast cancer, accounting for less than 1% of all cases, with approximately 2,600 cases reported annually in the United States. Early detection of breast tumors is crucial in reducing the risk of developing breast cancer. A publicly available dataset containing features of breast tumors was utilized to identify breast tumors using machine learning and deep learning techniques. Various prediction models were constructed, including logistic regression (LR), decision tree (DT), random forest (RF), support vector machine (SVM), Gradient Boosting (GB), Extreme Gradient Boosting (XGB), Light GBM, and a recurrent neural network (RNN) model. These models were trained to classify and predict breast tumor cases based on the provided features.
BREAST TUMOR DETECTION USING EFFICIENT MACHINE LEARNING AND DEEP LEARNING TEC...mlaij
Breast cancer tissues grow when cells in the breast expand and divide uncontrollably, resulting in a lump
of tissue commonly called and named tumor. Breast cancer is the second most prevalent cancer among
women, following skin cancer. While it is more commonly diagnosed in women aged 50 and above, it can
affect individuals of any age. Although it is rare, men can also develop breast cancer, accounting for less
than 1% of all cases, with approximately 2,600 cases reported annually in the United States. Early
detection of breast tumors is crucial in reducing the risk of developing breast cancer. A publicly available
dataset containing features of breast tumors was utilized to identify breast tumors using machine learning
and deep learning techniques. Various prediction models were constructed, including logistic regression
(LR), decision tree (DT), random forest (RF), support vector machine (SVM), Gradient Boosting (GB),
Extreme Gradient Boosting (XGB), Light GBM, and a recurrent neural network (RNN) model. These
models were trained to classify and predict breast tumor cases based on the provided features.
Breast Tumor Detection Using Efficient Machine Learning and Deep Learning Tec...mlaij
Machine Learning and Applications: An International Journal (MLAIJ) is a quarterly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the machine learning. The journal is devoted to the publication of high quality papers on theoretical and practical aspects of machine learning and applications.The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on machine learning advancements, and establishing new collaborations in these areas. Original research papers, state-of-the-art reviews are invited for publication in all areas of machine learning.
Authors are solicited to contribute to the journal by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the areas of machine learning.
Image processing and machine learning techniques used in computer-aided dete...IJECEIAES
This paper aims to review the previously developed Computer-aided detection (CAD) systems for mammogram screening because increasing death rate in women due to breast cancer is a global medical issue and it can be controlled only by early detection with regular screening. Till now mammography is the widely used breast imaging modality. CAD systems have been adopted by the radiologists to increase the accuracy of the breast cancer diagnosis by avoiding human errors and experience related issues. This study reveals that in spite of the higher accuracy obtained by the earlier proposed CAD systems for breast cancer diagnosis, they are not fully automated. Moreover, the false-positive mammogram screening cases are high in number and over-diagnosis of breast cancer exposes a patient towards harmful overtreatment for which a huge amount of money is being wasted. In addition, it is also reported that the mammogram screening result with and without CAD systems does not have noticeable difference, whereas the undetected cancer cases by CAD system are increasing. Thus, future research is required to improve the performance of CAD system for mammogram screening and make it completely automated.
PREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUESIAEME Publication
Women who have improved from breast cancer (BC) constantly panic about setback. The way that they have persevered through the meticulous treatment makes repeat their biggest fear. However, with current spreads in technology, early repeat prediction can enable patients to get treatment prior. The accessibility of broad information and propelled techniques make precise and fast prediction possible. This examination expects to think about the exactness of a couple of existing information mining calculations in predicting BC repeat. It inserts a particle swarm optimization as highlight choice into ANN classifier. An objective of increasing the accuracy level of the prediction model.
The Evolution and Impact of Medical Science Journals in Advancing Healthcaresana473753
Medical science journals have evolved into essential tools for advancing healthcare by disseminating research findings, promoting evidence-based practices, and fostering collaboration. Their historical significance, role in evidence-based medicine, and adaptability to the digital age make them indispensable in the quest for improved healthcare outcomes. As they continue to evolve, medical science journals will play a vital role in shaping the future of medicine and healthcare worldwide.
"journals" refer to academic or professional publications that contain articles and research papers related to various aspects of the medical field. These journals serve as a platform for the dissemination of new medical knowledge, research findings, clinical studies, and expert opinions. They play a crucial role in advancing medical science, sharing best practices, and keeping healthcare professionals, researchers, and students informed about the latest developments in medicine and related disciplines.
Similar to Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis (20)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 15, Issue 5 (Nov. - Dec. 2013), PP 39-44
www.iosrjournals.org
www.iosrjournals.org 39 | Page
Performance and Evaluation of Data Mining Techniques in
Cancer Diagnosis
R.M. Chandrasekar Ph.D, V. Palaniammal M.C.A., M.Phil.
(Professor, Computer Science, Annamalai University, Chidambaram, Tamil Nadu, India.)
(Asst. Professor, Thiruvalluvar University College of Arts & Science, Kallakurichi, Tamil Nadu, India.)
Abstract: We analyze the breast Cancer data available from the WBC, WDBC from UCI machine learning with
the aim of developing accurate prediction models for breast cancer using data mining techniques. Data mining
has, for good reason, recently attracted a lot of attention, it is a new Technology, tackling new problem, with
great potential for valuable commercial and scientific discoveries. The experiments are conducted in WEKA.
Several data mining classification techniques were used on the proposed data. There are many classification
techniques in data mining such as Decision Tree, Rules NNge, Tree random forest, Random Tree, lazy IBK. The
aim of this paper is to investigate the performance of different classification techniques. The data breast cancer
data with a total 286 rows and 10 columns will be used to test and justify the different between the classification
methods and algorithm.
Keywords - Machine learning, data mining Weka, classification, breast cancer
I. Introduction
About 1 in 8 U.S women (just under 12%) will develop invasive breast cancer over the course of her
life. In 2011, an estimated 230,480 new cases of invasive breast cancer were expected to be diagnosed in women
in the U.S, along with 57,650 new cases of non-invasive (in situ) breast cancer. For women in the U.S breast
cancer death rates are higher than those for any other cancer besides lung cancer. In 2011 there were more than
2.6million breast cancer survivor in the U.S.A women’s risk of breast cancer approximately relative (mother,
sister, daughter) who has been diagnosed with breast cancer. About 15% of women who get breast cancer have a
family member diagnosed with it. About 85% of breast cancers occur in women who have no family history of
breast cancer. These occur due to genetic mutations that happen as a result of the aging process and life in
general.
Breast cancer (An overview):
Breast Cancer is the most common cancer diseases among women excluding no melanoma skin.
Cancers are divided into two types benign and Malignant. If the cancer is benign under the conditions of early
diagnosis. Malignancy status includes the three basic measurement 1.Age 2.Longer tumor length 3.ADC or
Apparent Diffusion coefficient (biopsy confirmed).Attributes 1.patient id 2.diagnosis m=malignant ,b=benign
3.Real valued features. Cell uncles’ 1.radius 2.texture 3.perimeter 4.Area 5.Smoothness.I Diagnosis Attributes
are 1.Menopausal status a. Premenopausal, b.post menopausal. II.Basic diagnosis :-1.clinical
examinations,2.mammography (y/n),3.MRI (y/n),4.Ultrasound (y/n),5.Fine Needle aspiration (y/n),6.Core
biopsy (y/n),7.Open biopsy (y/n) ,8.Other (y/n),9.Unknown (y/n).III Clinical Trial Enrollment:-y/n/Not
stated/inadequate. IV. Initial Representation:-Screening-mmaaography,Screening-MRI,Screening-
Others,Symptomatic.V.Total extent of Lesion (DCIS AND INVASIVE) 1.Lesion in mm.VI.Lymphovascular
Invasion(presence of tumor cells in endothelium-lined spaces) 1.present/absent/suspicious/not stated /unknown.
Number stages of Breast cancer:-
There are 4 number stages of breast cancer, staging takes into various factors, including 1.The size of
the tumor (tumor means either breast lymph’s or Area of cancer cells found on a Scan or mammogram)
2.Cancer cells have spread into nearby lymph glands (lymph Node) 3.The tumor cells has spread to any other
part of the body (Metastasized-TNM). Stage1 breast cancer is split into 2 Stages.Stage1A tumor is 2 cm or
smaller and has spread outside the breast.Stage1B: Small areas of breast Cancer cells are found in the lymph
node closed to the breast and either the tumor is 2 cm or smaller. Stage 2 breast cancer: stage 2A:Tumor 2cm or
smaller in the breast and cancer cells are found in 1 to 3 lymph Node in the armpit or in the lymph Node near
the breast bone. Stage 2B: The tumor is larger than 2 cm but not larger than 5 cm and small areas of cancer cells
are in the lymph Node.2 to 5 cm spread 1 to 3 lymph nodes in the armpit or near the breastbones or the tumor is
larger than 5cm and has not spread to the lymph node. Stage 3A breast cancer No tumor is seen in the breast or
the tumor may be any size and cancer is found in 4 to 9 lymph glands under the arm or in the lymph glands near
the breast bone. The tumor is more than 5 cm and has spread into up to 3 lymph nodes near the breast bone.
2. Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
www.iosrjournals.org 40 | Page
Stage 3B: The tumor has spread to the skin of the breast or to the chest wall and made the kin break down or
cancer swelling. The cancer may have spread to the up 9 lymph Node in the armpit or to the lymph glands near
the breast bone. Stage 4 breast cancer: The tumor can be any size. The lymph Node may or may not contain
cancer cells. The cancer has spread to the other parts of the body such as bone, lungs, liver, and brain. The TNM
(Tumor, Node, and Metastasis) system is specifies for each type of cancer. Once a patient‘s T, N, and M
categories have been determined, this information is combined in a process called stage grouping to determine a
women disease stage.Stage0 (the least advantage stage) to Stage4 (the most advantage stage).
II. Headings
Existing System:
The data sets used are SEER data or Wisconsin data. Data pre-processing is applied before data mining
to improve the quality of the data. Data pre-processing includes data cleaning, data integration, data
transformation and data reduction techniques. The features used for classification purposes coincided with the
Breast Imaging Reporting and Data System (BI-RADS) as this is how radiologists classify breast cancer. The
BI-RADS features of density, mass shape, mass margin and abnormality assessment rank are used as they have
been proven to provide good classification accuracy. A Classification method, Decision tree algorithms are
widely used in medical field to classify the medical data for diagnosis. Feature Selection increases the accuracy
of the Classifier because it eliminates irrelevant attributes. Feature selection with decision tree classification
greatly enhances the quality of the data in medical diagnosis. CART algorithm with various feature selection
methods to find out whether the same feature selection method may lead to best accuracy on various datasets of
same domain. Artificial neural networks (ANNs) and support vector machines have been recently proposed as a
very effective method for pattern recognition, machine learning and data mining. The discrimination capability
of the features extracted from the sonograms was tested by using the SVM (support vector machine), ANN and
KNN (K-nearest neighbor) classifier. It was found that the SVM gave the greatest accuracy while the ANN had
the highest sensitivity. Back propagation neural network (BPNN) and radial basis function network (RBFN)
were used for the training and testing of data.
Algorithm:
1. CART Algorithm
2. Decision tree algorithms
3. ID3
4. C4.5 decision tree algorithms
5. Hunt’s algorithm
6. SVM
7. Naïve Bayes classifier.
8. The back-propagated neural network.
Proposed System:
Early diagnosis needs an accurate and reliable diagnosis procedure that can be used by physicians to
distinguish benign breast tumors from malignant ones without going for surgical biopsy. The objective of these
predictions is to assign patients to one of the two group either a “benign” that is noncancerous or a “malignant”
that is cancerous. The prognosis problem is the long-term care for the disease for patients whose cancer has been
surgically removed. In this paper, we propose a model-based data mining technique with a neural network
classification technique and the improvements possible using an ensemble approach. Ensembles have been
proposed as a mechanism for improving the classification accuracy of existing classifiers providing that
constituents are diverse. Contrasting the cluster analysis technique against a baseline neural network classifier,
and then considers the effects of applying an ensemble technique to improve the accuracies. In different cases
the state of the disease condition itself can be marked by stages where the diagnostic symptoms or signs can be
subtle or different to other stages of the disease. This means that there is often not a clean mapping between the
diagnostic features and the diagnosis. The usage of clustering has also been extended to classifiers and detection
systems in order to improve detection and provide greater classification accuracy. Use of a feature selection
technique with a decision tree classifier to classify breast cancer. Another advantage of the model-based
clustering approach is that no decisions have to be made about the scaling of the observed variables: for
instance, when working with normal distributions with unknown variances, the results will be the same
irrespective of whether the variables are normalized or not. In order to fasten the classification we are using
Artificial Neural Network (ANN).
Algorithm:
1. Decision tree algorithms – For feature selection
3. Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
www.iosrjournals.org 41 | Page
2. C4.5 Algorithm – For classification
3. ANN (Artificial Neural Network)
Software Requirement:
The WEKA is an ensemble of tools for data classification, regression, clustering, association rules, and
visualization. The toolkit is developed in Java and is an open source software issued under the GNU
General Public License.
WBCD dataset.
III. Figures And Tables
Characteristics of database:
Wisconsin diagnostics Breast Cancer this database is created by William H.Wolberge at University of
Wisconsin. This database contains 286 observations among which 201 are benign cases and 85 are malignant
cases. These instances are described by 10 Attributes:-.class no-recurrence events, recurrence events, 2.age,
3.Menopause, 4.tumor_size, 5.inv_nodes, 6.node_cap, 7.deg_malig, 8.breast, 9.breast_quad, 10.irradiat
Table1:
Attributes Values
Age 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99
menopause lt40, ge40,
premeno
tumor-size 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40- 44,45-49, 50-54,55-59
inv-nodes 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26,27-29, 30-32, 33-35,36-39
node-caps yes, no
deg-malig 1, 2, 3
Breast left, right
breast-quad left-up, left-low, right-up, right-low, central
irradiat yes, no
Class no-recurrence-events, recurrence-events
Classification prediction process in data mining:
Data mining is collection of techniques for efficient automated discovery of previously unknown valid
novel, useful and understandable pattern in large databases. Classification is the process of minding a model that
describes and distinguishes data classes concepts. This derived model is based on the analysis of a set of training
data. Classification predicts categorial labels predictable numerical data values rather than class lables.
We used several classification methods resulting in identification of top 5 classification alogorithm.
This paper presents a brief description of the classifiers and meta_classifiers used in the experiments results.
Fig1: Classification prediction process in data mining
Rules NNge (Non_Nested Generalised Examplars):
NNge Classifier new examples by determining the nearest neighbour in the exemplar database using a
Euclidean distance function.
4. Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
www.iosrjournals.org 42 | Page
The function is 𝐷 𝐸𝐻 = 𝑊
𝐻 𝑊
𝐼
𝐸 𝐼−𝐻 𝐼
𝑀𝐴𝑋 𝐼−𝑀𝐼𝑁 𝐼
𝑀
𝐼=1
2
Lazy K Star classifier
Lasy kstar algorithm most successful classifier. Implementation of an instance based classifier, which
uses the entropic distance measures. Test instance is based upon the class of those training instance.
Lazy IBK (K_Nearest Neighbours classifier):
IBK classifier is simple instance based learner that uses the class of nearest K training instances for the
class of the test instance. Predicts the class of the single nearest training instance for each test instance. Get
maximum no of instances allowed in the training pool.
Meta Decorate (Diverse Ensemble creation by oppositional relabeling of artificial Training Examples):
Meta learners such as Boosting Bagging and Random Forest provide diversity by sub sampling or
reweighting the existing training data. Decorate performs by adding randomly constructed to the training
set,when building new members.
𝑑𝑗 𝑥 =
0 𝑖𝑓𝑐𝑗 𝑥
1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
= 𝑐∗
𝑥
𝐷 =
1
𝑁𝑀
𝑑𝑗 𝑥𝑖
𝑛
𝑖=1
𝑚
𝑗=1
𝑐∗
𝑥 =arg max 𝑝 𝑘 𝑥 ,kє{1…….Q)
𝑝 𝑘 𝑥 = 𝑝𝑐3,𝑘 𝑥𝑐 𝑗𝜖 𝑐∗ / 𝑐∗
Tree Random Forest:
Random Forest classifier consists of multiple decision trees. The final class of an instance in a Random
forest is assigned by outputting the class that is the mode of the outputs of individual trees. Which can produce
robust and accurate classification. Random classifiers is based on a combination of many decision Tree
predictors such that each tree depends on the values of a random vector sampled independently and with the
same distribution for all trees in the forest. RF has excellent accuracy among current classifier Algorithm.
Fig2:Block diagram of the survival detection system
Table2:
Simulation result of each Algorithm
Algorithm
Total instance 286
Correctly classify
Instance(%value)
Incorrectly classify
Instance(%value)
Time taken
(seconds)
Kuppa status
Rule NNge 249
97.2028
8
2.7972
0.16 0.933
Random Forest 280
97.9021
6
2.0979
0.11 0.9501
Meta Decorate 259
90.5594
27
9.4406
1.81 0.7584
Lazy IBK 280
97.9021
6
2.0979
0 0.9491
Lazy Kstar 280
97.9021
6
2.0979
0 0.9491
Random Tree 280
97.9021
6
2.0979
0.02 0.9491
5. Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
www.iosrjournals.org 43 | Page
Classification
Knm
SVM
Neural N/W
Decision Tree Classifier
Output Begin or Malignant
Pixel analysis
Intensity Analysis
Texture Analysis
Feature Extraction
Fuzzy C-means
K-means clustering
Free hand clustering
Active contour
clustering
Segmentation
1. Image filter
Linear filter
Median Filter
Adaptive Filter
2. Images resize
3. Contrast enhancement
Run time
Cache enhancement
algorithm
Input Image
Pre-processing
Read single image
Runtime read image
Read multiple images
Table3:
Training and Simulation errors:
Algorithm
Total instance
286
Mean Absolute
Error %
Root Mean
Squared Error
%
Relative
absolute
Error%
Root Relative
Squared
Error%
Rules NNge 0.028 0.1672 6.6868 36.5947
Random Forest 0.0978 01614 23.3712 35.3247
Meta Decorate 0.3655 0.3797 87.3643 83.083
Lazy-IBK 0.0253 0.1053 6.0487 23.0348
Lazy-Kstar 0.0747 0.1399 17.8681 30.6094
Random Tree 0.0221 0.1052 5.2937 23.0237
Architecture Diagram
6. Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
www.iosrjournals.org 44 | Page
IV. Conclusion
This paper presents effective classification Techniques. After investigation of different classification
Algorithm we have chosen 6 classifier based on our simulation performance and we have used Tree Random
classifier achieved overall classification accuracy 98%, which is significanant. In future work we propose to
analyze Ensemble classifier for 100% accuracy.
Acknowledgements
The authors like to express our appreciations to prof. R.M. Chandrasekar for his help in correcting
earlier versions of this paper. We also would like to thank the anonymous reviewers for their valuable comments
and in sightful suggestions.
References
[1] Mitchell, T.M. 1997. Machine Learning. McGraw-Hill Science
[2] John, G., Cleary, E. & Leonard, E. 1995. K*: An Instance-based Learner Using an Entropic Distance Measure. In: 12th
International Conference on Machine Learning, 108-114.
[3] Aha, D. & Kibler, D. 1991. Instance-based learning algorithms. Machine Learning.
[4] American Cancer Societies. Breast Cancer Facts& Figures 2005-2006. Atlanta: American Cancer Society, Inc.
(http://www.cancer.org/).
[5] National Resource. Cancer Epidemiology Biomarkers & Prevention 1999; 8:1117-1121.
[6] Houston, Andrea L. and Chen, et. al.. Medical Data Mining on the Internet: Research on a Cancer Information System. Artificial
Intelligence Review 1999; 13:437-466.
[7] Lundin M, Lundin J, Burke HB, Toikkanen S Pylkkanen L, Joensuu H. Artificial neural Networks applied to survival prediction in
breast cancer. Oncology 1999; 57:281-6.
[8] Delen D, Walker G, Kadam A. Predicting breast cancer survivability: a comparison of three data Mining methods. Artificial
Intelligence in Medicine. 2005 Jun; 34(2):113-27.
[9] Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools and techniques, 2nd Edition. San Francisco: Morgan
Kaufmann; 2005.
[10] Weka: Data Mining Software in Java,http://www.cs.waikato.ac.nz/ml/weka
[11] lando Anunciac¸ ˜ao and Bruno C. Gomes and Susana Vinga and Jorge Gaspar and Arlindo L. Oliveira and Jos´e Rueff , A Data
Mining Approach for the detection of High-Risk Breast Cancer Groups
[12] aria-Luiza Antonie, Osmar R. Za¨ıane, Alexandru Coma, .Application of Data Mining Techniques for Medical Image
Classification.Proceeding of second International worshop on Mutimedia data mining(MDM/KDD’2001),in conjuction with ACM
SIGKDD conference.SAN FRANCISCO,USA,AUG 26,2001.