Breast-Cancer-Detection-final.pptx..............

MAJOR PROJECT
ON
BREAST CANCER DETECTION
Guided By: Submitted By:
Bivasa Ranjan Parida Mahima Milan Mohapatra - 1801219074
Suroshree Ghosh - 1801219161
Gyana Prakash Sahoo - 1801219056
Biswajit Sahoo - 1801219035
Dept. of Computer Science & Engineering
College of Engineering Bhubaneswar

:Content:
Introduction
System Specification
Methodologies
System Architecture
Project Interface
Task Performed
Confusion Matrix
Project Interface for breast cancer detected
Project Interface for breast cancer not detected
Advantages & Disadvantages
Applications
Future Scope
Conclusion
Reference

Introduction
 Cancer is a disease in which abnormal cells divide uncontrollably and destroy
body tissue.
 Mainly of two types i.e.
 Malignant(Cancerous)
 Benign(Non cancerous)
 Breast Cancer is the second largest cause of cancer deaths among women.
 At the same time, it is also among the most curable cancer types if it can be
diagnosed early.

System
Specification
Hardware Requirements:
 System: Pentium IV 2.4GHz
 Hard Disk: 500 GB
 RAM: 4 GB
 Any desktop/laptop system with above configuration or higher level
Software Requirements:
 Operating System: Windows 7 and above
 Coding Language: Python 2.7 and above
 Scripting tool: Jupyter Notebook
 Libraries: Pandas, Numpy, Sklearn, stats, Matplotlib, statistics.

Methodologies
What is a Support Vector Machine(SVM)?
• Supervised pattern classification
• powerful and versatile Machine Learning model
• suited for small or medium sized datasets.
• SVM is a training algorithm for learning classification and regression
rules from
data.

System
Architecture: Start
Training Data Breast cancer
detection
Preprocessed data
Cleaned dataset
Data visualization
Prediction using SVM
algorithm
Analysis the output and
performance
Stop

Task
Performed
Preparing the Data:-
Some loaded packages are;
1. import pandas pd 2.import
numpy as np
3.import matplotlib.pyplot as plt 4.import
seaborn as sns
Using pandas we will load the dataset and print some basic
information.
df = pd.read_csv("cell_samples.csv")
df.head()
df.tail()

• Output:
Which will display top and bottom entities of the data set used in our model.

• Now we can calculate how many diagnosis are malignant and how many are
benign . Which has been shown below.
Output:
• Now we can use seaborn to create heat map of the correlations between the
features.
plt.figure(figsize=(14, 11))
sns.heatmap(df.corr(),annot=True,cmap=
'viridis’) plt.show()

Why Choose
SVC?
114
(Fig: Confusion Matrix)
Predicted
Actual
TN FP
FN TP

From confusion matrix we can calculate Accuracy,Error,precision,recall.
1.Accuracy=(TP+TN)/Total
=(114+54)/175
=168/175
=0.96
2.Error=1-Accuracy
=1-0.96
=0.04
3.precision=TP/Predicted positive
=54/58
=0.93
4.recall=TP/Actual positive
=54/57
=0.95

Project Interface for Breast Cancer
Detected

Project Interface for Breast Cancer Not
Detected

Advantages
 Effective in high dimensional spaces
 Effective in cases where number of
dimensions is greater than the
number of samples.
 It is also memory efficient.
Disadvantages
If the number of features is much
greater than the number of samples,
avoid over-fitting in choosing Kernel
functions.
SVMs do not directly provide probability
estimates, these are calculated using an
expensive five-fold cross-validation.

Application
s
 Early detection leads to more treatment options and a better chance for
survival.
 Breast cancer detected at an early stage have a 93 percent or higher
survival rate in the first five years.
 It is quite easier to treat at an early stage rather than last stage.

Future
Scope
Breast cancer if found at an early stage will help save lives of thousands of women
or even men. Hence, this project plays a very important role for future:
• These projects help the real world patients and doctors to gather as much
information as they can.
• By using machine learning algorithms we will be able to classify and predict
the cancer into bening or malignant.
• Machine learning algorithms can be used for medical oriented research, it
advances the system, reduces human errors and lowers manual mistakes.

Conclusio
n
• After applying the different classification models, we have got accuracies with
different models. Decision Tree, K-NN, Support Vector Machine and Logistic
Regression algorithms achieved 94.64 percent,89.22 percent, 96.87 percent and 94.67
percent accuracy respectively.
• This research established the model’s performance and significant factors affecting
breast cancer patients’ survival rates, which may be used in clinical practice, especially
in the Asian scenario.

Reference
1. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/1472-694
7-8-56
2. https://airccse.org/journal/ijdps/papers/4313ijdps09.pdf
3. https://link.springer.com/article/10.1007/s10489-007-0073-z
4. https://www.sciencedirect.com/science/article/pii/S1877050916302575
5. https://www.academia.edu/71848246/Prediction_of_Breast_Cancer_Disease_us
ing_Machine_Learning_Algorithms

Breast-Cancer-Detection-final.pptx..............

More Related Content

Similar to Breast-Cancer-Detection-final.pptx..............

Recently uploaded

Breast-Cancer-Detection-final.pptx..............