1. BREAST CANCER PREDICTION USING MACHINE
LEARNING TECHNIQUES
SUBMITTED BY:
ADARSH THAKUR(22MCA0200)
SHUBHAM JHA(22MCA0398)
RAVI KUMAR PATEL(22MCA0338)
GUIDE: Dr. Senthilkumar T
2. • In this Modern Era the breast cancer in women has become very common. It is
second most prevalent cancer in women to be diagnosed in world.
• Early signs of breast cancer are difficult to identify.
• This project looks to solve this problem by taking the historical data pertaining to
different attributes of body cells and develop a module that will predict the
breast cancer with reasonable accuracy the dataset used for this purpose is from
the popular dataset platform kaggle and has taken into consideration different
parameters for predicting cancer and various algorithms like KNN and Logistic
Regression.
PROBLEM DESCRIPTION
3. • Breast cancer is a type of cancer that develops in the breast cells.
• The second most prevalent cancer in women to be diagnosed in the
United States is breast cancer.
• Both men and women can develop breast cancer, but women are much
more likely to do so.
• Different areas of the breast might give rise to breast cancer.
WHAT IS BREAST CANCER?
4. MOTIVATION
• As of 2019, on average, 1 in 8 U.S women (approx. 12%) would develop invasive
breast cancer at some point during her life.
• 5-year survival rate for breast cancer is 100% with early detection and 15% with
late detection (UK Cancer research) .
• Machine learning (ML) techniques play a key role in healthcare in recent years.
• In the case of breast cancer, machine learning techniques can be used to
distinguish between malignant and benign tumors for enabling early detection.
• Most ML based applications focus on large data sets citing ML’s ability to handle
big data.
• However, from a user’s perspective most users have access to publicly available
small data sets.
• Thus, it is interesting to analyze if the traditional non complex basic ML
algorithms can achieve high accuracy classifications using small datasets.
6. •The proposed project looks to solve the problem by taking the
historical data pertaining to different attributes of body cells and
develop a model that will predict the breast cancer.
•The breast cancer is predicted with reasonable accuracy.
•We have compared various Machine learning algorithm and
considered the accuracy of each to determine the best algorithm
for training dataset.
PROPOSED WORK
7. RESEARCH DESIGN
• Machine Learning algorithms implemented: Logistic Regression and K nearest
Neighbor .
• These models are incorporated in the breast cancer prediction platform.
• Traditional benchmark Machine Learning technique: Pretrained models.
• We the benchmark model performance against our Machine Learning
algorithm performance.
9. METHODOLOGY AND MODULE
DESCRIPTION
Data Collection
Data Pre-
processing
Applying Machine
Learning Algorithm
Predicting result
and
visualization
Choosing Best
Accuracy Algo
to train model
10. DATA
• The dataset: Publicly available (created by Dr. William H. Wolberg,
physician at the University Of Wisconsin Hospital at Madison,
Wisconsin, USA (Wolberg and Mangasarian 1990).
• Breast-cancer-Wisconsin has 569 instances (Benign: 357 Malignant:
212)
• 2 classes (62.74% malignant and 37.25% benign)
14. CONCLUSION
• This study identifies KNN to be most successful in breast cancer classification.
(93.70% accuracy rate, and the least number of fault predictions)
• Concave_points_mean and perimeter_mean are the most important features for
classifying breast cancer outcomes.
• Our results show that ML algorithms can classify breast cancer outcomes with high
accuracy and identify key characteristics even for small datasets.
• Thus, higher accuracy can be achieved with standard classification models versus
more complicated models even for a smaller data set.
• We highlight the significant potential in using ML techniques as a diagnostic tool
for early detection of breast cancer in general.
• In future we are looking forward to apply more Machine learning and deep
learning Techinques and compare the results.