SlideShare a Scribd company logo
1 of 5
Download to read offline
EDA on Haberman's Survival Data
Objective:
The objective is to classify whether the patient survive after operation of breast cancer or not.
Data Description:
Data is collected from https://www.kaggle.com/gowtamsingulur/habermancsv.
The data set contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago's Billings Hospital
on the survival of patients who had undergone surgery for breast cancer.
Importing Libraries
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Loading data
In [2]:
data=pd.read_csv("C:/Users/KIIT/Applied_AI_Practice_Data/Haberman/haberman.csv")
data.head()
In [3]:
data.columns
'age'- Age of patient at the time of operation.
'year'- Year of operation(i.e 1900).
'Nodes'- No. of positive Axillary Lymph Nodes(Lymph Nodes are small, bean-shaped organs that acts as filter, which are present
in underarm. If Lymph Nodes have some cancer cells in them, they are called positive.)
'status'- It is the survival status of patient.
In [4]:
data.count()
Out[2]:
age year nodes status
0 30 64 1 1
1 30 62 3 1
2 30 65 0 1
3 31 59 2 1
4 31 65 4 1
Out[3]:
Index(['age', 'year', 'nodes', 'status'], dtype='object')
Out[4]:
In [5]:
data.isnull().sum()
In [6]:
data['status'].value_counts()
Observations
There are 4 columns, out of these 'status' is the output column.
In status there are 2 class- 1-> 'survival', 2-> 'Death'.
There are total of 306 entity.
There is no missing value in the dataset.
In [7]:
data['status']=data['status'].apply(lambda x: 'survived' if x==1 else 'died')
For better understanding let 1-> 'survived' and 2-> 'died'
In [8]:
s=sns.FacetGrid(data,hue='status',height=6)
s=s.map(sns.distplot,'age')
s.add_legend()
plt.show()
Out[4]:
age 306
year 306
nodes 306
status 306
dtype: int64
Out[5]:
age 0
year 0
nodes 0
status 0
dtype: int64
Out[6]:
1 225
2 81
Name: status, dtype: int64
In [9]:
s=sns.FacetGrid(data,hue='status',height=6)
s.map(sns.distplot,'year')
s.add_legend()
plt.show()
In [10]:
s=sns.FacetGrid(data,hue='status',height=6)
s.map(sns.distplot,'nodes')
s.add_legend()
plt.show()
Observation
Observation
From the 1st plot we can say that there is more chance that the patient having age less than 35 can survived.
and patient having age more than 75 have less chance of survival.
majority of patient survive have less than 5 postive nodes.
But using this plot we can't distinguish 2 classes clearly.
In [11]:
sns.set_style('whitegrid')
sns.pairplot(data,hue='status',height=5)
plt.show()
Observation
By looking the scatter plots we can't distinguish class.
In [12]:
sns.boxplot(x='status',y='age',data=data)
plt.show()
In [13]:
sns.boxplot(x='status',y='year',data=data)
plt.show()
In [14]:
sns.boxplot(x='status',y='nodes',data=data)
plt.show()
Observation
we may conclude the patient having 3-4 or less no. of positive nodes are survived.
75% of survived patient having less than 4 positive nodes.

More Related Content

What's hot

Explaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionExplaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionVasileiosMezaris
 
Tendensi Sentral.ppt
Tendensi Sentral.pptTendensi Sentral.ppt
Tendensi Sentral.pptssuserb03c5d1
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Wekaweka Content
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingHoang Nguyen
 
Visualising Multi Dimensional Data
Visualising Multi Dimensional DataVisualising Multi Dimensional Data
Visualising Multi Dimensional DataAmit Kapoor
 
SAP BW - Info objects ppt
SAP BW - Info objects pptSAP BW - Info objects ppt
SAP BW - Info objects pptYasmin Ashraf
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesDATAVERSITY
 
PageRank in Multithreading
PageRank in MultithreadingPageRank in Multithreading
PageRank in MultithreadingShujian Zhang
 
SAP BW - Data store objects
SAP BW - Data store objectsSAP BW - Data store objects
SAP BW - Data store objectsYasmin Ashraf
 
Pertemuan 6 (ukuran penyebaran data))
Pertemuan 6 (ukuran penyebaran data))Pertemuan 6 (ukuran penyebaran data))
Pertemuan 6 (ukuran penyebaran data))reno sutriono
 
SAP BW - Info object (characteristics)
SAP BW - Info object (characteristics)SAP BW - Info object (characteristics)
SAP BW - Info object (characteristics)Yasmin Ashraf
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesDeepaR42
 
Klasterisasi - Algoritma K-Means Clustering.pdf
Klasterisasi - Algoritma K-Means Clustering.pdfKlasterisasi - Algoritma K-Means Clustering.pdf
Klasterisasi - Algoritma K-Means Clustering.pdfElvi Rahmi
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberHouw Liong The
 

What's hot (20)

Explaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionExplaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attention
 
Tendensi Sentral.ppt
Tendensi Sentral.pptTendensi Sentral.ppt
Tendensi Sentral.ppt
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Visualising Multi Dimensional Data
Visualising Multi Dimensional DataVisualising Multi Dimensional Data
Visualising Multi Dimensional Data
 
SAP BW - Info objects ppt
SAP BW - Info objects pptSAP BW - Info objects ppt
SAP BW - Info objects ppt
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case Studies
 
PageRank in Multithreading
PageRank in MultithreadingPageRank in Multithreading
PageRank in Multithreading
 
SAP BW - Data store objects
SAP BW - Data store objectsSAP BW - Data store objects
SAP BW - Data store objects
 
Decision tree
Decision treeDecision tree
Decision tree
 
Scaling and Normalization
Scaling and NormalizationScaling and Normalization
Scaling and Normalization
 
Pertemuan 6 (ukuran penyebaran data))
Pertemuan 6 (ukuran penyebaran data))Pertemuan 6 (ukuran penyebaran data))
Pertemuan 6 (ukuran penyebaran data))
 
Yolo
YoloYolo
Yolo
 
SAP BW - Info object (characteristics)
SAP BW - Info object (characteristics)SAP BW - Info object (characteristics)
SAP BW - Info object (characteristics)
 
Infoobject
InfoobjectInfoobject
Infoobject
 
Sap hr-programming
Sap hr-programmingSap hr-programming
Sap hr-programming
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and Techniques
 
Klasterisasi - Algoritma K-Means Clustering.pdf
Klasterisasi - Algoritma K-Means Clustering.pdfKlasterisasi - Algoritma K-Means Clustering.pdf
Klasterisasi - Algoritma K-Means Clustering.pdf
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & Kamber
 
EDA
EDAEDA
EDA
 

Similar to EDA On Haberman Data

Research on Haberman dataset also business required document
Research on Haberman dataset also business required documentResearch on Haberman dataset also business required document
Research on Haberman dataset also business required documentManjuYadav65
 
web-application.pdf
web-application.pdfweb-application.pdf
web-application.pdfouiamouhdifa
 
Cancer detection using data mining
Cancer detection using data miningCancer detection using data mining
Cancer detection using data miningRishabhKumar283
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for HealthcareChandan Reddy
 
Automatic Brain Segmentation-3770
Automatic Brain Segmentation-3770Automatic Brain Segmentation-3770
Automatic Brain Segmentation-3770Kitware Kitware
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcUSD Bioinformatics
 
Introducción al análisis de datos
Introducción al análisis de datosIntroducción al análisis de datos
Introducción al análisis de datoshibari08
 
Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...
Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...
Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...Kavika Roy
 
Recommending Movies Using Neo4j
Recommending Movies Using Neo4j Recommending Movies Using Neo4j
Recommending Movies Using Neo4j Ilias Katsabalos
 
Multivariate Regression using Skull Structures
Multivariate Regression using Skull StructuresMultivariate Regression using Skull Structures
Multivariate Regression using Skull StructuresJustin Pierce
 
Below is my program, I just have some issues when I want to check ou.pdf
Below is my program, I just have some issues when I want to check ou.pdfBelow is my program, I just have some issues when I want to check ou.pdf
Below is my program, I just have some issues when I want to check ou.pdfdhavalbl38
 
Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIPaul Groth
 
Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...
Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...
Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...Guillermo Santos
 
Clinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_DiseaseClinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_DiseaseSunil Kakade
 
Mining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth AlgorithmMining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth AlgorithmWaqas Tariq
 
Mining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth AlgorithmMining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth AlgorithmWaqas Tariq
 
Question I need help with c++ Simple Classes Assigment. i get this .pdf
Question I need help with c++ Simple Classes Assigment. i get this .pdfQuestion I need help with c++ Simple Classes Assigment. i get this .pdf
Question I need help with c++ Simple Classes Assigment. i get this .pdfexxonzone
 

Similar to EDA On Haberman Data (20)

Research on Haberman dataset also business required document
Research on Haberman dataset also business required documentResearch on Haberman dataset also business required document
Research on Haberman dataset also business required document
 
web-application.pdf
web-application.pdfweb-application.pdf
web-application.pdf
 
Cancer detection using data mining
Cancer detection using data miningCancer detection using data mining
Cancer detection using data mining
 
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCAREK-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for Healthcare
 
Automatic Brain Segmentation-3770
Automatic Brain Segmentation-3770Automatic Brain Segmentation-3770
Automatic Brain Segmentation-3770
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmc
 
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
 
PERFORMANCE EVALUATION OF DIFFERENT CLASSIFIER ON BREAST CANCER
PERFORMANCE EVALUATION OF DIFFERENT CLASSIFIER ON BREAST CANCERPERFORMANCE EVALUATION OF DIFFERENT CLASSIFIER ON BREAST CANCER
PERFORMANCE EVALUATION OF DIFFERENT CLASSIFIER ON BREAST CANCER
 
Introducción al análisis de datos
Introducción al análisis de datosIntroducción al análisis de datos
Introducción al análisis de datos
 
Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...
Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...
Unraveling The Meaning From COVID-19 Dataset Using Python – A Tutorial for be...
 
Recommending Movies Using Neo4j
Recommending Movies Using Neo4j Recommending Movies Using Neo4j
Recommending Movies Using Neo4j
 
Multivariate Regression using Skull Structures
Multivariate Regression using Skull StructuresMultivariate Regression using Skull Structures
Multivariate Regression using Skull Structures
 
Below is my program, I just have some issues when I want to check ou.pdf
Below is my program, I just have some issues when I want to check ou.pdfBelow is my program, I just have some issues when I want to check ou.pdf
Below is my program, I just have some issues when I want to check ou.pdf
 
Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AI
 
Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...
Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...
Data Analysis. Predictive Analysis. Activity Prediction that a subject perfor...
 
Clinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_DiseaseClinical_Decision_Support_For_Heart_Disease
Clinical_Decision_Support_For_Heart_Disease
 
Mining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth AlgorithmMining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
 
Mining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth AlgorithmMining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
Mining of Prevalent Ailments in a Health Database Using Fp-Growth Algorithm
 
Question I need help with c++ Simple Classes Assigment. i get this .pdf
Question I need help with c++ Simple Classes Assigment. i get this .pdfQuestion I need help with c++ Simple Classes Assigment. i get this .pdf
Question I need help with c++ Simple Classes Assigment. i get this .pdf
 

Recently uploaded

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 

Recently uploaded (20)

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 

EDA On Haberman Data

  • 1. EDA on Haberman's Survival Data Objective: The objective is to classify whether the patient survive after operation of breast cancer or not. Data Description: Data is collected from https://www.kaggle.com/gowtamsingulur/habermancsv. The data set contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago's Billings Hospital on the survival of patients who had undergone surgery for breast cancer. Importing Libraries In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns Loading data In [2]: data=pd.read_csv("C:/Users/KIIT/Applied_AI_Practice_Data/Haberman/haberman.csv") data.head() In [3]: data.columns 'age'- Age of patient at the time of operation. 'year'- Year of operation(i.e 1900). 'Nodes'- No. of positive Axillary Lymph Nodes(Lymph Nodes are small, bean-shaped organs that acts as filter, which are present in underarm. If Lymph Nodes have some cancer cells in them, they are called positive.) 'status'- It is the survival status of patient. In [4]: data.count() Out[2]: age year nodes status 0 30 64 1 1 1 30 62 3 1 2 30 65 0 1 3 31 59 2 1 4 31 65 4 1 Out[3]: Index(['age', 'year', 'nodes', 'status'], dtype='object') Out[4]:
  • 2. In [5]: data.isnull().sum() In [6]: data['status'].value_counts() Observations There are 4 columns, out of these 'status' is the output column. In status there are 2 class- 1-> 'survival', 2-> 'Death'. There are total of 306 entity. There is no missing value in the dataset. In [7]: data['status']=data['status'].apply(lambda x: 'survived' if x==1 else 'died') For better understanding let 1-> 'survived' and 2-> 'died' In [8]: s=sns.FacetGrid(data,hue='status',height=6) s=s.map(sns.distplot,'age') s.add_legend() plt.show() Out[4]: age 306 year 306 nodes 306 status 306 dtype: int64 Out[5]: age 0 year 0 nodes 0 status 0 dtype: int64 Out[6]: 1 225 2 81 Name: status, dtype: int64
  • 4. Observation From the 1st plot we can say that there is more chance that the patient having age less than 35 can survived. and patient having age more than 75 have less chance of survival. majority of patient survive have less than 5 postive nodes. But using this plot we can't distinguish 2 classes clearly. In [11]: sns.set_style('whitegrid') sns.pairplot(data,hue='status',height=5) plt.show() Observation By looking the scatter plots we can't distinguish class. In [12]: sns.boxplot(x='status',y='age',data=data) plt.show()
  • 5. In [13]: sns.boxplot(x='status',y='year',data=data) plt.show() In [14]: sns.boxplot(x='status',y='nodes',data=data) plt.show() Observation we may conclude the patient having 3-4 or less no. of positive nodes are survived. 75% of survived patient having less than 4 positive nodes.