2. What is Data Analysis?
Data analysis is the process of cleaning, changing, and
processing raw data and extracting actionable, relevant
information that helps businesses make informed decisions.
The procedure helps reduce the risks inherent in decision-
making by providing useful insights and statistics, often
presented in charts, images, tables, and graphs.
3. Steps followed by data analysis
Data Exploration: It is the initial step where we learn about the dataset’s content and
characteristics. We know about the size of the dataset. We know if the dataset contain
having any missing value or contain any outliers.
Data cleaning: This step involves handing of missing values and outliers, removing
unwanted variable from the dataset that are not suitable for our analysis.
Modal building: Once we get the cleaned data, now its time to divide the dataset into
training and testing set and apply the suitable algorithm accord. to the nature of the
problem and type of dataset we have.
Performance: At last check the performance of the modal on unseen data.
4. Heart Disease Data Analysis
In this analysis we have to predict whether the person is having the cancer or
not and for the data we have taken the dataset from Kaggle and perform the
analysis.
In this there are 13 independent variable and 1 dependent variable.
‘0’ is used person not having the cancer and ‘1’ is used person having the
cancer
6. Heart Disease Data Analysis
Use of head() to load the 1st five rows of dataset
7. Heart Disease Data Analysis
Detect and remove the duplicate the rows from the dataset
8. Heart Disease Data Analysis
Check if the dataset is evenly distribute or not.
9. Heart Disease Data Analysis
Check the correlation of independent variable with dependent variable.
10. Heart Disease Data Analysis
Now split the dataset into training and testing set
11. Build the model
On this dataset we have perform various algorithms like logistic regression,
decision tree and random forest and the modal which give the best accuaracy
is “Random Forest classifer”