SlideShare a Scribd company logo
1 of 40
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 1
SHREE N.J.SONECHA MNG &TECH INSTITUTE
CHANDUVAV
DevelopedBy:-
1) Asker Hema [195533693002]
2) Dusara Khushbu[195533693008]
3) Makvana Bharat[195533693020]
DATA ANALYSIS
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 2
Theory of any subject is important but without its
practice it becomes useless particularly for the computer
student. A Project developer student can’t become a perfect
man of technologist without practical understanding of
branch. Hence this visiting provides golden opportunity for
all developer students.
The principal objective of the in office visiting is to get
details about the operation process which are carried out in
the proper used in the various place. It’s another attractive
feature is to learn office management & discipline which is
equally important in life.
PREFACE
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 3
ACKNOWLEDGEMENT
The principal objective of the in office visiting is to get
details about the operation process which are carried out in
the proper used in the various place. It’s another attractive
feature is to learn office management & discipline which is
equally important in life.
The success of any project is never limited to the
individuals undertaking the project. It is the collective effort
of the people around an individual that spell success. For all
effort, behind this successful project, we are highly intended
to the following personalities without whom this project
would never be completed.
Mr. Chirag Rachchh sir HOD of MCA Department, who
had guided us, regularly supervises our project. We would
like to express our deep gratitude to the all friends, for their
valuable suggestion and cooperation.
At last, our special thanks to Mr. Dipak Thanki sir
Assistant professor who have encouraged and motivated us
directly or indirectly.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 4
INDEX
NO CONCEPT
1. BasicObjectives
1.1 Introduction
1.2 Need for Data Analytics
1.3 ProblemStatement
1.4 Introductionto Python
2. UnderstandData
2.1 About Data Source
2.2 Understand Data: Basic Questions
2.3 Understand Data: Data Wrangling
2.4 ExploratoryAnalysis
3. Methodology
3.1 Extract Features & Model Methodology
3.2 Introductionto Model and Methodology
3.3 Data Visualization
3.4 Various query outcome, visualization,
analysis and conclusion
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 5
3.5 Implementation of model & Methodology
3.6 Advantages and Limitations of proposed
Model
4. Conclusion
Conclusion
5. References
List of References
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 6
INTRODUCTION
Dataset Title Covid-19 India
Project For N. J. Sonecha Mgt & tech
institute chanduvav
Developed By
Dusara Khushbu
Asker Hema
Makvana Bharat
College N. J. Sonecha Mgt and tech
institute chanduvav
Project Guide Thanki Dipak Sir
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 7
1. Basic Objectives
Introduction
Data science projects offer you a promising way to
kick-start your career in this field. Not only do you get to
learn data science by applying it, you also get projects to
showcase on your CV! Nowadays, recruiters evaluate a
candidate’s potential by his/her work and don’t put a lot
of emphasis on certifications. It wouldn’t matter if you
just tell them how much you know if you have nothing to
show them! That’s where most people struggle and miss
out.
You might have worked on several problems before,
but if you can’t make it presentable & easy-to-explain,
how on earth would someone know what you are capable
of? That’s where these projects will help you. Think of
the time you’ll spend on these projects like your training
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 8
sessions. The more time you spend practicing, the better
you’ll become!
We’ve made sure to provide you with a taste of a
variety of problems from different domains. We believe
everyone must learn to smartly work with huge amounts
of data, hence large datasets are included. Also, we’ve
made sure all the datasets are open and free to access.
Data science spend a significant amount of time on theory
and not enough on practical application. To make real
progress along the path toward becoming a data scientist,
it’s important to start building data science projects as
soon as possible.
Need for Data Analytics
Data analytics is the science of analyzing raw data
in order to make conclusions about that information.
Many of the techniques and processes of data analytics
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 9
have been automated into mechanical processes
and algorithms that work over raw data for human
consumption.
Data analytics techniques can reveal trends and
metrics that would otherwise be lost in the mass of
information. This information can then be used to
optimize processes to increase the overall efficiency of
a business or system.
Data analytics (DA) is the process of examining data
sets in order to draw conclusions about the
information they contain, increasingly with the aid of
specialized systems and software. Data analytics
technologies and techniques are widely used in
commercial industries to enable organizations to
make more-informed business decisions and by
scientists and researchers to verify or disprove
scientific models, theories and hypotheses.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 10
 Why Data Analytics:
Data analytics is important because it helps
businesses optimize their performances. Implementing it
into the business model means companies can help reduce
costs by identifying more efficient ways of doing business
and by storing large amounts of data.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 11
 Data AnalyticsLife Cycle
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 12
Problem Statement
Project Definition:
Our project is about which type of COVID-19 INDIA
problem occur like:
 Cough
 Fever
 Difficulty breathing (severe cases)
 Tiredness
Our project detail:-
These databasesare related to Covid-19 that started out
in China has now spread globally with countries
scrambling to tackle it. The virus that started out as a
healthcare emergency has now started to have serious
economic consequences.
For the purpose of this article, we will only be looking
at the dataat a countrylevel and not at the Province/State
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 13
level. Let’s create a consolidateddataset that combines the
datasets for Cases, Deaths and Recoveries. I have also
created a function to get daily count from the cumulative
data
Column Detail:-
(1)Date:
Date of cumulative report
(2)Name of State / UT:
Name of the state / Union Territory / National
5Capital Region
(3)Total Confirmed cases (Indian National):
Cumulative count of Indian national confirmed with
COVID-19
(4)Total Confirmed cases (Foreign National):
Cumulative count of foreign national confirmed
with COVID-19
(5)Cured/Discharged/Migrated:
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 14
Cumulative count of cured / discharged cases
(6)Latitude:
Latitude of the location
(7)Longitude:
Longitude of the location
(8)Death:
Cumulative count of deaths reported
(9)Total Confirmed cases:
Total confirmed cases
(10) Gender:
Age / Age range / Age bracket
(11) detected_city:
City in which case is detected
(12) detected_district:
District in which case is detected
(13) detected_state:
District in which case is detected
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 15
(14) state_code:
State in which case is detected
(15) current_status:
Current status
(16)Notes:
Note
(17) suspected_contacted_patient:
Suspected contacted patient
(18) Nationality:
Nationality of the patient
(19) type_of_transmission:
Type of transmission
(20) status_change_date:
Date on which case status changed
(21) backup_notes:
Backup notes
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 16
Introduction To Python
What is python:
Python is a popular programming language. It was
created by Guido van Rossum, and released in 1991.
Python is a general purpose programming language that
is becoming ever more popular for data science.
Companies worldwide are using python to insight from
their data and gain a competitive edge.
What can Python do:
Python can be used on a server to create web
applications.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 17
Python can be used alongside software to create work
flows.
Python can connect to database systems. It can also read
and modify files.
Python can be used to handle big data and perform
complex mathematics.
Python can be used for rapid prototyping, or for
production-ready software development.
Why Python?
Python works on different platforms (Windows, Mac,
Linux, Raspberry Pi, etc).
Features in Python:
 Easy to code
 Python is high level programming language.
 Free and Open Source
 Object-Oriented Language
 GUI Programming Support
 High-Level Language
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 18
 Extensible feature
 Python is Portable language
 Python is Integrated language
2. Understand Data
About Data Source:
Source: - Covid -19 India
Data Source link: -
https://www.kaggle.com/imdevskp/covid19-corona-virus-india-dataset
Basic Questions:
Q.1) What is a corona virus?
Corona viruses are a large family of viruses that are
known to cause illness ranging from the common cold to
more severe diseases such as Middle East Respiratory
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 19
Syndrome (MERS) and Severe Acute Respiratory
Syndrome (SARS).
Q.2) Who is most at risk for the corona virus
disease?
Peopleof all ages can be infected by the new corona virus
(2019-nCoV).Older people, and people with pre-existing
medical conditions (such as asthma, diabetes, heart
disease) appear to be more vulnerable to becoming
severely ill with the virus.
WHO advises people of all ages to take steps to protect
themselves from the virus, for example by following good
hand hygiene and good respiratory hygiene.
Q.3)Is there a vaccine for the corona virus
disease?
When a disease is new, there is no vaccine until one is
developed. It can take a number of years for a new
vaccine to be developed.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 20
Q.4) Can corona viruses be transmitted from
person to person?
Yes, some corona viruses can be transmitted from person
to person, usually after close contact with an infected
patient, for example, in a household workplace, or health
care centre.
Q.5) Is there a treatment for a novel corona
virus?
There is no specific treatment for disease caused by a
novel corona virus.
Q.6) What can I do to protect myself?
Standard recommendations to reduce exposure to and
transmission of a range of illnesses include maintaining
basic hand and respiratory hygiene, and safe food
practices and avoiding close contact, when possible, with
anyone showing symptoms of respiratory illness such as
coughing and sneezing.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 21
Q.7) Are health workers at risk from a novel
corona virus?
Yes, they can be, as health care workers come into contact
with patients more often than the general public WHO
recommends that health care workers consistently apply
appropriate
Data Wrangling
Data wrangling, sometimes referred to as data
mugging, is the process of transforming and
mapping data from one "raw" data form into
another format with the intent of making it more
appropriate and valuable for a variety of
downstream purposes such as analytics.
A data wrangler is a person who performs these
transformation operations.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 22
 Use of Data Wrangling:
The data transformations are typically applied to distinct
entities (e.g. fields, rows, columns, data values etc.)
within a data set, and could include such actions as
extractions, parsing, joining, standardizing, augmenting,
cleansing, consolidating and filtering to create desired
wrangling outputs that can be leveraged downstream.
Exploratory Analysis:
Exploratory data analysis (EDA) is an approach
to analyzing datasets to summarize their main
characteristics, often with visual methods.
A statistical method can be used or not, but primarily
EDA is for seeing what the data can tell us beyond the
formal modeling or hypothesis testing task.
Typical graphical techniquesused in EDA are:
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 23
 Box Plot
 Histogram
 Run Chart
 Pareto Chart
 Pie Chart
 Ogive chart
3. Methodology
Extract Feature & Method Methodology:
Important Column: -
Date:
Date of cumulative report
Name of State / UT:
Name of the state / Union Territory /
National Capital Region
Total Confirmed cases (Indian National):
Cumulative count of Indian national
confirmed with COVID-19
Total Confirmed cases (Foreign National):
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 24
Cumulative count of foreign national
confirmed with COVID-19
Cured/Discharged/Migrated:
Cumulative count of cured / discharged
cases
Latitude:
Latitude of the location
Longitude:
Longitude of the location
Death:
Cumulative count of deaths reported
Total Confirmed cases:
Total confirmed cases
Types of Model:-
Data modeling is the process of producing a
descriptive diagram of relationships between various
types of information that are to be stored in a database.
1.Test Datasets
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 25
2.Classification Test Problems
3.Regression Test Problems
Test Datasets: -
The test dataset is a dataset used to provide an unbiased
evaluation of a final model fit on the training dataset.
A test dataset is a dataset that is independent of the
training dataset, but that follows the same probability
distribution as the training dataset. If a model fit to the
training dataset also fits the test dataset well,
minimal over fitting has taken place.
Test datasets are small contrived problems that allow you
to test and debug your algorithms and test harness.
 They can be generated quickly and easily.
 They are small and easily visualized in two
dimensions.
 They can be scaled up trivially.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 26
Classification of Test Problems:-
After building a predictive classification model, you need
to evaluate the performance of the model.
That is how good the model is in predicting the outcome
of new observations test data that have been not used to
train the model.
Blobs Classification Problem:-
Used for Gaussian distribution.
You can control how many blobs to generate and the
number of samples to generate.
 Moons Classification Problem:-
Use for binary classification and will generate a swirl
pattern
That is capable of learning nonlinear class boundaries.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 27
 Circles Classification Problem:-
Use fall into concentric circles.
You can control the amount of noise in the shapes.
Regression Test Problems:-
Regression is the problem of predicting a quantity given
an observation
We will create a dataset with a linear relationship between
inputs and the outputs.
Problems and Issues of Linear Regression
o Specification
o Proxy Variables and Measurement Error
o Selection Bias
o Multicollinearity
o Autocorrelation
o Heteroskedasticity
o SimultaneousEquations
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 28
o Limited Dependent Variables
Introduction To Model & Methodology:
What is Regression?
Regression analysis is a set of statistical processes for
estimating the relationships among variables. It includes
many techniques for modeling and analyzing several
variables, when the focus is on the relationship between a
dependent variable and one or more independent
variables.
This technique is used for forecasting, time series
modeling and finding the casual effect
relationship between the variables.
Why do we use Regression Analysis?
Regression analysis estimates the relationship between
two or more variables.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 29
There are multiple benefits of using regression analysis.
They are as follows:
1.It indicates the significant relationships between
dependent variable and independent variable.
2.It indicates the strength of impact of
multiple independent variables on a dependent
variable.
Types Of Regression:
1. LinearRegression:
It is one of the most widely known modeling technique.
Linear regression is usually among the first few topics
which people pick while learning predictive modeling.
In this technique, the dependent variable is continuous,
independentvariable(s) can be continuous or discrete and
nature of regression line is linear.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 30
There must be linear relationship between independent
and dependent variables.
Linear Regression is very sensitive to Outliers. It can
terribly affect the regression line and eventually the
forecasted values.
2. LogisticRegression:
Logistic regression is used to find the probability of
event=Success and event=Failure.
We should use logistic regression when the dependent
variable is binary (0/ 1, True/ False, Yes/ No) in nature.
Logistic regression is widely used for classification
problems.
Logistic regression doesn’t require linear relationship
between dependent and independent variables.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 31
It can handle various types of relationships because it
applies a non-linear log transformation to the predicted
odds ratio.
Logistic regression estimates the parameters of a logistic
model and is form of binomial regression.
Logistic regression is used to deal with data that has two
possible criterions and the relationship between the
criterions and the predictors.
3. Polynomial Regression:
A regression equation is a polynomial regression equation
if the power of independent variable is more than 1.
It is used for curvilinear data. Polynomial regression is fit
with the method of least squares.
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 32
The goal of regression analysis to model the expected
value of a dependent variable y in regards to the
independent variable x.
Formula:
Y=mx+b
m=N(∑XY)-(∑X)(∑Y)/N∑X2
-(∑X)2
b=∑y-(m.∑x)/N
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 33
Data Visualization:
All Graph: -
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 34
Output:
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 35
Box-Plot:-
Output:-
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 36
Histogram:-
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 37
Output:-
Info:-
Output:-
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 38
Scatter chart:-
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 39
Output:-
COVID-19 INDIA
HEMA–BHARAT–KHUSHBU Page 40
3. List of References:
1. http://www.kaggle.com
2. python data set
3. google.com
4. http://medicalnewstoday.com
Plagarism Report:
-Thank You.

More Related Content

Similar to Covid 19[hbk]

Report on medical center
Report on medical centerReport on medical center
Report on medical centerMD Hasan Mozumder
 
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET Journal
 
Cloud Based Covid 19 Testing Management System
Cloud Based Covid 19 Testing Management SystemCloud Based Covid 19 Testing Management System
Cloud Based Covid 19 Testing Management Systemijtsrd
 
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET Journal
 
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRY
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRYSOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRY
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRYKaustubh Nale
 
Medic - Artificially Intelligent System for Healthcare Services ...
Medic - Artificially Intelligent System for Healthcare Services              ...Medic - Artificially Intelligent System for Healthcare Services              ...
Medic - Artificially Intelligent System for Healthcare Services ...IRJET Journal
 
IRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET Journal
 
Final report june 2014
Final report  june 2014Final report  june 2014
Final report june 2014monikamedia
 
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...ijtsrd
 
IRJET- Trend Analysis on Twitter
IRJET- Trend Analysis on TwitterIRJET- Trend Analysis on Twitter
IRJET- Trend Analysis on TwitterIRJET Journal
 
IRJET- Detecting Fake News
IRJET- Detecting Fake NewsIRJET- Detecting Fake News
IRJET- Detecting Fake NewsIRJET Journal
 
Fake News and Message Detection
Fake News and Message DetectionFake News and Message Detection
Fake News and Message DetectionIRJET Journal
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsJOSEPH FRANCIS
 
Taxation project new
Taxation project newTaxation project new
Taxation project newIT
 
IRJET- Authentic News Summarization
IRJET-  	  Authentic News SummarizationIRJET-  	  Authentic News Summarization
IRJET- Authentic News SummarizationIRJET Journal
 
Report (Support of IT in Pharmaceutical Industry)
Report (Support of IT in Pharmaceutical Industry)Report (Support of IT in Pharmaceutical Industry)
Report (Support of IT in Pharmaceutical Industry)Moiz Kagdi
 
IRJET- Heart Disease Prediction and Recommendation
IRJET-  	  Heart Disease Prediction and RecommendationIRJET-  	  Heart Disease Prediction and Recommendation
IRJET- Heart Disease Prediction and RecommendationIRJET Journal
 
Privacy prescriptions for technology interventions on covid 19 in india(2)
Privacy prescriptions for technology interventions on covid 19 in india(2)Privacy prescriptions for technology interventions on covid 19 in india(2)
Privacy prescriptions for technology interventions on covid 19 in india(2)sabrangsabrang
 
Full Paper: Analytics: Key to go from generating big data to deriving busines...
Full Paper: Analytics: Key to go from generating big data to deriving busines...Full Paper: Analytics: Key to go from generating big data to deriving busines...
Full Paper: Analytics: Key to go from generating big data to deriving busines...Piyush Malik
 
COVID19 ANALYZER
COVID19 ANALYZERCOVID19 ANALYZER
COVID19 ANALYZERIRJET Journal
 

Similar to Covid 19[hbk] (20)

Report on medical center
Report on medical centerReport on medical center
Report on medical center
 
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
 
Cloud Based Covid 19 Testing Management System
Cloud Based Covid 19 Testing Management SystemCloud Based Covid 19 Testing Management System
Cloud Based Covid 19 Testing Management System
 
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
 
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRY
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRYSOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRY
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRY
 
Medic - Artificially Intelligent System for Healthcare Services ...
Medic - Artificially Intelligent System for Healthcare Services              ...Medic - Artificially Intelligent System for Healthcare Services              ...
Medic - Artificially Intelligent System for Healthcare Services ...
 
IRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare Applications
 
Final report june 2014
Final report  june 2014Final report  june 2014
Final report june 2014
 
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
 
IRJET- Trend Analysis on Twitter
IRJET- Trend Analysis on TwitterIRJET- Trend Analysis on Twitter
IRJET- Trend Analysis on Twitter
 
IRJET- Detecting Fake News
IRJET- Detecting Fake NewsIRJET- Detecting Fake News
IRJET- Detecting Fake News
 
Fake News and Message Detection
Fake News and Message DetectionFake News and Message Detection
Fake News and Message Detection
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Taxation project new
Taxation project newTaxation project new
Taxation project new
 
IRJET- Authentic News Summarization
IRJET-  	  Authentic News SummarizationIRJET-  	  Authentic News Summarization
IRJET- Authentic News Summarization
 
Report (Support of IT in Pharmaceutical Industry)
Report (Support of IT in Pharmaceutical Industry)Report (Support of IT in Pharmaceutical Industry)
Report (Support of IT in Pharmaceutical Industry)
 
IRJET- Heart Disease Prediction and Recommendation
IRJET-  	  Heart Disease Prediction and RecommendationIRJET-  	  Heart Disease Prediction and Recommendation
IRJET- Heart Disease Prediction and Recommendation
 
Privacy prescriptions for technology interventions on covid 19 in india(2)
Privacy prescriptions for technology interventions on covid 19 in india(2)Privacy prescriptions for technology interventions on covid 19 in india(2)
Privacy prescriptions for technology interventions on covid 19 in india(2)
 
Full Paper: Analytics: Key to go from generating big data to deriving busines...
Full Paper: Analytics: Key to go from generating big data to deriving busines...Full Paper: Analytics: Key to go from generating big data to deriving busines...
Full Paper: Analytics: Key to go from generating big data to deriving busines...
 
COVID19 ANALYZER
COVID19 ANALYZERCOVID19 ANALYZER
COVID19 ANALYZER
 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 

Covid 19[hbk]

  • 1. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 1 SHREE N.J.SONECHA MNG &TECH INSTITUTE CHANDUVAV DevelopedBy:- 1) Asker Hema [195533693002] 2) Dusara Khushbu[195533693008] 3) Makvana Bharat[195533693020] DATA ANALYSIS
  • 2. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 2 Theory of any subject is important but without its practice it becomes useless particularly for the computer student. A Project developer student can’t become a perfect man of technologist without practical understanding of branch. Hence this visiting provides golden opportunity for all developer students. The principal objective of the in office visiting is to get details about the operation process which are carried out in the proper used in the various place. It’s another attractive feature is to learn office management & discipline which is equally important in life. PREFACE
  • 3. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 3 ACKNOWLEDGEMENT The principal objective of the in office visiting is to get details about the operation process which are carried out in the proper used in the various place. It’s another attractive feature is to learn office management & discipline which is equally important in life. The success of any project is never limited to the individuals undertaking the project. It is the collective effort of the people around an individual that spell success. For all effort, behind this successful project, we are highly intended to the following personalities without whom this project would never be completed. Mr. Chirag Rachchh sir HOD of MCA Department, who had guided us, regularly supervises our project. We would like to express our deep gratitude to the all friends, for their valuable suggestion and cooperation. At last, our special thanks to Mr. Dipak Thanki sir Assistant professor who have encouraged and motivated us directly or indirectly.
  • 4. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 4 INDEX NO CONCEPT 1. BasicObjectives 1.1 Introduction 1.2 Need for Data Analytics 1.3 ProblemStatement 1.4 Introductionto Python 2. UnderstandData 2.1 About Data Source 2.2 Understand Data: Basic Questions 2.3 Understand Data: Data Wrangling 2.4 ExploratoryAnalysis 3. Methodology 3.1 Extract Features & Model Methodology 3.2 Introductionto Model and Methodology 3.3 Data Visualization 3.4 Various query outcome, visualization, analysis and conclusion
  • 5. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 5 3.5 Implementation of model & Methodology 3.6 Advantages and Limitations of proposed Model 4. Conclusion Conclusion 5. References List of References
  • 6. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 6 INTRODUCTION Dataset Title Covid-19 India Project For N. J. Sonecha Mgt & tech institute chanduvav Developed By Dusara Khushbu Asker Hema Makvana Bharat College N. J. Sonecha Mgt and tech institute chanduvav Project Guide Thanki Dipak Sir
  • 7. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 7 1. Basic Objectives Introduction Data science projects offer you a promising way to kick-start your career in this field. Not only do you get to learn data science by applying it, you also get projects to showcase on your CV! Nowadays, recruiters evaluate a candidate’s potential by his/her work and don’t put a lot of emphasis on certifications. It wouldn’t matter if you just tell them how much you know if you have nothing to show them! That’s where most people struggle and miss out. You might have worked on several problems before, but if you can’t make it presentable & easy-to-explain, how on earth would someone know what you are capable of? That’s where these projects will help you. Think of the time you’ll spend on these projects like your training
  • 8. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 8 sessions. The more time you spend practicing, the better you’ll become! We’ve made sure to provide you with a taste of a variety of problems from different domains. We believe everyone must learn to smartly work with huge amounts of data, hence large datasets are included. Also, we’ve made sure all the datasets are open and free to access. Data science spend a significant amount of time on theory and not enough on practical application. To make real progress along the path toward becoming a data scientist, it’s important to start building data science projects as soon as possible. Need for Data Analytics Data analytics is the science of analyzing raw data in order to make conclusions about that information. Many of the techniques and processes of data analytics
  • 9. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 9 have been automated into mechanical processes and algorithms that work over raw data for human consumption. Data analytics techniques can reveal trends and metrics that would otherwise be lost in the mass of information. This information can then be used to optimize processes to increase the overall efficiency of a business or system. Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software. Data analytics technologies and techniques are widely used in commercial industries to enable organizations to make more-informed business decisions and by scientists and researchers to verify or disprove scientific models, theories and hypotheses.
  • 10. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 10  Why Data Analytics: Data analytics is important because it helps businesses optimize their performances. Implementing it into the business model means companies can help reduce costs by identifying more efficient ways of doing business and by storing large amounts of data.
  • 11. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 11  Data AnalyticsLife Cycle
  • 12. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 12 Problem Statement Project Definition: Our project is about which type of COVID-19 INDIA problem occur like:  Cough  Fever  Difficulty breathing (severe cases)  Tiredness Our project detail:- These databasesare related to Covid-19 that started out in China has now spread globally with countries scrambling to tackle it. The virus that started out as a healthcare emergency has now started to have serious economic consequences. For the purpose of this article, we will only be looking at the dataat a countrylevel and not at the Province/State
  • 13. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 13 level. Let’s create a consolidateddataset that combines the datasets for Cases, Deaths and Recoveries. I have also created a function to get daily count from the cumulative data Column Detail:- (1)Date: Date of cumulative report (2)Name of State / UT: Name of the state / Union Territory / National 5Capital Region (3)Total Confirmed cases (Indian National): Cumulative count of Indian national confirmed with COVID-19 (4)Total Confirmed cases (Foreign National): Cumulative count of foreign national confirmed with COVID-19 (5)Cured/Discharged/Migrated:
  • 14. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 14 Cumulative count of cured / discharged cases (6)Latitude: Latitude of the location (7)Longitude: Longitude of the location (8)Death: Cumulative count of deaths reported (9)Total Confirmed cases: Total confirmed cases (10) Gender: Age / Age range / Age bracket (11) detected_city: City in which case is detected (12) detected_district: District in which case is detected (13) detected_state: District in which case is detected
  • 15. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 15 (14) state_code: State in which case is detected (15) current_status: Current status (16)Notes: Note (17) suspected_contacted_patient: Suspected contacted patient (18) Nationality: Nationality of the patient (19) type_of_transmission: Type of transmission (20) status_change_date: Date on which case status changed (21) backup_notes: Backup notes
  • 16. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 16 Introduction To Python What is python: Python is a popular programming language. It was created by Guido van Rossum, and released in 1991. Python is a general purpose programming language that is becoming ever more popular for data science. Companies worldwide are using python to insight from their data and gain a competitive edge. What can Python do: Python can be used on a server to create web applications.
  • 17. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 17 Python can be used alongside software to create work flows. Python can connect to database systems. It can also read and modify files. Python can be used to handle big data and perform complex mathematics. Python can be used for rapid prototyping, or for production-ready software development. Why Python? Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc). Features in Python:  Easy to code  Python is high level programming language.  Free and Open Source  Object-Oriented Language  GUI Programming Support  High-Level Language
  • 18. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 18  Extensible feature  Python is Portable language  Python is Integrated language 2. Understand Data About Data Source: Source: - Covid -19 India Data Source link: - https://www.kaggle.com/imdevskp/covid19-corona-virus-india-dataset Basic Questions: Q.1) What is a corona virus? Corona viruses are a large family of viruses that are known to cause illness ranging from the common cold to more severe diseases such as Middle East Respiratory
  • 19. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 19 Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS). Q.2) Who is most at risk for the corona virus disease? Peopleof all ages can be infected by the new corona virus (2019-nCoV).Older people, and people with pre-existing medical conditions (such as asthma, diabetes, heart disease) appear to be more vulnerable to becoming severely ill with the virus. WHO advises people of all ages to take steps to protect themselves from the virus, for example by following good hand hygiene and good respiratory hygiene. Q.3)Is there a vaccine for the corona virus disease? When a disease is new, there is no vaccine until one is developed. It can take a number of years for a new vaccine to be developed.
  • 20. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 20 Q.4) Can corona viruses be transmitted from person to person? Yes, some corona viruses can be transmitted from person to person, usually after close contact with an infected patient, for example, in a household workplace, or health care centre. Q.5) Is there a treatment for a novel corona virus? There is no specific treatment for disease caused by a novel corona virus. Q.6) What can I do to protect myself? Standard recommendations to reduce exposure to and transmission of a range of illnesses include maintaining basic hand and respiratory hygiene, and safe food practices and avoiding close contact, when possible, with anyone showing symptoms of respiratory illness such as coughing and sneezing.
  • 21. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 21 Q.7) Are health workers at risk from a novel corona virus? Yes, they can be, as health care workers come into contact with patients more often than the general public WHO recommends that health care workers consistently apply appropriate Data Wrangling Data wrangling, sometimes referred to as data mugging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics. A data wrangler is a person who performs these transformation operations.
  • 22. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 22  Use of Data Wrangling: The data transformations are typically applied to distinct entities (e.g. fields, rows, columns, data values etc.) within a data set, and could include such actions as extractions, parsing, joining, standardizing, augmenting, cleansing, consolidating and filtering to create desired wrangling outputs that can be leveraged downstream. Exploratory Analysis: Exploratory data analysis (EDA) is an approach to analyzing datasets to summarize their main characteristics, often with visual methods. A statistical method can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Typical graphical techniquesused in EDA are:
  • 23. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 23  Box Plot  Histogram  Run Chart  Pareto Chart  Pie Chart  Ogive chart 3. Methodology Extract Feature & Method Methodology: Important Column: - Date: Date of cumulative report Name of State / UT: Name of the state / Union Territory / National Capital Region Total Confirmed cases (Indian National): Cumulative count of Indian national confirmed with COVID-19 Total Confirmed cases (Foreign National):
  • 24. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 24 Cumulative count of foreign national confirmed with COVID-19 Cured/Discharged/Migrated: Cumulative count of cured / discharged cases Latitude: Latitude of the location Longitude: Longitude of the location Death: Cumulative count of deaths reported Total Confirmed cases: Total confirmed cases Types of Model:- Data modeling is the process of producing a descriptive diagram of relationships between various types of information that are to be stored in a database. 1.Test Datasets
  • 25. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 25 2.Classification Test Problems 3.Regression Test Problems Test Datasets: - The test dataset is a dataset used to provide an unbiased evaluation of a final model fit on the training dataset. A test dataset is a dataset that is independent of the training dataset, but that follows the same probability distribution as the training dataset. If a model fit to the training dataset also fits the test dataset well, minimal over fitting has taken place. Test datasets are small contrived problems that allow you to test and debug your algorithms and test harness.  They can be generated quickly and easily.  They are small and easily visualized in two dimensions.  They can be scaled up trivially.
  • 26. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 26 Classification of Test Problems:- After building a predictive classification model, you need to evaluate the performance of the model. That is how good the model is in predicting the outcome of new observations test data that have been not used to train the model. Blobs Classification Problem:- Used for Gaussian distribution. You can control how many blobs to generate and the number of samples to generate.  Moons Classification Problem:- Use for binary classification and will generate a swirl pattern That is capable of learning nonlinear class boundaries.
  • 27. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 27  Circles Classification Problem:- Use fall into concentric circles. You can control the amount of noise in the shapes. Regression Test Problems:- Regression is the problem of predicting a quantity given an observation We will create a dataset with a linear relationship between inputs and the outputs. Problems and Issues of Linear Regression o Specification o Proxy Variables and Measurement Error o Selection Bias o Multicollinearity o Autocorrelation o Heteroskedasticity o SimultaneousEquations
  • 28. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 28 o Limited Dependent Variables Introduction To Model & Methodology: What is Regression? Regression analysis is a set of statistical processes for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. This technique is used for forecasting, time series modeling and finding the casual effect relationship between the variables. Why do we use Regression Analysis? Regression analysis estimates the relationship between two or more variables.
  • 29. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 29 There are multiple benefits of using regression analysis. They are as follows: 1.It indicates the significant relationships between dependent variable and independent variable. 2.It indicates the strength of impact of multiple independent variables on a dependent variable. Types Of Regression: 1. LinearRegression: It is one of the most widely known modeling technique. Linear regression is usually among the first few topics which people pick while learning predictive modeling. In this technique, the dependent variable is continuous, independentvariable(s) can be continuous or discrete and nature of regression line is linear.
  • 30. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 30 There must be linear relationship between independent and dependent variables. Linear Regression is very sensitive to Outliers. It can terribly affect the regression line and eventually the forecasted values. 2. LogisticRegression: Logistic regression is used to find the probability of event=Success and event=Failure. We should use logistic regression when the dependent variable is binary (0/ 1, True/ False, Yes/ No) in nature. Logistic regression is widely used for classification problems. Logistic regression doesn’t require linear relationship between dependent and independent variables.
  • 31. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 31 It can handle various types of relationships because it applies a non-linear log transformation to the predicted odds ratio. Logistic regression estimates the parameters of a logistic model and is form of binomial regression. Logistic regression is used to deal with data that has two possible criterions and the relationship between the criterions and the predictors. 3. Polynomial Regression: A regression equation is a polynomial regression equation if the power of independent variable is more than 1. It is used for curvilinear data. Polynomial regression is fit with the method of least squares.
  • 32. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 32 The goal of regression analysis to model the expected value of a dependent variable y in regards to the independent variable x. Formula: Y=mx+b m=N(∑XY)-(∑X)(∑Y)/N∑X2 -(∑X)2 b=∑y-(m.∑x)/N
  • 33. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 33 Data Visualization: All Graph: -
  • 40. COVID-19 INDIA HEMA–BHARAT–KHUSHBU Page 40 3. List of References: 1. http://www.kaggle.com 2. python data set 3. google.com 4. http://medicalnewstoday.com Plagarism Report: -Thank You.