5. Data Analysis & Report Writing: Data Analysis: Cleaning of Data, Editing, Coding, Tabular representation of data, frequency tables, Univariate analysis - Interpretation of Mean, Median Mode; Standard deviation, Coefficient of Variation. Graphical Representation of Data: Appropriate Usage of Bar charts, Pie charts, Line charts, Histograms. Bivariate Analysis: Cross tabulations, Bivariate Correlation Analysis - meaning & types of correlation, Karl Person’s coefficient of correlation and spearman’s rank correlation. Chi-square test including testing hypothesis of association, association of attributes. Linear Regression Analysis: Meaning of regression, Purpose and use, Linear regression; Interpretation of regression co-efficient, Applications in business scenarios. Test of Significance: Small sample tests: t (Mean, proportion) and F tests, Z test. Non-parametric tests: Binomial test of proportion, Randomness test. Analysis of Variance: One way and two-way Classifications. Research Reports: Structure of Research report, Report writing and Presentation.
Top^Clinic ^%[+27785538335__Safe*Women's clinic//Abortion Pills In Harare
Unit -5 - Data Analysis & Report Writing.pptx
1. UNIT V -
Data Analysis & Report
Writing:
Dr. Prachi Murkute
2. Data Analysis & Report Writing
• Data Analysis: Cleaning of Data, Editing, Coding, Tabular representation of data, frequency tables, Univariate
analysis - Interpretation of Mean, Median Mode; Standard deviation, Coefficient of Variation.
• Graphical Representation of Data: Appropriate Usage of Bar charts, Pie charts, Line charts, Histograms.
• Bivariate Analysis: Cross tabulations, Bivariate Correlation Analysis - meaning & types of correlation, Karl
Person’s coefficient of correlation and spearman’s rank correlation. Chi-square test including testing
hypothesis of association, association of attributes.
• Linear Regression Analysis: Meaning of regression, Purpose and use, Linear regression; Interpretation of
regression co-efficient, Applications in business scenarios.
• Test of Significance: Small sample tests: t (Mean, proportion) and F tests, Z test. Non-parametric tests:
Binomial test of proportion, Randomness test. Analysis of Variance: One way and two-way Classifications.
• Research Reports: Structure of Research report, Report writing and Presentation.
3/5/2023 Dr. Prachi Murkute 2
3. Cleaning of Data,
• Data cleaning is the process of fixing or removing incorrect,
corrupted, incorrectly formatted, duplicate, or incomplete
data within a dataset. When combining multiple data sources,
there are many opportunities for data to be duplicated or
mislabeled.
3/5/2023 Dr. Prachi Murkute 3
4. Editing,
• Data editing is the process of "improving" collected survey
data. The improvement involves finding erroneous data and
then correcting it. Errors may have happened along the way
from the respondent to the survey organization's data files for
various reasons, intended or unintended.
3/5/2023 Dr. Prachi Murkute 4
5. Coding • Coding is a qualitative data analysis strategy
in which some aspect of the data is
assigned a descriptive label that allows the
researcher to identify related content across
the data. How you decide to code - or whether
to code- your data should be driven by your
methodology.
3/5/2023 Dr. Prachi Murkute 5
6. Tabular representation
of data,
• In tabular representation of data, the given data
set is presented in rows and columns. When a
table is used to represent a large amount of data
in an arranged, organised, engaging, coordinated
and easy to read form it is called the tabular
representation of data.
3/5/2023 Dr. Prachi Murkute 6
7. Frequency tables,
• Frequency refers to
the number of times
an event or a value
occurs. A frequency
table is a table that
lists items and
shows the number
of times the items
occur.
3/5/2023 Dr. Prachi Murkute 7
8. Mean
For example, mean of 2, 6, 4, 5, 8
is: Mean = (2 + 6 + 4 + 5 + 8) / 5 =
25/5 = 5.
3/5/2023 Dr. Prachi Murkute 8
9. Median
• Median, in statistics, is the
middle value of the
given list of data when
arranged in an order.
What is the median of 4 and 7?
The mean of these middle values is (4 + 7) /
2 = 5.5 , so the median is 5.5
3/5/2023 Dr. Prachi Murkute 9
12. Standard deviation
• A standard deviation (or
σ) is a measure of how
dispersed the data is in
relation to the mean.
Low standard deviation
means data are clustered
around the mean, and
high standard deviation
indicates data are more
spread out.
3/5/2023 Dr. Prachi Murkute 12
13. Coefficient of Variation
Coefficient of variation is a
relative measure of
dispersion that is used to
determine the variability of
data. It is expressed as a
ratio of the standard
3/5/2023 Dr. Prachi Murkute 13
14. Graphical Representation of Data:
Appropriate Usage of Bar charts,
• Bar charts should be used when
you are showing segments of
information. Vertical bar charts
are useful to compare different
categorical or discrete variables,
such as age groups, classes,
schools, etc., as long as there are
not too many categories to
compare. They are also very
useful for time series data.
3/5/2023 Dr. Prachi Murkute 14
15. Pie charts, Line charts, Histograms
3/5/2023 Dr. Prachi Murkute 15
16. Bivariate Analysis
• Cross tabulations, Bivariate Correlation Analysis - meaning & types of
correlation, Karl Person’s coefficient of correlation and spearman’s
rank correlation. Chi-square test including testing hypothesis of
association, association of attributes.
3/5/2023 Dr. Prachi Murkute 16
17. Cross tabulations
• Cross tabulations are data
tables that display not
only the results of the
entire group of
respondents, but also
the results from
specifically defined
subgroups.
3/5/2023 Dr. Prachi Murkute 17
18. Bivariate Correlation Analysis - meaning &
types of correlation
• A bivariate correlation analyzes
whether and how two
variables covary linearly, that
is, whether the variance of one
changes in a linear fashion as
the variance of the other
changes.
3/5/2023 Dr. Prachi Murkute 18
19. Types of correlation
• There are three types of correlation:
1. Positive and negative correlation.
2. Linear and non-linear correlation.
3. Simple, multiple, and partial correlation.
3/5/2023 Dr. Prachi Murkute 19
20. Positive and negative correlation
• For example, when
two stocks move in
the same direction,
the correlation
coefficient is positive.
Conversely, when two
stocks move in
opposite directions, the
correlation coefficient is
negative.
3/5/2023 Dr. Prachi Murkute 20
21. Linear and non-linear
correlation • Linear correlation is
defined when the ratio of
proportion of two given
variables are
same/constant.
• Example- every time when
the income increases by
20% there is a rise in
expenditure of 5%.
• Non-linear correlation is
defined as when the ratio
of variations between two
given variables changes.
3/5/2023 Dr. Prachi Murkute 21
23. Karl Person’s coefficient of correlation and
spearman’s rank correlation.
• Karl Person’s coefficient of
correlation- Karl Pearson's
coefficient of correlation is defined
as a linear correlation
coefficient that falls in the value
range of -1 to +1. Value of -1
signifies strong negative
correlation while +1 indicates
strong positive correlation.
3/5/2023 Dr. Prachi Murkute 23
24. spearman’s rank
correlation
• Spearman's rank
correlation measures the strength
and direction of association
between two ranked variables. It
basically gives the measure of
monotonicity of the relation between
two variables i.e. how well the
relationship between two variables
could be represented using a
monotonic function.
3/5/2023 Dr. Prachi Murkute 24
25. Chi-square test
• A chi-squared test (symbolically represented as χ2) is basically
a data analysis on the basis of observations of a random set of
variables. Usually, it is a comparison of two statistical data sets.
This test was introduced by Karl Pearson in 1900 for categorical
data analysis and distribution. So, it was mentioned
as Pearson’s chi-squared test.
• The chi-square test is used to estimate how likely the
observations that are made would be, by considering the
assumption of the null hypothesis as true.
3/5/2023 Dr. Prachi Murkute 25
26. Linear Regression Analysis:
• Meaning of regression- A regression is a statistical technique
that relates a dependent variable to one or more
independent (explanatory) variables. A regression model is
able to show whether changes observed in the dependent
variable are associated with changes in one or more of the
explanatory variables.
• Purpose and use: Regression allows researchers to predict or
explain the variation in one variable based on another
variable. Definitions: ❖ The variable that researchers are trying
to explain or predict is called the response variable. It is also
sometimes called the dependent variable because it depends
on another variable.
3/5/2023 Dr. Prachi Murkute 26
27. Linear regression; Interpretation of regression co-
efficient
• A positive coefficient indicates
that as the value of the
independent variable increases,
the mean of the dependent
variable also tends to increase.
A negative coefficient suggests
that as the independent variable
increases, the dependent variable
tends to decrease.
3/5/2023 Dr. Prachi Murkute 27
29. Small sample tests: t (Mean, proportion)
• If the sample size is less than 30 i.e., n < 30, the
sample may be regarded as small sample. and it is
popularly known as t-test or students' t-distribution
or students' distribution. Let us take the null
hypothesis that there is no significant difference
between the sample mean and population mean.
3/5/2023 Dr. Prachi Murkute 29
30. F tests
• An F-test is any statistical test in
which the test statistic has an F-
distribution under the null
hypothesis. It is most often used
when comparing
3/5/2023 Dr. Prachi Murkute 30
31. Non-parametric tests: Binomial test of
proportion, Randomness test.
• Binomial test of proportion -
A binomial test uses
sample data to determine
if the population
proportion of one level in
a binary (or dichotomous)
variable equals a specific
claimed value.
3/5/2023 Dr. Prachi Murkute 31
32. Randomness
Test
• To test the run test of
randomness, first set up
the null and alternative
hypothesis. In run test of
randomness, null
hypothesis assumes that
the distributions of the two
continuous populations
are the same. The
alternative hypothesis will
be the opposite of the null
hypothesis.
3/5/2023 Dr. Prachi Murkute 32
34. Research Reports: Structure of Research
report, Report writing and Presentation
• A research report is a well-
crafted document that outlines
the processes, data, and
findings of a systematic
investigation.
• It is an important document that
serves as a first-hand account
of the research process, and it
is typically considered as an
objective and accurate source
of information.
A complete research paper is
reporting on experimental
research will typically contain
a Title page, Abstract,
Introduction, Methods,
Results, Discussion, and
References sections. Many
will also contain Figures and
Tables and some will have an
Appendix or Appendices.
3/5/2023 Dr. Prachi Murkute 34