Research Methodology and Biostatistics, Moptom.pptx

Research Methodology and Biostatistics
Dipsikha Aryal
MPH 2nd year
BP Koirala Institute of Health Science
Dharan, Nepal
1
16:14

Contents
• Formulation of research objectives
• Variables
• Types of research design
• Data collection tools and technique
• Probability
• Central tendency
• Sample and sampling
• Sensitivity, specificity, PPV and NPV
• Health, illness and disease
2
16:14

Formulation of research objectives
• What is objectives?
 The objectives of a research study summarize what is to be achieved by the study.
 Closely related to the statement of problem
3
16:14

Types
Objectives General Specific
4
16:14

General objective
 Researcher expect to achieve by the study in general term
 Broader perspective
Specific Objectives
 Measurable statements on the specific questions to be answered
 Smaller, sequential, logically connected parts of general objectives
5
16:14

Criteria of formulating objectives
 S- Specific
 M- Measurable
 A- Achievable
 R- Relevant
 T- Time bound
6
16:14

Variables
• Well defined name of data is known as variable.
Types
1) Qualitative variable 2) Quantitative variable
 Nominal - Discrete
 Ordinal - Continuous
7
16:14

Variables based on causal relationship
• Independent variables- Variables that are used to describe or measure the
factors that are assumed to cause or at least to influence the problem.
• Dependent variable- Variable that is used to describe or measure the problem
under study
• Confounding variable- An extraneous variable associated with problem and
possible cause of the problem, may either strengthen or weaken the apparent
relationship between the problem or cause.
8
16:14

Example:
Effect
( Dependent
variable)
Cause
( Independent
variable)
Other factor
( Confounding
variable)
9
16:14

Research design
• A research design is the conceptual structure within which research is
conducted; it constitutes the blueprint for the collection, measurement and
analysis of data.
• Need ?
10
16:14

Types
 Qualitative research design
 Quantitative research design
Qualitative research design
 Case study
 Ethnographical study
 Historical research
 Ethno-methodological research
11
16:14

Quantitative research design
 Observational
 Descriptive research design- Case study, community diagnosis,
descriptive ecological study
 Analytical research design- Cross sectional, Ecological studies, case-
control studies, cohort studies
12
16:14

 Experimental research design
 Pre- experimental research design
 Quasi experimental research design
 True experimental research design
13
16:14

Cross-sectional studies
o Simplest form of observational study
o Prevalence study
o Based on single examination of cross section of a population at one given
time (snapshot)
o Researcher has no control over the exposure of interest
o Examples???
14
16:14

• Analytical study in which characteristics group of the individuals with
disease (cases) are compared with non disease (control).
• Retrospective study
• It attempts to test causal hypothesis
• Both exposure and outcome has occurred before the start of study
• Study proceed backwards from effect to cause
Case-control study
15
16:14

Case-control study
study
Time 16
16:14

• Cohort is defined as a group of people who share a common
characteristics or experience within a defined time period.
• Exposure status are followed to assess the occurrence of disease
• Study proceed forward from cause to effect
• Prospective/ forward looking/ longitudinal study
• Incidence can be calculated
Cohort study
18
16:14

Interpretation
• RR =1 ( No association)
• RR > 1 (Risk factor)
• RR < 1 ( Protective)
16:14 21

Experimental studies
• Pre-experimental study
Manipulation, no control and no randomization
• Quasi experimental study
Manipulation, control but no randomization
• True experimental study
Manipulation, control and randomization
22
16:14

Tools and technique of data collection:
Data Collection Techniques Data collection tools
Observing, measuring Checklist
Interviewing Pen, watch, scales, microscope
Administering written questionnaire Interview guide, checklist,
questionnaire, tape recorder
Focus group discussion Questionnaire
23
16:14

Basic concept of probability
 Probable, likelihood/ predictability
 Probability is not unknown for health workers and is frequently
encountered
 Probability is the chance that something will happen
 Defined as the number of times in which that event occurs in a very
large number of trials.
25
16:14

Concept
• Also defined as ratio of number of times a particular event occurs to the number
of trials
• n= Total number of possible outcomes
• m= the number of ways of achieving success
• p= probability
• Probability (p)= The number of ways of achieving success / Total number of
possible outcomes
i.e m/n
26
16:14

The probability scale
• Probability of something which is certain to happen is 1.
• Probability of something which is impossible to happen is 0.
• Therefore probability is always between 0 to 1 i.e 0 ≤ p ≤ 1.
• The sum of probability of happening and not happening event is 1.
• So, p + q=1
27
16:14

Terminologies
• Outcomes: All possible results of an experiment are known as outcomes. Eg: possible
outcomes of tossing a coin are head and tail.
• Independent event: If happening of one event is not affected by other remaining events.
Eg: In coin tossing experiment, head obtained in the 1st, 2nd,and 3rd experiment are
independent.
• Dependent event: When occurrence of one event influence the occurrence of other
event, the second event is said to be dependent.
28
16:14

Terminologies
• Mutually exclusive event: If happening of one event excludes the happening of other
events.
Eg: A person cannot have blood group A and O.
• Non-mutually exclusive: Event that can happen at same time.
Eg: Patient with DM and HTN
• Equally likely: When the likelihood of every event is same. Eg: Birth of male or female
child has an equal chance.
29
16:14

Measure of central tendency
30
16:14

Central tendency
• Measure of central tendency are summary statistics or descriptive statistics used to
indicate central location of data values.
• Most commonly used measure of central tendency are mean, median and mode.
31
16:14

Types of central tendency
1) Mathematical average (mean)
- Arithmetic mean
- Geometric mean
- Harmonic mean
2) Positional average
- Median
- Mode
- Quartile
- Deciles
32
16:14

• Mean: Arithmetic mean or mean is simply known as average.
• Mean (x
̄ ) = Sum of observations/ Number of observations
• Median: Value that divides observation into two equal parts. It is the 50th
percentile of a distribution.
• Median formula is {(n + 1) ÷ 2}th
• Mode: Most frequently occurring value in data set.
• Mode may be unimodal, bimodal and multimodal.
• Mode= 3 Median- 2 Mean
33
16:14

Sample and sampling procedure
34
16:14

Sampling
• Sampling is that part of statistical practice concerned with the selection of
individual observations intended to yield knowledge about population of
concern.
• A daily life example is that of cooking rice. A housewife just picks up a few
grains of rice from the cooking vessel and gets a fairly good idea whether
the entire lot of rice is fully cooked or it requires more cooking.
35
16:14

• A small group chosen for study is called sample.
• A value calculated from a defined population, such as mean (μ), standard
deviation (σ), or proportion(P) is called a parameter.
• A value calculated from a sample is called a statistic such as mean,
standard deviation (s) and proportion (p).
36
16:14

Strategies for determining sample size
i) Census for small population
ii) Using a sample size of a similar study
iii) Using published tables
iv) Using a formula to calculate sample size
Quantitative data ( Numerical data)
Qualitative data ( categorical data)
37
16:14

For qualitative data
 To estimate the proportion of qualitative characteristics. For eg: Prevalence
of TB, proportion of eligible couples using contraceptives.
 The tools required are proportion (p), maximum allowable error (d) and
standard normal variate (Z).
38
16:14

The sample size n is given by,
(For infinite population)
Where n= required number of sample size
p= expected prevalence or proportion
q=1-p
z= standard normal variate (z=1.64 for 90% CI, Z= 1.96 for 95% CI, z=2.58 for 99% CI)
d = Allowable error
2
2
d
pq
z
n 
39
16:14

For finite population, sample size n is given by,
n=
n0
1+n0
/N
Where n= sample size
N = population size
2
2
0
d
pq
z
n 
40
16:14

1) A survey is planned to determine what proportion of the higher
secondary students have abused on drug abuse. If prevalence p is
not available from previous studies and pilot sample cannot be
drawn. What sample size would be required with CI 99% and
allowable error 0.04?
Here,
If p is not given we should assume it as 50% i.e p= 0.5, q=0.5
Z= 2.58 for 99% CI
Allowable error (d)= 0.04
Now,
n=
z2
pq
d2
= (2.58)2 *0.5*0.5/ (0.04)2
= 1040.06 ~ 1040 41
16:14

For quantitative data
• To estimate mean value of quantitative characteristics such as mean
cholesterol level, the mean age of teenage pregnancy, etc in a population.
• The tools required are allowable error(d), population standard deviation(σ)
and standard normal variate (Z)
42
16:14

Sample size n is given by,
(For infinite population)
Where n= sample size
z= statistic for level of confidence (for 95% CI, Z is 1.96)
d= allowable error
2
2
2
d
Z
n


43
16:14

1) A health officer wishes to estimate the mean hemoglobin level in defined community
preliminary information is that mean is about 150mg/dl with SD of 30mg/dl. If sampling
error up to 5mg/dl in the estimate to be tolerated, how many subjects should be included in
the study at 95% CI?
If the community to be sampled has 1000 people, what should be the sample size at 95%
CI?
Here,
Mean=150mg/dl
SD for sample(s) = 30mg/dl
Maximum allowable error (d)= 5mg/dl
Standard normal variate (Z)= 1.96 at 95% CI
Now, n= Z2 s2/d2
= (1.96)2 *(30)2/(5)2
= 138.29 ~ 139
If N= 1000, n0 = 139
n=
n0
1+n0
/N
= 139/1+139/1000 = 121.3 ~ 122
44
16:14

Sampling Techniques/Methods
1. Probability Sampling Technique
2. Non-Probability Sampling Technique
46
16:14

Probability Sampling Technique
1. Simple Random Sampling
2. Stratified Random Sampling
3. Systematic Random Sampling
4. Cluster Random Sampling
5. Multistage Random Sampling
6. Probability Proportional to Size (PPS) Sampling
47
16:14

Non Probability Sampling Technique
1. Purposive/Judgmental Sampling
2. Convenience Sampling
3. Snowball sampling
4. Quota sampling
48
16:14

Simple Random Sampling
Simple random sampling, the most basic among the probability sampling techniques
-All members have the equal chance (probability) of being selected.
-Free from sampling bias
-This method is applicable when population is small, homogenous and readily available.
(need to available complete list of population or identification of population)
-A table of random number or lottery method is use to determine.
For simple Example, We wish to draw a sample
of 4 students from a population of 14 students.
Place all 14 students names in a container and
draw out 4 names one by one
49
16:14

Stratified Random Sampling
• Sampling scheme is applicable when the population is not homogenous.
• Dividing the population into homogeneous subsets or strata defined by pre-
determined criteria such as geographical location, demographic characteristics,
economic factors, sociological factors etc.)
• Stratified random sample is one obtained by separating the population elements
into non-overlapping groups called strata and then selecting a simple random
sample from stratum.
50
16:14

Systematic Random Sampling
• It is more often applied to field studies when the population is large, scattered and not
homogenous.
• Use if complete and up-to date list of the sampling unit is available and from which
the sample is drawn
• Systematic sampling relies on arranging the target population according to some
ordering scheme and then selecting elements at regular interval through that ordered
list.
• Sampling Interval (K) =Total Population/Desired sample size
• =N/n, Example N=15, n = 5; K= 15/5; K = 3
51
16:14

Cluster Sampling
• A cluster sample is obtained by selecting clusters from the population
on the basis of simple random sampling
• Naturally occurring groups are clusters
• This method is used when the unit of population are natural groups or
clusters such as villages, wards, blocks, children of school etc. in which
every unit of cluster is taken.
52
16:14

Probability Proportional to Size (PPS) Sampling
• Probability proportional sampling is used in survey research when the
sampling units vary in size.
• Sample is taken on the basis of different cluster and on the basis of
proportion of the total population of each cluster
• For example if sample size of 3000 is to be taken from three wards of a municipality then
if the population of wards no. 1, 2 and 3 is 6000, 5000 and 4000, the sample size will be:
53
16:14

Non-Probability Sampling
• Each element of population has no equal chance of being selected of the item
from the population
• If the sampling frame is not available, non-probability sampling technique is
used.
• Using this technique, we may or may not represent the population well.
• Within the probability sampling we are using in between.
54
16:14

Introduction
• Correlation is measure of the relationship between two quantitative variables
such as weight and cholesterol, weight and height.
• It is used to denote association between the variables.
• The extent or degree of relationship between two sets of figures on two
continuous variables is called correlation coefficient.
• It is denoted by letter ‘ r ‘.
• The extent of correlation ranges between minus one (-1) and plus one (+1), i.e.,
-1 ≤ r ≤ +1.
56
16:14

Explanation
1. A zero correlation indicates that there is no relationship between the
variables.
2. A correlation of -1 shows a perfect negative correlation.
3. A correlation of +1 indicates a perfect positive correlation
57
16:14

Presentation of data
Quantitative data
 Tabular form: Inclusive, exclusive table
 Graphical: Line chart, histogram
 Numerical: Measure of central tendency, Measure of dispersion,
correlation
16:14 59

Qualitative data
 Tabular: Simple, complex, summary table
 Graphical: Pie chart, bar diagram
 Numerical: Proportion, percentage, Rate and ratio
16:14 60

Sensitivity, specificity, PPV, NPV
61
16:14

Validity of a screening test
62
16:14

Evaluation of a screening test
 Sensitivity = probability of a positive test with the disease
= a/ (a+c ) x 100
 Specificity= probability of a negative test in a person without the disease.
= d/ (b+d ) x 100
63
16:14

• Positive predictive value
= probability of the person having the disease when the test is
positive
= a/ (a+b ) x 100
• Negative predictive value
= probability of the person who has no disease when the test is
negative.
= c/ (c+d ) x 100
64
16:14

True Cases of Glaucoma
Sensitivity = 50% (50/100) Specificity = 95% (1900/2000)
65
16:14

Concept
• Disease is a physiological /psychological dysfunction – without ease or
something is wrong in bodily function
• Illness is the individual perceptions and behavior in response to the
disease or who feels aware of not being well
• Sickness is the state of social dysfunction, i.e. a role that the individual
assumes when ill.
67
16:14

• Disease Elimination: is used to describe interruption of transmission of a
disease. E.g. elimination of measles, polio and diphtheria from large
geographical areas.
• Disease Eradication: It is the cessation of infection and disease from the
whole world. e.g. smallpox
16:14 68

Population Medicine
• Population medicine is referred to as hygiene, public health, preventive
medicine, social medicine or community medicine.
• Hygiene: is the science of health and embraces all factors which
contribute to healthful living.
16:14 69

• Public Health: Is the science and art of preventing diseases, prolonging
life and promoting health and efficiency through organized community
efforts.
• Community Health: All the personal health and environmental services in
any human community, irrespective of whether such services were public
or private ones.
16:14 70

• Community Medicine: is the science and art of preventing disease, prolonging life and
promoting health and efficiency of a community through their active and continued
participation.
• Preventive Medicine: is the science and art of preventing disease, prolonging life and
promoting health and efficiency of groups of individuals, and individuals within these
groups, through interception of disease process.
• Social Medicine: is the science and art of preventing disease, prolonging life and
promoting health and efficiency of populations by intercepting social factors that have a
direct or indirect relationship with the diseases.
16:14 71

MCQs…
1) Measurement of blood pressure is which type of data:
 Nominal
Ordinal
Interval
 Continuous
16:14 74

2) Which can have more than one value?
 Mean
 Median
 Mode
 Any of the above
16:14 75

3) For an epidemiological study, 10th person is selected from a population.
This type of sampling is known as:
 Simple random sampling
 Stratified random sampling
 Systematic random sampling
 Cluster random sampling
16:14 76

4) Appropriate statistical method to compare two proportion is:
 Chi-square test
 Student’s t-test
 Odds ratio
 Correlation coefficient
16:14 77

5) The analytical study where population is the unit of study:
 Cross- sectional
 Ecological
 Case-control
 Cohort
16:14 78

6) Incidence rate is calculated from:
 Case-control study
 Prospective study
 Retrospective study
 RCT
16:14 79

7) The ratio between the incidence of disease among exposed and non-
exposed is called:
 Causal risk
 Relative risk
 Attributable risk
 Odds ratio
16:14 80

Research Methodology and Biostatistics, Moptom.pptx

Recommended

Recommended

More Related Content

Similar to Research Methodology and Biostatistics, Moptom.pptx

Similar to Research Methodology and Biostatistics, Moptom.pptx (20)

More from DipsikhaAryal

More from DipsikhaAryal (7)

Recently uploaded

Recently uploaded (20)

Research Methodology and Biostatistics, Moptom.pptx