Research methodology & Biostatistics

RESEARCH METHODOLOGY
&
BIOSTATISTICS

* Few jewels from ocean
Dr. Kusum Gaur
Professor, PSM
WHO Fellow IEC

Definition of Research

“Research is a
systematized effort
to gain new knowledge”.

12/08/2012 Dr. Kusum Gaur 2

Steps in Research (Holy 11)
1. Collect review of literature/Situation Analysis
2. Identify and prioritize health problems
3. Decide aims & objectives
4. Planning Methodology
5. Execution
6. Compilation, Classification & Presentation of
data
7. Analysis
8. Test of Significance/Test of Hypothesis
9. Inferences
10. Report Writing
11. Dissemination of Report


Process of Concluding
8 7 6

Reporting Inferences Analysis

Data Collection
5

Execution

Execution
Research Problem
Define
1

for Pretest
Collection
Data
Review of Literature Methodology
4

2 3
Planning


STEP-1

DEFINITION
OF THE
RESEARCH PROBLEM


RESEARCH PROBLEM ?

Research Problem refers to some difficulty
which a researcher experiences and
wants to obtain a solution for the same.

i.e. a question or issue to be examined.


Process of Defining Problem

Analysis of the Situation

Identify & Prioritize Problems

Select & Define Problem

Statement of
Research Objectives


CRITERIA OF SELECTION
The selection of one appropriate researchable
problem out of the identified problems requires
evaluation of certain criteria.

* Internal / Personal criteria – Researcher‟s side

* External Criteria – Problem side factors


INTERNAL CRITERIA OF SELECTION

 Researcher‟s Interest,

 Researcher‟s Competence,

 Researcher‟s own Resource:
 Human Resource
 Money
 Material
 Time


EXTERNAL CRITERIA OF SELECTION

 Researchability of the problem,
 Importance and Urgency,
 Novelty of the Problem,
 Feasibility,
 Facilities,
 Social Relevance
 Public health Importance

12/08/2012 Dr. Kusum Gaur 10

DEFINE RESEARCH PROBLEM
(Title of the Research Topic)

 Transforming the selected research problem into a
scientifically researchable statement.

 Problem definition or Problem statement should be
clear, precise, self-explanatory and include:-

 What
 How
 When
 Where

12/08/2012 Dr. Kusum Gaur 11

RESEARCH OBJECTIVES
(Objectives)
 Research Objectives are the statement of the
questions that is to be investigated with the goal of
answering the overall research problem.

 Research Objectives should be clear and achievable.

 Generally, they are written as statements, using the
word “to”
(For example, „to discover …‟, „to determine …‟, „to
establish …‟, „to find out -----‟, „to assess -----‟etc. )
 Objectives should infer in the end of the study

12/08/2012 Dr. Kusum Gaur 12

Hypothetical Research Question
 Problem:
PCR of Diabetes Mellitus is increasing very
fast during last five year

 Mission:
Reduce the incidence of heart disease

 Belief:
Meditation is good to reduce stress which
is an important precursor of DM

 Hypothesis
H- Meditation decreases the risk of DM

12/08/2012 Dr. Kusum Gaur 13

Association of Garlic consumption with
coronary Artery Diseases

Aim: To Study the association of Meditation with
Diabetes Mellitus in patients attending at Medical
OPD of SMS Hospital, Jaipur (Raj) India.

Objectives:
1. To assess and compare the proportion of DM
cases in individuals doing regular meditation and
not doing meditation.

2. To find out the risk ratio of DM in individuals not
doing meditation on doing regular meditation.

STEP-2

REVIEW
OF
LITERATURE

12/08/2012 Dr. Kusum Gaur 15

Review of literature

What ?

Why ?

Where ?

12/08/2012 Dr. Kusum Gaur 16

What ?
REVIEW OF LITERATURE

Literature Review is the documentation

of a published and unpublished work

from secondary sources of data

in the areas of specific interest to the researcher.

12/08/2012 Dr. Kusum Gaur 17

Why ? - PURPOSE OF REVIEW
 Tofind out already investigated problems and
those that need further investigation.

 To formulate researchable hypothesis.

 To gain a background knowledge

 To identify data sources

 To learn how others structured their reports.
12/08/2012 Dr. Kusum Gaur 18

Where ?
SOURCES OF LITERATURE
 Books and Journals
 Databases
Bibliographic Databases
Abstract Databases
Full-Text Databases
 Govt. and NGO Records & Reports
 Internet
 On line journals: ww.articalbase.com …….
 E. Databases – Popline, Medline …….
 Research Dissertations / Thesis

12/08/2012 Dr. Kusum Gaur 19

Methodology
 Study Area : Location of study - Hospital, community etc.

 Study Period: Start to end of Study (maximum period
available for study should be defined)

 *Selection of Study Design

 * Selection of Study Population

 Pre-requisits of study: Study Tools, Terminologies,
Orientation trainings etc.

*will be taken separately

12/08/2012 Dr. Kusum Gaur 21

Methodology……

• Study Tools for data collection: subjects, proforma,
examination, measurements, lab investigations
• Planning
 Data collection, compilation, data entry
 Data cleaning
 Analysis plan:
• Confidentiality
• Ethical clearance: Consent from
 Institutional Review Board
 Observational units

12/08/2012 Dr. Kusum Gaur 22

Study Design

A study design is a specific plan or protocol
for conducting the study,
which allows the investigator
to translate the conceptual hypothesis
into an operational one.

12/08/2012 Dr. Kusum Gaur 23

Direction of Study

Backward Forward

Cross -sectional

Retrospective Prospective
3

4. Ambidirectional
12/08/2012 Dr. Kusum Gaur 24

Decision Tree
Intervention Done
No Yes
Observational Study Experimental Study

Comparison Group Randomization

No Yes
No Yes
Descriptive Study Analytic Study
NRCT Study RCT Study

Direction of Study

E O E O
Cohort Study E = O Case-Control Study
Cross-Sectional Study

12/08/2012 Dr. Kusum Gaur 25

Epidemiological Study Design
Observational Studies
 Descriptive Studies

Analytic
Cross-Sectional
Case-Control
Cohort

Experimental / Interventional studies
As per Control: RCT/NRCT
As per Blinding: Single /Double Blind
As per Design: Simple/Cross-over
As per Area: Field/Clinical/Lab
12/08/2012 Dr. Kusum Gaur 26

Descriptive Studies

• Case reports
• Case series
• Population studies

12/08/2012 Dr. Kusum Gaur 27

Descriptive Studies: Uses

• Hypothesis generating

• Suggesting associations

12/08/2012 Dr. Kusum Gaur 28

Descriptive Type of Observational Study

• Other Name Case-Series/Population
• Unit of Study Case/Individuals
• Study Question What is happening 
• Direction Of Inquiry
• Study Design
desired information
about cases/individuals is
collected

12/08/2012 Dr. Kusum Gaur 29

Case-Series …….

Advantages
• Easy to do
• Excellent at identifying unusual situation
• Good for generating hypotheses

Disadvantages
• Generally short-term
• Investigators self-select (bias!)
• no controls

09/03/2010 Dr. Kusum Gaur 30

Analytical Observational Studies

• Cross-sectional

• Case-control

• Cohort

12/08/2012 Dr. Kusum Gaur 31

Cross-sectional Study
• Data collected at a single point in time

• Describes associations

• Prevalence
A “Snapshot”

12/08/2012 Dr. Kusum Gaur 32


• Other Name Prevalence Study
• Unit of Study Individual
• Direction of Inquiry
• Study Design Exposed
to Factor

Not
 Exposed
Diseased to Factor

Population Exposed to
 Factor
Non-
Disease Not
Exposed to
12/08/2012 Dr. Kusum Gaur Factor 33

Objectives of a Cross-Sectional Study

To find out association

12/08/2012 Dr. Kusum Gaur 34


Sample of Population
Defined Population

Regular Not doing meditation
Meditation

Prevalence of Prevalence of
DM DM

Time Frame = Present
12/08/2012 Dr. Kusum Gaur 35

E.G. Out of 1000 population if 100 were doing meditation regularly &
out of that only 2 were having DM. Remaining 900 were not doing
meditation at all, out of that 220 were having DM.

+ DM -

2 98
Meditation

+

- 220 680

12/08/2012 Dr. Kusum Gaur 36


• Strengths
– Quick
– Cheap

• Weaknesses
– Cannot establish cause-effect

09/03/2010 Dr. Kusum Gaur 37

Case-Control Studies
 Start with people who have disease(Cases)

 Match them with controls that do not have
disease (Match Confounding)

 Look back and assess exposures

12/08/2012 Dr. Kusum Gaur 38

Controls

A control is a standard of comparison
(confounded with variability but without effect)

for

• Effects

• Variability

12/08/2012 Dr. Kusum Gaur 39

Case-Control Study
• Other Name Retrospective Study
• Unit of Study Cases/Control
• Study Question What has happened 
• Direction of Inquiry= F O
• Study Design
Exposed

 Cases
Not
Exposed

Exposed

Control
Not
Exposed
12/08/2012 Dr. Kusum Gaur 40

Objective of a Case-Control Study

To find out association

To assess Risk Ratio

12/08/2012 Dr. Kusum Gaur 41

Case-Control Study

Cases
Regular Meditation
Patients with DM
No Meditation

Controls
Regular Meditation
Persons w/o DM
No Meditation

Past Present
12/08/2012 Dr. Kusum Gaur 42

The logic of Case-Control Studies

Cases differ from controls only in having the
disease

If exposure does not predispose to having
the disease, then exposure should be equally
distributed between the cases and controls.

 The extent of greater previous exposure
among the cases reflects the increased risk
that exposure confers
12/08/2012 Dr. Kusum Gaur 43

Case-Control Studies: Strengths

• Good for rare outcomes: cancer
• Can examine relation of exposures to disease
• Useful to generate hypothesis
• Fast
• Cheap
• Provides Odds Ratio

09/03/2010 Dr. Kusum Gaur 44

Case-Control Studies: Weaknesses

• Cannot measure
– Incidence
– Prevalence
– Relative Risk
• Can only study one outcome
• High susceptibility to bias

09/03/2010 Dr. Kusum Gaur 45

Cohort Study

• Begin with disease-free individuals

• Classify patients as exposed/unexposed

• Record outcomes in both groups

• Compare outcomes using relative risk

12/08/2012 Dr. Kusum Gaur 46

Cohort Study
• Other Name Prospective Study / Follow-up Study/Incidence Study
• Unit of Study Individual
• Direction of Inquiry F O
• Study Design Diseased

•
Exposed to Not Non
Factor Diseased

Cohort
Cohort Diseased
Not
Exposed to
Factor
Non-Diseased

12/08/2012 Dr. Kusum Gaur 47

Logic of Cohort Study

Cohort is a group of persons sharing a
common characteristics

Differences in the rate at which exposed and
control subjects contract a disease is due to
the differences in exposure, since others are
known and similar.

12/08/2012 Dr. Kusum Gaur 48

Cohort Study

 Prospective (usually)

 Controlled

 Can determine causes and incidence of
diseases as well as identify risk factors

 Generally expensive, time consuming and
difficult to carry out
12/08/2012 Dr. Kusum Gaur 49

Steps for Cohort Study

 Identify geographically defined group
 Identify exposed subjects and not exposed
subjects
 Follow over a specific time
 Record the fraction in each group who
develop the condition of interest
 Compare these fractions using RR, AR or OR

12/08/2012 Dr. Kusum Gaur 50

Objectives of a Cohort Study

 To find out association

 To assess Risk Ratio

 To find out Relative Risk

 To find out Attributed Risk

12/08/2012 Dr. Kusum Gaur 51

Prospective Cohort Study
DM
No Meditation
No DM

Cohort
DM
Regular
Meditation No DM

Present Future
12/08/2012 Dr. Kusum Gaur 52

Cohort Study: Strengths

• Can measure multiple outcomes

• Can adjust for confounding variables

• Can calculate Attributed Risk

09/03/2010 Dr. Kusum Gaur 53

Cohort Study: Weaknesses

• Expensive

• Time consuming

• Cannot study rare outcomes

• Confounding variables

09/03/2010 Dr. Kusum Gaur 54

Measurements of association

Cohort Study Case Control Study

•Significance Test •Significance Test
•Relative Risk •OR
•Attributable Risk
•OR

12/08/2012 Dr. Kusum Gaur 55

Measures of Association
Significance Test – to test significance of
difference in exposure between control and
Cases
Odds ratio - ratio of the odds of contracting
disease in given exposure
Relative Risk – Ratio between incidence
among exposed and incidence among non-
exposed
Attributed Risk – percentage of difference
between incidence among exposed and non-
exposed with incidence among exposed
RR or OR of 1 indicate no effect of exposure (equal odds)
12/08/2012 Dr. Kusum Gaur 56

‘Z’ Score of Exposure Rates

Cases control

Exposed a b
a x 100
Exposure Rates = in Cases Non- c d
exposed
(P2) a+c

b x 100
Exposure Rates = in Controls P2 – P1
(P1) b+d Z Score =
SEDP

P1 Q 1 P 2 Q 2
SEDP = ------------- + --------
09/03/2010 Dr. Kusum Gaur 57
N1 N2

ad
ODD‟s Ratio = Times
bc

Incidence among Exposed
RR = Times
Incidence among Non-Exposed

a/a+b a (c+d)
= =
c/c+d c (a+b)

09/03/2010 Dr. Kusum Gaur 58

Attributed Risk

(Incidence among Exposed - Incidence among Non-Exposed)

AR = x 100
Incidence among Exposed
a
Incidence among Exposed= x 100
a+b
c
Incidence among Non-Exposed= x 100
c+d

09/03/2010 Dr. Kusum Gaur 59

Experimental Studies

Clinical trials provide the “gold standard” of

determining the relationship between factor

and the event

12/08/2012 Dr. Kusum Gaur 60

Types of Experimental Study

As per Randomization:
• Randomized Control Trials (RCT)

• Concurrent Parallel Design (RCT)

• Sequential RCT Design

• RCT with External Control

• Non – Randomized Trials (NRCT)

12/08/2012 Dr. Kusum Gaur 61

Types of Experimental Study….

As per Design:
• Simple

• Cross-Over Study Design

As per Study Area:
• Field Trials

• Clinical Trials

• Lab. Trials
12/08/2012 Dr. Kusum Gaur 62

Quality of Experimental Study

• Randomization

• Blinding

• Control

• Cross-Over

12/08/2012 Dr. Kusum Gaur 63

Controls in Clinical Trials

A clinical trial is a comparative, prospective
experiment conducted in human subjects

• Historical controls are better than no
controls

• Patients can serve as own controls - This is
usually beneficial as the comparison
removes patient differences

12/08/2012 Dr. Kusum Gaur 64

Blinding
Good practice: factors that can affect the
evaluation of outcome should not be permitted
to influence the evaluation process

Single-blind
Patient or evaluator (either of one) is blinded as
to intervention

Double-blind design
Neither patient nor outcome evaluator knows Rx
to which patient was assigned

12/08/2012 Dr. Kusum Gaur 65

Randomized Control Trials (RCT)

• Before and After Comparison

• Comparison with Placebo

• Comparison Of two medicine/procedure/tests

• Comparison Of > two medicine/procedure/tests

12/08/2012 Dr. Kusum Gaur 66

Experimental Study
• Other Name Intervention Study
• Objective To know the effect of intervention
• Unit of Study Individual meeting entry criteria
• Study Question What is happening after intervention in
both groups 
• Direction of Inquiry I E
• Study Design 1(Intervention with Placebo) Positive
Outcome

Group 1/cases Intervention
Negative
Outcome

Positive
Outcome
Group
Placebo
2/control
Negative
Outcome

12/08/2012 Dr. Kusum Gaur 67

Clinical Trial

R Treatment
a Outcomes
Group
n
d
Study o
Population m

i
z Outcomes
e Control Group

12/08/2012 Dr. Kusum Gaur 68

Intervention Study - Design 2
(Comparison of Effect of Two Interventions)

Cases
Meeting
Entry criteria

Group - 1 Group -2

Intervention -1 Intervention Intervention - 2

Positive Negative Positive
Outcome Negative
Outcome Outcome Outcome

12/08/2012 Dr. Kusum Gaur 69

Cross Over Design
Group -1 Cases Group-2
Meeting
Entry
criteria Intervention - 2
Intervention - 1

Positive Negative
Positive Negative Outcome
Outcome Outcome
Outcome

Group -1
Group -2 Crossover

Intervention -2
Intervention -1

Positive Negative
Positive Negative
Outcome
Outcome Outcome
Outcome

12/08/2012 Dr. Kusum Gaur 70

Other Types of Experimental Study

• Quincy Experimental Study

• Block Experimental Study

12/08/2012 Dr. Kusum Gaur 71

Quincy Experimental Study

Cases
Meeting
Entry criteria

Group - 1 Group -2

Intervention Intervention No Intervention

Positive Negative Positive
Outcome Negative
Outcome Outcome Outcome

12/08/2012 Dr. Kusum Gaur 72

Block Experimental Study

Cases
Meeting
Entry criteria

Group -3
Group - 1

Group -2

Intervention Intervention-3
Intervention -1 Intervention

Intervention-2

Positive Positive Negative
Negative
Outcome Outcome Outcome Outcome
Positive Negative
Outcome Outcome

12/08/2012 Dr. Kusum Gaur 73

Steps of Experimental Study
Drawing up a Protocol

Reference Population

Sample Population

Exclusions

Randomization
Experimental Group Control Group

Manipulation/Intervention

Follow - up

12/08/2012 Assessment of Outcome
Dr. Kusum Gaur 74

Ideal Study Design for established causality

Ethical Issues

STUDY QUESTIONS AND APPROPRIATE DESIGNS

Type of Question Appropriate Study Design
Burden of illness Field Surveys
- Prevalence Cross Sectional Survey
- Incidence Longitudinal survey

Causation, Risk & Prognosis Case Control Study,
Cohort study, RCT

Treatment Efficacy Randomized Controlled study

Diagnostic Test Evaluation Randomized Controlled study

Cost Effectiveness Randomized Controlled study

12/08/2012 Dr. Kusum Gaur 76

Hierarchy of Epidemiological Study Design

Establish Causality RCT

Cohort

Case Control

Cross-Sectional

Case Series

Generate Hypothesis Case Report

12/08/2012 Dr. Kusum Gaur 77

Methodology

 Study Area : Location of study - Hospital, community
etc.

 Study Period: Start to end of Study (maximum period
available for study should be defined)

 *Selection of Study Design

 * Selection of Study Population
Sample Size
Sampling Technique

 Pre-requisits of study: Study Tools, Terminologies,
Orientation trainings etc.

12/08/2012 Dr. Kusum Gaur 78

Selection of study population

Whole Population

Sample Population

12/08/2012 Dr. Kusum Gaur 79

What is Sample ?

• A sample is a small representative
segment of a population

• Inferences drawn from a sample are
expected to be applicable for the source
population

12/08/2012 Dr. Kusum Gaur 80

Why do we need a sample?

To get inferences

applicable to universe

with minimum resources

12/08/2012 Dr. Kusum Gaur 81

Sample – Qualities

Sample is a part of population but it is true
representative of whole.

Qualities

Adequate size

Appropriate sampling technique

12/08/2012 Dr. Kusum Gaur 82

Factors on which SAMPLE SIZE depend:

• Population Factors
– Type of information available
• Type of study
– Type of Data
– Type of study design
– Type of sampling
– Type of Statistical Analysis for outcome needed
• Determined values of research by researcher
– Power
– Significance level

12/08/2012 Dr. Kusum Gaur 83

Power: Ability to detect right answer

Alpha Error: Chance to miss right answer

Type of Data & level of Measurements

Qualitative – Counted Facts – Nominal Data
Measured as Numbers expressed as proportions

Quantitative- Measured Facts - Numerical Data
Measured as quantity & expressed as Mean SD

*Ordinal Data – Rank Order Data
Measured as rank & expressed as Median Percentile

12/08/2012 Dr. Kusum Gaur 91

Sample size for Qualitative data

Z 2 PQ 4 PQ
Sample Size= ------------------- -- = ------------------
L2 L2

P= Prevalence of disease
Q = 100-P
L = allowable error
Z= 1.96 ≈ 2 for 95% CL
for descriptive/case-series type of study design

09/03/2010 Dr. Kusum Gaur 92

Sample size for Quantitative data

Z 2 SD 2 4 SD 2
Sample Size= ------------------- -- =----------------------
L2 L2

SD= Standard Deviation
L = allowable error
Z= 1.96 ≈ 2 for 95% CL
For Descriptive Studies only

09/03/2010 Dr. Kusum Gaur 94

Finite Correction

Sample Size – Finite Population (where the
population is less than 50,000)
SS
New SS = _________________
( 1 + ( SS – 1 ))Pop

How many controls?

n
k Here n0=No. of cases &
2n0  n n = expected no. of cases

• k = 13 / (2*11 – 13) = 13 / 9 = 1.44
• kn0 = 1.44*11 ≈ 16 controls (and 11 cases)
– Same precision as 13 controls and 13 cases

Sampling Design factors of sample size

Variance of Specified Sampling
Design Effect =
Variance of Simple Random Sampling

12/08/2012 Dr. Kusum Gaur 97

Sampling Technique effect on Sample Size

Sampling Technique Design Effect Size Multiplier

Simple Random Sampling 1

Systemic Random Sampling 1.2

Stratified Random Sampling 0.8

Cluster Random Sampling 2

12/08/2012 Dr. Kusum Gaur 98

Conventionally accepted
Researcher’s Estimations

Alpha Error 0.05

Power 80%

Confidence Limit 95%

12/08/2012 Dr. Kusum Gaur 99

Key Concepts: Sample size
• Sampling Design - larger sample for Custer

• Desired Power – more power for larger sample

• Allowable error – smaller error for larger sample

• Heterogeneity leads to have larger sample to
cover diversities

• Nature of Analysis – Complex multivariate
needs larger sample
12/08/2012 Dr. Kusum Gaur 100

Steps -Sample Size Estimation
• Stage 1- * Base Sample Size Calculation (n)

• Stage 2 – Sample Size with Design Effect (d)
=n*d

• Stage 3- Contingency Addition (e.g. 5%)
SS Estimation for study population
=(n*d)+5%of n

*Use appropriate equation for sample size
calculation
http://stat.ubc.ca/~rollin/stats/ssize/
12/08/2012 Dr. Kusum Gaur 101

E.G. Mean 1= 5, Mean 2 = 15 & SD = 14 inputting values

12/08/2012 Dr. Kusum Gaur 107

12/08/2012 Dr. Kusum Gaur 108

12/08/2012 Dr. Kusum Gaur 109

12/08/2012 Dr. Kusum Gaur 110

12/08/2012 Dr. Kusum Gaur 111

12/08/2012 Dr. Kusum Gaur 112

SAMPLING TECHNIQUES

• PROBABILITY/RANDOM SAMPLING

• NONPROBABILITY SAMPLING

12/08/2012 Dr. Kusum Gaur 119

Random sampling Techniques

Aim is to give equal chance to
every observation unit to be
selected for study in sample.

(Any Observation unit
should not have Zero Probability )

12/08/2012 Dr. Kusum Gaur 120

* Random Sampling Techniques

Simple Random Technique

Systemic Random Technique

Stratified Random Technique

Multiphase Random Technique

Multistage Random Technique

Cluster Random Technique

12/08/2012 Dr. Kusum Gaur 121

Simple Random Technique

• Lottery Method

• Random Table Method

12/08/2012 Dr. Kusum Gaur 122

12/08/2012 Dr. Kusum Gaur 123

Steps –Use of Random Table
• Stage 1- Give number to each member of population
• Stage 2 – Determine total population size (N)
• Stage 3- Determine Sample size (S)

• Stage 4 – Drop one finger on Random Table with eyes
closed
• Stage 5 – Drop one finger with eyes closed on direction
to be chosen – Up/Down/Rt/Lt

• Stage 6- Determine first number within 0 to N
• Stage 7- * Determine other numbers till Sample size (S)

* Once a number is chosen do not repeat it again
12/08/2012 Dr. Kusum Gaur 124

Steps –Use of Random Table..
e.g. N=300, M=50

Random no. Selected no. (3 digits from 0-300)
49468
49699
14043 043
15013 013
12600
33122 122
94169 169
89916
74169 169
32007 007
www.evaluation
wikiog/index/how_to_use_a_random_number_Table
12/08/2012 Dr. Kusum Gaur 125

Systemic Random Technique

The selection of sample follows a systematic
interval of selection
• Find serial interval
(K) = total population/sample size
• 1st observation through simple random sampling
among 1to K. th
• Next observation = (1st +K) Observation
• Next observation = (2 nd +K)th Observation

• -------------so on till No. of observations
= Sample Size

12/08/2012 Dr. Kusum Gaur 126

Systemic Random Technique Population
N=100 (Given) 1 21 41 61 81
2 22 42 62 82
S=20 (Estimated) 3 23 43 63 83
K=N/S =100/20 =5 4 24 44 64 84
5 25 45 65 85
1st observation between 1 to 5 6 26 46 66 86
7 27 47 67 87
though SRS e.g. 3 8 28 48 68 88
Every 5th observation from 3rd 9 29 49 69 89
10 30 50 70 90
observation will be included in 11 31 51 71 91
sample population 12 32 52 72 92
13 33 53 73 93
So, sample population will be – 3rd 14 34 54 74 94
8th 13th 18th 23rd 28th 33rd 38th 15 35 55 75 95
16 36 56 76 96
43rd 48th 53rd 58th 63rd 68th 73rd 17 37 57 77 97
78th 83rd 88th 93rd and 98th 18 38 58 78 98
19 39 59 79 99
observation 20 40 60 80 100
12/08/2012 Dr. Kusum Gaur 127

Stratified Random Technique
Sample selection through Simple Random/Systemic Random Technique

Sample Strata 1
Sample
Strata 2

Sample Strata 3
12/08/2012 Dr. Kusum Gaur 128

Multiphase Random Technique
Specific test
Screening Test
S/S
Population

Probable cases Cases
Suspected cases For
study

12/08/2012 Dr. Kusum Gaur 129

Multistage Random Technique

Each stage Simple RT is used village
district
village

village
State 1 district
Population village
Study
Of Population
Nation village
district
village
State 2
village
district
village

12/08/2012 Dr. Kusum Gaur 130

Cluster Random Technique
The unit of random selection is a cluster rather than individual
• CI = Total population /30 (in 30 Cluster Technique)

Cluster 1 Cluster 27

Cluster 2 Cluster 28
Population Study
Of Population
Nation Cluster 3 Cluster 29

Cluster 30
Cluster 4

Through Simple RT

12/08/2012 Dr. Kusum Gaur 131

Stratified Vs Cluster Technique

Stratified Technique Cluster Technique
• Homogenous groups • Comparable groups of
are made population are made
• Randomly select (usually 30)
sample from each • Randomly select
group
sample from each
• To make it more truly group
representative, take
sample population • More chances of error
proportion to size (PPS) than simple random
• Less chances of error
than simple random

Non Probability Sampling

• When random samples are not possible
• Rare disease
• Small population
• Special population
• Special Condition
• Difficult to reach population

12/08/2012 Dr. Kusum Gaur 133

Non-probability Samples

Convenience
 Purposive
 Quota
 Snow ball study

12/08/2012 Dr. Kusum Gaur 134

12/08/2012 Dr. Kusum Gaur 135

12/08/2012 Dr. Kusum Gaur 136

12/08/2012 Dr. Kusum Gaur 137

Snow ball sampling

Contact tracing
Initial respondent helps in recruiting
new population
Useful in network analysis approach

12/08/2012 Dr. Kusum Gaur 138

Step-4 & 5

Data Collection
and
Data Management

Sources of Data

• Primary –Own generated data

• Secondary –Already generated data
Published
Non-Published

12/08/2012 Dr. Kusum Gaur 140

Primary Vs Secondary source of Data

Primary data Secondary data
• Need to be generated • Readily available

• First hand information • Second hand information
• Questionnaire
• Not need of questionnaire
• Purpose served
• Purpose served ?
• Analysis as per
purpose
• Require more time and • Descriptive
money • Less expensive
12/08/2012 Dr. Kusum Gaur 141

Type of Data Collection Methods

Interview
Personnel
Telephonic
Observation
Experimental
Interview and Observation
Observation and Experimental
Interview ,Observation and Experimental

12/08/2012 Dr. Kusum Gaur 142

Forms of questions(Open Vs Closed)

Open ended Close ended

• Possible responses are • Categories are given
not given. already coded
• Mean, SD, Median • Proportion
• For seeking opinions, • For eliciting factual
attitudes ,perceptions information
• Not so depth
• Provides in depth info. • Investigator‟s bias
• Experience of • Ease of answering,
investigator and • Easy to analyse
analyst required

12/08/2012 Dr. Kusum Gaur 143

Considerations in formulating questionnaire

(Questionnaire/Interview schedule)

 Use simple and everyday language

 Do not use ambiguous questions(?/?)

 Do not ask leading questions

 The order of questions:

 Guideline for filling an instrument, pen-pencil

Pre testing

12/08/2012 Dr. Kusum Gaur 144

Validity of a Research Instrument

Ability of an instrument to measure what it is
designed to measure being measured

Establish the logical link between the
questions and objectives

 Items/questions cover the full range of
issue/attitude being measured

12/08/2012 Dr. Kusum Gaur 145

1.Decide the information required.
Steps
2. Define the target respondents.
3. Method(s) of reaching target
4. Decide on question content.
5. Develop the question wording.
6. Put questions into a meaningful order.
7. Check the length of the questionnaire.
8. Pre-test the questionnaire.

9. Develop the final survey form

12/08/2012 Dr. Kusum Gaur 146

Organization and Compilation of Data

Organization and Compilation of Data in such a way
(Master Chart ) to have reliable, relevant, adequate
and reasonably complete data with following
requisites –
Simplicity
Briefness
Utility
Distinctively
Comparability
Scientific Arrangement
Attractive
12/08/2012 Effective
Dr. Kusum Gaur 147

Steps of Observations

• Entry of Observations Unites

• Master Chart

• Tabulation

• Diagrammatic Presentation

12/08/2012 Dr. Kusum Gaur 149

Tabulation – Content of Table
 Table No. Sequence in the text
 Tile of Table –short, clear and self explanatory to say about for
what the table is ?
 Body of Table –consist of rows and columns
 Rows – 1st row shows headings of columns
 1st column shows headings of rows
 rest of rows and columns are showing data as per required
 number of rows and columns should be limited to maintained
simplicity of table
 source of data ( if it is other than the present study ) should be
written just below the body of table
 Source of Data ?
 Foot Note - written just below the body of table, if there is any
hidden information
 Inferences –summary value of table

12/08/2012 Dr. Kusum Gaur 168

Types of Tables

As per purpose
General tables –about Socio-demographic profile
Specific tables –about Aims and objectives

As per originality
Original tables-from original Data
Derived tables –from original tables

As per Construction
Simple tables- showing one variable at one time
Complex tables – showing > one variable at one time

12/08/2012 Dr. Kusum Gaur 169

12/08/2012 Dr. Kusum Gaur 170

Diagrammatic Presentations
Bar
Qualitative Data
Simple Histogram
Multiple Frequency Polygon
Component Cumulative Frequency
Pie Polygon
Line Scatter Diagram
Pictogram Box and Whisker
Spot Map Correlation Diagram

Qualitative Data Quantitative Data
12/08/2012 Dr. Kusum Gaur 172

12/08/2012 Dr. Kusum Gaur 173

Simple Bar diagram

12%
4th Qtr

14%
3rd Qtr 12%

2nd Qtr
32%

1st Qtr
82%

0 1 2 3 4 5 6 7 8 9

12/08/2012 Dr. Kusum Gaur 175

Multiple Bar diagram
60

50

40

(1) 1-5 Years
30
(2) 6-10 Years
(3) 11 & Above Years

20

10

0
(1) Very Dissatisfied (2) Dissatisfied (3) neither satisfied (4) Satisfied (5) Very Satisfied
nor dissatisfied

12/08/2012 Dr. Kusum Gaur 176

12/08/2012 Dr. Kusum Gaur 177

Pie diagram

Propotion of Pie = (Proportion of that variable )(360)Degree

12%

14% 1st Qtr
2nd Qtr
82% 3rd Qtr
4th Qtr
32%

12/08/2012 Dr. Kusum Gaur 178

Line diagram
7

6

5

4
Series 2
3
Series 1
2

1

0
2000 2001 2002 2003 2004 2005

12/08/2012 Dr. Kusum Gaur 179

Histogram ( Area Diagramme)

Series 1
40
30
20
10
Series 1
0
0 to 5 yrs
5yrs to 10
10 yrs to
yrs 15 yrs to
15 yrs 20 yrs to
20 yrs
25 yrs

12/08/2012 Dr. Kusum Gaur 180

Scatter Diagram
30

25

20
Duration of Diabetes

15
Duration of diabetes in yrs.
Linear (Duration of diabetes in yrs.)

10

5

0
0 50 100 150 200 250 300
No. of Patients

12/08/2012 Dr. Kusum Gaur 181

Radar diagram

5/1/2002
40
30
20
9/1/2002 6/1/2002
10
Series 1
0
Series 2

8/1/2002 7/1/2002

12/08/2012 Dr. Kusum Gaur 182

Box & Whisker
70

60

50

40 Open
High
30 Low

20 Close

10

0
5/1/2002 6/1/2002 7/1/2002 8/1/2002 9/1/2002

12/08/2012 Dr. Kusum Gaur 183

Biostatistics = Biology + Statistics

• Biostatistics is application of statistics in
biology i.e. science of figure in medical science

• Data: Set of information, facts or figures
numerically coded and from which conclusions
may be drawn is called data (singular-datum).

• Statistics: The collection of methods used in
planning an experiment
and analyzing data in order to draw accurate
conclusions.

Type of Biostatistics

• Descriptive statistics generally characterizes
or describes a set of data elements

• Inferential statistics tries to infer information
about a population by using information
gathered by sampling

Descriptive Analysis

Qualitative Data
Rates
Ratios
Proportions

Quantitative Data
Central Tendencies  Disperson
Mean Standard Deviation
Mode Standard Error
Median Confidencial Limit
Skeweness

12/08/2012 Dr. Kusum Gaur 187

Descriptive Analysis of
Qualitative Data
No. of total Events in a year (A)
Rate = * 1000
MYP of that Region (T)

No. of total (A)
Ratio =
No. of total (B)
No. of Specific Events (A)
Percentage of Events = * 100
Total Events (T)

Event of Sp. Cause (A)
Proportional Rate = * 10 n
Total Deaths (T)
12/08/2012 Dr. Kusum Gaur 188

Descriptive Analysis of
Quantitative Data
Mean = Mathematical Average ∑X
N
Mode = Most commonly occurring value
Median = Center value when arrange in increasing N+1
or decreasing fashion 2

Standard Deviation = It tells how much scores deviate from the mean
 it is the square root of the variance
 it is the most commonly used measure of spread (X-X)
SD=√ N
Standard Error = Deviation from mean per observation
SD/ √N

Skewness = Deviation of peak from median
SK= 3 (Mean –Median)/SD
12/08/2012 Dr. Kusum Gaur 189

Appropriate choice

of

significance tests

12/08/2012 Dr. Kusum Gaur 192

TEST OF SIGNIFICANCE OF QUALITATIVE DATA

TEST OF SIGNIFICANCE OF QUALITATIVE DATA

One Sample Two Sample >Two Sample

Sample proportion
to Independent Dependent Dependent Independent
Population Proportion
Mc Numer Cochron’s
Large Sample Small Sample
(>30) (<30)
Small Sample Large Sample Large Sample Small Sample
Yat’s Corrected
‘Z’ Score Corrected ‘Z’ Score Chi Squire
Chi Squire ‘Z’ Score Chi Squire
Yat’s Corrected Chi
Chi Squire

12/08/2012 Dr. Kusum Gaur 193

TEST OF SIGNIFICANCE OF QUANTITATIVE DATA

TEST OF SIGNIFICANCE OF QUANTITATIVE DATA

One Sample Two Sample >Two Sample

Sample Mean
to Independent Dependent Dependent Independent
Population Mean
Paired ‘T’ Test ANOVA Friedman
Large Sample Small Sample
(>30) (<30)
Small Sample Large Sample Large Sample Small Sample

‘Z’ Test ‘T’ Test
‘Z’ Test ANOVA ANOVA

12/08/2012 Dr. Kusum Gaur 194

STUDY DESIGNS AND APPROPRIATE TEST

Type Study Design Appropriate Significance Test

Descriptive Study

Analytical
Case Control Study OR
Qualitative ‘Z’ Score Test/Chi-Square Test
Quantitative ‘Z’ Test/’t’ Test
Cohort study OR, AR, & RR
Qualitative ‘Z’ Score Test/Chi-Square Test
Quantitative ‘Z’ Test/’t’ Test
12/08/2012 Dr. Kusum Gaur 195

STUDY DESIGNS AND APPROPRIATE TEST
Type Study Design Appropriate Significance Test

Randomized Controlled study
Quantitative (before and after)- Paired ‘t’ Test
Quantitative (before and after >1 followup)- Freidmen ANOVA
Quantitative (between two Gps)- Unpaired ‘t’ Test
Quantitative (between > two Gps)- ANOVA Test

Randomized Controlled study
Qualitative (before and after)- Mac Numer Test
Qualitative (before and after >1 followup)- Cochron’s Test
Qualitative (between two Gps)- ‘Z’ Score/Chi-square Test
12/08/2012 Qualitative (between > two Gps)- Chi-square Test
Dr. Kusum Gaur 196

STATISTICAL TEST OF SIGNIFICANCE
Nominal Numerical Ordinal

Two Groups ‘Z’ Score Test ‘Z’ test (n>30) Mann Whitny
Chi-square Test T Test (n<30)

> Two Groups Chi-square Test ANOVA Kruskal Wallis

Paired Two Mec Numer Paired ANOVA Wilcoxon Sign

Multiple Cohrane Repeated Friedman
Observation in Multivarient ANOVA
same individual

Association of Contegency Correlation(Pearson) Spearman
Two Variable Cofficient Regression Correlation

STATISTICAL TEST OF SIGNIFICANCE
Research Number and Number and Covariates Test Goal of Analysis
Question type of DV type of IV

Nominal 1 nominal chi square determine if difference between
Group croups
differences Continuous 1 dichotomous t-test
Determine significance of
1 Categorical 1 one-way ANOVA mean group
1+ one-way differences
ANCOVA
2+ Categorical 1 factorial ANOVA
1+ factorial ANCOVA
2+ Continuous 1 Categorical 1 one-way MANOVA Create linear
1+ one-way MANCOVA combo of Dependent variable
2+ Categorical 1 factorial (Dvs)
MANOVA to maximize
1+ factorial MANCOVA mean group
differences
Degree of Continuous 1 Continuous Bivariate Determine relationship/prediction
relationship Correlation

2+ Continuous Multiple Linear combination to predict the
Regression DV
1+ Continuous 2+ Continuous Path Analysis Estimate causal relations among
variables
12/08/2012 Dr. Kusum Gaur 198

Comparing difference between
Two Sample Proportions
„Z‟ Score Test
P2 – P1 here, P1– proportion of that event in 1st Sample
„Z‟ Score = P2 - proportion of that event in 2nd Sample
SEDP SEDP – Standard Error of
Difference in Proportion

Q1 - proportion without that event
in 1st Sample i.e. Q1 = 100 – P1
Q2 - proportion without that event in
P1 Q 1 P 2 Q 2 2nd Sample i.e. 100 – P2
SEDP = ------- + -------- N1 - Sample Size of 1st Sample
N1 N2 N2 - Sample Size of 2nd Sample

12/08/2012 Dr. Kusum Gaur 199

Inference of ‘Z’ Score Test

If „Z‟ > 2 = Difference is Significant

If „Z‟ < 2 = Difference is Not Significant

If „Z‟ > 3 = Difference is Highly Significant

12/08/2012 Dr. Kusum Gaur 200

>Two Sample Proportions
Chi-Square Test
Indications
Qualitative data
Normal distribution
Comparing difference between
Two Sample proportions
Multiple Sample proportions

12/08/2012 Dr. Kusum Gaur 201

>Two Sample Proportions
Chi-Square Test
Chi Square(2) = ∑all cells(O-E)2 Tr x Tc
E=
E T

(O1-E1)2 (O2-E2)2 (O3-E4)2 (On-En)2
Chi Squire = + + + ---+
E1 E2 E3 En
Tr – Total of that Row
here, O – Observed value of cell
Tc – Total of that column
E – Expected value of cell,
T – Grand Total i.e. a+b+c+d
considering Null Hypothesis
Degree of Freedom (DF) = (C – 1) (R -1)

R= No. of Rows, C = No. of Column
12/08/2012 Dr. Kusum Gaur 202

Inference of Chi Square(x2)
Chi Square(x2 ) value is seen at Degree of Freedom
DF = (R – 1) (C – 1), from Chi Square((2) Table
(here R=No. of Rows &C= No. of Column)
at desired level of significance

Inferences
If Chi Square(x2 ) Test Value is –
Higher than Table value = Difference in proportions is
Significant at that desired level of significance.

If Chi Square(x2 ) Test Value is –
Lower than Table value = Difference in proportions is
Not Significant at that desired level of significance.
12/08/2012 Dr. Kusum Gaur 203

Two Sample Means (>30)
„Z‟ Test
Pre-requisites
Quantitative data
 Homogenous normally distributed Random Sample
Sample Size > 30

Indications
To see the Significance of any Observation in
reference of Mean Value of that sample
Comparing difference between
Sample Mean to Population Mean
Means of Two independent Samples

12/08/2012 Dr. Kusum Gaur 204

Two Sample Means (>30)
„Z‟ Test
X2 – X1 here, X1– Mean of that event in 1st Sample
„Z‟ Test = X2 - Mean of that event in 2nd Sample
SEDM SEDM – Standard Error of
Difference in Means

SD1 – Standard Error of 1st Sample
SD2 – Standard Error of 2nd Sample
N1 - Sample Size of 1st Sample
SD2 1 SD2 2 N2 - Sample Size of 2nd Sample
SEDM = ------- + --------
N1 N2

12/08/2012 Dr. Kusum Gaur 205

Two Sample Means (<30)
„T‟ Test
Prerequisites

Random Sample


Normally Distributed

Sample Size < 30

12/08/2012 Dr. Kusum Gaur 206

Type of ‘T’ Test

as per design
Unpaired / Paired

for inference
One Tail /Two tail

12/08/2012 Dr. Kusum Gaur 207

Unpaired ‘T’ Test Design

Population -1 Population -2

S-1 S-2

Mean --1 Unpaired ‘T’ test Mean --2

12/08/2012 Dr. Kusum Gaur 208

Paired ‘T’ Test Design

Intervention

Population Sam
Observations-1 Observations 2
ple-

Mean --1 Mean --2
Paired ‘T’ test

12/08/2012 Dr. Kusum Gaur 209

One Tail ‘T’ Test

Acceptance Zone Rejection Zone
One Tail – Results are aspect only in one direction

Two Tail ‘T’ Test

Rejection Zone Acceptance Zone Rejection Zone
Two Tail – Results are aspect in both direction

Two Sample Means (<30)
„T‟ Test
X2 – X1 here, X1– Mean of that event in 1st Sample
„T‟ Test = --------------- X2 - Mean of that event in 2nd Sample
SEDM SEDM – Standard Error of
Difference in Means

SD1 – Standard Error of 1st Sample
SD2 – Standard Error of 2nd Sample
N1 - Sample Size of 1st Sample
SD2 1 SD2 2 N2 - Sample Size of 2nd Sample
SEDM = ------- + --------
N1 N2

Degree of Freedom (DF) = (N1 – 1) + (N2 -1) = N1 + N2 - 2

12/08/2012 Dr. Kusum Gaur 212

Inference of ‘T’ Test Value
„T‟ Test Value is matched at Degree of Freedom
(DF) = N1 + N2 – 2 in the Table of “T”
at desired level of significance.

Inferences
If „T‟ Test Value is –
Higher than Table value = Difference in Means is

If „T‟ Test Value is –
Lower than Table value = Difference in Means is
12/08/2012 Dr. Kusum Gaur 213

>Two Sample Means

ANALYSIS OF VARIENCE (ANOVA) TEST

Pre-requisites
 Homogenous normally distributed Random
Sample

Indications
Comparing difference between more than Two
Means

12/08/2012 Dr. Kusum Gaur 214

>Two Sample Means
„ANOVA‟ Test
MSOSI MSOS2 - Mean Sum Of Squares Within Classes
ANOVA = ---------- = Total SOS – MSOSI
MSOS2
T SOS = X2 – (X)2/N

MSOSI – Mean Sum Of Squares Between Classes = SOSI / K-1

SOSI –Sum Of Squares Between Classes

(Xa)2 (Xb)2 (Xc)2 (Xk)2 (X)2
= --------- + ----------- + ----------- + ….+ ____ __ - ---------
Na Nb Nc Nk N

At Degree of Freedom (DF) = ( K-1) Horizontal
12/08/2012 Dr. Kusum Gaur
(N – K) Vertical
215

Inference of ANOVA
Find out Variance Ratio value at Degree of Freedom
(DF) = ( K-1) Horizontal, (N – K) Vertical
from the Variance Ratio Table
at desired level of significance.

Inferences
If Test value is > Table value = Difference in Means is

If Test value is < Table value = Difference in Means is

12/08/2012 Dr. Kusum Gaur 216

CORRELATION

Indications

To find out relationship between variables

12/08/2012 Dr. Kusum Gaur 217

Type & Degree of Correlation
Correlation Inference Correlation (r) Inference
+1 Perfect +ve -1 Perfect +ve
Correlation Correlation
> 0.95 About Perfect +ve > - 0.95 About Perfect +ve
Correlation Correlation
> 0.75 V. Good Correlation > - 0.75 V. Good Correlation

0.75 – 0.5 Moderate Correlation - 0.75 to – 0.5 Moderate
Correlation
0.5 – 0.25 Fair Correlation - 0.5 to – 0.25 Fair Correlation
0.25 - 0 No Correlation < - 0.25 No Correlation

12/08/2012 Dr. Kusum Gaur 218

Correlation

CORRELATION

Two Variables > Two Variables

Un-Paired Data Paired Data

Pearson‟s Spearman‟s Rank Order Multivariate
Correlation Correlation Correlation

12/08/2012 Dr. Kusum Gaur 219

Pearson’s correlation

. ∑ ( X – X) ∑ ( Y – Y) ∑xy
Correlation (r) = =
√∑ ( X – X)2 ∑ ( Y – Y)2 √ ∑ x2 y2

Direct Method
∑ X Y - ∑ X ∑Y / N
Correlation (r) = -----------------------------
√ {∑X2 – (∑X)2/N}{ ∑Y2 – (∑Y)2 /N}

12/08/2012 Dr. Kusum Gaur 220

Pearson’s correlation -----

here,
∑ X Y = Sum of multiplication of X and Y
∑ X = Sum of all observations of X Series
∑ Y = Sum of all observations of YX Series
N =Total no. of observations
∑X2 = Sum of Squares of all observations of X Series
∑Y2 = Sum of Squares of all observations of Y Series
(∑X)2 = Square of Sum of all observations of X Series
(∑Y)2 = Square of Sum of all observations of Y Series

12/08/2012 Dr. Kusum Gaur 221

Spearman’s Rank Order Correlation

6∑D2
• Spearman‟s Rank (rs ) = 1 -
N3 - N

12/08/2012 Dr. Kusum Gaur 222

Significance Test for Correlation (r)

Standard Error (SE) of rs = rs √ N-1

Inference
• If difference >2 SE of r =Difference is
Significant at 5% level
• If difference < 2SE of r =Difference is
Not Significant at 5% level

12/08/2012 Dr. Kusum Gaur 223

REGRESSION

Indication
To find out causal relationship between
variables

REGRESSION COFFICIENT- It is a measure of
change in one dependent variable (y) with
one unit change in the other variable (x)

12/08/2012 Dr. Kusum Gaur 224

Regression line with Regression Equation

The regression equation of ‘Y’ on ‘X’ is expressed as follows:
Here, ‘a’ is interceptor & ‘b’ is slope Yc = a + bX

Regression Lines

Régression line of Y on X is Y = a + bX ----(1)
Régression line of X on Y is X = a + bY ----(2)

Here- Y = one variable
X = other variable
a = interceptor of X line on Y line
b = slope of X line on Y line Regression

12/08/2012 Dr. Kusum Gaur 226

Regression – Equations
Regression Equation of X on Y

SD of series X
(X – X)= r (Y –Y) ---- (3)
SD of series Y

Regression Equation of Y on X

SD of series Y
(Y – Y)= r (X –X) ------- (4)
SD of series X
12/08/2012 Dr. Kusum Gaur 227

Regression – coefficients
Regression Coefficient of X on Y

SD of series X ∑(X-X)(Y –Y)

b(xy)= r =
SD of series Y ∑(X – X)2

Regression Coefficient of Y on X

SD of series Y ∑(X-X)(Y –Y)
b(yx)= r =
12/08/2012
SD of series Kusum Gaur
Dr.
X ∑(Y – Y)2 228

Relation of correlation and
Regression

Co-rrelation (r) = √ bxy byx

12/08/2012 Dr. Kusum Gaur 229

Between
Tests/Procedure/Therapy
For comparison with Gold Standard:
Sensitivity
Specificity
PPV
NPV
ROC

For agreement of association: Kappa
For appropriate cut of value for diagnostic test: ROC

12/08/2012 Dr. Kusum Gaur 230

Sensitivity and Specificity
Status based on gold standard test

Diseased Normal

Test positive True positive False positive
Observation in a b
new test Test negative False negative True negative
c d

Sensitivity = a /(a+c) PPV = a /(a+b)

Specificity = d /(b+d) NPV = d /(c+d)

12/08/2012 Dr. Kusum Gaur 231

Kappa Statistics
(Measurement of Agreement)
Test Value Inference
0.93 – 1 Excellent Agreement
0.81 – 0.92 Very Good Agreement
0.61 – 0.80 Good Agreement
0.41 – 0.60 Fair Agreement
0.21 – 0.40 Slight Agreement
0.01 – 0.20 Poor Agreement
< 0.01 No Agreement
12/08/2012 Dr. Kusum Gaur 233

Non-Parametric Tests
Advantages
Distribution free
Easier to do
Easier to understand/infer

Disadvantages
They ignore certain amount of information
Indicated only ordinal or nominal data
Statistically Less efficient
Indicated only to test hypothesis, not for
estimates

12/08/2012 Dr. Kusum Gaur 234

Parametric Test Vs Non-Parametric
Test Quality Parametric Non-Parametric

Assumed Distribution Normal Any

Assumed Variance Homogenous Any

Data Type Interval-Continous Nominal /Ordinal

Data set Relationship Independent Any

Usual Centre Measure Mean Median

More conclusions Easier to calculate
Advantages
More efficient Less affected by outliers

12/08/2012 Dr. Kusum Gaur 235

Parametric Test Vs Non-Parametric
Test Parametric Non-Parametric
Correlation test Pearson Spearman
Independent Independent-
Mann-Whitney test
measures, 2 groups measures t-test
One-way,
Independent
independent- Kruskal-Wallis test
measures, >2 groups
measures ANOVA
Repeated measures,
Matched-pair t-test Wilcoxon test
2 conditions
Repeated measures, One-way, repeated
Friedman's test
>2 conditions measures ANOVA

Sign Test (K Test)– nonparametric test for quantitative paired data
12/08/2012 Dr. Kusum Gaur 236

Sign test

• Simplest
• Based on direction(- /+/0)
• Signs as per the direction are counted

• Inference – if S≤K = Null hypothesis (H₀) is
rejected
• Here „S‟ is net sum of signs as per sign
• „K‟ is constant

12/08/2012 Dr. Kusum Gaur 237

Sign test – Steps
Sign K Test for Small Sample (<30)
– Find out net sum of signs as per sign(S)
– S = (total + signs) – (total – signs)
– K = (n-1)/2 - 0.98√n
• Inference – if S≤K = Null hypothesis (H₀) is rejected

Sign Z Test for Large Sample (>30)
– Find out no of ties with less frequent sign(X)
– Z = (X – np) / √ np (1-p) here X= no. + Sign
• Inference – if Z>2 = Null hypothesis is rejected

12/08/2012 Dr. Kusum Gaur 238

12/08/2012 Dr. Kusum Gaur 239

12/08/2012 Dr. Kusum Gaur 240

12/08/2012 Dr. Kusum Gaur 241

12/08/2012 Dr. Kusum Gaur 242

12/08/2012 Dr. Kusum Gaur 243

Step-7

Inferences

12/08/2012 Dr. Kusum Gaur 244

Steps in Statistical Inference

Generating NULL and ALTERNATIVE
hypothesis
Testing the hypothesis using appropriate
statistical tests
Obtaining „p‟ value
Concluding from the p value.
Obtaining Level of Significance
Comparing „p‟ value with CI.

12/08/2012 Dr. Kusum Gaur 245

‘P’ Value and Inferences
with Normal Curve

12/08/2012 Dr. Kusum Gaur 246

Mean 1SD =68% values - Confidence Limit 68% - P Value = >0.05 - NS
Mean 2SD =95% values - Confidence Limit 95% - P Value = 0.05 - S
Mean 3SD =99% values - Confidence Limit 99% - P Value = 0.001 - HS


Mean 1SD =68% values - Confidence Limit 68% - P Value =/>0.05 - NS
Mean 2SD =95% values - Confidence Limit 95% - P Value < 0.05 – S
Mean 3SD =99% values - Confidence Limit 99% P Value < 0.001 - HS
12/08/2012 Dr. Kusum Gaur 248

Conventionally Accepted
Significance Level

 P Value > 0.05 LS=Not Significant

 P Value < 0.05 LS=Significant

 P Value < 0.001 LS=Highly Significant

Step-8

Reporting

12/08/2012 Dr. Kusum Gaur 250

Steps of Report Writing

Title of Project
Abstract
Introduction
Aims & Objectives
Methodology
Observations-Compilation, Classification &
Presentation of data with analysis and inferences
Discussion
Conclusions
Recommendations
Limitations
Acknowledgment
Bibliography

12/08/2012 Dr. Kusum Gaur 251

Discussion

Explanation of findings
Logic and reasoning for the results as it
appears
Compare and contrast with findings of other
researchers
Based on objectives of the study
Should answer the research question
Scope & limitations of the study

12/08/2012 Dr. Kusum Gaur 252

Recommendations & conclusions

• Based on our findings
• Limited to objectives of the study
• Policy implications
• Relevance should be emphasized
• Should be exclusively limited to
observations

12/08/2012 Dr. Kusum Gaur 253

Managerial and financial aspects

Protocol development
Time line/Gantt chart
Peer review
Development of tools
Training in data collection
Budget/ financial accounting
Quality control
Monitoring & Evaluation

12/08/2012 Dr. Kusum Gaur 254

Time Line/Gant chart/log Fram
Activities 1.1.12- 16.1- 1.2.12- 1.3.12- 16.5.12- 16.6.12- 16.7.12-
15.1.12 31.1 15.2.12 15.5.12 15.6.12 15.7.12 31.7.12

Planning
Officials
Que. Dev
Training
Poilet Survey
Corrections
Re-training
Resource Proc

Survey
Analysis
Report Writing
Dissemination
of Report

Computer in Statistics

12/08/2012 Dr. Kusum Gaur 256

Web sites related to Statistics

• http://stattrek.com
• http://vassarstat.net
• http://www.scribd.com
• http://www.statistixl.com
• http://statistics calculators.com
• http://stat.ubc.ca/~rollin/stats/ssize/
• ………………………………………………………
……

12/08/2012 Dr. Kusum Gaur 257

Computer Softwares in Statistics

• Microsoft Excel
• SPSS
• Epi info
• Epi tab
• Mini tab
• Graph Pad
• Primer
• Medcal
• ……………..

12/08/2012 Dr. Kusum Gaur 258

Always there is room for improvement

12/08/2012 Dr. Kusum Gaur 259

Research methodology & Biostatistics

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to Research methodology & Biostatistics

Similar to Research methodology & Biostatistics (20)

Recently uploaded

Recently uploaded (20)

Research methodology & Biostatistics

Editor's Notes