Introduction to
Research Methodology
For Basic Research Workshop
Prof. Dr. Jamalludin Ab Rahman B.Med.Sc., MD, MPH
Department of Community Medicine
Kulliyyah of Medicine
International Islamic University Malaysia
The programme
Day 1
 Getting the research idea
 Literature review & managing them (using EndNote)
 Building conceptual framework
 Phrasing objective
 Determine best study design
 Sampling plan & sample size
 Plan for data collection – preparing data dictionary
9April2017Researchmethodologyfornurses
2
The programme….
Day 2
 Plan for statistical analysis – preparing dummy table
 Basic statistical analysis
 Descriptive statistics – determine Normality, mean (SD),
median (IQR), count (%)
 Analytical statistics – compare means, compare
proportion, correlation
9April2017Researchmethodologyfornurses
3
Contents
 Why we do research?
 How we do research?
 Effective literature search & management
 Choosing best study design
 Planning for data collection
 Planning for statistical test
 How to write for publication
9April2017Researchmethodologyfornurses
4
Why we do research?
 Answers our curiosity?
 Forced by our superior
 For promotion
9April2017Researchmethodologyfornurses
5
Researchmethodologyfornurses
6
Start with end in mind
(Bijaksana)
9April2017
Researchmethodologyfornurses
7
Idea
Justify Explore Strategize
Collect
Analyse Report Publish
Initiation
Planning &
strategize (proposal)
Data collection
Reporting & sharing
Study design
Sampling plan
Sample size determination
Plan for data collection
Plan for statistical analysis
9April2017
The 4 Critical Strategies
1. Clear conceptual framework
2. SMART objectives
3. Detail data dictionary
4. Expected dummy table
Researchmethodologyfornurses
8
9April2017
Organise your research idea
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
9
Identify variables involved
 Main outcomes (dependant)
 Explanatory (exposures, factors)
variables (independent)
 Confounding variables
9April2017Researchmethodologyfornurses
10
9April2017Researchmethodologyfornurses
11
Exposure Outcome
Confounder
9April2017Researchmethodologyfornurses
12
9April2017Researchmethodologyfornurses
13
Alcohol-
based
mouth wash
Oral Cancer
Smoking
9April2017Researchmethodologyfornurses
14
9April2017Researchmethodologyfornurses
15
Varkevisser, C. M.,
Pathmanathan, I.,
& Brownlee, A. T.
(2003). Designing
and conducting
health systems
research projects:
Kit Publishers.
Figure 3 Structural Equation Model Showing the Relationship Between
Family Processes, Child Characteristics, and Achievement for Girls
Aged 6 to 11 years
9April2017Researchmethodologyfornurses
16
Thornley S (2012)
Causation and
Statistical Prediction:
Perfect Strangers or
Bedfellows?
J Biom Biostat 3:e115.
doi:10.4172/2155-
6180.1000e115
9April2017Researchmethodologyfornurses
17
Literature Review and
Bibliographic Management
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
18
9April2017Researchmethodologyfornurses
19
Research add
more information
to the circle of
knowledge
9April2017Researchmethodologyfornurses
20
Research
Information
New
knowledge
Literature
9April2017Researchmethodologyfornurses
21
Literature review is not
 A chronological catalogue of all
material,
 A collection of quotes or
paraphrases, or
 A sales pitch selling only the good
side of a study.
9April2017Researchmethodologyfornurses
22
Literature review is
 Organised
 Structured
 Evaluated (critically appraised)
research materials
9April2017Researchmethodologyfornurses
23
The steps – 5S
1. Strategise (plan) Most important step
 Get your topic, problem statement & conceptual framework
ready (& in-sync)
 Identify components, variables, keywords & authors
2. Search – by keywords, domains of interest
3. Screen – title & abstract
4. Sort – organise & prioritise materials
5. Summarise – evaluate & re-organise
9April2017Researchmethodologyfornurses
24
9April2017Researchmethodologyfornurses
25
Strategise
Search
Sort
Summarise
• Based on problem
statement &
conceptual
framework
• Identify causal
relationship,
variables,
domains, authors
• Organise into the
pre-planned
structure or
domain of interest
• Online – Google,
Google scholar,
individual journal’s
page, unpublished
materials
• Use electronic
bibliographic apps
e.g. EndNote,
Mendeley, Papers
• Credibility of articles
(journal & authors)
• Identify study
population, study
design,
measurement
method used
• Check with the
problem statement
• Annotate (e.g. by
using EndNote)
• Re-organise the
information if
needed
Screen
• Screen the title,
abstract
9April2017Researchmethodologyfornurses
26
Strategise
Search
Sort
Summarise
• What is OSA?
Diagnosis
• Prevalence of OSA
• Why study? HPT is
prevalent. OSA is
under-diagnosed &
could be one of the
reason
• HPT OSA
• Possible
confounders – Sex,
BMI
• Based on Headings:
Introduction, Method,
Discussion
• Based on domain:
Epidemiology of
OSA, Causal
relationships (with
HPT, BMI, Sex)
• Based on point FOR
& AGAINST
hypothesis
• Subject matters:
Obstructive Sleep
Apnea (& Apnoea),
Sleep Disordered
Breathing, Malaysia
(local relevance),
Hypertension, PSG
vs. screening tool
• Designs: Cohort,
Case-control
• Confounders
• Credibility of articles
(journal & authors)
• Identify study
population, study
design,
measurement
method used
• Check with the
problem statement
• Annotate (e.g. by
using EndNote)
• Re-organise the
information if
needed
Screen
• Screen the title,
abstract
Title: Relationship of OSA & HPT
Design: Case-control (HPT vs. No-HPT)
HPT OSA Male
Obese
9April2017Researchmethodologyfornurses
27
The number of articles found
Alternative spelling
How they were sorted
Time period
Observe the year,
author & journal
Full text
Number of citations by Google
You can cite (import)
into EndNote
The same doc maybe
available at multiple
places
9April2017Researchmethodologyfornurses
30
Domains of interest. Can assign
many domains for one article
Customise the header
Include full text or any
relevant documents
Phrasing objectives
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
32
SMART
 Specific – What you really want to do
 Measurable – Must be able to be measured
 Attainable – Achievable
 Relevant – Relevant to you
 Timely – Can be done within the time you have
9April2017Researchmethodologyfornurses
33
e.g.
 To determine the prevalence of breastfeeding among
mothers
Can rephrase to:
 To measure the prevalence of exclusive breastfeeding
among working mothers attending post-natal clinic in Pahang
 To measure the prevalence of exclusive breastfeeding and
factors related with it among working mothers attending post-
natal clinic in Pahang
9April2017Researchmethodologyfornurses
34
Choosing Best
Research Designs
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
35
Research design
Design
Observational
Cross
sectional
Case-control
Cohort
Experimental
Animal Trial
Clinical Trial
Community
Trial
9April2017Researchmethodologyfornurses
36
Measure exposure &
outcome & the same time
Fix the outcome.
Measure the exposure
Fix the exposure.
Measure the outcome
9April2017Researchmethodologyfornurses
37
Cause & Effect
EffectCause
Cause precedes effect
Time
Exposure Outcome
Smoking Lung cancer
High fat diet Obesity
New drug Better prognosis
9April2017Researchmethodologyfornurses
38
Cross sectional study
EffectCause
Exposure Outcome
Smoking Lung cancer
High fat diet Obesity
New drug Better prognosis
Observe BOTH Cause & Effect at the same time – No temporal association
9April2017Researchmethodologyfornurses
39
Case-Control study
EffectCause
Exposure Outcome
Smoking Lung cancer
High fat diet Obesity
New drug Better prognosis
Observe the Cause (PAST events)
9April2017Researchmethodologyfornurses
40
Cohort study
EffectCause
Exposure Outcome
Smoking Lung cancer
High fat diet Obesity
New drug Better prognosis
Observe the effect (NEW event)
9April2017Researchmethodologyfornurses
41
Experimental study
Effect
Cause
(New
drug)
Effect
Other
cause
(Control)
9April2017Researchmethodologyfornurses
42
Experimental study
Effect
New
drug
Effect
Control
No effect
No effect
Randomisation
Eligible
population
Sampling
Population
Hierarchy of
Evidence
Randomised
controlled trial
Cohort
Case control
Case report
Expert opinion
9April2017Researchmethodologyfornurses
43
Meta-analysis
Systematic review
Cross sectional study
 Population based – represent population
 Measure exposure & outcomes at the same point in time –
No temporal association
 Impossible to infer causality
 Prevalence study – measure magnitude or burden of disease
 Descriptive
 Repeated cross-sectional study ~ pseudo-longitudinal e.g.
British Association for the Study of Community Dentistry (BASCD) guidance on sampling for
surveys of child dental health. A BASCD coordinated dental epidemiology programme quality
standard (Pine et al. 1997)
9April2017Researchmethodologyfornurses
44
Cross sectional study
Advantages
 Measure prevalence of a
population
 Measure multiple exposures &
outcomes
 Relatively inexpensive
 Relative shorter time
Disadvantages
 No temporal association – no
inference to causality
 Prevalence-incidence bias (Nyman
bias) e.g. if smokers die due to
AMI faster, a cross-sectional study
will reveal less smoker among AMI
patients
 Health workers effect e.g. when
survey done from house to house,
only health respondent are
available in their home/office
9April2017Researchmethodologyfornurses
45
Cross sectional study - Example
 NHMS ~ Household study, all Malaysian (N=47,610 for
2006)
 NOHSA ~ Adult (>15) (N=14,444 for 2010)
 NMCS ~ GP vs. PHC (N~12,000)
 NHANES (US) - http://www.cdc.gov/nchs/nhanes.htm
9April2017Researchmethodologyfornurses
46
Case control study
 Fix the outcomes, measure the exposures
 Longitudinal
 Retrospective
 Case = outcome of interest
 Control = comparing outcome
9April2017Researchmethodologyfornurses
47
Case control study
9April2017
Research methodology for nurses 48
Case
Control
No exposure
Exposure
No exposure
Exposure
History Now
Lung cancer
patients
Healthy
patients
Smoking
Non smoking
Smoking
Non smoking
Lung Cancer
No Lung
Cancer
Smoking 20 (18.2%) 90 (81.8%)
Not Smoking 5 (4.5%) 105 (95.5%)
2 (df=1)= 10.150, p =0.001, OR = 4.7 (CI95% 1.7 – 13.0)
Because p < 0.05, we reject H0. Therefore there
is a different between smoker & non smoker
9April2017Researchmethodologyfornurses
49
Cases
 Well defined
 Source – institution
vs. population
Control
 Matched vs. Unmatched
 Matching ~ controls
resemble the cases with
regard to certain characteristics
(age, gender, SES etc)
 Individual vs. Group
matching
 Source – institution vs.
population
 Ratio to cases ~ up to 4:1
9April2017
Research methodology for nurses 50
Case control study
Advantages
 Good for rare conditions or
diseases
 Less time needed to conduct the
study because the condition or
disease has already occurred
 Measure multiple risk factors
 Can establish an association
Disadvantages
 Recall bias
 Not good for evaluating
diagnostic tests because it’s
already clear that the cases have
the condition and the controls do
not
 It can be difficult to find a suitable
control group
9April2017Researchmethodologyfornurses
51
Example
 Tobacco use and risk of
myocardial infarction in
52 countries in the
INTERHEART study: a
case-control study. (Teo
2006)
9April2017Researchmethodologyfornurses
52
Cohort study
 Measure outcomes
 Compare incidence of a disease (or condition) among
exposed and unexposed individuals over time
 Disease free at the onset (or inception)
 Repeated measurements ~ follow up
 Prospective vs. retrospective cohort
9April2017Researchmethodologyfornurses
53
Cohort study
9April2017Researchmethodologyfornurses
54
Disease
No disease
No exposure
Exposure
Now Future
Lung cancer
No lung cancer
Smoking
Non smoking
Disease
No disease
Lung cancer
No lung cancer
Define cohort
 Both exposed & not exposed groups have equal chance
to:
 Develop disease
 Be followed-up
 Types:
 Representative – low exposed subjects
 Enriched – high exposed subjects
 Specific group – occupational, institution etc
9April2017Researchmethodologyfornurses
55
Measurements
 Exposure
 Carefully defined in advance
 Standard measurement for both E+ & E- groups
 Outcome
 Primary vs. Secondary outcome
9April2017Researchmethodologyfornurses
56
Follow-up
 Keep participation at > 90%
 Equal likelihood to detect disease in all subjects
 Active vs. Passive follow-up
 Blinding
9April2017Researchmethodologyfornurses
57
Example Obesity as an Independent Risk Factor for
Cardiovascular Disease: A 26-year Follow-up of
Participants in the Framingham Heart Study
(Hubert 1983)
9April2017Researchmethodologyfornurses
58
Cohort study
Advantages
 Infer causality
 Measure multiple outcomes
 Study rare exposure
 Measure incidence
Disadvantages
 Costly
 Loss to follow up
 Large sample size for rare
outcomes
 Selection bias
9April2017Researchmethodologyfornurses
59
9April2017Researchmethodologyfornurses
60
9April2017Researchmethodologyfornurses
61
Sampling Plan
(Sampling technique & sample size)
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
62
Before we sample
Determine study place, duration & subjects
 Describe study place
- especially if plan to represent a population
 State time & duration
 Who or what are the subjects
- population, people, animal etc.
9April2017Researchmethodologyfornurses
63
Subjects
 Target population
 Study population
 Sampling frame
 Sampling unit
 Observation unit
9April2017Researchmethodologyfornurses
64
Example – NHMS III 2006
 Target population
 Study population
 Sampling frame
 Sampling unit
 Observation unit
9April2017Researchmethodologyfornurses
65
All Malaysian
Household up to strata 6
List of Enumeration Block & Living Quarters
Enumeration Block & Living Quarters
All household in the selected Living Quarters
9April2017Researchmethodologyfornurses
66
Random Non-Random
Represent a population
Must have sampling frame
• Simple
• Systematic
• Stratified
• Cluster
Still valid if justified
• Convenience
• Purposive
• Snowball
• Volunteer
Simple
9April2017Researchmethodologyfornurses
67
Systematic
9April2017Researchmethodologyfornurses
68
1 2 43
A
B
C
D
E
Cluster
9April2017Researchmethodologyfornurses
69
Stratified
9April2017Researchmethodologyfornurses
70
Random sampling
Random
Simple
Systematic
Stratified
Cluster
Multistage
9April2017Researchmethodologyfornurses
71
The ideal method. Randomly sample 10 students from
a class of 50 students.
Follow certain pattern, order. Only first sample is
random.
Study population divided intro strata. All strata
selected. Portion of sample in each strata sampled.
Study population divided intro clusters. Assume all
clusters are the same. Not all clusters selected. Only
some will be sampled.
Several sampling techniques applied at different stage.
Non random sampling
Non random
Convenience
Purposive
Quota
Snowball
9April2017Researchmethodologyfornurses
72
Haphazard. Grab sampling. No rules. Just get the
sample.
Non representative. Convenience but with certain
purpose. E.g. diabetic patients on single insulin
therapy.
When the sampling stop after achieving certain
size.
Sample by reference or recommendation.
Example – NHMS III 2006
Two stage stratified random sampling
 Target population
 Study population
 Strata
 Clusters
 Sampling frame
 Sampling unit
 Observation unit
 Sample distribution
9April2017Researchmethodologyfornurses
73
All Malaysian
Household up to strata 6
List of Enumeration Block & Living Quarters
Enumeration Block & Living Quarters
All household in the selected Living Quarters
State & location (urban or rural)
Enumeration Block & Living Quarters
Proportionate to size
Example – A clinical trial
 Target population All diabetic patients
 Study population Diabetic for at least 1 year on
insulin therapy who attended
MOPD from Jan-Dec 2014
 Sampling frame N/A
 Sampling unit Any diabetic who fulfilled the
selection criteria
 Observation unit Same as SU
9April2017Researchmethodologyfornurses
74
Why we calculate sample size?
 Estimate sample required for external validity (inference)
& logistic preparation
 Justify sample size for research to:
 represent population
 measure treatment effect (detect difference)
 We don’t do research haphazardly
9April2017Researchmethodologyfornurses
75
Sample size is an estimate
 Best sample size for specific purpose (for the main
objective)
 Never an exact value, always an estimate
 Calculated for certain expected estimates (outcomes) at
certain degree of precision
 Expected values - estimated from previous studies or
from intelligent ‘guess’
9April2017Researchmethodologyfornurses
76
Plan with the end in mind
 No one formula fits all
 Depending on objective:
1. Represent population
2. Hypothesis testing
 Adjust for design effect (based on type of sampling),
confidence level, alpha error, power, stratification and
anticipated response rate
9April2017Researchmethodologyfornurses
77
Pre-requisites
 Objective determined
 Sampling method known
 Estimate the outcome
 Precision required
 Statistical test used is known
 Set the power (b) and confidence level (a)
 Anticipate non-response
9April2017Researchmethodologyfornurses
78
Represent population
 Represent population e.g. state, district, village,
institution etc
 Usually for prevalence study e.g. to measure prevalence
of hypertension in Malaysia; or mean DMF among adult
in Pahang
9April2017Researchmethodologyfornurses
79
Single proportion
 𝑛 =
𝑧 𝛼/2
2 𝑝(1−𝑝)
𝑑2
 Where z = value from standard normal distribution
corresponding to desired confidence level, a (𝑧 𝛼/2=1.96 for
95%CI)
 p is expected true proportion
 d is desired precision
 For small populations n can be adjusted so that 𝑛 𝑐 =
𝑁𝑛
𝑁+𝑛
9April2017Researchmethodologyfornurses
80
9April2017Researchmethodologyfornurses
81
95% CI is a = 0.05
a/2 = 0.025
𝑧 𝛼/2 =1.96
Power 80% or 0.8
1-b = 1-0.8 = 0.2
b = 0.2
𝑧 𝛽 =0.84
Example
 Study to measure prevalence of hypertension in a village
of 1000 people
 Expected prevalence = 40%, Precision = 5%, CI=95%,
expected non-response 20%
 𝑛 =
1.962 ∗0.4(1−0.4)
0.052 = 369 ≅ 370
 Anticipating non response, 𝑛 = 450
9April2017Researchmethodologyfornurses
82
Single mean
 Study to measure one sample mean
 𝑛 = (
𝑧 𝛼/2∗𝑠
𝑑
)2, where s = expected standard deviation
(SD)
 We use smallest d to get largest n possible
9April2017Researchmethodologyfornurses
83
Example
 Study to measure average DMF among adults in a
village of 2000 people
 Estimated DMF = 11 (SD 10) with the precision of 2 at
95%CI
 𝑛 = (
1.96∗10
2
)2
= 96.04 ≅ 100
9April2017Researchmethodologyfornurses
84
Hypothesis testing
 How one variable is different from the other
 E.g. different % of hypertension between male & female
9April2017Researchmethodologyfornurses
85
Compare two proportions
 𝑛 =
(𝑧 𝛼
2
+𝑧 𝛽)2∗(𝑝1 1−𝑝1 +𝑝2 1−𝑝2 )
(𝑝1−𝑝2)2
 Total sample size = 2*n
 Where p1 and p2 are the expected sample proportions of
the two groups; za/2 is the critical value of the Normal
distribution at a/2 and zβ is the critical value of the Normal
distribution at β (e.g. for a power of 80%, β is 0.2 and the
critical value is 0.84)
9April2017Researchmethodologyfornurses
86
Example
 A study to compare prevalence of diabetes mellitus
between male and female. Expected prevalence are
30% vs. 40% respectively. Power 80% at 95%
confidence level.
 𝑛 = (1.96 + 0.84)2
∗
0.3 1−0.3 +0.4 1−0.4
0.3−0.4 2 = 7.9 ∗ 45=355.5
≅ 360
 Total sample size = 360 * 2 = 720
9April2017Researchmethodologyfornurses
87
Compare two means
 𝑛 =
2∗ 𝑧 𝛼
2
+𝑧 𝛽
2∗𝑠2
𝑑2
 Where d=smallest means difference and s = standard
deviation
 Total sample size = 2*n
9April2017Researchmethodologyfornurses
88
Example
 A study to compare means of HbA1c between diabetic
treated with new drug versus the standard drug (control).
Expected difference of 10.5% vs. 11.5% with SD
estimated of 5%
 𝑛 =
2∗52
12 ∗ 1.96 + 0.84 2
= 395
 Total sample size = 790
9April2017Researchmethodologyfornurses
89
Using applications
 Software or online calculators
 Make sure the formula is valid
 Suggestion:
 PS: Power and Sample Size Calculation
http://biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize
 Epi Info http://wwwn.cdc.gov/epiinfo/7/
 OpenEpi
http://www.openepi.com/oe2.3/menu/openepimenu.htm
9April2017Researchmethodologyfornurses
90
Epi Info – Population survey
9April2017Researchmethodologyfornurses
91
Epi Info – Cohort & Cross-sectional
9April2017Researchmethodologyfornurses
92
Open Epi –
Compare two
means
9April2017Researchmethodologyfornurses
93
PS - Dichotomous
9April2017Researchmethodologyfornurses
94
PS – Compare
two means
9April2017Researchmethodologyfornurses
95
Planning for Data Collection
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
96
You wanna collect now?
1. Objectives are clear
2. Variables identified & defined
3. Samples identified
4. Instrument valid & reliable
5. Instrument tested
6. Simulation done
Only then, you go and collect the data
9April2017Researchmethodologyfornurses
97
9April2017Researchmethodologyfornurses
98
9April2017Researchmethodologyfornurses
99
Variables identified & defined
Conceptual
framework
Identify variables
Define variables
How to define variable?
 Definition
 Working/operational definition
 How it will be measured
 Precision of measurement
 How it will be reported
9April2017Researchmethodologyfornurses
100
Furnish the info for each variable
e.g. Obesity
 Definition
9April2017Researchmethodologyfornurses
101
Overweight and obesity are
defined as abnormal or
excessive fat accumulation that
presents a risk to health. (WHO
1998).
Furnish the info for each variable
e.g. Obesity
 Definition
 Working definition
9April2017Researchmethodologyfornurses
102
A crude population measure of obesity is the
body mass index (BMI), a person's weight (in
kilograms) divided by the square of his or her
height (in metres). In this study, overweight
and obesity are grouped together and
labelled as Obesity. Any BMI 25 kg/m2 or
above is considered obese (WHO 1998).
Furnish the info for each variable
e.g. Obesity
 Definition
 Working definition
 How it will be measured
 Precision of measurement
9April2017Researchmethodologyfornurses
103
Weight is measured using
calibrated Seca 762 with precision
to the nearest of 1 kg
Height using Seca 206 with
precision to the nearest 1 mm.
9April2017Researchmethodologyfornurses
104
Make sure the instrument are valid & reliable
9April2017Researchmethodologyfornurses
105
9April2017Researchmethodologyfornurses
106
NATIONAL HEALTH AND
NUTRITION EXAMINATION
SURVEY III. Body
Measurements
(Anthropometry) 1998
Furnish the info for each variable
e.g. Obesity
 Definition
 Working definition
 How it will be measured
 Precision of measurement
 How it will be reported
9April2017Researchmethodologyfornurses
107
Obesity will be described in
count (N) and proportion (%).
9April2017Researchmethodologyfornurses
108
9April2017Researchmethodologyfornurses
109
9April2017Researchmethodologyfornurses
110
9April2017Researchmethodologyfornurses
111
9April2017Researchmethodologyfornurses
112
Then only you start to design
the data collection form
Tools to create form
 Usual word processing (e.g. Word), spreadsheet (e.g.
Excel)
 Better with Database (Microsoft Access)
 Google Form
 EpiInfo Make Questionnaire
9April2017Researchmethodologyfornurses
113
9April2017Researchmethodologyfornurses
114
9April2017Researchmethodologyfornurses
115
9April2017Researchmethodologyfornurses
116
9April2017Researchmethodologyfornurses
117
• Test the form
• Do complete testing
• Revise the form
• Run some analyses
Plan for Statistical Analysis
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
118
9April2017Researchmethodologyfornurses
119
Research idea
General objectives
Specific Objective Specific Objective Specific Objective
Analysis Analysis Analysis
9April2017Researchmethodologyfornurses
120
Type of statistical analysis
AnalyticalDescriptive
e.g. Describe socio-
demographic characteristics -
Age, Sex, Race etc.
e.g. Prevalence of
hypertension.
e.g. Compare demographic
characteristics between two
population – Compare age
between male & female
e.g. Distribution of gender by
hypertension status
e.g. How demographic
characteristics (more than one
factors) explain hypertension
Univariable Bivariable Multivariable
IV DV IV DV IV DV IV
Steps for the plan
1. Focus on each specific objective
2. Decide whether it will be descriptive or analytical
3. For descriptive statistics: Mean (SD) or Median (IQR) or
N(%)
4. For analytical (present of hypothesis); state the Ho
5. Select suitable statistical test – usually run Univariate
first, then Multivariate
6. Present this in a dummy table
9April2017Researchmethodologyfornurses
121
Dummy table - example
9April2017Researchmethodologyfornurses
122
Data quality
 Valid value
e.g. age > 200 years, weight > 500 kg, pregnant male
etc
 No missing value
 Relevant skip response
e.g. Not Applicable response for number of pregnancy
for male respondent
 Declare method to ensure good data quality – e.g.
double data entry
9April2017Researchmethodologyfornurses
123
How to report the results
 Start with descriptive statistics – usually for baseline
findings e.g. description of study population – socio-
demographic or basic clinical characteristics
 Proceed with hypothesis testing (if any) – e.g. compare
between groups/treatments
 State statistical tests used
 State data transformation done (if any)
 Declare missing value & how they were managed
9April2017Researchmethodologyfornurses
124
9April2017Researchmethodologyfornurses
125
9April2017Researchmethodologyfornurses
126
9April2017Researchmethodologyfornurses
127
Type of data
(Level of measurement)
Categorical
Nominal Ordinal
Numerical
Discrete Continuous
e.g. Gender, Race e.g. Cancer
staging, Severity
of CXR for PTB
e.g. Parity,
Gravida
e.g. Hb, RBS,
cholesterol.
Data
Categorical
Frequency
(count) &
Percentage
Numerical
Normal Mean (SD)
Not Normal
Median
(Range/IQR)
How to describe a data
9April2017Researchmethodologyfornurses
128
Bivariable analysis
 Compare numericals – compare means, compare ranks,
correlation
 Compare proportions – chi-square
9April2017Researchmethodologyfornurses
129
Bivariable analysis
Dependent variable
(outcome)
Numerical Categorical
Independent
variable
(factor)
Numerical Correlation
Independent sample t-test/
Mann-Whitney U
Categorical
Independent sample t-test/
Mann-Whitney U
One-way ANOVA/Kruskall
Wallis
Chi-square
9April2017Researchmethodologyfornurses
130
Bivariable analysis - detail
9April2017Researchmethodologyfornurses
131
Variable 1 Variable 2 Test
Categorical Categorical Chi-square
Categorical (2 pop) Numerical (Normal) Independent sample t-test
Categorical (2 pop) Numerical (Not Normal) Mann-Whitney U test
Categorical (> 2 pop) Numerical (Normal) One-way ANOVA
Categorical (> 2 pop) Numerical (Not Normal) Kruskal-Wallis test
Numerical (Normal) Numerical (Normal) Pearson Correlation Coefficient Test
Numerical (Normal/ Not Normal) Numerical (Not Normal) Spearman Correlation Coefficient Test
Numerical (Normal) Numerical (Normal) – Paired Paired t-test
Numerical (Not Normal) Numerical (Not Normal) – Paired Wilcoxon Signed Rank Test
Multivariate test
Dependant Independent variables Test
Numerical All numerical Linear regression
Numerical Numerical & categorical GLM
Repeated numericals Numerical & categorical Repeated measures ANOVA
Categorical Numerical & categorical Logistic regression
9April2017Researchmethodologyfornurses
132
Specific test
 Survival analysis – time to event analysis
 Time series – observe trend over time & forecasting
 Factor analysis – data reduction
 Structural equation modelling – general tool to explore or
confirm a ‘causal’ model (relationship between variables)
9April2017Researchmethodologyfornurses
133
How to publish your research
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
134
Howtowritea
researchpaper
Researchmethodologyfornurses
135
Title
Method
Discussion
& conclusion
State
objectives
Results
Introduction
Abstract
Finalise title
The
backbone
of the study
How the
study was
done
Present based
on objectives
Answer objectives.
Discuss differences,
recommendation & limitation
Justify research. State
research gap. Phrase
objective clearly
9April2017
Before you write
 Decide the target audience
 Scientific publication or report
 Choose journal
 Study the format & requirement
 Separate text, table & graphics
9April2017Researchmethodologyfornurses
136
The structure
Publication
 Title
 Abstract
 Keywords
 Introduction
 Method
 Results
 Discussion
 References
Report/thesis*
 Title
 Abstract
 Introduction
 Literature review
 Objective
 Methodology
 Results
 Discussion
 References
9April2017Researchmethodologyfornurses
137
* Institution specific
The suggested sequence
1. Based on specific objective, analyse the data & produce
planned tables
2. Interpret & describe the results in Result section
3. Discuss in Discussion section
4. Answer the research questions
5. Complete the method & introduction
6. Finally, write the abstract
9April2017Researchmethodologyfornurses
138
Writing result
 Describe your result (no discussion)
 No reference (usually)
 Text vs. table vs. graphic (no redundancy)
 Text to summarise, Table for detail, Graphic to show
trend
 May state relevant statistics done (if not mentioned in
method)
9April2017Researchmethodologyfornurses
139
Writing discussion
 Should answer the research questions mentioned in
Introduction
 Discuss the result
 Do not repeat text as in Result
 May state limitation (but don’t go overboard)
 Recommend
 Conclude
9April2017Researchmethodologyfornurses
140
Research Management
Jamalludin Ab Rahman MD MPH
9April2017Researchmethodologyfornurses
141
Other things to plan
1. Ethical consideration – consent form, advisory
committee
2. Budget
3. Approval
9April2017Researchmethodologyfornurses
142
In summary, what are the critical
information
1. Specific objectives
2. Conceptual framework
3. Data dictionary
4. Dummy table & analytics guidelines
9April2017Researchmethodologyfornurses
143

Introduction to Research methodology

  • 1.
    Introduction to Research Methodology ForBasic Research Workshop Prof. Dr. Jamalludin Ab Rahman B.Med.Sc., MD, MPH Department of Community Medicine Kulliyyah of Medicine International Islamic University Malaysia
  • 2.
    The programme Day 1 Getting the research idea  Literature review & managing them (using EndNote)  Building conceptual framework  Phrasing objective  Determine best study design  Sampling plan & sample size  Plan for data collection – preparing data dictionary 9April2017Researchmethodologyfornurses 2
  • 3.
    The programme…. Day 2 Plan for statistical analysis – preparing dummy table  Basic statistical analysis  Descriptive statistics – determine Normality, mean (SD), median (IQR), count (%)  Analytical statistics – compare means, compare proportion, correlation 9April2017Researchmethodologyfornurses 3
  • 4.
    Contents  Why wedo research?  How we do research?  Effective literature search & management  Choosing best study design  Planning for data collection  Planning for statistical test  How to write for publication 9April2017Researchmethodologyfornurses 4
  • 5.
    Why we doresearch?  Answers our curiosity?  Forced by our superior  For promotion 9April2017Researchmethodologyfornurses 5
  • 6.
    Researchmethodologyfornurses 6 Start with endin mind (Bijaksana) 9April2017
  • 7.
    Researchmethodologyfornurses 7 Idea Justify Explore Strategize Collect AnalyseReport Publish Initiation Planning & strategize (proposal) Data collection Reporting & sharing Study design Sampling plan Sample size determination Plan for data collection Plan for statistical analysis 9April2017
  • 8.
    The 4 CriticalStrategies 1. Clear conceptual framework 2. SMART objectives 3. Detail data dictionary 4. Expected dummy table Researchmethodologyfornurses 8 9April2017
  • 9.
    Organise your researchidea Jamalludin Ab Rahman MD MPH 9April2017Researchmethodologyfornurses 9
  • 10.
    Identify variables involved Main outcomes (dependant)  Explanatory (exposures, factors) variables (independent)  Confounding variables 9April2017Researchmethodologyfornurses 10
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
    9April2017Researchmethodologyfornurses 15 Varkevisser, C. M., Pathmanathan,I., & Brownlee, A. T. (2003). Designing and conducting health systems research projects: Kit Publishers.
  • 16.
    Figure 3 StructuralEquation Model Showing the Relationship Between Family Processes, Child Characteristics, and Achievement for Girls Aged 6 to 11 years 9April2017Researchmethodologyfornurses 16
  • 17.
    Thornley S (2012) Causationand Statistical Prediction: Perfect Strangers or Bedfellows? J Biom Biostat 3:e115. doi:10.4172/2155- 6180.1000e115 9April2017Researchmethodologyfornurses 17
  • 18.
    Literature Review and BibliographicManagement Jamalludin Ab Rahman MD MPH 9April2017Researchmethodologyfornurses 18
  • 19.
  • 20.
  • 21.
  • 22.
    Literature review isnot  A chronological catalogue of all material,  A collection of quotes or paraphrases, or  A sales pitch selling only the good side of a study. 9April2017Researchmethodologyfornurses 22
  • 23.
    Literature review is Organised  Structured  Evaluated (critically appraised) research materials 9April2017Researchmethodologyfornurses 23
  • 24.
    The steps –5S 1. Strategise (plan) Most important step  Get your topic, problem statement & conceptual framework ready (& in-sync)  Identify components, variables, keywords & authors 2. Search – by keywords, domains of interest 3. Screen – title & abstract 4. Sort – organise & prioritise materials 5. Summarise – evaluate & re-organise 9April2017Researchmethodologyfornurses 24
  • 25.
    9April2017Researchmethodologyfornurses 25 Strategise Search Sort Summarise • Based onproblem statement & conceptual framework • Identify causal relationship, variables, domains, authors • Organise into the pre-planned structure or domain of interest • Online – Google, Google scholar, individual journal’s page, unpublished materials • Use electronic bibliographic apps e.g. EndNote, Mendeley, Papers • Credibility of articles (journal & authors) • Identify study population, study design, measurement method used • Check with the problem statement • Annotate (e.g. by using EndNote) • Re-organise the information if needed Screen • Screen the title, abstract
  • 26.
    9April2017Researchmethodologyfornurses 26 Strategise Search Sort Summarise • What isOSA? Diagnosis • Prevalence of OSA • Why study? HPT is prevalent. OSA is under-diagnosed & could be one of the reason • HPT OSA • Possible confounders – Sex, BMI • Based on Headings: Introduction, Method, Discussion • Based on domain: Epidemiology of OSA, Causal relationships (with HPT, BMI, Sex) • Based on point FOR & AGAINST hypothesis • Subject matters: Obstructive Sleep Apnea (& Apnoea), Sleep Disordered Breathing, Malaysia (local relevance), Hypertension, PSG vs. screening tool • Designs: Cohort, Case-control • Confounders • Credibility of articles (journal & authors) • Identify study population, study design, measurement method used • Check with the problem statement • Annotate (e.g. by using EndNote) • Re-organise the information if needed Screen • Screen the title, abstract Title: Relationship of OSA & HPT Design: Case-control (HPT vs. No-HPT) HPT OSA Male Obese
  • 27.
    9April2017Researchmethodologyfornurses 27 The number ofarticles found Alternative spelling How they were sorted Time period Observe the year, author & journal Full text Number of citations by Google You can cite (import) into EndNote The same doc maybe available at multiple places
  • 28.
    9April2017Researchmethodologyfornurses 30 Domains of interest.Can assign many domains for one article Customise the header Include full text or any relevant documents
  • 29.
    Phrasing objectives Jamalludin AbRahman MD MPH 9April2017Researchmethodologyfornurses 32
  • 30.
    SMART  Specific –What you really want to do  Measurable – Must be able to be measured  Attainable – Achievable  Relevant – Relevant to you  Timely – Can be done within the time you have 9April2017Researchmethodologyfornurses 33
  • 31.
    e.g.  To determinethe prevalence of breastfeeding among mothers Can rephrase to:  To measure the prevalence of exclusive breastfeeding among working mothers attending post-natal clinic in Pahang  To measure the prevalence of exclusive breastfeeding and factors related with it among working mothers attending post- natal clinic in Pahang 9April2017Researchmethodologyfornurses 34
  • 32.
    Choosing Best Research Designs JamalludinAb Rahman MD MPH 9April2017Researchmethodologyfornurses 35
  • 33.
    Research design Design Observational Cross sectional Case-control Cohort Experimental Animal Trial ClinicalTrial Community Trial 9April2017Researchmethodologyfornurses 36 Measure exposure & outcome & the same time Fix the outcome. Measure the exposure Fix the exposure. Measure the outcome
  • 34.
    9April2017Researchmethodologyfornurses 37 Cause & Effect EffectCause Causeprecedes effect Time Exposure Outcome Smoking Lung cancer High fat diet Obesity New drug Better prognosis
  • 35.
    9April2017Researchmethodologyfornurses 38 Cross sectional study EffectCause ExposureOutcome Smoking Lung cancer High fat diet Obesity New drug Better prognosis Observe BOTH Cause & Effect at the same time – No temporal association
  • 36.
    9April2017Researchmethodologyfornurses 39 Case-Control study EffectCause Exposure Outcome SmokingLung cancer High fat diet Obesity New drug Better prognosis Observe the Cause (PAST events)
  • 37.
    9April2017Researchmethodologyfornurses 40 Cohort study EffectCause Exposure Outcome SmokingLung cancer High fat diet Obesity New drug Better prognosis Observe the effect (NEW event)
  • 38.
  • 39.
  • 40.
    Hierarchy of Evidence Randomised controlled trial Cohort Casecontrol Case report Expert opinion 9April2017Researchmethodologyfornurses 43 Meta-analysis Systematic review
  • 41.
    Cross sectional study Population based – represent population  Measure exposure & outcomes at the same point in time – No temporal association  Impossible to infer causality  Prevalence study – measure magnitude or burden of disease  Descriptive  Repeated cross-sectional study ~ pseudo-longitudinal e.g. British Association for the Study of Community Dentistry (BASCD) guidance on sampling for surveys of child dental health. A BASCD coordinated dental epidemiology programme quality standard (Pine et al. 1997) 9April2017Researchmethodologyfornurses 44
  • 42.
    Cross sectional study Advantages Measure prevalence of a population  Measure multiple exposures & outcomes  Relatively inexpensive  Relative shorter time Disadvantages  No temporal association – no inference to causality  Prevalence-incidence bias (Nyman bias) e.g. if smokers die due to AMI faster, a cross-sectional study will reveal less smoker among AMI patients  Health workers effect e.g. when survey done from house to house, only health respondent are available in their home/office 9April2017Researchmethodologyfornurses 45
  • 43.
    Cross sectional study- Example  NHMS ~ Household study, all Malaysian (N=47,610 for 2006)  NOHSA ~ Adult (>15) (N=14,444 for 2010)  NMCS ~ GP vs. PHC (N~12,000)  NHANES (US) - http://www.cdc.gov/nchs/nhanes.htm 9April2017Researchmethodologyfornurses 46
  • 44.
    Case control study Fix the outcomes, measure the exposures  Longitudinal  Retrospective  Case = outcome of interest  Control = comparing outcome 9April2017Researchmethodologyfornurses 47
  • 45.
    Case control study 9April2017 Researchmethodology for nurses 48 Case Control No exposure Exposure No exposure Exposure History Now Lung cancer patients Healthy patients Smoking Non smoking Smoking Non smoking
  • 46.
    Lung Cancer No Lung Cancer Smoking20 (18.2%) 90 (81.8%) Not Smoking 5 (4.5%) 105 (95.5%) 2 (df=1)= 10.150, p =0.001, OR = 4.7 (CI95% 1.7 – 13.0) Because p < 0.05, we reject H0. Therefore there is a different between smoker & non smoker 9April2017Researchmethodologyfornurses 49
  • 47.
    Cases  Well defined Source – institution vs. population Control  Matched vs. Unmatched  Matching ~ controls resemble the cases with regard to certain characteristics (age, gender, SES etc)  Individual vs. Group matching  Source – institution vs. population  Ratio to cases ~ up to 4:1 9April2017 Research methodology for nurses 50
  • 48.
    Case control study Advantages Good for rare conditions or diseases  Less time needed to conduct the study because the condition or disease has already occurred  Measure multiple risk factors  Can establish an association Disadvantages  Recall bias  Not good for evaluating diagnostic tests because it’s already clear that the cases have the condition and the controls do not  It can be difficult to find a suitable control group 9April2017Researchmethodologyfornurses 51
  • 49.
    Example  Tobacco useand risk of myocardial infarction in 52 countries in the INTERHEART study: a case-control study. (Teo 2006) 9April2017Researchmethodologyfornurses 52
  • 50.
    Cohort study  Measureoutcomes  Compare incidence of a disease (or condition) among exposed and unexposed individuals over time  Disease free at the onset (or inception)  Repeated measurements ~ follow up  Prospective vs. retrospective cohort 9April2017Researchmethodologyfornurses 53
  • 51.
    Cohort study 9April2017Researchmethodologyfornurses 54 Disease No disease Noexposure Exposure Now Future Lung cancer No lung cancer Smoking Non smoking Disease No disease Lung cancer No lung cancer
  • 52.
    Define cohort  Bothexposed & not exposed groups have equal chance to:  Develop disease  Be followed-up  Types:  Representative – low exposed subjects  Enriched – high exposed subjects  Specific group – occupational, institution etc 9April2017Researchmethodologyfornurses 55
  • 53.
    Measurements  Exposure  Carefullydefined in advance  Standard measurement for both E+ & E- groups  Outcome  Primary vs. Secondary outcome 9April2017Researchmethodologyfornurses 56
  • 54.
    Follow-up  Keep participationat > 90%  Equal likelihood to detect disease in all subjects  Active vs. Passive follow-up  Blinding 9April2017Researchmethodologyfornurses 57
  • 55.
    Example Obesity asan Independent Risk Factor for Cardiovascular Disease: A 26-year Follow-up of Participants in the Framingham Heart Study (Hubert 1983) 9April2017Researchmethodologyfornurses 58
  • 56.
    Cohort study Advantages  Infercausality  Measure multiple outcomes  Study rare exposure  Measure incidence Disadvantages  Costly  Loss to follow up  Large sample size for rare outcomes  Selection bias 9April2017Researchmethodologyfornurses 59
  • 57.
  • 58.
  • 59.
    Sampling Plan (Sampling technique& sample size) Jamalludin Ab Rahman MD MPH 9April2017Researchmethodologyfornurses 62
  • 60.
    Before we sample Determinestudy place, duration & subjects  Describe study place - especially if plan to represent a population  State time & duration  Who or what are the subjects - population, people, animal etc. 9April2017Researchmethodologyfornurses 63
  • 61.
    Subjects  Target population Study population  Sampling frame  Sampling unit  Observation unit 9April2017Researchmethodologyfornurses 64
  • 62.
    Example – NHMSIII 2006  Target population  Study population  Sampling frame  Sampling unit  Observation unit 9April2017Researchmethodologyfornurses 65 All Malaysian Household up to strata 6 List of Enumeration Block & Living Quarters Enumeration Block & Living Quarters All household in the selected Living Quarters
  • 63.
    9April2017Researchmethodologyfornurses 66 Random Non-Random Represent apopulation Must have sampling frame • Simple • Systematic • Stratified • Cluster Still valid if justified • Convenience • Purposive • Snowball • Volunteer
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
    Random sampling Random Simple Systematic Stratified Cluster Multistage 9April2017Researchmethodologyfornurses 71 The idealmethod. Randomly sample 10 students from a class of 50 students. Follow certain pattern, order. Only first sample is random. Study population divided intro strata. All strata selected. Portion of sample in each strata sampled. Study population divided intro clusters. Assume all clusters are the same. Not all clusters selected. Only some will be sampled. Several sampling techniques applied at different stage.
  • 69.
    Non random sampling Nonrandom Convenience Purposive Quota Snowball 9April2017Researchmethodologyfornurses 72 Haphazard. Grab sampling. No rules. Just get the sample. Non representative. Convenience but with certain purpose. E.g. diabetic patients on single insulin therapy. When the sampling stop after achieving certain size. Sample by reference or recommendation.
  • 70.
    Example – NHMSIII 2006 Two stage stratified random sampling  Target population  Study population  Strata  Clusters  Sampling frame  Sampling unit  Observation unit  Sample distribution 9April2017Researchmethodologyfornurses 73 All Malaysian Household up to strata 6 List of Enumeration Block & Living Quarters Enumeration Block & Living Quarters All household in the selected Living Quarters State & location (urban or rural) Enumeration Block & Living Quarters Proportionate to size
  • 71.
    Example – Aclinical trial  Target population All diabetic patients  Study population Diabetic for at least 1 year on insulin therapy who attended MOPD from Jan-Dec 2014  Sampling frame N/A  Sampling unit Any diabetic who fulfilled the selection criteria  Observation unit Same as SU 9April2017Researchmethodologyfornurses 74
  • 72.
    Why we calculatesample size?  Estimate sample required for external validity (inference) & logistic preparation  Justify sample size for research to:  represent population  measure treatment effect (detect difference)  We don’t do research haphazardly 9April2017Researchmethodologyfornurses 75
  • 73.
    Sample size isan estimate  Best sample size for specific purpose (for the main objective)  Never an exact value, always an estimate  Calculated for certain expected estimates (outcomes) at certain degree of precision  Expected values - estimated from previous studies or from intelligent ‘guess’ 9April2017Researchmethodologyfornurses 76
  • 74.
    Plan with theend in mind  No one formula fits all  Depending on objective: 1. Represent population 2. Hypothesis testing  Adjust for design effect (based on type of sampling), confidence level, alpha error, power, stratification and anticipated response rate 9April2017Researchmethodologyfornurses 77
  • 75.
    Pre-requisites  Objective determined Sampling method known  Estimate the outcome  Precision required  Statistical test used is known  Set the power (b) and confidence level (a)  Anticipate non-response 9April2017Researchmethodologyfornurses 78
  • 76.
    Represent population  Representpopulation e.g. state, district, village, institution etc  Usually for prevalence study e.g. to measure prevalence of hypertension in Malaysia; or mean DMF among adult in Pahang 9April2017Researchmethodologyfornurses 79
  • 77.
    Single proportion  𝑛= 𝑧 𝛼/2 2 𝑝(1−𝑝) 𝑑2  Where z = value from standard normal distribution corresponding to desired confidence level, a (𝑧 𝛼/2=1.96 for 95%CI)  p is expected true proportion  d is desired precision  For small populations n can be adjusted so that 𝑛 𝑐 = 𝑁𝑛 𝑁+𝑛 9April2017Researchmethodologyfornurses 80
  • 78.
    9April2017Researchmethodologyfornurses 81 95% CI isa = 0.05 a/2 = 0.025 𝑧 𝛼/2 =1.96 Power 80% or 0.8 1-b = 1-0.8 = 0.2 b = 0.2 𝑧 𝛽 =0.84
  • 79.
    Example  Study tomeasure prevalence of hypertension in a village of 1000 people  Expected prevalence = 40%, Precision = 5%, CI=95%, expected non-response 20%  𝑛 = 1.962 ∗0.4(1−0.4) 0.052 = 369 ≅ 370  Anticipating non response, 𝑛 = 450 9April2017Researchmethodologyfornurses 82
  • 80.
    Single mean  Studyto measure one sample mean  𝑛 = ( 𝑧 𝛼/2∗𝑠 𝑑 )2, where s = expected standard deviation (SD)  We use smallest d to get largest n possible 9April2017Researchmethodologyfornurses 83
  • 81.
    Example  Study tomeasure average DMF among adults in a village of 2000 people  Estimated DMF = 11 (SD 10) with the precision of 2 at 95%CI  𝑛 = ( 1.96∗10 2 )2 = 96.04 ≅ 100 9April2017Researchmethodologyfornurses 84
  • 82.
    Hypothesis testing  Howone variable is different from the other  E.g. different % of hypertension between male & female 9April2017Researchmethodologyfornurses 85
  • 83.
    Compare two proportions 𝑛 = (𝑧 𝛼 2 +𝑧 𝛽)2∗(𝑝1 1−𝑝1 +𝑝2 1−𝑝2 ) (𝑝1−𝑝2)2  Total sample size = 2*n  Where p1 and p2 are the expected sample proportions of the two groups; za/2 is the critical value of the Normal distribution at a/2 and zβ is the critical value of the Normal distribution at β (e.g. for a power of 80%, β is 0.2 and the critical value is 0.84) 9April2017Researchmethodologyfornurses 86
  • 84.
    Example  A studyto compare prevalence of diabetes mellitus between male and female. Expected prevalence are 30% vs. 40% respectively. Power 80% at 95% confidence level.  𝑛 = (1.96 + 0.84)2 ∗ 0.3 1−0.3 +0.4 1−0.4 0.3−0.4 2 = 7.9 ∗ 45=355.5 ≅ 360  Total sample size = 360 * 2 = 720 9April2017Researchmethodologyfornurses 87
  • 85.
    Compare two means 𝑛 = 2∗ 𝑧 𝛼 2 +𝑧 𝛽 2∗𝑠2 𝑑2  Where d=smallest means difference and s = standard deviation  Total sample size = 2*n 9April2017Researchmethodologyfornurses 88
  • 86.
    Example  A studyto compare means of HbA1c between diabetic treated with new drug versus the standard drug (control). Expected difference of 10.5% vs. 11.5% with SD estimated of 5%  𝑛 = 2∗52 12 ∗ 1.96 + 0.84 2 = 395  Total sample size = 790 9April2017Researchmethodologyfornurses 89
  • 87.
    Using applications  Softwareor online calculators  Make sure the formula is valid  Suggestion:  PS: Power and Sample Size Calculation http://biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize  Epi Info http://wwwn.cdc.gov/epiinfo/7/  OpenEpi http://www.openepi.com/oe2.3/menu/openepimenu.htm 9April2017Researchmethodologyfornurses 90
  • 88.
    Epi Info –Population survey 9April2017Researchmethodologyfornurses 91
  • 89.
    Epi Info –Cohort & Cross-sectional 9April2017Researchmethodologyfornurses 92
  • 90.
    Open Epi – Comparetwo means 9April2017Researchmethodologyfornurses 93
  • 91.
  • 92.
    PS – Compare twomeans 9April2017Researchmethodologyfornurses 95
  • 93.
    Planning for DataCollection Jamalludin Ab Rahman MD MPH 9April2017Researchmethodologyfornurses 96
  • 94.
    You wanna collectnow? 1. Objectives are clear 2. Variables identified & defined 3. Samples identified 4. Instrument valid & reliable 5. Instrument tested 6. Simulation done Only then, you go and collect the data 9April2017Researchmethodologyfornurses 97
  • 95.
  • 96.
    9April2017Researchmethodologyfornurses 99 Variables identified &defined Conceptual framework Identify variables Define variables
  • 97.
    How to definevariable?  Definition  Working/operational definition  How it will be measured  Precision of measurement  How it will be reported 9April2017Researchmethodologyfornurses 100
  • 98.
    Furnish the infofor each variable e.g. Obesity  Definition 9April2017Researchmethodologyfornurses 101 Overweight and obesity are defined as abnormal or excessive fat accumulation that presents a risk to health. (WHO 1998).
  • 99.
    Furnish the infofor each variable e.g. Obesity  Definition  Working definition 9April2017Researchmethodologyfornurses 102 A crude population measure of obesity is the body mass index (BMI), a person's weight (in kilograms) divided by the square of his or her height (in metres). In this study, overweight and obesity are grouped together and labelled as Obesity. Any BMI 25 kg/m2 or above is considered obese (WHO 1998).
  • 100.
    Furnish the infofor each variable e.g. Obesity  Definition  Working definition  How it will be measured  Precision of measurement 9April2017Researchmethodologyfornurses 103 Weight is measured using calibrated Seca 762 with precision to the nearest of 1 kg Height using Seca 206 with precision to the nearest 1 mm.
  • 101.
  • 102.
  • 103.
    9April2017Researchmethodologyfornurses 106 NATIONAL HEALTH AND NUTRITIONEXAMINATION SURVEY III. Body Measurements (Anthropometry) 1998
  • 104.
    Furnish the infofor each variable e.g. Obesity  Definition  Working definition  How it will be measured  Precision of measurement  How it will be reported 9April2017Researchmethodologyfornurses 107 Obesity will be described in count (N) and proportion (%).
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.
    9April2017Researchmethodologyfornurses 112 Then only youstart to design the data collection form
  • 110.
    Tools to createform  Usual word processing (e.g. Word), spreadsheet (e.g. Excel)  Better with Database (Microsoft Access)  Google Form  EpiInfo Make Questionnaire 9April2017Researchmethodologyfornurses 113
  • 111.
  • 112.
  • 113.
  • 114.
    9April2017Researchmethodologyfornurses 117 • Test theform • Do complete testing • Revise the form • Run some analyses
  • 115.
    Plan for StatisticalAnalysis Jamalludin Ab Rahman MD MPH 9April2017Researchmethodologyfornurses 118
  • 116.
    9April2017Researchmethodologyfornurses 119 Research idea General objectives SpecificObjective Specific Objective Specific Objective Analysis Analysis Analysis
  • 117.
    9April2017Researchmethodologyfornurses 120 Type of statisticalanalysis AnalyticalDescriptive e.g. Describe socio- demographic characteristics - Age, Sex, Race etc. e.g. Prevalence of hypertension. e.g. Compare demographic characteristics between two population – Compare age between male & female e.g. Distribution of gender by hypertension status e.g. How demographic characteristics (more than one factors) explain hypertension Univariable Bivariable Multivariable IV DV IV DV IV DV IV
  • 118.
    Steps for theplan 1. Focus on each specific objective 2. Decide whether it will be descriptive or analytical 3. For descriptive statistics: Mean (SD) or Median (IQR) or N(%) 4. For analytical (present of hypothesis); state the Ho 5. Select suitable statistical test – usually run Univariate first, then Multivariate 6. Present this in a dummy table 9April2017Researchmethodologyfornurses 121
  • 119.
    Dummy table -example 9April2017Researchmethodologyfornurses 122
  • 120.
    Data quality  Validvalue e.g. age > 200 years, weight > 500 kg, pregnant male etc  No missing value  Relevant skip response e.g. Not Applicable response for number of pregnancy for male respondent  Declare method to ensure good data quality – e.g. double data entry 9April2017Researchmethodologyfornurses 123
  • 121.
    How to reportthe results  Start with descriptive statistics – usually for baseline findings e.g. description of study population – socio- demographic or basic clinical characteristics  Proceed with hypothesis testing (if any) – e.g. compare between groups/treatments  State statistical tests used  State data transformation done (if any)  Declare missing value & how they were managed 9April2017Researchmethodologyfornurses 124
  • 122.
  • 123.
  • 124.
    9April2017Researchmethodologyfornurses 127 Type of data (Levelof measurement) Categorical Nominal Ordinal Numerical Discrete Continuous e.g. Gender, Race e.g. Cancer staging, Severity of CXR for PTB e.g. Parity, Gravida e.g. Hb, RBS, cholesterol.
  • 125.
    Data Categorical Frequency (count) & Percentage Numerical Normal Mean(SD) Not Normal Median (Range/IQR) How to describe a data 9April2017Researchmethodologyfornurses 128
  • 126.
    Bivariable analysis  Comparenumericals – compare means, compare ranks, correlation  Compare proportions – chi-square 9April2017Researchmethodologyfornurses 129
  • 127.
    Bivariable analysis Dependent variable (outcome) NumericalCategorical Independent variable (factor) Numerical Correlation Independent sample t-test/ Mann-Whitney U Categorical Independent sample t-test/ Mann-Whitney U One-way ANOVA/Kruskall Wallis Chi-square 9April2017Researchmethodologyfornurses 130
  • 128.
    Bivariable analysis -detail 9April2017Researchmethodologyfornurses 131 Variable 1 Variable 2 Test Categorical Categorical Chi-square Categorical (2 pop) Numerical (Normal) Independent sample t-test Categorical (2 pop) Numerical (Not Normal) Mann-Whitney U test Categorical (> 2 pop) Numerical (Normal) One-way ANOVA Categorical (> 2 pop) Numerical (Not Normal) Kruskal-Wallis test Numerical (Normal) Numerical (Normal) Pearson Correlation Coefficient Test Numerical (Normal/ Not Normal) Numerical (Not Normal) Spearman Correlation Coefficient Test Numerical (Normal) Numerical (Normal) – Paired Paired t-test Numerical (Not Normal) Numerical (Not Normal) – Paired Wilcoxon Signed Rank Test
  • 129.
    Multivariate test Dependant Independentvariables Test Numerical All numerical Linear regression Numerical Numerical & categorical GLM Repeated numericals Numerical & categorical Repeated measures ANOVA Categorical Numerical & categorical Logistic regression 9April2017Researchmethodologyfornurses 132
  • 130.
    Specific test  Survivalanalysis – time to event analysis  Time series – observe trend over time & forecasting  Factor analysis – data reduction  Structural equation modelling – general tool to explore or confirm a ‘causal’ model (relationship between variables) 9April2017Researchmethodologyfornurses 133
  • 131.
    How to publishyour research Jamalludin Ab Rahman MD MPH 9April2017Researchmethodologyfornurses 134
  • 132.
    Howtowritea researchpaper Researchmethodologyfornurses 135 Title Method Discussion & conclusion State objectives Results Introduction Abstract Finalise title The backbone ofthe study How the study was done Present based on objectives Answer objectives. Discuss differences, recommendation & limitation Justify research. State research gap. Phrase objective clearly 9April2017
  • 133.
    Before you write Decide the target audience  Scientific publication or report  Choose journal  Study the format & requirement  Separate text, table & graphics 9April2017Researchmethodologyfornurses 136
  • 134.
    The structure Publication  Title Abstract  Keywords  Introduction  Method  Results  Discussion  References Report/thesis*  Title  Abstract  Introduction  Literature review  Objective  Methodology  Results  Discussion  References 9April2017Researchmethodologyfornurses 137 * Institution specific
  • 135.
    The suggested sequence 1.Based on specific objective, analyse the data & produce planned tables 2. Interpret & describe the results in Result section 3. Discuss in Discussion section 4. Answer the research questions 5. Complete the method & introduction 6. Finally, write the abstract 9April2017Researchmethodologyfornurses 138
  • 136.
    Writing result  Describeyour result (no discussion)  No reference (usually)  Text vs. table vs. graphic (no redundancy)  Text to summarise, Table for detail, Graphic to show trend  May state relevant statistics done (if not mentioned in method) 9April2017Researchmethodologyfornurses 139
  • 137.
    Writing discussion  Shouldanswer the research questions mentioned in Introduction  Discuss the result  Do not repeat text as in Result  May state limitation (but don’t go overboard)  Recommend  Conclude 9April2017Researchmethodologyfornurses 140
  • 138.
    Research Management Jamalludin AbRahman MD MPH 9April2017Researchmethodologyfornurses 141
  • 139.
    Other things toplan 1. Ethical consideration – consent form, advisory committee 2. Budget 3. Approval 9April2017Researchmethodologyfornurses 142
  • 140.
    In summary, whatare the critical information 1. Specific objectives 2. Conceptual framework 3. Data dictionary 4. Dummy table & analytics guidelines 9April2017Researchmethodologyfornurses 143

Editor's Notes

  • #22 Collect data, analyse them & publish
  • #68 Randomly select 5 cases Use random number table Use random calculator
  • #69 Aim is to select 5 students out of 20 If we plan to select 5 samples, create 5 groups (it becomes 20/5 samples per group) – Group A, B, C, D & E Sort cases – any order for each group – e.g. alphabetical Give number to all 5 cases in all groups Randomly select a number from 4 (because there are 4 cases) If No 3 is selected, choose ALL case No 3 in all groups
  • #80 The Decayed, Missing, Filled (DMF) index has been used for more than 70 years and is well established as the key measure of caries experience in dental epidemiology. total number of teeth or surfaces that are decayed (D), missing (M), or filled (F) in an individual.