Step'by-step guide to critiquing
research. Part 1: quantitative research
Michaei Coughian, Patricia Cronin, Frances Ryan
Abstract
When caring for patients it is essential that nurses are using the
current best practice. To determine what this is, nurses must be able
to read research critically. But for many qualified and student nurses
the terminology used in research can be difficult to understand
thus making critical reading even more daunting. It is imperative
in nursing that care has its foundations in sound research and it is
essential that all nurses have the ability to critically appraise research
to identify what is best practice. This article is a step-by step-approach
to critiquing quantitative research to help nurses demystify the
process and decode the terminology.
Key words: Quantitative research
methodologies
Review process • Research
]or many qualified nurses and nursing students
research is research, and it is often quite difficult
to grasp what others are referring to when they
discuss the limitations and or strengths within
a research study. Research texts and journals refer to
critiquing the literature, critical analysis, reviewing the
literature, evaluation and appraisal of the literature which
are in essence the same thing (Bassett and Bassett, 2003).
Terminology in research can be confusing for the novice
research reader where a term like 'random' refers to an
organized manner of selecting items or participants, and the
word 'significance' is applied to a degree of chance. Thus
the aim of this article is to take a step-by-step approach to
critiquing research in an attempt to help nurses demystify
the process and decode the terminology.
When caring for patients it is essential that nurses are
using the current best practice. To determine what this is
nurses must be able to read research. The adage 'All that
glitters is not gold' is also true in research. Not all research
is of the same quality or of a high standard and therefore
nurses should not simply take research at face value simply
because it has been published (Cullum and Droogan, 1999;
Rolit and Beck, 2006). Critiquing is a systematic method of
Michael Coughlan, Patricia Cronin and Frances Ryan are Lecturers,
School of Nursing and Midwifery, University of Dubhn, Trinity
College, Dublin
Accepted for publication: March 2007
appraising the strengths and limitations of a piece of research
in order to determine its credibility and/or its applicability
to practice (Valente, 2003). Seeking only limitations in a
study is criticism and critiquing and criticism are not the
same (Burns and Grove, 1997). A critique is an impersonal
evaluation of the strengths and limitations of the research
being reviewed and should not be seen as a disparagement
of the researchers ability. Neither should it be regarded as
a jousting match between the researcher and the reviewer.
Burns and Grove (1999) call this an 'intellectual critique'
in that it is not the creator but the creati.
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Stepby-step guide to critiquingresearch. Part 1 quantitati.docx
1. Step'by-step guide to critiquing
research. Part 1: quantitative research
Michaei Coughian, Patricia Cronin, Frances Ryan
Abstract
When caring for patients it is essential that nurses are using the
current best practice. To determine what this is, nurses must be
able
to read research critically. But for many qualified and student
nurses
the terminology used in research can be difficult to understand
thus making critical reading even more daunting. It is
imperative
in nursing that care has its foundations in sound research and it
is
essential that all nurses have the ability to critically appraise
research
to identify what is best practice. This article is a step-by step-
approach
to critiquing quantitative research to help nurses demystify the
process and decode the terminology.
Key words: Quantitative research
methodologies
Review process • Research
]or many qualified nurses and nursing students
research is research, and it is often quite difficult
to grasp what others are referring to when they
discuss the limitations and or strengths within
2. a research study. Research texts and journals refer to
critiquing the literature, critical analysis, reviewing the
literature, evaluation and appraisal of the literature which
are in essence the same thing (Bassett and Bassett, 2003).
Terminology in research can be confusing for the novice
research reader where a term like 'random' refers to an
organized manner of selecting items or participants, and the
word 'significance' is applied to a degree of chance. Thus
the aim of this article is to take a step-by-step approach to
critiquing research in an attempt to help nurses demystify
the process and decode the terminology.
When caring for patients it is essential that nurses are
using the current best practice. To determine what this is
nurses must be able to read research. The adage 'All that
glitters is not gold' is also true in research. Not all research
is of the same quality or of a high standard and therefore
nurses should not simply take research at face value simply
because it has been published (Cullum and Droogan, 1999;
Rolit and Beck, 2006). Critiquing is a systematic method of
Michael Coughlan, Patricia Cronin and Frances Ryan are
Lecturers,
School of Nursing and Midwifery, University of Dubhn, Trinity
College, Dublin
Accepted for publication: March 2007
appraising the strengths and limitations of a piece of research
in order to determine its credibility and/or its applicability
to practice (Valente, 2003). Seeking only limitations in a
study is criticism and critiquing and criticism are not the
same (Burns and Grove, 1997). A critique is an impersonal
evaluation of the strengths and limitations of the research
being reviewed and should not be seen as a disparagement
3. of the researchers ability. Neither should it be regarded as
a jousting match between the researcher and the reviewer.
Burns and Grove (1999) call this an 'intellectual critique'
in that it is not the creator but the creation that is being
evaluated. The reviewer maintains objectivity throughout
the critique. No personal views are expressed by the
reviewer and the strengths and/or limitations of the study
and the imphcations of these are highlighted with reference
to research texts or journals. It is also important to remember
that research works within the realms of probability where
nothing is absolutely certain. It is therefore important to
refer to the apparent strengths, limitations and findings
of a piece of research (Burns and Grove, 1997). The use
of personal pronouns is also avoided in order that an
appearance of objectivity can be maintained.
Credibility and integrity
There are numerous tools available to help both novice and
advanced reviewers to critique research studies (Tanner,
2003). These tools generally ask questions that can help the
reviewer to determine the degree to which the steps in the
research process were followed. However, some steps are
more important than others and very few tools acknowledge
this. Ryan-Wenger (1992) suggests that questions in a
critiquing tool can be subdivided in those that are useful
for getting a feel for the study being presented which she
calls 'credibility variables' and those that are essential for
evaluating the research process called 'integrity variables'.
Credibility variables concentrate on how believable the
work appears and focus on the researcher's qualifications and
ability to undertake and accurately present the study. The
answers to these questions are important when critiquing
a piece of research as they can offer the reader an insight
into vhat to expect in the remainder of the study.
However, the reader should be aware that identified strengths
4. and limitations within this section will not necessarily
correspond with what will be found in the rest of the work.
Integrity questions, on the other hand, are interested in the
robustness of the research method, seeking to identify how
appropriately and accurately the researcher followed the
steps in the research process. The answers to these questions
658 British Journal of Nursing. 2007. Vol 16, No II
RESEARCH METHODOLOGIES
Table 1. Research questions - guidelines for critiquing a
quantitative research study
Elements influencing the beiievabiiity of the research
Elements
Writing styie
Author
Report titie
Abstract
Questions
Is the report well written - concise, grammatically correct,
avoid the use of jargon? Is it weil iaid out and
organized?
Do the researcher(s') quaiifications/position indicate a degree of
knowledge in this particuiar field?
Is the title clear, accurate and unambiguous?
Does the abstract offer a clear overview of the study including
the research problem, sample,
methodology, finding and recommendations?
5. Elements influencing the robustness of the research
Elements
Purpose/research
Problem
Logical consistency
Literature review
Theoreticai framework
Aims/objectives/
research question/
hypotheses
Sampie
Ethicai considerations
Operational definitions
Methodology
Data Anaiysis / results
Discussion
References
Questions
Is the purpose of the study/research problem clearly identified?
Does the research report foilow the steps of the research process
in a iogical manner? Do these steps
naturally fiow and are the iinks ciear?
is the review Iogicaily organized? Does it offer a balanced
critical anaiysis of the iiterature? is the majority
6. of the literature of recent origin? is it mainly from primary
sources and of an empirical nature?
Has a conceptual or theoretical framework been identified? Is
the framework adequately described?
is the framework appropriate?
Have alms and objectives, a research question or hypothesis
been identified? If so are they clearly
stated? Do they reflect the information presented in the
iiterature review?
Has the target popuiation been cieariy identified? How were the
sample selected? Was it a probability
or non-probabiiity sampie? is it of adequate size? Are the
indusion/exciusion criteria dearly identified?
Were the participants fuiiy informed about the nature of the
research? Was the autonomy/
confidentiaiity of the participants guaranteed? Were the
participants protected from harm? Was ethicai
permission granted for the study?
Are aii the terms, theories and concepts mentioned in the study
dearly defined?
is the research design cieariy identified? Has the data gathering
instrument been described? is the
instrument appropriate? How was it deveioped? Were reliabiiity
and validity testing undertaken and the
resuits discussed? Was a piiot study undertaken?
What type of data and statisticai analysis was undertaken? Was
it appropriate? How many of the sampie
participated? Significance of the findings?
Are the findings iinked back to the iiterature review? if a
hypothesis was identified was it supported?
Were the strengths and limitations of the study including
generalizability discussed? Was a
recommendation for further research made?
Were ali the books, journais and other media aliuded to in the
study accurateiy referenced?
7. will help to identify the trustworthiness of the study and its
applicability to nursing practice.
Critiquing the research steps
In critiquing the steps in the research process a number
of questions need to be asked. However, these questions
are seeking more than a simple 'yes' or 'no' answer. The
questions are posed to stimulate the reviewer to consider
the implications of what the researcher has done. Does the
way a step has been applied appear to add to the strength
of the study or does it appear as a possible limitation to
implementation of the study's findings? {Table 1).
Eiements influencing beiievabiiity of the study
Writing style
Research reports should be well written, grammatically
correct, concise and well organized.The use of jargon should
be avoided where possible. The style should be such that it
attracts the reader to read on (Polit and Beck, 2006).
Author(s)
The author(s') qualifications and job title can be a useful
indicator into the researcher(s') knowledge of the area
under investigation and ability to ask the appropriate
questions (Conkin Dale, 2005). Conversely a research
study should be evaluated on its own merits and not
assumed to be valid and reliable simply based on the
author(s') qualifications.
Report title
The title should be between 10 and 15 words long and
should clearly identify for the reader the purpose of the
study (Connell Meehan, 1999). Titles that are too long or
too short can be confusing or misleading (Parahoo, 2006).
8. Abstract
The abstract should provide a succinct overview of the
research and should include information regarding the
purpose of the study, method, sample size and selection.
Hritislijourn.il of Nursing. 2007. Vol 16. No 11 659
the main findings and conclusions and recommendations
(Conkin Dale, 2005). From the abstract the reader should
be able to determine if the study is of interest and whether
or not to continue reading (Parahoo, 2006).
Eiements influencing robustness
Purpose of the study/research problem
A research problem is often first presented to the reader in
the introduction to the study (Bassett and Bassett, 2003).
Depending on what is to be investigated some authors will
refer to it as the purpose of the study. In either case the
statement should at least broadly indicate to the reader what
is to be studied (Polit and Beck, 2006). Broad problems are
often multi-faceted and will need to become narrower and
more focused before they can be researched. In this the
literature review can play a major role (Parahoo, 2006).
Logical consistency
A research study needs to follow the steps in the process in a
logical manner.There should also be a clear link between the
steps beginning with the purpose of the study and following
through the literature review, the theoretical framework, the
research question, the methodology section, the data analysis,
and the findings (Ryan-Wenger, 1992).
Literature review
The primary purpose of the literature review is to define
9. or develop the research question while also identifying
an appropriate method of data collection (Burns and
Grove, 1997). It should also help to identify any gaps in
the literature relating to the problem and to suggest how
those gaps might be filled. The literature review should
demonstrate an appropriate depth and breadth of reading
around the topic in question. The majority of studies
included should be of recent origin and ideally less than
five years old. However, there may be exceptions to this,
for example, in areas where there is a lack of research, or a
seminal or all-important piece of work that is still relevant to
current practice. It is important also that the review should
include some historical as well as contemporary material
in order to put the subject being studied into context. The
depth of coverage will depend on the nature of the subject,
for example, for a subject with a vast range of literature then
the review will need to concentrate on a very specific area
(Carnwell, 1997). Another important consideration is the
type and source of hterature presented. Primary empirical
data from the original source is more favourable than a
secondary source or anecdotal information where the
author relies on personal evidence or opinion that is not
founded on research.
A good review usually begins with an introduction which
identifies the key words used to conduct the search and
information about which databases were used. The themes
that emerged from the literature should then be presented
and discussed (Carnwell, 1997). In presenting previous
work it is important that the data is reviewed critically,
highlighting both the strengths and limitations of the study.
It should also be compared and contrasted with the findings
of other studies (Burns and Grove, 1997).
Theoretical framework
Following the identification of the research problem
10. and the review of the literature the researcher should
present the theoretical framework (Bassett and Bassett,
2003). Theoretical frameworks are a concept that novice
and experienced researchers find confusing. It is initially
important to note that not all research studies use a defined
theoretical framework (Robson, 2002). A theoretical
framework can be a conceptual model that is used as a
guide for the study (Conkin Dale, 2005) or themes from
the literature that are conceptually mapped and used to set
boundaries for the research (Miles and Huberman, 1994).
A sound framework also identifies the various concepts
being studied and the relationship between those concepts
(Burns and Grove, 1997). Such relationships should have
been identified in the literature. The research study should
then build on this theory through empirical observation.
Some theoretical frameworks may include a hypothesis.
Theoretical frameworks tend to be better developed in
experimental and quasi-experimental studies and often
poorly developed or non-existent in descriptive studies
(Burns and Grove, 1999).The theoretical framework should
be clearly identified and explained to the reader.
Aims and objectives/research question/
research hypothesis
The purpose of the aims and objectives of a study, the research
question and the research hypothesis is to form a link between
the initially stated purpose of the study or research problem
and how the study will be undertaken (Burns and Grove,
1999). They should be clearly stated and be congruent with
the data presented in the literature review. The use of these
items is dependent on the type of research being performed.
Some descriptive studies may not identify any of these items
but simply refer to the purpose of the study or the research
problem, others will include either aims and objectives or
research questions (Burns and Grove, 1999). Correlational
designs, study the relationships that exist between two or
11. more variables and accordingly use either a research question
or hypothesis. Experimental and quasi-experimental studies
should clearly state a hypothesis identifying the variables to
be manipulated, the population that is being studied and the
predicted outcome (Burns and Grove, 1999).
Sample and sample size
The degree to which a sample reflects the population it
was drawn from is known as representativeness and in
quantitative research this is a decisive factor in determining
the adequacy of a study (Polit and Beck, 2006). In order
to select a sample that is likely to be representative and
thus identify findings that are probably generalizable to
the target population a probability sample should be used
(Parahoo, 2006). The size of the sample is also important in
quantitative research as small samples are at risk of being
overly representative of small subgroups within the target
population. For example, if, in a sample of general nurses, it
was noticed that 40% of the respondents were males, then
males would appear to be over represented in the sample,
thereby creating a sampling error. The risk of sampling
660 Britishjournal of Nursing. 2007. Vol 16. No II
RESEARCH METHODOLOGIES
errors decrease as larger sample sizes are used (Burns and
Grove, 1997). In selecting the sample the researcher should
clearly identify who the target population are and what
criteria were used to include or exclude participants. It
should also be evident how the sample was selected and
how many were invited to participate (Russell, 2005).
Ethical considerations
12. Beauchamp and Childress (2001) identify four fundamental
moral principles: autonomy, non-maleficence, beneficence
and justice. Autonomy infers that an individual has the right
to freely decide to participate in a research study without
fear of coercion and with a full knowledge of what is being
investigated. Non-maleficence imphes an intention of not
harming and preventing harm occurring to participants
both of a physical and psychological nature (Parahoo,
2006). Beneficence is interpreted as the research benefiting
the participant and society as a whole (Beauchamp and
Childress, 2001). Justice is concerned with all participants
being treated as equals and no one group of individuals
receiving preferential treatment because, for example, of
their position in society (Parahoo, 2006). Beauchamp and
Childress (2001) also identify four moral rules that are both
closely connected to each other and with the principle of
autonomy. They are veracity (truthfulness), fidelity (loyalty
and trust), confidentiality and privacy.The latter pair are often
linked and imply that the researcher has a duty to respect the
confidentiality and/or the anonymity of participants and
non-participating subjects.
Ethical committees or institutional review boards have to
give approval before research can be undertaken. Their role
is to determine that ethical principles are being applied and
that the rights of the individual are being adhered to (Burns
and Grove, 1999).
Operational definitions
In a research study the researcher needs to ensure that
the reader understands what is meant by the terms and
concepts that are used in the research. To ensure this any
concepts or terms referred to should be clearly defined
(Parahoo, 2006).
Methodology: research design
13. Methodology refers to the nuts and bolts of how a
research study is undertaken. There are a number of
important elements that need to be referred to here and
the first of these is the research design. There are several
types of quantitative studies that can be structured under
the headings of true experimental, quasi-experimental
and non-experimental designs (Robson, 2002) {Table 2).
Although it is outside the remit of this article, within each
of these categories there are a range of designs that will
impact on how the data collection and data analysis phases
of the study are undertaken. However, Robson (2002)
states these designs are similar in many respects as most
are concerned with patterns of group behaviour, averages,
tendencies and properties.
Methodology: data collection
The next element to consider after the research design
is the data collection method. In a quantitative study any
number of strategies can be adopted when collecting data
and these can include interviews, questionnaires, attitude
scales or observational tools. Questionnaires are the most
commonly used data gathering instruments and consist
mainly of closed questions with a choice of fixed answers.
Postal questionnaires are administered via the mail and have
the value of perceived anonymity. Questionnaires can also be
administered in face-to-face interviews or in some instances
over the telephone (Polit and Beck, 2006).
Methodology: instrument design
After identifying the appropriate data gathering method
the next step that needs to be considered is the design
of the instrument. Researchers have the choice of using
a previously designed instrument or developing one for
the study and this choice should be clearly declared for
the reader. Designing an instrument is a protracted and
sometimes difficult process (Burns and Grove, 1997) but the
14. overall aim is that the final questions will be clearly linked
to the research questions and will elicit accurate information
and will help achieve the goals of the research.This, however,
needs to be demonstrated by the researcher.
Table 2. Research designs
Design
Experimental
Qucisl-experimental
Non-experimental,
e.g. descriptive and
Includes: cross-sectional.
correlationai.
comparative.
iongitudinal studies
Sample
2 or more groups
One or more groups
One or more groups
Sample
allocation
Random
Random
Not applicable
15. Features
• Groups get
different treatments
• One variable has not
been manipuiated or
controlled (usually
because it cannot be)
• Discover new meaning
• Describe what already
exists
• Measure the relationship
between two or more
variables
Outcome
• Cause and effiect relationship
• Cause and effect relationship
but iess powerful than
experimental
• Possible hypothesis for
future research
• Tentative explanations
Britishjournal of Nursing. 2007. Vol 16. No 11 661
16. If a previously designed instrument is selected the researcher
should clearly establish that chosen instrument is the most
appropriate.This is achieved by outlining how the instrument
has measured the concepts under study. Previously designed
instruments are often in the form of standardized tests
or scales that have been developed for the purpose of
measuring a range of views, perceptions, attitudes, opinions
or even abilities. There are a multitude of tests and scales
available, therefore the researcher is expected to provide the
appropriate evidence in relation to the validity and reliability
of the instrument (Polit and Beck, 2006).
Methodology: validity and reliability
One of the most important features of any instrument is
that it measures the concept being studied in an unwavering
and consistent way. These are addressed under the broad
headings of validity and reliability respectively. In general,
validity is described as the ability of the instrument to
measure what it is supposed to measure and reliability the
instrument's ability to consistently and accurately measure
the concept under study (Wood et al, 2006). For the most
part, if a well established 'off the shelf instrument has been
used and not adapted in any way, the validity and reliability
will have been determined already and the researcher
should outline what this is. However, if the instrument
has been adapted in any way or is being used for a new
population then previous validity and reliability will not
apply. In these circumstances the researcher should indicate
how the reliability and validity of the adapted instrument
was established (Polit and Beck, 2006).
To establish if the chosen instrument is clear and
unambiguous and to ensure that the proposed study has
been conceptually well planned a mini-version of the main
study, referred to as a pilot study, should be undertaken before
17. the main study. Samples used in the pilot study are generally
omitted from the main study. Following the pilot study the
researcher may adjust definitions, alter the research question,
address changes to the measuring instrument or even alter
the sampling strategy.
Having described the research design, the researcher should
outline in clear, logical steps the process by which the data
was collected. All steps should be fully described and easy to
follow (Russell, 2005).
Analysis and results
Data analysis in quantitative research studies is often seen
as a daunting process. Much of this is associated with
apparently complex language and the notion of statistical
tests. The researcher should clearly identify what statistical
tests were undertaken, why these tests were used and
what •were the results. A rule of thumb is that studies that
are descriptive in design only use descriptive statistics,
correlational studies, quasi-experimental and experimental
studies use inferential statistics. The latter is subdivided
into tests to measure relationships and differences between
variables (Clegg, 1990).
Inferential statistical tests are used to identify if a
relationship or difference between variables is statistically
significant. Statistical significance helps the researcher to
rule out one important threat to validity and that is that the
result could be due to chance rather than to real differences
in the population. Quantitative studies usually identify the
lowest level of significance as PsO.O5 (P = probability)
(Clegg, 1990).
To enhance readability researchers frequently present
their findings and data analysis section under the headings
18. of the research questions (Russell, 2005). This can help the
reviewer determine if the results that are presented clearly
answer the research questions. Tables, charts and graphs may
be used to summarize the results and should be accurate,
clearly identified and enhance the presentation of results
(Russell, 2005).
The percentage of the sample who participated in
the study is an important element in considering the
generalizability of the results. At least fifty percent of the
sample is needed to participate if a response bias is to be
avoided (Polit and Beck, 2006).
Discussion/conclusion/recommendations
The discussion of the findings should Oow logically from the
data and should be related back to the literature review thus
placing the study in context (Russell, 2002). If the hypothesis
was deemed to have been supported by the findings,
the researcher should develop this in the discussion. If a
theoretical or conceptual framework was used in the study
then the relationship with the findings should be explored.
Any interpretations or inferences drawn should be clearly
identified as such and consistent with the results.
The significance of the findings should be stated but
these should be considered within the overall strengths
and limitations of the study (Polit and Beck, 2006). In this
section some consideration should be given to whether
or not the findings of the study were generalizable, also
referred to as external validity. Not all studies make a claim
to generalizability but the researcher should have undertaken
an assessment of the key factors in the design, sampling and
analysis of the study to support any such claim.
Finally the researcher should have explored the clinical
significance and relevance of the study. Applying findings
19. in practice should be suggested with caution and will
obviously depend on the nature and purpose of the study.
In addition, the researcher should make relevant and
meaningful suggestions for future research in the area
(Connell Meehan, 1999).
References
The research study should conclude with an accurate list
of all the books; journal articles, reports and other media
that were referred to in the work (Polit and Beck, 2006).
The referenced material is also a useful source of further
information on the subject being studied.
Conciusions
The process of critiquing involves an in-depth examination
of each stage of the research process. It is not a criticism but
rather an impersonal scrutiny of a piece of work using a
balanced and objective approach, the purpose of which is to
highlight both strengths and weaknesses, in order to identify
662 Uritish Journal of Nursinii. 2007. Vol 16. No II
RESEARCH METHODOLOGIES
whether a piece of research is trustworthy and unbiased. As
nursing practice is becoming increasingly more evidenced
based, it is important that care has its foundations in sound
research. It is therefore important that all nurses have the
ability to critically appraise research in order to identify what
20. is best practice. HH
Russell C (2005) Evaluating quantitative researcli reports.
Nephrol Nurs J
32(1): 61-4
Ryan-Wenger N (1992) Guidelines for critique of a research
report. Heart
Lung 21(4): 394-401
Tanner J (2003) Reading and critiquing research. BrJ Perioper
Nurs 13(4):
162-4
Valente S (2003) Research dissemination and utilization:
Improving care at
the bedside.J Nurs Care Quality 18(2): 114-21
Wood MJ, Ross-Kerr JC, Brink PJ (2006) Basic Steps in
Planning Nursing
Research: From Question to Proposal 6th edn. Jones and
Bartlett, Sudbury
Bassett C, B.issett J (2003) Reading and critiquing research. BrJ
Perioper
NriK 13(4): 162-4
Beauchamp T, Childress J (2001) Principles of Biomedical
Ethics. 5th edn.
O.xford University Press, Oxford
Burns N, Grove S (1997) The Practice of Nursing Research:
Conduct, Critique
and Utilization. 3rd edn.WB Saunders Company, Philadelphia
Burns N, Grove S (1999) Understanding Nursing Research. 2nd
21. edn. WB
Saunders Company. Philadelphia
Carnell R (1997) Critiquing research. Nurs Pract 8(12): 16-21
Clegg F (1990) Simple Statistics: A Course Book for the Social
Sciences. 2nd edn.
Cambridge University Press. Cambridge
Conkin DaleJ (2005) Critiquing research for use in practice.J
Pediatr Health
Care 19: 183-6
Connell Meehan T (1999) The research critique. In:Treacy P,
Hyde A, eds.
Nursing Research and Design. UCD Press, Dublin: 57-74
Cullum N. Droogan J (1999) Using research and the role of
systematic
reviews of the literature. In: Mulhall A. Le May A. eds. Nursing
Research:
Dissemination and Implementation. Churchill Livingstone,
Edinburgh:
109-23-
Miles M, Huberman A (1994) Qualitative Data Analysis. 2nd
edn. Sage,
Thousand Oaks. Ca
Parahoo K (2006) Nursing Research: Principles, Process and
Issties. 2nd edn.
Palgrave Macmillan. Houndmills Basingstoke
Polit D. Beck C (2006) Essentials of Nursing Care: Methods,
Appraisal and
Utilization. 6th edn. Lippincott Williams and Wilkins,
22. Philadelphia
Robson C (2002) Reat World Research. 2nd edn. Blackwell
Publishing,
O.xford
KEY POINTS
I Many qualified and student nurses have difficulty
understanding the concepts and terminology associated
with research and research critique.
IThe ability to critically read research is essential if the
profession is to achieve and maintain its goal to be
evidenced based.
IA critique of a piece of research is not a criticism of
the wori<, but an impersonai review to highlight the
strengths and iimitations of the study.
I It is important that all nurses have the ability to criticaiiy
appraise research In order to identify what is best
practice.
Critiquing Nursing Research 2nd edition
Critiquing
Nursing Research
23. 2nd edition
ISBN-W; 1- 85642-316-6; lSBN-13; 978-1-85642-316-8; 234 x
156 mm; p/back; 224 pages;
publicatior) November 2006; £25.99
By John R Cutdiffe and Martin Ward
This 2nd edition of Critiquing Nursing Research retains the
features which made the original
a 'best seller' whilst incorporating new material in order to
expand the book's applicability. In
addition to reviewing and subsequently updating the material of
the original text, the authors
have added two further examples of approaches to crtitique
along with examples and an
additonal chapter on how to critique research as part of the
work of preparing a dissertation.
The fundamentals of the book however remain the same. It
focuses specifically on critiquing
nursing research; the increasing requirement for nurses to
become conversant with research,
understand its link with the use of evidence to underpin
practice; and the movement towards
becoming an evidence-based discipline.
As nurse education around the world increasingly moves
towards an all-graduate discipline, it
is vital for nurses to have the ability to critique research in
order to benefit practice. This book
is the perfect tool for those seeking to gain or develop precisely
that skill and is a must-have
for all students nurses, teachers and academics.
John Cutclitfe holds the 'David G. Braithwaite' Protessor of
24. Nursing Endowed Chair at the University of Texas (Tyler); he is
also an Adjunct Professor of Psychiatric Nursing at Stenberg
College International School of Nursing, Vancouver, Canada.
Matin Ward is an Independent tvtental Health Nurse Consultant
and Director of tvlW Protessional Develcpment Ltd.
To order your copy please contact us using the details below or
visit our website
www.quaybooks.co.yk where you will also tind details ot other
Quay Books otters and titles.
John Cutcliffe and Martin Ward
IQUAYBOOKS
AdMsioiiDftUHiolthcareM
Quay Books Division I MA Healthcare Limited
Jesses Farm I Snow Hill I Dinton I Salisbury I Wiltshire I SP3
5HN I UK
Tel: 01722 716998 I Fax: 01722 716887 I E-mail:
[email protected] I Web: www.quaybooks.co.uk
A
ilH
MAHbUTHCASIUMITED
Uritishjoiirnnl of Nursinji;. 2OO7.V0I 16. No 11 663
25. Step'by-step guide to critiquing
research. Part 1: quantitative research
Michaei Coughian, Patricia Cronin, Frances Ryan
Abstract
When caring for patients it is essential that nurses are using the
current best practice. To determine what this is, nurses must be
able
to read research critically. But for many qualified and student
nurses
the terminology used in research can be difficult to understand
thus making critical reading even more daunting. It is
imperative
in nursing that care has its foundations in sound research and it
is
essential that all nurses have the ability to critically appraise
research
to identify what is best practice. This article is a step-by step-
approach
to critiquing quantitative research to help nurses demystify the
process and decode the terminology.
Key words: Quantitative research
methodologies
Review process • Research
]or many qualified nurses and nursing students
research is research, and it is often quite difficult
to grasp what others are referring to when they
discuss the limitations and or strengths within
a research study. Research texts and journals refer to
26. critiquing the literature, critical analysis, reviewing the
literature, evaluation and appraisal of the literature which
are in essence the same thing (Bassett and Bassett, 2003).
Terminology in research can be confusing for the novice
research reader where a term like 'random' refers to an
organized manner of selecting items or participants, and the
word 'significance' is applied to a degree of chance. Thus
the aim of this article is to take a step-by-step approach to
critiquing research in an attempt to help nurses demystify
the process and decode the terminology.
When caring for patients it is essential that nurses are
using the current best practice. To determine what this is
nurses must be able to read research. The adage 'All that
glitters is not gold' is also true in research. Not all research
is of the same quality or of a high standard and therefore
nurses should not simply take research at face value simply
because it has been published (Cullum and Droogan, 1999;
Rolit and Beck, 2006). Critiquing is a systematic method of
Michael Coughlan, Patricia Cronin and Frances Ryan are
Lecturers,
School of Nursing and Midwifery, University of Dubhn, Trinity
College, Dublin
Accepted for publication: March 2007
appraising the strengths and limitations of a piece of research
in order to determine its credibility and/or its applicability
to practice (Valente, 2003). Seeking only limitations in a
study is criticism and critiquing and criticism are not the
same (Burns and Grove, 1997). A critique is an impersonal
evaluation of the strengths and limitations of the research
being reviewed and should not be seen as a disparagement
of the researchers ability. Neither should it be regarded as
a jousting match between the researcher and the reviewer.
27. Burns and Grove (1999) call this an 'intellectual critique'
in that it is not the creator but the creation that is being
evaluated. The reviewer maintains objectivity throughout
the critique. No personal views are expressed by the
reviewer and the strengths and/or limitations of the study
and the imphcations of these are highlighted with reference
to research texts or journals. It is also important to remember
that research works within the realms of probability where
nothing is absolutely certain. It is therefore important to
refer to the apparent strengths, limitations and findings
of a piece of research (Burns and Grove, 1997). The use
of personal pronouns is also avoided in order that an
appearance of objectivity can be maintained.
Credibility and integrity
There are numerous tools available to help both novice and
advanced reviewers to critique research studies (Tanner,
2003). These tools generally ask questions that can help the
reviewer to determine the degree to which the steps in the
research process were followed. However, some steps are
more important than others and very few tools acknowledge
this. Ryan-Wenger (1992) suggests that questions in a
critiquing tool can be subdivided in those that are useful
for getting a feel for the study being presented which she
calls 'credibility variables' and those that are essential for
evaluating the research process called 'integrity variables'.
Credibility variables concentrate on how believable the
work appears and focus on the researcher's qualifications and
ability to undertake and accurately present the study. The
answers to these questions are important when critiquing
a piece of research as they can offer the reader an insight
into vhat to expect in the remainder of the study.
However, the reader should be aware that identified strengths
and limitations within this section will not necessarily
correspond with what will be found in the rest of the work.
28. Integrity questions, on the other hand, are interested in the
robustness of the research method, seeking to identify how
appropriately and accurately the researcher followed the
steps in the research process. The answers to these questions
658 British Journal of Nursing. 2007. Vol 16, No II
RESEARCH METHODOLOGIES
Table 1. Research questions - guidelines for critiquing a
quantitative research study
Elements influencing the beiievabiiity of the research
Elements
Writing styie
Author
Report titie
Abstract
Questions
Is the report well written - concise, grammatically correct,
avoid the use of jargon? Is it weil iaid out and
organized?
Do the researcher(s') quaiifications/position indicate a degree of
knowledge in this particuiar field?
Is the title clear, accurate and unambiguous?
Does the abstract offer a clear overview of the study including
the research problem, sample,
methodology, finding and recommendations?
Elements influencing the robustness of the research
29. Elements
Purpose/research
Problem
Logical consistency
Literature review
Theoreticai framework
Aims/objectives/
research question/
hypotheses
Sampie
Ethicai considerations
Operational definitions
Methodology
Data Anaiysis / results
Discussion
References
Questions
Is the purpose of the study/research problem clearly identified?
Does the research report foilow the steps of the research process
in a iogical manner? Do these steps
naturally fiow and are the iinks ciear?
is the review Iogicaily organized? Does it offer a balanced
critical anaiysis of the iiterature? is the majority
of the literature of recent origin? is it mainly from primary
sources and of an empirical nature?
30. Has a conceptual or theoretical framework been identified? Is
the framework adequately described?
is the framework appropriate?
Have alms and objectives, a research question or hypothesis
been identified? If so are they clearly
stated? Do they reflect the information presented in the
iiterature review?
Has the target popuiation been cieariy identified? How were the
sample selected? Was it a probability
or non-probabiiity sampie? is it of adequate size? Are the
indusion/exciusion criteria dearly identified?
Were the participants fuiiy informed about the nature of the
research? Was the autonomy/
confidentiaiity of the participants guaranteed? Were the
participants protected from harm? Was ethicai
permission granted for the study?
Are aii the terms, theories and concepts mentioned in the study
dearly defined?
is the research design cieariy identified? Has the data gathering
instrument been described? is the
instrument appropriate? How was it deveioped? Were reliabiiity
and validity testing undertaken and the
resuits discussed? Was a piiot study undertaken?
What type of data and statisticai analysis was undertaken? Was
it appropriate? How many of the sampie
participated? Significance of the findings?
Are the findings iinked back to the iiterature review? if a
hypothesis was identified was it supported?
Were the strengths and limitations of the study including
generalizability discussed? Was a
recommendation for further research made?
Were ali the books, journais and other media aliuded to in the
study accurateiy referenced?
will help to identify the trustworthiness of the study and its
31. applicability to nursing practice.
Critiquing the research steps
In critiquing the steps in the research process a number
of questions need to be asked. However, these questions
are seeking more than a simple 'yes' or 'no' answer. The
questions are posed to stimulate the reviewer to consider
the implications of what the researcher has done. Does the
way a step has been applied appear to add to the strength
of the study or does it appear as a possible limitation to
implementation of the study's findings? {Table 1).
Eiements influencing beiievabiiity of the study
Writing style
Research reports should be well written, grammatically
correct, concise and well organized.The use of jargon should
be avoided where possible. The style should be such that it
attracts the reader to read on (Polit and Beck, 2006).
Author(s)
The author(s') qualifications and job title can be a useful
indicator into the researcher(s') knowledge of the area
under investigation and ability to ask the appropriate
questions (Conkin Dale, 2005). Conversely a research
study should be evaluated on its own merits and not
assumed to be valid and reliable simply based on the
author(s') qualifications.
Report title
The title should be between 10 and 15 words long and
should clearly identify for the reader the purpose of the
study (Connell Meehan, 1999). Titles that are too long or
too short can be confusing or misleading (Parahoo, 2006).
Abstract
The abstract should provide a succinct overview of the
32. research and should include information regarding the
purpose of the study, method, sample size and selection.
Hritislijourn.il of Nursing. 2007. Vol 16. No 11 659
the main findings and conclusions and recommendations
(Conkin Dale, 2005). From the abstract the reader should
be able to determine if the study is of interest and whether
or not to continue reading (Parahoo, 2006).
Eiements influencing robustness
Purpose of the study/research problem
A research problem is often first presented to the reader in
the introduction to the study (Bassett and Bassett, 2003).
Depending on what is to be investigated some authors will
refer to it as the purpose of the study. In either case the
statement should at least broadly indicate to the reader what
is to be studied (Polit and Beck, 2006). Broad problems are
often multi-faceted and will need to become narrower and
more focused before they can be researched. In this the
literature review can play a major role (Parahoo, 2006).
Logical consistency
A research study needs to follow the steps in the process in a
logical manner.There should also be a clear link between the
steps beginning with the purpose of the study and following
through the literature review, the theoretical framework, the
research question, the methodology section, the data analysis,
and the findings (Ryan-Wenger, 1992).
Literature review
The primary purpose of the literature review is to define
or develop the research question while also identifying
an appropriate method of data collection (Burns and
33. Grove, 1997). It should also help to identify any gaps in
the literature relating to the problem and to suggest how
those gaps might be filled. The literature review should
demonstrate an appropriate depth and breadth of reading
around the topic in question. The majority of studies
included should be of recent origin and ideally less than
five years old. However, there may be exceptions to this,
for example, in areas where there is a lack of research, or a
seminal or all-important piece of work that is still relevant to
current practice. It is important also that the review should
include some historical as well as contemporary material
in order to put the subject being studied into context. The
depth of coverage will depend on the nature of the subject,
for example, for a subject with a vast range of literature then
the review will need to concentrate on a very specific area
(Carnwell, 1997). Another important consideration is the
type and source of hterature presented. Primary empirical
data from the original source is more favourable than a
secondary source or anecdotal information where the
author relies on personal evidence or opinion that is not
founded on research.
A good review usually begins with an introduction which
identifies the key words used to conduct the search and
information about which databases were used. The themes
that emerged from the literature should then be presented
and discussed (Carnwell, 1997). In presenting previous
work it is important that the data is reviewed critically,
highlighting both the strengths and limitations of the study.
It should also be compared and contrasted with the findings
of other studies (Burns and Grove, 1997).
Theoretical framework
Following the identification of the research problem
and the review of the literature the researcher should
present the theoretical framework (Bassett and Bassett,
34. 2003). Theoretical frameworks are a concept that novice
and experienced researchers find confusing. It is initially
important to note that not all research studies use a defined
theoretical framework (Robson, 2002). A theoretical
framework can be a conceptual model that is used as a
guide for the study (Conkin Dale, 2005) or themes from
the literature that are conceptually mapped and used to set
boundaries for the research (Miles and Huberman, 1994).
A sound framework also identifies the various concepts
being studied and the relationship between those concepts
(Burns and Grove, 1997). Such relationships should have
been identified in the literature. The research study should
then build on this theory through empirical observation.
Some theoretical frameworks may include a hypothesis.
Theoretical frameworks tend to be better developed in
experimental and quasi-experimental studies and often
poorly developed or non-existent in descriptive studies
(Burns and Grove, 1999).The theoretical framework should
be clearly identified and explained to the reader.
Aims and objectives/research question/
research hypothesis
The purpose of the aims and objectives of a study, the research
question and the research hypothesis is to form a link between
the initially stated purpose of the study or research problem
and how the study will be undertaken (Burns and Grove,
1999). They should be clearly stated and be congruent with
the data presented in the literature review. The use of these
items is dependent on the type of research being performed.
Some descriptive studies may not identify any of these items
but simply refer to the purpose of the study or the research
problem, others will include either aims and objectives or
research questions (Burns and Grove, 1999). Correlational
designs, study the relationships that exist between two or
more variables and accordingly use either a research question
or hypothesis. Experimental and quasi-experimental studies
35. should clearly state a hypothesis identifying the variables to
be manipulated, the population that is being studied and the
predicted outcome (Burns and Grove, 1999).
Sample and sample size
The degree to which a sample reflects the population it
was drawn from is known as representativeness and in
quantitative research this is a decisive factor in determining
the adequacy of a study (Polit and Beck, 2006). In order
to select a sample that is likely to be representative and
thus identify findings that are probably generalizable to
the target population a probability sample should be used
(Parahoo, 2006). The size of the sample is also important in
quantitative research as small samples are at risk of being
overly representative of small subgroups within the target
population. For example, if, in a sample of general nurses, it
was noticed that 40% of the respondents were males, then
males would appear to be over represented in the sample,
thereby creating a sampling error. The risk of sampling
660 Britishjournal of Nursing. 2007. Vol 16. No II
RESEARCH METHODOLOGIES
errors decrease as larger sample sizes are used (Burns and
Grove, 1997). In selecting the sample the researcher should
clearly identify who the target population are and what
criteria were used to include or exclude participants. It
should also be evident how the sample was selected and
how many were invited to participate (Russell, 2005).
Ethical considerations
Beauchamp and Childress (2001) identify four fundamental
moral principles: autonomy, non-maleficence, beneficence
36. and justice. Autonomy infers that an individual has the right
to freely decide to participate in a research study without
fear of coercion and with a full knowledge of what is being
investigated. Non-maleficence imphes an intention of not
harming and preventing harm occurring to participants
both of a physical and psychological nature (Parahoo,
2006). Beneficence is interpreted as the research benefiting
the participant and society as a whole (Beauchamp and
Childress, 2001). Justice is concerned with all participants
being treated as equals and no one group of individuals
receiving preferential treatment because, for example, of
their position in society (Parahoo, 2006). Beauchamp and
Childress (2001) also identify four moral rules that are both
closely connected to each other and with the principle of
autonomy. They are veracity (truthfulness), fidelity (loyalty
and trust), confidentiality and privacy.The latter pair are often
linked and imply that the researcher has a duty to respect the
confidentiality and/or the anonymity of participants and
non-participating subjects.
Ethical committees or institutional review boards have to
give approval before research can be undertaken. Their role
is to determine that ethical principles are being applied and
that the rights of the individual are being adhered to (Burns
and Grove, 1999).
Operational definitions
In a research study the researcher needs to ensure that
the reader understands what is meant by the terms and
concepts that are used in the research. To ensure this any
concepts or terms referred to should be clearly defined
(Parahoo, 2006).
Methodology: research design
Methodology refers to the nuts and bolts of how a
research study is undertaken. There are a number of
37. important elements that need to be referred to here and
the first of these is the research design. There are several
types of quantitative studies that can be structured under
the headings of true experimental, quasi-experimental
and non-experimental designs (Robson, 2002) {Table 2).
Although it is outside the remit of this article, within each
of these categories there are a range of designs that will
impact on how the data collection and data analysis phases
of the study are undertaken. However, Robson (2002)
states these designs are similar in many respects as most
are concerned with patterns of group behaviour, averages,
tendencies and properties.
Methodology: data collection
The next element to consider after the research design
is the data collection method. In a quantitative study any
number of strategies can be adopted when collecting data
and these can include interviews, questionnaires, attitude
scales or observational tools. Questionnaires are the most
commonly used data gathering instruments and consist
mainly of closed questions with a choice of fixed answers.
Postal questionnaires are administered via the mail and have
the value of perceived anonymity. Questionnaires can also be
administered in face-to-face interviews or in some instances
over the telephone (Polit and Beck, 2006).
Methodology: instrument design
After identifying the appropriate data gathering method
the next step that needs to be considered is the design
of the instrument. Researchers have the choice of using
a previously designed instrument or developing one for
the study and this choice should be clearly declared for
the reader. Designing an instrument is a protracted and
sometimes difficult process (Burns and Grove, 1997) but the
overall aim is that the final questions will be clearly linked
to the research questions and will elicit accurate information
38. and will help achieve the goals of the research.This, however,
needs to be demonstrated by the researcher.
Table 2. Research designs
Design
Experimental
Qucisl-experimental
Non-experimental,
e.g. descriptive and
Includes: cross-sectional.
correlationai.
comparative.
iongitudinal studies
Sample
2 or more groups
One or more groups
One or more groups
Sample
allocation
Random
Random
Not applicable
Features
39. • Groups get
different treatments
• One variable has not
been manipuiated or
controlled (usually
because it cannot be)
• Discover new meaning
• Describe what already
exists
• Measure the relationship
between two or more
variables
Outcome
• Cause and effiect relationship
• Cause and effect relationship
but iess powerful than
experimental
• Possible hypothesis for
future research
• Tentative explanations
Britishjournal of Nursing. 2007. Vol 16. No 11 661
If a previously designed instrument is selected the researcher
40. should clearly establish that chosen instrument is the most
appropriate.This is achieved by outlining how the instrument
has measured the concepts under study. Previously designed
instruments are often in the form of standardized tests
or scales that have been developed for the purpose of
measuring a range of views, perceptions, attitudes, opinions
or even abilities. There are a multitude of tests and scales
available, therefore the researcher is expected to provide the
appropriate evidence in relation to the validity and reliability
of the instrument (Polit and Beck, 2006).
Methodology: validity and reliability
One of the most important features of any instrument is
that it measures the concept being studied in an unwavering
and consistent way. These are addressed under the broad
headings of validity and reliability respectively. In general,
validity is described as the ability of the instrument to
measure what it is supposed to measure and reliability the
instrument's ability to consistently and accurately measure
the concept under study (Wood et al, 2006). For the most
part, if a well established 'off the shelf instrument has been
used and not adapted in any way, the validity and reliability
will have been determined already and the researcher
should outline what this is. However, if the instrument
has been adapted in any way or is being used for a new
population then previous validity and reliability will not
apply. In these circumstances the researcher should indicate
how the reliability and validity of the adapted instrument
was established (Polit and Beck, 2006).
To establish if the chosen instrument is clear and
unambiguous and to ensure that the proposed study has
been conceptually well planned a mini-version of the main
study, referred to as a pilot study, should be undertaken before
the main study. Samples used in the pilot study are generally
omitted from the main study. Following the pilot study the
41. researcher may adjust definitions, alter the research question,
address changes to the measuring instrument or even alter
the sampling strategy.
Having described the research design, the researcher should
outline in clear, logical steps the process by which the data
was collected. All steps should be fully described and easy to
follow (Russell, 2005).
Analysis and results
Data analysis in quantitative research studies is often seen
as a daunting process. Much of this is associated with
apparently complex language and the notion of statistical
tests. The researcher should clearly identify what statistical
tests were undertaken, why these tests were used and
what •were the results. A rule of thumb is that studies that
are descriptive in design only use descriptive statistics,
correlational studies, quasi-experimental and experimental
studies use inferential statistics. The latter is subdivided
into tests to measure relationships and differences between
variables (Clegg, 1990).
Inferential statistical tests are used to identify if a
relationship or difference between variables is statistically
significant. Statistical significance helps the researcher to
rule out one important threat to validity and that is that the
result could be due to chance rather than to real differences
in the population. Quantitative studies usually identify the
lowest level of significance as PsO.O5 (P = probability)
(Clegg, 1990).
To enhance readability researchers frequently present
their findings and data analysis section under the headings
of the research questions (Russell, 2005). This can help the
reviewer determine if the results that are presented clearly
42. answer the research questions. Tables, charts and graphs may
be used to summarize the results and should be accurate,
clearly identified and enhance the presentation of results
(Russell, 2005).
The percentage of the sample who participated in
the study is an important element in considering the
generalizability of the results. At least fifty percent of the
sample is needed to participate if a response bias is to be
avoided (Polit and Beck, 2006).
Discussion/conclusion/recommendations
The discussion of the findings should Oow logically from the
data and should be related back to the literature review thus
placing the study in context (Russell, 2002). If the hypothesis
was deemed to have been supported by the findings,
the researcher should develop this in the discussion. If a
theoretical or conceptual framework was used in the study
then the relationship with the findings should be explored.
Any interpretations or inferences drawn should be clearly
identified as such and consistent with the results.
The significance of the findings should be stated but
these should be considered within the overall strengths
and limitations of the study (Polit and Beck, 2006). In this
section some consideration should be given to whether
or not the findings of the study were generalizable, also
referred to as external validity. Not all studies make a claim
to generalizability but the researcher should have undertaken
an assessment of the key factors in the design, sampling and
analysis of the study to support any such claim.
Finally the researcher should have explored the clinical
significance and relevance of the study. Applying findings
in practice should be suggested with caution and will
obviously depend on the nature and purpose of the study.
43. In addition, the researcher should make relevant and
meaningful suggestions for future research in the area
(Connell Meehan, 1999).
References
The research study should conclude with an accurate list
of all the books; journal articles, reports and other media
that were referred to in the work (Polit and Beck, 2006).
The referenced material is also a useful source of further
information on the subject being studied.
Conciusions
The process of critiquing involves an in-depth examination
of each stage of the research process. It is not a criticism but
rather an impersonal scrutiny of a piece of work using a
balanced and objective approach, the purpose of which is to
highlight both strengths and weaknesses, in order to identify
662 Uritish Journal of Nursinii. 2007. Vol 16. No II
RESEARCH METHODOLOGIES
whether a piece of research is trustworthy and unbiased. As
nursing practice is becoming increasingly more evidenced
based, it is important that care has its foundations in sound
research. It is therefore important that all nurses have the
ability to critically appraise research in order to identify what
is best practice. HH
44. Russell C (2005) Evaluating quantitative researcli reports.
Nephrol Nurs J
32(1): 61-4
Ryan-Wenger N (1992) Guidelines for critique of a research
report. Heart
Lung 21(4): 394-401
Tanner J (2003) Reading and critiquing research. BrJ Perioper
Nurs 13(4):
162-4
Valente S (2003) Research dissemination and utilization:
Improving care at
the bedside.J Nurs Care Quality 18(2): 114-21
Wood MJ, Ross-Kerr JC, Brink PJ (2006) Basic Steps in
Planning Nursing
Research: From Question to Proposal 6th edn. Jones and
Bartlett, Sudbury
Bassett C, B.issett J (2003) Reading and critiquing research. BrJ
Perioper
NriK 13(4): 162-4
Beauchamp T, Childress J (2001) Principles of Biomedical
Ethics. 5th edn.
O.xford University Press, Oxford
Burns N, Grove S (1997) The Practice of Nursing Research:
Conduct, Critique
and Utilization. 3rd edn.WB Saunders Company, Philadelphia
Burns N, Grove S (1999) Understanding Nursing Research. 2nd
edn. WB
Saunders Company. Philadelphia
45. Carnell R (1997) Critiquing research. Nurs Pract 8(12): 16-21
Clegg F (1990) Simple Statistics: A Course Book for the Social
Sciences. 2nd edn.
Cambridge University Press. Cambridge
Conkin DaleJ (2005) Critiquing research for use in practice.J
Pediatr Health
Care 19: 183-6
Connell Meehan T (1999) The research critique. In:Treacy P,
Hyde A, eds.
Nursing Research and Design. UCD Press, Dublin: 57-74
Cullum N. Droogan J (1999) Using research and the role of
systematic
reviews of the literature. In: Mulhall A. Le May A. eds. Nursing
Research:
Dissemination and Implementation. Churchill Livingstone,
Edinburgh:
109-23-
Miles M, Huberman A (1994) Qualitative Data Analysis. 2nd
edn. Sage,
Thousand Oaks. Ca
Parahoo K (2006) Nursing Research: Principles, Process and
Issties. 2nd edn.
Palgrave Macmillan. Houndmills Basingstoke
Polit D. Beck C (2006) Essentials of Nursing Care: Methods,
Appraisal and
Utilization. 6th edn. Lippincott Williams and Wilkins,
Philadelphia
46. Robson C (2002) Reat World Research. 2nd edn. Blackwell
Publishing,
O.xford
KEY POINTS
I Many qualified and student nurses have difficulty
understanding the concepts and terminology associated
with research and research critique.
IThe ability to critically read research is essential if the
profession is to achieve and maintain its goal to be
evidenced based.
IA critique of a piece of research is not a criticism of
the wori<, but an impersonai review to highlight the
strengths and iimitations of the study.
I It is important that all nurses have the ability to criticaiiy
appraise research In order to identify what is best
practice.
Critiquing Nursing Research 2nd edition
Critiquing
Nursing Research
2nd edition
47. ISBN-W; 1- 85642-316-6; lSBN-13; 978-1-85642-316-8; 234 x
156 mm; p/back; 224 pages;
publicatior) November 2006; £25.99
By John R Cutdiffe and Martin Ward
This 2nd edition of Critiquing Nursing Research retains the
features which made the original
a 'best seller' whilst incorporating new material in order to
expand the book's applicability. In
addition to reviewing and subsequently updating the material of
the original text, the authors
have added two further examples of approaches to crtitique
along with examples and an
additonal chapter on how to critique research as part of the
work of preparing a dissertation.
The fundamentals of the book however remain the same. It
focuses specifically on critiquing
nursing research; the increasing requirement for nurses to
become conversant with research,
understand its link with the use of evidence to underpin
practice; and the movement towards
becoming an evidence-based discipline.
As nurse education around the world increasingly moves
towards an all-graduate discipline, it
is vital for nurses to have the ability to critique research in
order to benefit practice. This book
is the perfect tool for those seeking to gain or develop precisely
that skill and is a must-have
for all students nurses, teachers and academics.
John Cutclitfe holds the 'David G. Braithwaite' Protessor of
Nursing Endowed Chair at the University of Texas (Tyler); he is
48. also an Adjunct Professor of Psychiatric Nursing at Stenberg
College International School of Nursing, Vancouver, Canada.
Matin Ward is an Independent tvtental Health Nurse Consultant
and Director of tvlW Protessional Develcpment Ltd.
To order your copy please contact us using the details below or
visit our website
www.quaybooks.co.yk where you will also tind details ot other
Quay Books otters and titles.
John Cutcliffe and Martin Ward
IQUAYBOOKS
AdMsioiiDftUHiolthcareM
Quay Books Division I MA Healthcare Limited
Jesses Farm I Snow Hill I Dinton I Salisbury I Wiltshire I SP3
5HN I UK
Tel: 01722 716998 I Fax: 01722 716887 I E-mail:
[email protected] I Web: www.quaybooks.co.uk
A
ilH
MAHbUTHCASIUMITED
Uritishjoiirnnl of Nursinji;. 2OO7.V0I 16. No 11 663
49. Jeff Rotman/The Image Bank/Getty Images
chapter 7
Dependent t-Tests and Repeated
Measures Analysis of Variance
Learning Objectives
After reading this chapter, you will be able to. . .
1. describe the impact that initial between-groups differences
have on test results when using the
t-test or analysis of variance.
2. compare the independent t-test to the dependent-groups t-test.
3. complete a dependent-(paired/repeated-)samples t-test.
4. explain what power means in statistical testing.
5. compare the one-way ANOVA to the repeated-measures
ANOVA.
6. complete a repeated-measures ANOVA.
7. interpret results and draw conclusions of within-group
designs.
8. present within-group analysis results in APA format.
9. employ Wilcoxon signed-ranks W-test and Friedman’s
nonparametric ANOVA.
CN
50. CO_LO
CO_TX
CO_NL
CT
CO_CRD
suk85842_07_c07.indd 235 10/23/13 1:29 PM
CHAPTER 7Section 7.1 Reconsidering the t and F Ratios
Tests of significant difference, such as the t-test and analysis of
variance, are of two kinds: tests involving independent (or
between) groups and those that employ related,
or dependent (or within) groups. The tests covered to this point
in the book have involved
only independent groups tests. However, there are important
advantages related to the
dependent groups procedures, and they are used frequently in
data analysis.
In this chapter, the focus will be on the dependent groups
equivalents of the independent
t-test and the one-way ANOVA. Since the same groups are used
over time or treatments,
these are called dependent/within-groups designs, whereas the
matched or equivalent
groups can also be employed as an alternative design—all
collectively known as repeated-
measures designs. Although repeated-measures designs answer
the same questions as
51. their independent groups equivalents (i.e., are there significant
differences within groups,
across times/treatments, or between matched/equivalent groups)
under particular cir-
cumstances they can do so with greater economy and more
statistical power.
7.1 Reconsidering the t and F Ratios
The scores produced in both the independent t and the one-way
ANOVA are ratios. In the case of the t-test, the ratio is the
result of dividing the difference between the means
of the groups by the standard error of the difference:
t 5
M1 2 M2
SEd
With ANOVA, the F ratio is the mean square between divided
by the mean square within:
F 5
MSbet
MSwith
With either t or F, the denominator in the ratio reflects how
much scores vary within (rather
than between) the groups of subjects involved in the study.
These differences are easiest
to see in the way the standard error of the difference is
calculated for a t-test. When group
sizes are equal, the formula is
SED 5 "1SEM1 2 2 1 1SEM2 2 2
with SEM 5
52. s
"n
and s, of course, a measure of score variation in any group.
So the standard error of the difference is based on the standard
error of the mean, which
in turn is based on the standard deviation. These connections
make it clear that score vari-
ance within in a t-test has its root in the standard deviation for
each group of scores. If we
reverse the order and work from the standard deviation back to
the standard error of the
difference, note the following:
• When scores vary substantially in a group, it is reflected in a
large standard
deviation.
• When the standard deviation is relatively large, the standard
error of the mean
must likewise be large because the standard deviation is the
numerator in the
formula for SEM.
H1
TX_DC
BLF
TX
BL
BLL
53. suk85842_07_c07.indd 236 10/23/13 1:29 PM
CHAPTER 7Section 7.1 Reconsidering the t and F Ratios
• A large standard error of the mean results in a large standard
error of the difference because that statistic is the square root
of the sum of the squared standard errors of the mean.
• When the standard error of the difference is large, the differ-
ence between the means has to be correspondingly larger in
order for the result to be statistically significant. The table of
critical values indicates that no t ratio (the ratio of the differ-
ences between the means and the standard error of the differ-
ence) may be less than 1.96 for a two-tailed test and less than
1.645 for a one-tailed test based on the critical a 5 .05.
Error Variance
The point of this is that the value of t in the t-test—and it is the
same for F in an ANOVA—
is greatly affected by the amount of variability within the
groups involved. When the
variability within those groups is extensive, the values of t and
F are correspondingly
diminished and less likely to be statistically significant than
when there is relatively little
variability within the groups.
These differences within groups stem from differences in the
way individuals within the
samples react to whatever treatment is the independent variable;
different people respond
differently to the same stimulus. These differences represent
54. error variance, which is what
occurs whenever scores differ for reasons not related to the
influence of the IV.
Other Sources of Error Variance
Within-group differences are not the only source of error
variance in the calculation of t
and F. Both t-test and ANOVA are based on the assumption that
the groups involved are
equivalent before the independent variable is introduced. In a t-
test where the impact of
relaxation therapy on clients’ anxiety is the issue, the
assumption is that before the ther-
apy is introduced, the group, which will receive the therapy, and
the control group, which
will not, begin with equivalent levels of anxiety. That
assumption is the key to attributing
any differences after the treatment to the therapy, the IV.
Confounding Variables
In comparisons such as this, the initial equivalence of the
groups can be a problem, how-
ever. Maybe there were differences in anxiety before the
therapy was introduced. There
might be differences in the employment circumstances of each
group, and perhaps those
threatened with unemployment are more anxious than the
others. Maybe there are age-
related differences. These other influences that are not
controlled in an experiment are
sometimes called confounding variables.
If a psychologist wants to examine the impact that a substance
abuse program has on
55. addicts’ behavior, a study might be set up as follows. Two
groups of the same number of
addicts are selected, and the substance abuse program is
provided to one group. After the
program, the psychologist measures the level of substance abuse
in both groups to see
whether there is a difference.
A If the size
of the standard
deviation is related to
the size of the group,
in a t-test, what is the
relationship between
sample size and error?
Try It!
suk85842_07_c07.indd 237 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
The problem is that the presence or absence of the program is
not the only thing that might
prompt subjects to respond differently. Perhaps subjects’
background experiences are dif-
ferent. Maybe there are ethnic group differences, age
differences, or social class differ-
ences. If any of those differences affect substance abuse
behavior, there is an opportunity
to confuse the influence of those factors with the impact of the
substance abuse program,
which is the IV. If those other differences are not controlled and
they affect the dependent
56. variable, they contribute to error variance. There is error
variance any time the dependent
variable (DV) scores fluctuate for reasons unrelated to the IV.
Therefore, there is error variance reflected in the variability
within groups, and there is error
variance represented in any difference between groups that is
not related to the IV. Test
results can be meaningful only when the score variance that is
related to the independent
variable is substantially greater than the error variance—what is
controlled must contrib-
ute more to score values than what is left uncontrolled. This
makes it important to look for
ways to control error variance so that it is not confused with the
variability in scores that
stems from the independent variable. Controlling for
confounding variables is a necessary
research activity. A confounding variable can affect the IV-DV
relationship, thereby lowering
internal validity and thus the statistical conclusion validity of
your findings. Failing to take
confounding variables into account can result in misleading data
and erroneous conclu-
sions, to the detriment of the researcher’s reputation. In other
words, be careful of research
findings and sweeping general statements as there may be
several confounding elements.
That said, controlling for extraneous confounding variables
could be done in several ways.
7.2 Dependent-Groups Designs
Ideally, any before-the-treatment differences between the
groups in a study will be min-
imal. Recall that random selection occurs when every member
of a population has an
57. equal chance of being selected. The logic behind random
selection is that when groups are
randomly drawn from the same population, they will differ only
by chance, but they will
differ because no sample can represent the population with
complete fidelity, and occa-
sionally, the chance differences will affect the way subjects
respond to the IV.
One way to reduce error variance is to adopt what are called
dependent-groups designs. The independent t-test and the one-
way
ANOVA required independent groups. Members of one group
can-
not also be members of other groups in the same study.
However, in
the case of the t-test, if the same group is measured, exposed to
the
treatment, and then measured again, an important source of
error
variance is controlled. Using the same group twice makes initial
equivalence no longer a concern. Any scoring variability
between the
first and second measure should more accurately reflect the
impact of
the independent variable.
The Dependent-Samples t-Tests
One dependent-groups test where the same group is measured
twice is called the before/
after t-test, also known as the pre/post t-test. An alternative is
called the matched-pairs
or dependent-samples t-test, where each participant in the first
group is matched to
someone in the second group who has a similar characteristic.
58. Yet a third alternative that
B How does
random selection
attempt to control error
variance in statistical
testing?
Try It!
suk85842_07_c07.indd 238 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
is basically the same as a before/after design is the within-
treatment design where each
participant is used across two treatment groups (usually given at
two different times,
which makes it the same as the before/after t-test). In the latter
option the participant acts
as his or her own control where one of the treatments may be a
placebo. All three types of
dependent-samples t-tests have the same objective, which is
controlling the error variance
that is due to initial between-groups differences. Following are
examples of each test.
• The before/after design: A researcher is interested in the
impact that positive rein-
forcement has on employees’ sales productivity. Besides the
sales commission,
the researcher introduces a rewards program that can result in
increased vaca-
tion time. The researcher gauges sales productivity for a month,
59. introduces the
rewards program, and gauges sales productivity during the
second month for the
same people. The researcher will explore differences in
employee productivity
before and after the positive reinforcement intervention (the
rewards program).
If significance was obtained then the null (i.e., there is no
difference in employee
productivity after the introduction of the rewards program) can
be rejected and
find support for the alternative hypothesis (i.e., there is a
significant increase in
employee productivity after the introduction of the rewards
program).
• The matched-pairs design: A school counselor is interested in
the impact that verbal
reinforcement has on students’ reading achievement. To
eliminate between-groups
differences, the researcher selects 30 people for the treatment
group and matches
each person in the treatment group to someone in the control
group who has a
similar reading score on a standardized test. The researcher then
introduces the
verbal reinforcement program to those in the treatment group
for a specified period
and over time compares the performance of students in the two
groups as well as
their performance within the group. The matched-pairs design is
similar to an inde-
pendent group design with one major exception: that the groups
are as matched (or
equivalent) to each other as closely as possible based on a
particular measure—in
60. this case the match or equivalent is based on reading scores on a
standardized test.
• Within-treatment design: A psychiatrist measures each study
participant on taking a
placebo, and then the actual drug for depression to test for
significant differences
over the two treatments (placebo versus drug). Here a
counterbalancing design may be
employed to minimize the order effects that plague repeated-
measures design. Specif-
ically the order in which treatments are given can influence the
outcome. Therefore,
the treatments are given to the groups at different times as
depicted in Figure 7.1.
Figure 7.1: Counterbalance design
Source: Oskar Blakstad (May 8, 2009). Counterbalanced
Measures Design by Martyn Shuttleworth. Retrieved Aug 22,
2013 from
Explorable.com: http://explorable.com/counterbalanced-
measures-design.
Although there are differences in how the tests are set up,
calculating the t-statistic is
the same in each case. The differences between the approaches
are conceptual, not
Group 1 Treatment A Treatment B Posttest
Group 2 Treatment B Treatment A Posttest
suk85842_07_c07.indd 239 10/23/13 1:29 PM
http://explorable.com/counterbalanced-measures-design
61. CHAPTER 7Section 7.2 Dependent-Groups Designs
mathematical. Both approaches have the same purpose—to
control for any score varia-
tion stemming from nonrelevant factors. They both reduce the
error variance that comes
from using nonequivalent groups. Therefore, testing for
homogeneity of variance or the
Levene’s test is moot here, as we are not dealing with
differences between groups. On the
other hand, we are dealing with variance within groups and
across pairs of treatments.
If there are such significant differences, then this issue
constitutes what is described as a
violation of sphericity, which will be discussed in Section 7.3
with more depth when we
examine repeated-measures ANOVA.
Calculating t in a Dependent-Groups Design
Although the differences between before/after, matched-pairs,
and within-treatment
t-tests are not math-related, there are several approaches to
calculating the t statistic in the
dependent-groups tests. Whatever their differences, they all take
into
account the fact that the two sets of scores are related. One
approach is
to calculate the correlation between the two sets of scores and
then to
use the strength of the correlation value to reduce the error
variance—
the higher the correlation between the two sets of scores, the
62. lower the
error variance. Rather than correlations, which come up later in
the
book, we will rely on “difference scores.” But whether we use
correla-
tion values or difference scores, the result is the same.
The distribution of difference scores was discussed in Chapter 5
when the independent
t-test was introduced. The point of that distribution is to
determine the point at which the
difference between a pair of sample means (M1 2 M2) is so
great that the most probable
explanation is that the samples were not drawn from populations
with the same means.
That same distribution also provides the theoretical
underpinning for the dependent-
groups tests, but rather than the difference between the means
of the two groups
(M1 2 M2), the difference score in the dependent-groups tests is
based on the mean of the
differences between pairs of individual scores. That is, the
differences between each pair of
related scores will be determined, and then the mean of those
differences will become the
numerator in the t ratio. If the mean of the difference scores is
sufficiently different from
the mean of the distribution of difference scores (which, recall,
is 0), the t value will be
statistically significant.
The denominator in the t ratio is another standard error of the
mean value, but in this case,
it is the standard error of the mean for those difference scores.
The mechanics of checking
63. for significance are similar to what was done for the
independent t:
• A critical value from the t table defines the point at which the
t ratio is statisti-
cally significant.
• The critical value is dependent upon the degrees of freedom
for the problem.
For the dependent-samples t, the degrees of freedom are the
number of pairs of
scores minus 1 (n 2 1).
The dependent-groups t-test statistic has this form:
t 5
Md
SEMd
Formula 7.1
C How are the
before/after t-test
and the matched-pairs
t-test different?
Try It!
suk85842_07_c07.indd 240 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
Where
64. Md 5 the mean of the difference scores
SEMd 5 the standard error of the mean for the difference scores
The steps for completing the test follow:
1. From the two scores for each subject, subtract the second
from the first to deter-
mine the difference score, d for each pair.
2. Determine the mean of the d scores: Md 5
a d
number of pairs
3. Calculate the standard deviation of the d values, sd.
4. Calculate the standard error of the mean for the difference
scores, SEMd, by
dividing the result of Step 3 by the square root of the number of
pairs of scores,
SEMd 5
sd
"number of pairs
5. t 5
Md
SEMd
Following is an example to illustrate the steps to calculating the
dependent measures
t-test. A psychologist is investigating the impact that verbal
reinforcement has on the
number of questions students ask in a seminar.
65. • Ten upper-level students participate in two seminars where a
presentation is fol-
lowed by students’ questions.
• In the first seminar, no feedback is provided by the instructor
after a student asks
the presenter a question.
• In the second seminar, the instructor offers feedback—such as
“That’s an excel-
lent question” or “Very interesting question” or “Yes, that had
occurred to me as
well”—after each question.
• The psychologist will test the following hypothesis:
H0: There is no significant mean difference in the number of
student questions
asked from seminar 1 to seminar 2
H0: mseminar_1_questions 5 mseminar_2_questions
• By rejecting H0 , the psychologist will find support for the
alternative hypothesis:
Ha: There is a significant mean difference in the number of
student questions asked
from seminar 1 to seminar 2
Ha: mseminar_1_questions ? mseminar_2_questions
Is there a significant difference between the number of
questions students ask in the first
seminar compared to the number of questions students ask in the
second seminar? The
number of questions asked by each student in both seminars and
66. the solution to the prob-
lem are in Figure 7.2.
suk85842_07_c07.indd 241 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
Figure 7.2: Calculating the before/after and within-treatment t
Se
m
in
ar
1
Se
m
in
ar
2
1.
2.
3.
4.
5.
71. �2
1
1. Determine the difference between each pair of scores, d by
subtraction.
2. Determine the mean of the differences, the d values (Md).
Md = = � = �1.1
3. Calculate the standard deviation of the d values (sd).
Verify that sd = 1.101
�d = �11
determine standard error of the mean for the difference scores
(SEMd)
4. Just as the standard error of the mean in the earlier tests was
s/�n,
by dividing the result of step 3 by the square root of the number
of
pairs.
Verify that SEmd = = = 0.348
5. Divide Md by SEMd to determine t.
t = = � = �3.161
this test are the number of pairs of scores, np �1.
t0.05(9) = 2.262
72. 6. As noted earlier, the degrees of freedom for the critical value
of t for
�d
10
Md
SEMd
1.1
0.348
sd
�np
1.101
�10
11
10
suk85842_07_c07.indd 242 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
The calculated value of t exceeds the critical value from Table
5.1 (which is also Table B in
the Appendix). The result is statistically significant. Note that it
73. is the absolute value of the
calculated t in which we are interested. Because the question
was whether there is a sig-
nificant difference in the number of questions, it is a two-tailed
test, and it does not matter
which session had the greater number; it also does not matter
whether Session 1 is larger
than Session 2 or the other way around. The students in the
second session, where ques-
tions were followed by feedback, asked significantly more
questions than the students did
in the first session, when the instructor offered no feedback.
The Degrees of Freedom, the Dependent-Groups Test, and
Power
With Md 5 21.1, there is comparatively little difference between
the two sets of scores.
What makes such a small mean difference statistically
significant? The answer is in the
amount of error variance in this problem. When the error
variance is also very small—the
standard error of the difference scores is just .348—
comparatively small mean differences
can be statistically significant. The rationale for using
dependent-
groups tests as opposed to independent-group designs is that the
for-
mer are comparatively more powerful; there is less error to
contend
with, thereby increasing the probability of rejecting the null
hypoth-
esis. This brings us to the discussion of power in statistical
testing.
Table B in the Appendix, the critical values of t, indicates that
74. critical
values decline as degrees of freedom increase. That occurs not
only in
the critical values for t but also for F in analysis of variance and
in fact
for most tables of critical values for statistical tests. For the
dependent-
groups t-test, the degrees of freedom are based on
• the number of pairs of related scores, 21.
For the independent-groups t-test, the degrees of freedom are
based on
• the number of scores in both groups, 22 (Chapter 5).
This means that critical values are larger in a dependent-groups
test for the same number
of raw scores involved. But even a test with a larger critical
value can produce significant
results when there is more control of error variance. This is
what the dependent-groups
test provides. The central point is this: When each pair of scores
comes from the same par-
ticipant, or from a matched pair of participants, the random
variability from nonequiva-
lent groups is minimal because scores tend to vary similarly for
each pair, resulting in
relatively little error variance. The small SEMd value that
results more than compensates
for the fewer degrees of freedom and the associated larger
critical value connected to
dependent-groups tests.
In statistical testing, power is defined as the likelihood of
detecting a significant differ-
75. ence when it is present. The more powerful statistical test is the
one that will most read-
ily detect a significant difference. As long as the sets of scores
are closely related, the
dependent-measures test is more powerful than the independent-
groups equivalent.
D What does it
mean to say that
the within-subjects test
has more power than the
independent t-test?
Try It!
suk85842_07_c07.indd 243 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
A Matched-Pairs Example
Another form of the dependent-groups t-test is the matched-
pairs design. In this approach,
rather than measure the same people repeatedly, each
participant in one group is paired
with a participant in the other group who is similar.
For example, consider a market analyst who wants to determine
whether a television com-
mercial will induce consumers to spend more on a breakfast
cereal. The analyst selects a
group of consumers entering a grocery store, induces them to
view the television com-
mercial, and then tracks their expenditures on breakfast cereal.
76. A second group is selected,
and they also shop, but they do not view the television
commercial. The analyst selects
people for the second group who match the age and gender
characteristics of those in the
first group. This controls for age and gender because those
characteristics might affect
spending for the particular product. Each individual from Group
1 has a companion in
Group 2 of the same age and sex. The expenditures in dollars
for the members of each
group and the solution to the problem are in Figure 7.3.
Figure 7.3: Calculating a matched-pairs t-test
Vi
ew
ed
Di
d
no
t
Vi
ew d
1.
2.
3.
4.
79. sd = 2.092
Verify that Md = 1.125
SEMd = = = 0.662
t = = = 1.700
t0.05(9) = 2.262
sd
�np
2.092
�10
Md
SEMd
1.125
0.662
suk85842_07_c07.indd 244 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
The absolute value of t is less than the critical value from Table
5.1 (or Appendix Table B)
for df 5 9. The difference is not statistically significant. There
80. are probably several ways to
explain the outcome, but we will explore just three.
• The most obvious explanation is that the commercial did not
work. The shoppers
who viewed the commercial were not induced to spend
significantly more than
those who did not view it.
• Another explanation has to do with the matching. Perhaps age
and gender are
not related to how much people spend shopping for the
particular product. Per-
haps the shopper’s level of income is the most important
characteristic, and that
was not controlled in the pairing.
• Another explanation is related to sample size. Small samples
tend to be more
variable than larger samples, and variability is what the
denominator in the
t-ratio reflects. Perhaps if this had been a larger sample, the
SEMd would have
had a smaller value, and the t would have been significant.
The second explanation points out the disadvantage of matched-
pairs designs compared
to repeated-measures designs. The individual conducting the
study must be in a posi-
tion to know which characteristics of the participants are most
relevant to explaining the
dependent variable so that they can be matched in both groups.
Otherwise, it is impos-
sible to know whether a nonsignificant outcome reflects an
inadequate match, control of
the wrong variables, or a treatment that just does not affect the
81. DV.
Comparing the Dependent-Samples t-Test to the Independent t
In order to compare the dependent-samples t-test and the
independent t more directly, we
are going to apply both tests to the same data. This will
illustrate how each test deals with
error variance; however, a caution is necessary before
beginning: Once data is collected,
there really is no situation where someone can choose which
test to use because either the
groups are independent, or they are not. Therefore, we proceed
purely as an academic
exercise, recognizing that such a situation is not going to
happen in the ordinary course
of events.
As an example, a university program encourages students to
take a service learning class
that emphasizes the importance of community service as a part
of the students’ educa-
tional experience. Data is gathered on the number of hours
former students spend in com-
munity service per month after they complete the course and
graduate from the university.
• For the independent t-test, the students are divided between
those who took a
service learning class and graduates of the same year who did
not.
• For the dependent-groups t-test, those who took the service
learning class are
matched to a student with the same major, age, and gender who
did not take
82. the class.
The data and the solutions to both tests are in Figure 7.4.
suk85842_07_c07.indd 245 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
Figure 7.4: The before/after t-test versus the independent t-test
Because the differences between the scores are quite consistent,
as they tend to be when
participants are matched effectively, there is very little variance
in the difference scores.
This results in a comparatively small standard deviation of
difference scores and a small
standard error of the mean for the difference scores. This allows
for t ratios with even rela-
tively small numerators to be statistically significant. Because
for the independent t-test,
there is no assumption that the two groups are related, error
variance is based on the dif-
ferences within the groups of raw scores, and the denominator is
large enough that the t
value is not significant.
Cl
as
s
No
Cl
as
86. SEd = �(SEM12 + SEM22) = �(0.4532 + 0.3162) = 0.553
As a matched pairs t-test the results are,
t = = 0.650 + 0.211 = 3.081; t0.05(9) = 2.262. The result
is significant.
M1 � M2
SEd
3.50 � 2.850
= 1.175; t0.05(18) = 2.101. The result is not significant.=t =
0.553
Md
SEMd
suk85842_07_c07.indd 246 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
The Dependent-Groups t-Test on Excel
If the problem in Figure 7.4 is completed in Excel as a
dependent-groups test, the proce-
dure is as follows:
• Create the data file in Excel.
Column A is labeled Class to indicate those who had the service
learning class, and
87. column B is labeled No Class.
Enter the data, beginning with cell A2 for the first group and
cell B2 for the second
group.
• Click the Data tab at the top of the page.
• At the extreme right, choose Data Analysis.
• In the Analysis Tools window, select t-test: Paired Two
Sample for Means and
click OK.
• There are two blanks near the top of the window for Variable
1 Range and
Variable 2 Range. In the first, enter A2:A11 indicating that the
data for the first
(Class) group is in cells A2 to A11. In the second, enter B2:B11
for the No Class
group.
• Indicate that the hypothesized mean difference is 0. This
reflects the value for the
mean of the distribution of difference scores.
• Indicate A13 for the output range, so that the results do not
overlay the data
scores.
• Click OK.
Widen column A so that all the output is readable. The result is
the screenshot that is
Figure 7.5.
suk85842_07_c07.indd 247 10/23/13 1:29 PM
88. CHAPTER 7Section 7.2 Dependent-Groups Designs
Figure 7.5: The Excel output for the dependent-samples t-test
using the
data from Figure 7.4
In the Excel solution, t 5 3.074 rather than the 3.081 from the
longhand solution. The
Excel approach is to calculate the correlation between scores to
find a solution, rather than
to determine the difference between scores as we did. Note that
the Pearson correlation
(which will be explained in Chapter 8) is indicated at .91. In
any event, the very minor dif-
ference, .007, between the solution in Figure 7.4 and the Excel
solution in Figure 7.5 is not
relevant to the outcome as these are attributed to rounding
errors. The Excel output also
indicates results for both one-tailed and two-tailed tests at p 5
.05; the outcome is statisti-
cally significant.
suk85842_07_c07.indd 248 10/23/13 1:29 PM
CHAPTER 7Section 7.2 Dependent-Groups Designs
Apply It!
Repeated Measures
Diabetes is a group of metabolic diseases in which the body
89. cannot properly
regulate blood sugar. Management of this disease is achieved by
controlling
normal levels of glucose in the blood for as much of the time as
possible. This requires an
accurate, portable glucose monitor for home use.
A medical device company has developed a new portable
glucose monitor and wishes to
compare it against a laboratory standard. This will produce a
data set in which two different
monitors measure the glucose level of 11 randomly chosen
diabetes patients. Although the
two monitors take the blood samples at the same time, this can
be considered an example
of the before/after dependent-samples t-test because the same
group is measured twice.
By using the same set of patients for both monitors, each patient
is his or her own control.
Obtaining two measurements for each patient reduces
measurement variability compared
to using two independent sets of patients. Choosing a level of
significance of p ≤ .05, we use
the paired-sample t-test to test the null hypothesis that there is
no difference in measure-
ments between the two monitors.
• H0: mglucose_portable_monitor 5 mglucose_lab_monitor
By rejecting H0 the company will find support for the
alternative hypothesis that there is a
significant mean difference in the glucose level between both
machines.
• Ha: mglucose_portable_monitor ? mglucose_lab_monitor
90. The glucose readings from each of the two monitors are
measured in milligrams per deciliter
and are shown in the following table. There is a large variability
within each column because
each patient is different, and the readings were taken at various
times of the day.
Patient Portable Monitor Laboratory Standard
A 112 120
B 85 82
C 103 116
D 154 168
E 65 75
F 52 51
G 85 96
H 72 79
I 167 178
J 123 141
K 142 153
(continued)
suk85842_07_c07.indd 249 10/23/13 1:29 PM
91. CHAPTER 7Section 7.2 Dependent-Groups Designs
Comparing the Three Dependent t-Tests With the Independent t-
Test
The before/after and matched-pairs approaches to calculating a
dependent-groups t-test
each have advantages. The before/after design provides the
greatest control over the
extraneous variables that can confound the results in a matched-
pairs design. When using
the matching approach, there is always a chance that subjects in
Group 2 are not matched
closely enough on some relevant variable and the resulting
mismatches create error vari-
ance. In the service learning example, students were matched
according to age, major,
and gender. But if marital status affects students’ willingness to
be involved in commu-
nity service and it is not controlled, there could be an imbalance
of married/not-married
Apply It! (continued)
The Excel solution follows:
Variable 1 Variable 2
Mean 105.45 114.45
Variance 1428.67 1736.27
Observations 10 10