1. Jamie Cozens & Adam Thorne
1
Test-retest reliability of a standing isometric hamstring bridge test using force platforms to
examine isometric peak hamstring force in the dominant vs non-dominant leg
Introduction
Hamstring strains are common in a number of sports, most notably invasions games such as
football, where dynamic movements such as jumping and sprinting are frequent. In an
epidemiological study by Cohen et al (2011), the average number of training sessions and games
missed through hamstring strains were 11.3 and 2.6 respectively over a ten year period, with the
number of games missed ranging from 0-16 across the 39 subjects. A number of factors affect the
amount of time spent on the side-lines, including the player’s age, injury history and the grade of
the injury, which over a season and within a large squad of players represents a major disadvantage
for a sports team. Muscle strains have a high rate of recurrence (Orchard, Best and Verrall, 2005)
often due to incomplete rehabilitation through returning to play too quickly. As such, reliably
assessing the ability of the player to safely return to play is crucial for reducing the recurrence rate
and improving player availability over the entire season. One such way of assessing muscle
damage and readiness to return to play is by measuring isometric peak hamstring force (IPHF).
Whilst accurate measures of IPHF can be obtained using an isokinetic dynamometer, the
equipment is expensive and lack portability. Isometric contractions has been proven to be a safer
modality for reporting peak force than eccentric contractions, due to its little or no structural
muscle damage in comparison (Nosaka, Newton and Sacco, 2002). McCall et al (2015) presented
two simple tests that produced reliable measures of IPHF that could be implemented in
significantly less time and in a field setting. The method involved subjects lying supine in two
conditions; a) with the hip at 90° (from full hip extension) and the foot resting on a force platform,
raised to elicit a knee angle of 90° (from full knee extension); or b) with the knee at 30° and hip
angle at 30°. During a 3 second contraction, the coefficient of variation for IPHF in both dominant
and non-dominant legs for both tests, were reported as lower than the 10% cut-off used to declare
reliability. However, it was suggested that the force produced during this test may have been
influenced by activation of the hip extensor muscles which would incorrectly reflect IPHF. This
problem was also experienced during pilot testing of a similar supine protocol from our group,
with subjects facing difficulty resisting hip extension. Standing upright as opposed to lying supine
2. Jamie Cozens & Adam Thorne
2
may limit the extent of hip extension and as such, provide a more accurate assessment of IPHF,
whilst maintaining the quick and easy to administer characteristics of the former test. Therefore,
the aim of the study was to investigate the test-retest reliability of a standing isometric hamstring
test reporting IPHF.
Methods
Subjects Characteristics
Six subjects (five male, one female) working at St George's Park completed the study. All subjects
completed the IPHF testing at least 1 week prior to testing in order to familiarise themselves with
the protocol. All subjects reported a healthy state on the day of testing with no previous lower limb
injuries.
Protocol
Subjects were required on two separate days, separated by 24 hours, for testing and retesting with
the same protocol used on each day. Subjects completed a standardised warm up at 80-90 watts on
a watt-bike for five minutes. Following a five minute rest, testing began. Subjects placed their
dominant leg on a zeroed force plate (Pasco Scientific, Inc. Roseville, CA, USA, PS-2141), which
was attached to a flat, solid board ratcheted to couch that could be adjusted for height and
proximity. With their foot placed in the centre of the force plate, the couch was adjusted until hip
angle was at 90° and knee angle was at 20° from full extension. An allowance of 5° either side
was given. Angles were measured using a handheld goniometer. To calculate force production
relative to leg mass, the leg to be worked was weighed upon the force plate. The subjects were
instructed to stand with the head, upper back, buttocks and heel flat against the wall and their arms
across their chest. In order to fully activate the hamstring muscles, after a three second countdown
the subjects were instructed to push down on the platform through the heel, by attempting to curl
their heel back towards the wall for three seconds. Following the effort, subjects were allowed to
surrender the hamstring bridge position and rest for two minutes before completing the next two
trials in order to negate the effects of fatigue on force production. For the repeated efforts, the foot
3. Jamie Cozens & Adam Thorne
3
was placed again on the centre of the platform and the trials completed without re-measurement
of angles to replicate the field testing procedure. Following completion of the dominant leg, the
non-dominant leg underwent the same test and within a week the testing protocol was repeated
again to determine test-retest reliability. During each trial, isometric peak force was recorded along
with the rate of force development (RoFD) at 100, 150 and 200ms.
Statistical Analysis
A Shapiro-Wilk test was used to test the data for normal distribution, with a paired sample t-test
used to test for differences between the test and retest averages for the dominant and non-dominant
legs. Significance was set at p < 0.05. A spreadsheet by Hopkins (2002) was used to determine the
change in mean (CIM) between the test and retest trials, the typical error of measurement (TE) and
the intraclass correlation coefficient (ICC) at 90% confidence limits. The minimal difference need
to be considered as real (MD) was calculated as (TE x 1.96) x √2 (Weir, 2005). The limits of
agreement for IPHF were calculated in SPSS (SPSS 22.0) and plotted using the Bland-Altman
method. Effect sizes for normally distributed between test comparisons were calculated using
Cohen’s d.
Results
Dominant and non-dominant IPHF and RoFD were reported as normally distributed. Following
paired sample t-tests, no significant difference was reported in dominant and non-dominant IPHF
respectively, results of which are shown in Table 1. Paired sample t-tests were also ran for the
RoFD for the non-dominant leg at 100ms, 150ms and 200ms respectively (t = -.232, p = .826; ES
= -0.03; t = -.138, p = .896; ES = -0.02; t = .073; p = .945; ES = -0.02). RoFD for the dominant
leg was also reported as normally distributed, with no significantly difference reported for
100ms, 150ms and 200ms respectively (t = -1.730, p = .144; ES = -0.21; t = -.700, p = .515; ES =
-0.07; t = -.551, p = .605; ES = -0.07). All effect sizes were reported as small. Table 2 presents
the reliability variables for IPHF (typical error, change in mean, intraclass coefficient and
minimal difference) of the tests performed on the force plate. Tables 3 and 4 present the same
reliability variables for RoFD of the dominant and non-dominant legs of the test respectively.
4. Jamie Cozens & Adam Thorne
4
Table 1. t-test means, standard deviations, effect sizes and the magnitude of the effect sizes for
dominant and non-dominant IPHF.
Table 2. Reliability of simple and quick test for isometric peak hamstring force in standing
subjects.
Note: TE - typical error of measurement; ICC – intraclass correlation coefficient; MD - minimal
difference.
Table 3. Reliability of simple and quick test for rate of force development in dominant leg of
standing subjects.
Table 4. Reliability of simple and quick test for rate of force development in non-dominant leg
of standing subjects.
A Bland-Altman Plot was used to determine the limits of agreement for the IPHF in the
dominant and non-dominant legs (Bland and Altman, 1986). Data points lying between these two
lines indicate good test-retest reliability. Figures 1 and 2 show the test-retest averages for each
IPHF t -test ES ES Magnitude Mean Std.Deviation
Dominant -2.123; p = -0.087 -0.11 Small -9.51 10.97
Non-Dominant -2.195; p = 0.080 -0.12 Small -11.8 13.17
TE (90% CL) Change in mean (90% CL) ICC (90% CL) MD
RoFD at 100ms (N/ms) 46.96 (31.56 - 98.12) 46.91 (-7.73 - 101.55) 0.92 (0.67 - 0.98) 130.17
RoFD at 150ms (N/ms) 37.46 (25.18 - 78.26) 15.14 (-28.44 - 58.72) 0.99 (0.93 - 1.0) 103.8
RoFD at 200ms (N/ms) 40.36 (27.12 - 84.31) 12.84 (-34.11 - 59.79) 0.98 (0.89 - 0.99) 111.85
TE (90% CL) Change in mean (90% CL) ICC (90% CL) MD
Dominant Leg Test (N) 7.76 (5.22 - 16.22) 9.52 (0.48 - 18.55) 0.99 (0.95 - 1.0) 21.51
Non-Dominant Leg Test (N) 9.31 (6.26 - 19.46) 11.80 (0.97 - 22.63) 0.99 (0.94 - 1.0) 25.81
TE (90% CL) Change in mean (90% CL) ICC (90% CL) MD
RoFD at 100ms (N/ms) 83.15 (55.88 - 173.73) 11.15 (-85.59 - 107.89) 0.93 (0.71 - 0.98) 230.49
RoFD at 150ms (N/ms) 65.87 (44.27 - 137.62) 5.25 (-71.38 - 81.88) 0.93 (0.70 - 0.98) 182.6
RoFD at 200ms (N/ms) 70.66 (47.49 - 147.62) -2.99 (-85.19 - 79.22) 0.87 (0.46-0.96) 195.9
Table 4. Reliability of simple and quick test for rate of force development in non- dominant leg of standing participants.
5. Jamie Cozens & Adam Thorne
5
subject plotted against the test-retest difference and the subsequent limits of agreement. See
appendices for RoFD and corresponding limits of agreement (Figures 3-8).
Figure 1. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for IPHF in the dominant leg. Note: SD = standard deviation
Figure 2. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for IPHF in the non-dominant leg. Note: SD = standard deviation.
- 1.96 SD
-29.2
Mean
-9.51
+ 1.96 SD
10.1
+ 1.96 SD
11.76
Mean
-11.8
- 1.96 SD
-35.36
6. Jamie Cozens & Adam Thorne
6
Mean and limits of agreement were used to determine the range of the test-retest difference for
both the dominant and non-dominant IPHF; the results of which are shown below in Table 5.
Table 5. Mean difference, upper and lower limits of agreement and the range of both dominant
and non-dominant isometric peak hamstring force.
Discussion
The primary aim of this study was to investigate the test-retest reliability of a standing hamstring
bridge test. To the best of our knowledge, this is the first study to examine this body position and
the results have been encouraging. Throughout testing, this novel method proved to be quick and
easy to administer, making it suitable for application within a practical setting. The IPHF values
recorded were not only highly correlated but also fell between the upper and lower limits of
agreements in both dominant and non-dominant legs.
The range difference between the upper and lower limits of agreement was 39.3N and 47.1N
respectively. In relation to the average means across the subjects, this represented a variance
equating to almost 16% of their average maximal effort (249N). A similar hamstring test by
McCall et al (2015) showed a mean difference of 11-16%,although this study focused on simulated
match fatigue. This study reported IPHF in a supine position as a useful tool for measuring
isometric force reductions in professional football players. Limits of agreement will differ
between studies but the range of difference between test and re-test averages of the McCall study
were consistent with the present study. A difference between 38N and 54N was recorded, although
this study was completed by professional French Ligue 1 football players so it can be expected
that their results would be more consistent than the recreationally active subjects used in the
present study.
IPHF Mean Difference Upper Limit Lower Limit Range
Dominant (N) -9.52 10.13 -29.16 39.3
Non-Dominant (N) -11.8 11.76 -35.36 47.1
7. Jamie Cozens & Adam Thorne
7
One key question to arise from this statement is whether this represents a justifiable and reliable
difference between trials for subjects. A possible reason for this high, but nonetheless reliable
fluctuation exists regarding the technique adopted by the subjects. Prior to testing, a pilot study
was carried out which highlighted a number of issues regarding the technique used to produce
force, such as hip extension, heel placement and arm position. This hip extension and subsequent
change in hip angle allowed for an increase in vertical force when compared with the original test
measures. For example, one subject exhibited an IPHF equal or greater than another subject despite
a noticeable difference in body type and limb length. Although anthropometry cannot be discarded
so easily, the results from this pilot study showed vast differences between test and retest trials
which correlated with an ever-changing technique adopted. Although these issues were resolved
in the present study, it remains important to continue enforcing strict coaching cues to remove
inconsistencies in technique and improve reliability. Furthermore, it is possible that the element of
competition, with the aim of beating previous scores or attempting to increase magnitude of the
force trace displayed to the subjects, influenced the subjects into producing force by any means
necessary, at the expense of correct form. Therefore in the present study, subjects were blinded to
their own and others’ scores and were unable to view the force trace until the test-retest protocol
has been completed in both legs.
Another noteworthy finding is that the majority of the subject test-retest averages fell between the
mean and lower limit of agreement, suggesting a fatiguing effect during trials. Each subject
completed three trial on each leg during both test and retest trials, with a two minute rest period
implemented between each trial. Verbal dialect with subjects during testing revealed a fatiguing
effect existing, with three subjects reporting tightness in their hamstrings, and another commenting
on the uncomfortable starting position for the IPHF recording. While the latter issue was removed
via two minutes of rest without maintaining the starting position, the former could be attributed to
the warm-up procedure adopted. For the present study, subjects were required to cycle for 5
minutes at a power output equivalent to between 80-90 watts. This protocol was accepted via
previous lab-based warm-ups for testing at the present study’s facilities, including heat chamber
exposure and isokinetic dynamometry tests. However, McCall et al (2015) implemented a 10-
minute warm up; seven minutes of pedalling at 90 watts followed by three minutes at 120 watts
for a supine isometric hamstring bridge test. It is worth mentioning that this study focussed on
8. Jamie Cozens & Adam Thorne
8
simulated match fatigue, as such a higher intensity warm-up was adopted. Although a fatiguing
effect will naturally occur during testing, the warm-up quality and quantity can play a significant
role in the severity of the fatiguing state experienced. Further testing could investigate the effect
of warm-up intensity on IPHF and to what extent the results lie within the limits of agreement in
order to determine the effectiveness of simulated match fatigue. Another potential avenue to
pursue is a study of test-retest versus a simulated fatigued state, in order to determine the impact
of an intense bout(s) of exercise would have on consistency.
Rate of force development (RoFD) was another variable collected throughout testing, with the
results published in the appendices of this report. Although considered an interesting data set to
examine, results showed a wide range of force development throughout subject trials. This could
stem from the informal countdown from three, with many subjects commencing their trial before
the end of the countdown, whilst others started after this. Although this relates to consistency in
coaching points and verbal instructions throughout testing, it is clear that rate of force development
was more consistent in the dominant leg than the non-dominant leg. Further research could
investigate torque ratios throughout testing and their effect on IPHF consistency.
.
Limitations
As only one practitioner was present for the trials, it is possible that technique may still have
been inconsistent despite attempts to standardise this. Employing a second practitioner to
monitor technique and manually limit hip extension and knee flexion would be recommended.
Although every effort was made to limit the movement of the couch, a sturdier but adjustable
platform specifically designed for IPHF testing may improve reliability further.
Conclusion
The present study has demonstrated that a standing hamstring bridge test is a reliable method for
determining IPHF in the general population. This method is more potentially more practical than
previously mentioned methods and as such represents a more suitable alternative for use within a
sports setting. Further research should focus on determining the sensitivity of the method with
regards to post-exercise fatigue, for example after a competitive match.
9. Jamie Cozens & Adam Thorne
9
References
Cohen, S. B., Towers, J. D., Zoga, A., Irrgang, J. J., Makda, J., Deluca, P. F., and Bradley, J. P.
(2011, 04). Hamstring Injuries in Professional Football Players: Magnetic Resonance Imaging
Correlation with Return to Play. Sports Health: A Multidisciplinary Approach, 3(5), 423-430.
Hopkins, W. G. (2015). Spreadsheets for analysis of validity and reliability. Sports Science, 19,
36-42.
McCall, A., Nedelec, M., Carling, C., Gall, F. L., Berthoin, S., and Dupont, G. (2015, 04).
Reliability and sensitivity of a simple isometric posterior lower limb muscle test in professional
football players. Journal of Sports Sciences, 33(12), 1298-1304.
Orchard, J., Best, T. M., and Verrall, G. M. (2005, 11). Return to Play Following Muscle Strains.
Clinical Journal of Sport Medicine, 15(6), 436-44.
Nosaka, K., Newton, M., and Sacco, P. (2002). Responses of human elbow flexor muscles to
electrically stimulated forced lengthening exercise. Acta Physiologica Scandinavica, 174, 137–
145.
Weir, J. P. (2005). Quantifying test-retest reliability using the intraclass coefficient and the
SEM. Journal of Strength & Conditioning Research, 19, 231–240.
Hopkins, W. G. (2002). A scale of magnitude for effect sizes. Retrieved from
http://www.sportsci.org/resource/stats/effectmag.html
Bland, J. M. and Altman, D. G. (1986). Statistical method for assessing agreement between two
methods of clinical measurement. The Lancet, 307-310.
10. Jamie Cozens & Adam Thorne
10
Appendices
Figure 3. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for RoFD at 100ms in the dominant leg.
Figure 4. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for RoFD at 150ms in the dominant leg.
11. Jamie Cozens & Adam Thorne
11
Figure 5. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for RoFD at 200ms in the dominant leg.
Figure 6. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for RoFD at 100ms in the non-dominant leg.
12. Jamie Cozens & Adam Thorne
12
Figure 7. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for RoFD at 150ms in the non-dominant leg.
Figure 8. Subject test-retest average plotted against the test-retest difference and the limits of
agreement for RoFD at 200ms in the non-dominant leg.