INTRODUCTION

Sports medicine professionals utilize various clinical tests to determine capacity to produce force in patients. Isokinetic dynamometry is currently considered the gold standard for determining single joint muscle strength.1 However, this is not widely available in the clinical setting due to limitations of cost, space, and requisite time for testing. An alternative that is frequently used clinically is isometric testing with a hand-held dynamometer (HHD). This has been shown to be reliable for various joints when set up in a rigid and repeatable manner,2 and has been reported to be significantly correlated to isokinetic testing.3,4 Although previous work has demonstrated good to excellent reliability for clinician-stabilized HHD,4 belt stabilization is typically recommended.2 This may be more apparent when testing larger muscle groups capable of higher force production if the clinician or patient is not able to maintain a stable and rigid testing position.5,6 Further supporting this notion, the strength and sex of the examiner has been shown to influence reliability.5–7 Utilizing belt fixation or other forms of external fixation is one method that may mitigate the limitation of the clinician or patient’s strength.7–10 External fixation has been accomplished through various means clinically such as using gait belts, bracing against tables or walls, and using custom-built frames and devices.

One common HHD assessment includes knee flexion torque production. This measure may have utility in determining inter-limb deficits and comparing to normative data and also may allow a clinician to monitor progress throughout rehabilitation, determine effectiveness of interventions, and inform return-to-sport decision making.11,12 It should be acknowledged that this measure may not be directly used to infer (re)injury risk or sport performance. Prior literature has included a variety of methods with a HHD to obtain this measure.3,7,13–20 These studies collectively include a variety of patient positions, joint angles, and stabilization methods. It is important to note that knee flexion torque may vary depending on test position with potentially more torque producing capacity in longer muscle lengths (hip flexion and knee extension) possibly due to a muscle’s length-tension relationship.21,22 One should be cognizant of the testing positions and methods as it may be relevant when interpreting tests or comparing to previously reported data.

It is important to understand which clinically feasible positions and methods yield reliable measures as this gives insight to the force producing capacity of a muscle to aid in clinical decision making. In the clinical environment, it is also important to be efficient during testing as a measure of knee flexion torque represents only a portion of typical testing batteries. If reliable measures can be efficiently obtained during assessment of one physical quality, then this allows for more time to ensure appropriate and reliable measurement of other qualities, as well as provide the patient education on the interpretation of the results and possible interventions based on those results. It should be noted that many prior studies that support using belt fixation over clinician stabilization have been completed for the assessment of relatively strong and large muscle groups such as the knee and hip extensors.7–10 The knee flexors are typically capable of much less force production than the knee extensors, which is supported by a recent review including nearly 14,000 participants.22 Due to this, using clinician-stabilization when testing the knee flexors may be less susceptible to unacceptable reliability compared to the knee extensors. Therefore, the purpose of this study was to determine inter and intra-rater reliability of two clinically efficient methods of assessing isometric knee flexion torque using a HHD with clinician-stabilization. The hypothesis was that each method would yield good to excellent inter (between testers, within session) and intra-rater (within each tester, across sessions) reliability.

METHODS

Subjects

Participants were recruited in this cross-sectional study as a convenience sample via an organizational email and word of mouth. Participants were assessed on two separate days (average seven days between sessions). Testing sessions were attempted to be completed as close as possible to the same time of day to account for potential circadian variation.23 Exclusion criteria included prior history of hip or knee injury requiring medical intervention and participants who were unable to understand testing procedures or provide consent. This study was approved by the Lawrence Memorial Hospital Institutional Review Board and participants were provided written informed consent and given the opportunity to ask any study-related questions prior to participating.

Procedures

Participants began each session with a self-selected three to five minute warm up on a stationary bike, elliptical, or treadmill. Moment arm length was measured as the distance between the center of the lateral femoral condyle and the most lateral point of the lateral malleolus in the seated position using a standard measuring tape. This distance represents the knee joint axis of rotation24 and the point of force application for use in calculating torque. Dynamometry testing was completed in two positions, by two examiners, on both the dominant and non-dominant limb. Limb dominance was determined by asking “which leg would you prefer to kick a ball with?” The order of the tester, limb, and position were randomized on the first day and the same order was repeated for the second session. Figure 1 displays the process of testing among position, limb, and examiner. Each examiner was blinded to the results of the other examiner. The testing positions included 1) seated at the edge of a table with the hips and knees flexed to 90° with hands gripping the sides of the table for stabilization with the examiner holding the dynamometer between the leg of the table and leg of the participant (posterior to the lateral malleolus) and 2) lying prone on the table with the hip at 0° and knee at 90° with hands gripping the table for stabilization while the examiner assumed a stride stance position with elbows locked in extension and hands overlapping (Figure 2). This position was chosen based on the experience of the examiners as it was believed to afford adequate stability for the test. The dynamometer used for testing was a Hoggan MicroFET2® HHD (Hoggan Health Industries, Salt Lake City UT, USA). This device has previously demonstrated good to excellent inter and intra-rater reliability, as well as concurrent validity compared to a fixed dynamometer for a knee flexion torque test.15 “Make” tests were utilized meaning that the participant volitionally produced as much force as possible during each test. Prior to testing trials for each session, one to three submaximal trials were completed for familiarization. Following this, three maximal effort trials were completed. The participant was instructed to gradually increase force during the first second of a three to five second max effort push into the dynamometer. Vigorous verbal encouragement was provided by the examiner to help ensure that maximal effort was achieved. Approximately 10 seconds rest was given between contractions as the examiner recorded the data, and two to three minutes rest was given between examiners. If an individual tester noted a trial to be greater than ~20% different from the other two trials for that same session and tester, then an additional 30 second rest was given, and another trial was completed.

Figure 1
Figure 1.Testing procedure order.

The order of the examiner, position, and limb was randomized on the first session and repeated on the second session for each participant.

Figure 2
Figure 2.Testing methods.

A) The participant is seated at the edge of a table with the hip and knee flexed to 90 degrees and hands gripping the sides of the table while the clinician stabilizes the dynamometer between the participant’s leg and table. B) The participant is prone with the hip at 0 degrees and knee at 90 degrees and hands gripping the sides of the table while the clinician assumes a stride stance with elbows locked in extension to stabilize the dynamometer on the participant’s leg.

Statistical Analysis

Participant demographics were reported using descriptive statistics. Peak force (N) was recorded from each trial and converted to torque using the shank length. The average peak force of the three maximal attempts from each limb, position, and examiner was used for analysis. The intraclass correlation coefficient (ICC) and 95% confidence intervals were calculated in SPSS v.26 (IBM, Armonk, NY) to determine inter and intra-rater reliability. ICC values were classified as poor (<0.50), moderate (0.5-0.75), good (0.75-0.90), and excellent (>0.90).25 An a-priori alpha of 0.05 was used for statistical analysis. The standard error of measurement (SEM) was calculated using the equation SD * √(1-ICC).26 The minimal detectable change (MDC) was calculated as 1.96*√(2)*SEM.26 MDC was also reported as a percentage.

RESULTS

Inter-rater Reliability

Twenty healthy recreationally active individuals participated in this study (Table 1). Inter-rater reliability (between testers, within session) was good to excellent for both the seated and prone positions for both the dominant and non-dominant limbs. MDC ranged from 30-45% for the seated position and 21-40% for the prone position (Table 2).

Table 1.Participant demographics
  Mean ± StD
Age (yrs) 30.4 ± 8.9
Height (cm) 173.2 ± 14.4
Weight (kg) 77.5 ± 17.7
Sex (male/female) 11/9

cm=centimeters, kg=kilograms, StD=standard deviation, yrs=years

Table 2.Torque and inter-rater reliability for each limb, position, and testing session
    Rater 1 Rater 2 Inter-rater reliability
    Mean ± StD (Nm) Mean ± StD (Nm) ICC (95% CI) SEM (Nm) SEM (%) MDC (Nm) MDC (%)
Visit 1 Seated ND 65.0 ± 37.7 66.5 ± 33.3 0.92 (0.75,0.97) 9.8 15 27.3 41
Seated D 77.2 ± 34.0 73.8 ± 30.2 0.93 (0.78,0.98) 8.3 11 23.1 31
Prone ND 56.4 ± 22.7 63.9 ± 22.9 0.96 (0.87,0.99) 4.7 8 12.9 21
Prone D 53.1 ± 18.4 69.0 ± 22.6 0.84 (0.51,0.95) 8.8 14 24.3 40
   
Visit 2 Seated ND 78.7 ± 36.5 70.1 ± 32.4 0.88 (0.62,0.96) 12.0 16 33.2 45
Seated D 85.3 ± 35.6 80.3 ± 39.0 0.94 (0.82,0.98) 8.8 11 24.5 30
Prone ND 59.1 ± 20.0 70.3 ± 26.3 0.94 (0.82,0.98) 5.7 9 15.9 25
Prone D 57.0 ± 17.6 73.1 ± 24.2 0.88 (0.62,0.96) 7.9 12 21.8 34

CI=Confidence Interval, D=dominant, ICC=intraclass correlation coefficient, MDC=minimum detectable change, ND=non-dominant, Nm=Newton meters, SEM=standard error of the mean, StD=standard deviation

Intra-rater Reliability

Intra-rater reliability (within each tester, across sessions) was good for the seated position and excellent for the prone position for examiner 1, while it was good to excellent for both positions for examiner 2. MDC ranged from 43-62% for the seated position and 23-34% for the prone position (Table 3).

Table 3.Intra-rater reliability for each limb, position, and rater
  Intra-rater reliability (Rater 1)   Intra-rater reliability (Rater 2)
  ICC (95% CI) SEM (Nm) SEM (%) MDC (Nm) MDC (%)   ICC (95% CI) SEM (Nm) SEM (%) MDC (Nm) MDC (%)
Seated ND 0.89 (0.69,0.96) 12.5 17 34.6 48 0.90 (0.75,0.96) 10.6 16 29.4 43
Seated D 0.82 (0.53,0.94) 14.8 18 40.9 50 0.77 (0.48,0.91) 17.3 22 47.9 62
Prone ND 0.94 (0.82,0.98) 5.2 9 14.4 25 0.90 (0.76,0.96) 8.2 12 22.7 34
Prone D 0.94 (0.82,0.98) 4.6 8 12.6 23   0.89 (0.73,0.96) 8.4 12 23.2 33

CI=Confidence Interval, D=dominant, ICC=intraclass correlation coefficient, MDC=minimum detectable change, ND=non-dominant, Nm=Newton meters, SEM=standard error of the mean

DISCUSSION

The purpose of this investigation was to determine inter and intra-rater reliability of isometric knee flexion torque production during two clinically efficient and pragmatic testing methods. The hypothesis that good to excellent inter-rater reliability (between testers, within session) and intra-rater reliability (within each tester, across sessions) would be found for both methods was supported.

These findings for inter-rater reliability are consistent with previous reports. Others that have investigated HHD assessment of knee flexion in a seated position (90° hip flexion, 90° knee flexion) have reported good to excellent ICC values for inter-rater reliability ranging from 0.82 – 0.99.15–17 Specifically, Mentiplay et al. used perhaps the most comparable seated method to this study (90° hip flexion, 90° knee flexion, with clinician-stabilization) and reported ICC values of 0.82-0.92.15 Others that used a seated position (90° hip flexion, 90° knee flexion, with external fixation) reported ICC values of 0.93-0.9716 and 0.99.17 In the prone position, prior work has investigated various degrees of hip and knee flexion with all included angles yielding good ICC values from 0.82 – 0.87.7,19,20 None of these prior studies assessed in prone position used in this investigation (0° hip flexion, 90° knee flexion). The most comparable to the prone position in this study possibly was van der Made et al. (0° hip flexion, 15° knee flexion, with clinician-stabilization) who reported ICC values of 0.80-0.84.19 Other studies utilizing a prone method reported ICC values of 0.84 (0° hip flexion, 0° knee flexion, with external fixation),7 0.82 (0° hip flexion, 15° knee flexion, with external fixation),19 and 0.87 (45° hip flexion, 30° knee flexion, with external fixation).20

Regarding intra-rater reliability, prior literature has been highly variable in the seated position with poor – excellent ICC values ranging from 0.49 – 0.98.13–15,17,18 As noted above, Mentiplay et al. had the most comparable seated method to this study and reported intra-rater ICC values of 0.89-0.96.15 Others that assessed the seated position at these same joint angles with external fixation reported ICC values of 0.77,13 0.62-0.66,14 0.98,17 and 0.49.18 The authors are not aware of a prior study reporting intra-rater reliability for the prone position at 0° hip flexion, 90° knee flexion. In the prone position (45° hip flexion, 30° knee flexion), Wollin et al.20 reported a good intra-rater ICC value of 0.86.

Clinical Utility

The importance of the findings in this study may be highlighted in that both testing approaches did not involve additional devices or set-up time due to external fixators or other equipment. Indeed, the inter and intra-rater reliability was shown to be generally high for both methods despite not utilizing external fixation for the dynamometer or participant and similar to values previously reported using fixed HHD for knee flexion torque.7,13,16,17,19,20 It may be argued that the seated position in this study offers a form of external fixation (the table) and therefore no longer is entirely clinician-stabilized by definition. The use of the table leg does add a novel aspect to this assessment while maintaining clinical pragmatism and offers another seated method available to the clinician in addition to the seated methods from prior studies further described above.13–18 Further, one should not interpret this study as a comparison of clinician-stabilization to external fixation, only as an investigation of reliability of two clinically efficient and pragmatic assessment methods. Most prior studies do not directly compare clinician stabilization to external fixation for knee flexion specifically, so it may not be directly concluded if the stabilization condition influenced reliability for this specific joint assessment. In one study that did compare belt stabilization of the dynamometer to clinician stabilization in the prone position (0° hip flexion, 15° knee flexion), the ICC values for inter-rater reliability were 0.82 and 0.84 respectively, suggesting that the belt stabilization did not influence reliability for that specific method of assessment.19 Further, utilizing external or belt stabilization does not necessarily mean that good or excellent reliability will be achieved. For example, Martins et al.14 reported only moderate reliability (ICC: 0.62 - 0.66) and Toonstra et al.18 reported poor reliability (ICC: 0.49) despite utilizing external fixation methods of the dynamometer when assessing knee flexion torque. Additionally, both van der Made et al.19 and Mentiplay et al.15 showed good to excellent reliability for knee flexion torque assessment without a belt or external stabilization method. This may suggest that other aspects of the assessment method influence reliability, which may include but is not limited to the actual device used for stabilization, the dynamometer, the patient positioning, clinician positioning, characteristics of the clinician (i.e. sex, weight, strength), the instructions given, and both the patient and clinician’s familiarity with the assessment method. It should be made clear that this assumption of the lack of influence of external stabilization is only being suggested for isometric knee flexion torque assessment. Reliability of assessment of other joints and actions that are expected to produce greater torque such as knee extension and hip extension has been shown to generally be higher with an external stabilization method versus clinician stabilization.8–10,27 This is intuitive as a clinician would reasonably have more difficulty providing adequate stabilization against larger torque values.

Some clinicians may suggest that the extra time taken for HHD and patient fixation is a deterrent to obtaining the objective measurement. This deterring factor is mitigated with the methods of testing in this study, while still offering an acceptable form of stabilization. In the seated position, the participant’s own body weight stabilized the table providing for an immovable table leg to push the dynamometer against so that examiner strength is not a limiting factor. In the prone position, although the HHD is not fixated against an immovable object since it is held in place entirely by the examiner, the position assumed by the examiner (stride standing with elbows in full extension [Figure 2]) allowed for rigid enough stabilization to yield good to excellent reliability. Further, the participants were instructed to hold the table with both hands during each method to further provide some level of patient stabilization and limit compensatory mechanisms. It must be noted that sex, strength, and weight of the examiner and the patient could still be reasonably expected to influence reliability.5–7 Wikholm and Bohannon6 suggested reliability is more likely to be influenced by tester strength when participant force generation is greater than 120 Newtons. Below that value, reliability was not influenced by the strength of the tester. It is logical that this threshold may be variable among positions, joints, and actions assessed. It may be notable that both examiners in this investigation were males weighing >200 pounds that regularly participate in resistance training, which may have contributed to the observation of good to excellent reliability.5–7 Nonetheless, this study does provide options for clinically efficient and feasible methods of assessing knee flexion torque production.

One important observation in this investigation is the MDC values. MDCs include higher ranges for the seated position with values as high as 45% and 62% for inter and intra-rater reliability respectively. This is higher than inter-rater MDCs previously reported up to 25 - 29%,15,16 while previously reported intra-rater MDCs have been highly variable with the largest values ranging from 24 – 61%.14,15,18 This represents a potential limitation of utilizing this testing method despite the good to excellent reliability. These high MDC values suggest that a reassessment would need to yield a relatively large change from the initial assessment for the clinician to be confident that a real change in knee flexion torque production capacity has occurred. If a change does not exceed this large MDC, then the clinician may just be observing expected variations in force output for the method of testing. This indicates a potential supporting element for testing in the prone position as MDC values were as low as 30% and 23% for inter and intra-rater reliability respectively. This is more comparable to prior studies which report inter-rater MDCs ranging from 13-25%7,19,20 and an intra-rater MDC of 14%.20 The reason for the lower MDCs in the prone position may be mathematically due to lower standard deviations relative to the mean recorded with that method. The examiners subjectively noted that some participants in the seated position directed their line of force straight posterior into the dynamometer, while some appeared to direct their force in a slightly more superior oriented vector. This may have been due to the possibility of a slight compensation of concurrent hip flexion by the participants. Despite the dynamometer being held against an immovable table leg, the back of the dynamometer was a rounded surface that occasionally tended to slightly tilt against the flat leg of the table. This may have resulted in a relative decrease in the amount of force directed perpendicular to the dynamometer since their force vector was oriented slightly superior. As participants had varying degrees of force vector orientation, this may partially explain the larger relative standard deviations observed in the seated position and subsequently SEM and MDC calculations. If choosing to assess with the seated method utilized in this study, one should be cognizant to maintain the dynamometer directly perpendicular to the force vector produced by the patient. The clinician may improve this assessment by ensuring both hands firmly grasp both sides of the dynamometer and utilize adequate practice trials to ensure the participant is not adopting a compensatory hip flexion strategy.

The overall results of this study suggest that either of the clinically applicable assessment methods utilized in this study may be used to obtain a reliable measure of knee flexion torque production. The prone method may offer an advantage in that the MDC values are lower, indicating that this method may be more sensitive to detecting a true change when reassessing throughout the course of rehabilitation. Both methods provide the advantage of clinical efficiency as the only equipment required are a table and HHD, with no additional time and attention devoted to fixating the HHD or patient with external devices. When time and equipment restraints do not present as limitations, the authors still suggest utilizing any methods of patient and dynamometer fixation available that affords the most rigid and repeatable set up. This should especially be done if the clinician does not feel confident in their ability to stabilize the dynamometer, or the patient demonstrates compensations. Future research should identify clinically efficient and pragmatic reliable torque assessments for various joint angles and actions.

Limitations

There are several limitations of this investigation that should be acknowledged. First, there was no formal power analysis completed prior to commencement of the study which may have influenced the statistical results. Participants may not have achieved true maximal force production on all repetitions as an approximate 10 second break between trials may arguably have been inadequate. However, this testing protocol was chosen as it represents pragmatic testing during actual patient care when time may be a limiting factor. Additionally, the limb tested was alternated between positions to mitigate this and the examiners anecdotally did not observe any consistent decrease in performance between trials. Regarding familiarization, there was not a true familiarization protocol in which the testing procedures were completed without data collection on a separate day. However, the same process of submaximal familiarization trials were used for each testing session for consistency. Both examiners were males weighing >200 pounds that regularly participate in resistance training, therefore these results may not generalize to clinician populations not sharing these characteristics. Further, the participants were all healthy individuals with no history of significant knee or hip injuries, and results in an injured population may differ. Finally, the observed results are specific to the particular methods of testing in this study, and should not be assumed to generalize across other testers, body positions, joint angles, etc.

CONCLUSION

The results of this study support that both the seated and prone positions with clinician stabilization may be utilized as a reliable means of determining knee flexion torque. The prone position yielded lower MDC values suggesting that it may be more sensitive in detecting actual change across multiple assessments. While it is suggested to use the most rigid and repeatable methods of stabilization available that time and equipment affords, the clinician stabilized methods utilized in this study offer a clinically efficient means of assessing knee flexion torque in a pragmatic clinical environment.


Conflicts of Interest

None declared.