Restoration of muscle strength is a primary goal after orthopedic knee and shoulder injuries. Deficits in quadriceps and hamstring peak force place patients at a greater risk of re-injury following anterior cruciate ligament reconstruction and hamstring strains, respectively.1,2 Similarly, decreased shoulder internal and external rotation strength is associated with recurrent anterior shoulder instability.3 Despite the importance of restoring muscle force production after injury, many clinicians still rely on highly subjective methods to assess muscle strength.
In recent surveys of physical therapists’ return-to-sport decision-making process, 56% reported using manual muscle testing (MMT) as their sole means of measuring knee strength after anterior cruciate ligament reconstruction and 83% relied on MMT for the upper extremity.4,5 MMT is a semi-quantitative method using joint motion against gravity and manual resistance to grade muscle force on a 6-point ordinal scale (0 = No observed muscle activity, 5 = “Normal” muscle function).6 While quick and easy to perform, MMT has varying inter-rater reliability and limited sensitivity to detect strength deficits once a minimal threshold of force production is achieved.6–9
Electromechanical dynamometry (BIO) and hand-held dynamometry (HHD) are more objective alternatives to MMT. BIO is currently the gold standard in measuring single-joint peak force production.10 While demonstrating excellent reliability and sensitivity to detect between-limb peak force production deficits, BIO is not routinely used in clinical settings due to its high cost, space requirements, and lengthy testing procedures. Hand-held dynamometry, a portable and more affordable dynamometer, has demonstrated acceptable reliability and validity in both the knee and shoulder when compared to BIO.10–12 Despite the accessibility of HHD, continued reliance on MMT suggests that cost may be one of the barriers to widespread adoption.4,5
Commercially available crane scales (CS), or strain gauges, have been explored as a method to assess single-joint muscle force production, with potentially greater objectivity than MMT and substantially less cost than electromechanical dynamometers or HHD devices. However, the reliability and validity of these devices have not been established for measuring single-joint muscle force production. Therefore, the purpose of this study is to determine if a commercially available CS is a valid alternative for isometric single-joint muscle peak force testing of the knee and shoulder when compared to HHD and BIO. The two primary aims of the study were to determine the between-trials test-retest reliability and concurrent criterion validity of an accessible crane scale for measuring isometric knee and shoulder strength. It was hypothesized that the CS would demonstrate excellent between-trials test-retest reliability (ICCs >0.9) and clinically acceptable concurrent criterion validity when compared to BIO and HHD (ICCs > 0.75).
METHODS
Participants
The University of Pittsburgh Institutional Review Board approved the study, and all participants provided written informed consent. A convenience sample of healthy, physically active individuals, 18 to 45 years old, was recruited through email solicitation, posted flyers, and word-of-mouth from the University of Pittsburgh community from May 1st through December 31st, 2022. Individuals were excluded if they currently had a musculoskeletal injury to the upper or lower extremities or a health condition that limited their ability to complete strenuous exercise, per the American College of Sports Medicine Preparticipation Health Screening Recommendations.13 An a priori power analysis was not performed.
Study Design
This was a single-session, between-trials test-retest reliability and concurrent criterion validity study. The device of interest was a Klau Mini Weighing Crane Scale (KlauDirect), rated to measure up to 300 kg of force. The device was chosen for investigation due to its wide availability from online retailers, ability to hold peak force readings, and range of force measurement. The gold standard comparator for concurrent criterion validity was a Biodex System 4 in isometric mode (Biodex Medical Systems, Inc., Shirley, New York, USA). Additionally, a Lafayette Hand-Held Dynamometer (Model 01663, Lafayette Instrument, Lafayette, IN, USA) was used as an additional comparator since these devices are commonly available in clinical settings.
Age, height, weight, sex, and dominant limb were collected before strength testing. The participants’ preferred lower extremity to kick a soccer ball was established as the dominant lower limb. Their preferred upper extremity for throwing tasks and activities of daily living was established as the dominant upper limb.
Isometric muscle strength was assessed using each device for six motions: knee extension (EXT), knee flexion (FLEX), shoulder internal rotation (IR), shoulder external rotation (ER), and shoulder abduction (ABD). To limit any influence of limb dominance and fatigue, one limb was tested for each participant and the limb to be tested and order of device testing was randomized using a block randomization scheme (www.randomizer.org) with a block size of 12. The study team was not blinded to the randomization scheme before participant enrollment. For each device, motions were evaluated in a standardized order as follows: EXT, FLEX, IR, ER, then ABD. Testing was completed by three student physical therapists in the final year of their education, who completed approximately two hours of training before initiation of the study. Testing for each device was completed by the same examiner across all participants. Given the nature of the study design, testers were not blinded to testing results for each device and testing position.
Isometric Strength Testing
A standardized warm-up and testing process was followed for all devices and motions. Participants were instructed to complete two warmup repetitions at 50 and 75 percent of their self-perceived maximal intensity for each testing position. Approximately 30 seconds were provided between warmup repetitions. The moment arm for CS and HHD testing was recorded using a flexible tape measure. Three maximal intensity trials were then completed in each testing position with 60 seconds between trials. Once all motions were evaluated for the respective device, approximately three to five minutes were provided to switch between devices. Participants received strong verbal encouragement during all testing trials to ensure maximal intensity was achieved. Testing positions and the degree of stabilization for the CS and HHD were selected to closely mimic how these tools are being utilized in a clinical setting.
Knee Extension
Knee extension testing was performed with the participant seated at 90° of hip and knee flexion. If necessary, a cushioned pad was added to the seat to ensure that the participant’s foot did not contact the ground. For the CS assessment, the device was attached to the lower leg using a padded ankle cuff positioned two finger breadths above the lateral malleolus. The CS was oriented behind the lower leg, perpendicular to the shank, and attached to an immovable object using an inelastic strap (Figure 1A). For the HHD assessment, the device was secured to the front of the lower leg, approximately two finger breadths above the lateral malleolus using an inelastic loop, with the opposite end of the loop wrapped around an immovable object.14 The moment arm for CS and HHD testing was recorded as the distance in centimeters from the lateral femoral condyle to the center of the ankle cuff along the lateral lower leg. The shank cuff for the BIO assessment was attached two finger breadths above the lateral malleolus, and inelastic straps were used to stabilize the thigh, waist, and trunk.14 Participants were instructed to hold onto the chair or stabilization handles to limit extraneous movement.
Knee Flexion
Participant positioning for knee flexion was identical to knee extension for each testing device. For the CS assessment, the CS was oriented in front of the lower leg, perpendicular to the shank, and attached to an immovable object using an inelastic strap (Figure 1B). The examiner stood behind the participant and stabilized the chair to maximize stabilization during testing trials. For the HHD assessment, the device was secured to the back of the lower leg, approximately 2 finger breaths above the lateral malleolus using an inelastic loop, with the opposite end of the loop wrapped around an immovable object.15 The moment arm for CS and HHD testing was recorded as the distance in centimeters from the lateral femoral condyle to the center of the ankle cuff along the lateral lower leg. The shank cuff for the BIO assessment was attached 2 finger breadths above the lateral malleolus.15
Shoulder External Rotation
Shoulder external rotation for CS and BIO measurements was performed with the participant seated at 90° of hip and knee flexion, approximately 0° of shoulder abduction, neutral shoulder rotation, 90° of elbow flexion, and neutral forearm position. For the CS assessment, the device was attached to the distal forearm using a wrist cuff positioned just proximal to the wrist joint. The CS was oriented medial to the upper extremity, perpendicular to the forearm, and attached to an immovable object using an inelastic strap. A small towel was placed between the elbow and the trunk to maintain the arm at 0° abduction. For the HHD assessment, the participant was positioned supine with the shoulder abducted and elbow flexed to 90°. The device was placed on the dorsal forearm, just proximal to the wrist joint, with stabilization of the HHD provided by the tester.11 The moment arm for CS and HHD assessment was recorded as the distance in centimeters from the lateral epicondyle of the humerus to the center of the wrist cuff or the HHD pad along the dorsal forearm. Participants were placed into a padded elbow attachment for the BIO assessment while holding a stabilized handle in the aforementioned testing position.11
Shoulder Internal Rotation
Participant positions for shoulder internal rotation were identical to shoulder external rotation for each testing device. For the CS assessment, the device was attached to the distal forearm using a wrist cuff positioned just proximal to the wrist joint. The CS was oriented lateral to the upper extremity, perpendicular to the forearm, and attached to an immovable object using an inelastic strap (Figure 1D). For the HHD assessment, the participant positioning was identical to external rotation but the device was placed on the volar forearm, just proximal to the wrist joint, with stabilization provided by the tester.12 The moment arm for CS and HHD assessment was recorded as the distance from the lateral epicondyle of the humerus to the center of the wrist strap or dynamometer along the dorsal forearm. Participants were placed into a padded elbow attachment for the BIO assessment while holding a stabilized handle in the aforementioned testing position.11
Shoulder Abduction
CS and BIO measurements of shoulder abduction were completed with the participant seated at 90° of hip and knee flexion, 90° of shoulder elevation in the scapular plane, and neutral shoulder rotation. For the CS assessment, the device was attached to the distal forearm using a wrist cuff positioned just proximal to the wrist joint. The CS was oriented inferior to the upper extremity, perpendicular to the forearm, and attached to an immovable object using an inelastic strap (Figure 1C). For the HHD assessment, the device was secured to the dorsal forearm, just proximal to the wrist joint, with stabilization provided by the tester while the participant laid supine with the shoulder abducted to 90°.16 The moment arm for CS and HHD assessment was recorded as the distance from the acromioclavicular joint to the center of the wrist cuff. Participants held a stabilized handle for the BIO assessment while seated in the aforementioned testing position.17
Statistical Analyses
Between-trials test-retest reliability was examined using intraclass correlation coefficients (ICC) using a single-rating, absolute agreement, two-way mixed effects model.18 ICCs were classified as poor (<0.5), moderate (0.5 to <0.75), good (0.75 to <0.90), or excellent (.90 or above).18 The standard error of measurement (SEM) was calculated based on the obtained ICCs: where s = standard deviation of sample measurements.19 Additionally, the SEM was used to calculate the minimal detectable change (MDC): 20
Before validity analyses, CS and HHD results were converted to Newton meters. Since the direction of force was oriented approximately 90 degrees relative to the moment arm, the equation was simplified to
where τ = torque, ΜΑ = moment arm, and F = force.Validity was examined using ICCs, Pearson’s correlation coefficients, absolute and relative error calculations, Bland-Altman plots, and simple linear regression. Normality was confirmed for torque values of each device and motion using Shapiro-Wilk’s tests and visual inspection of quantile-quantile plots. ICCs were computed using an average rating, absolute agreement, two-way mixed effects model, classified according to the cut-offs described above.18 Pearson’s correlation coefficient (r) was used to examine the relationships between torque obtained by the CS to BIO and HHD. Correlation coefficients were classified as negligible (<0.10), weak (0.10 to 0.39), moderate (0.40 to 0.69), strong (0.70 to 0.89), or very strong (0.90 to 1.00).21 While ICCs are the preferred method for examining the concurrent criterion validity of continuous measures,18 Pearson’s correlation coefficient was calculated to facilitate comparison with previous studies. Bland-Altman plots were constructed with 95% limits of agreement (LoA) and bias estimates.22 Additionally, simple linear regression was utilized, with the average of measurements from the CS and each alternative device regressed on the difference between devices. The resulting linear regression line was overlayed on the Bland-Altman plots and visually inspected for evidence of a proportional bias.23 A substantial deviation of the regression line slope from zero would imply a non-constant bias across the potential range. Bland-Altman plots with regression lines were generated in Prism 10 v. 10.2.2. (Graphpad Software, LLC, Boston, MA) and all other analyses were performed in SPSS Statistics v. 28 (IBM Corp., Amonk, NY).
RESULTS
A total of 20 healthy participants were recruited. Participants were 50% female and had a mean (SD) age of 25.4 (3.8) years, height of 172.1 (9.2) cm, weight of 74.2 (13.3) kg, and body mass index of 24.9 (2.9) m/kg2. All 20 participants completed CS testing for all motions. However, due to equipment malfunctions, 19 participants completed BIO shoulder abduction testing, 12 completed HHD knee extension testing, and 11 completed HHD testing for all other motions.
Test-Retest Reliability
ICCs were excellent for between-trials test-retest reliability (Table 1). The SEM ranged from 7.2 to 23.3 N and the MDC95% ranged from 20.0 to 64.6 N (Table 1).
Concurrent Validity – Biodex (BIO)
ICCs for validity ranged from poor to moderate, with substantial absolute and relative error (Table 2). There was a strong correlation between devices for knee flexion and moderate for other motions (Table 2). Absolute and relative error was greater for knee motions relative to the shoulder motions. On average, the CS underestimated the values obtained by BIO, with a negative bias for all motions at the knee and shoulder (Figure 2). There appeared to be evidence for a proportional bias with knee extension and shoulder abduction, with bias increasing as the average torque produced increased (Figures 2B and 2E).
Concurrent Validity – Hand-held Dynamometry (HHD)
ICCs for validity were excellent (Table 3). Very strong correlations were found between devices for knee flexion, shoulder IR, and shoulder abduction, and strong correlations for knee extension and shoulder external rotation. Relative error was 9.7% or greater for all motions evaluated but less than the error observed for validity with BIO (Table 3). On average, the CS underestimated values obtained by HHD, with a negative bias for all tested motions (Figure 3). Similar to the comparison with BIO, there was evidence for a proportional bias with knee extension and shoulder abduction, with bias increasing as the average torque produced increased (Figures 3B and 3E).
See Supplementary Tables 1-3 for results of linear regression.
DISCUSSION
This study investigated a CS as an alternative to BIO and HHD for measuring isometric knee and shoulder strength. The CS device demonstrated excellent between-trial test-retest reliability for assessing force during knee and shoulder strength testing. Concurrent criterion validity was moderate to poor compared to gold standard BIO but was excellent compared to HHD. In general, the CS underestimated torque values obtained by both BIO and HHD and there was evidence of a proportional bias for knee extension and shoulder abduction.
The hypothesis regarding between-trials test-retest reliability was supported, as ICCs were excellent (ICCs >0.9) for all motions. Therefore, the CS provides consistent values within a single testing session. Since all data were collected at a single time point, it was not possible to assess between-session reliability. As a result, the coefficients, SEM, and MDC values should not be generalized to measurements obtained across different testing sessions.
The hypotheses regarding concurrent validity were partially supported. Compared to the gold standard BIO, ICCs were poorer than expected (ICCs < 0.75) for all motions. However, concurrent validity with HHD was clinically acceptable, with excellent ICCs (ICCs >0.9) for all motions. The CS underestimated torque values obtained by BIO and HHD for both comparisons. The magnitude of underestimation was much larger for BIO than for HHD. Additionally, there appeared to be a proportional bias for knee extension and shoulder abduction when compared to either alternative device. The CS tended to underestimate torque values to a greater degree as torque increased in these instances.
To the authors’ knowledge, this is the first study to investigate the reliability and validity of a readily available CS for assessing knee and shoulder strength. One previous study investigated the concurrent criterion validity of an in-line dynamometer for knee extension strength in participants with and without anterior cruciate ligament reconstruction.24 Validity compared to gold standard BIO was excellent (ICCs = 0.97 – 0.98) and there was no evidence of fixed or proportional bias. However, knee extension strength was assessed at 60° of knee flexion using a substantially more costly and less accessible in-line dynamometer specifically designed to measure muscle strength. Therefore, these results are not directly comparable.
In the absence of prior research exploring CSs as a method of strength assessment, the authors deemed it necessary to include an analysis of concurrent validity with HHD. HHD is more widely used in clinical settings than BIO. While it is unlikely that CSs will supplant BIO, they may yield results similar to those obtained through HHD. Consequently, comparing the current findings with existing literature comparing BIO and HHD provides valuable context for this study.
Most prior studies comparing BIO and HHD focused on the knee and evaluated validity using correlation coefficients other than ICCs, such as Pearson’s or Spearman’s coefficients) to assess validity. Additionally, there is substantial variability in the testing position and degree of stabilization during HHD across studies. Correlation coefficients ranged from 0.75 – 0.93 for knee extension and 0.69 – 0.88 for knee flexion.14,15,25–28 The observed correlation coefficients in the current study between the CS and BIO were lower for knee extension but similar for knee flexion. Two studies examined the concurrent validity of shoulder external and internal rotation with correlation coefficients ranging from 0.65 to 0.79 and 0.66 to 0.82, respectively.12,29 The observed correlation coefficients were lower for both motions. Although commonly used, correlation coefficients are a measure of relationship, not agreement, and do not account for fixed bias between two measures.30 Only one study has utilized ICCs to examine concurrent validity between BIO and HHD.31 The authors found ICCs ranging from 0.72 to 0.87 for knee flexion and 0.40 to 0.92 for knee extension. Our ICCs were similar for knee flexion but lower for knee extension. Combined, it appears that the concurrent validity of the CS is similar to that of HHD for knee flexion but lower for knee extension and shoulder motions.
A key decision in designing the study was to prioritize ecological validity and generalizability by selecting device set-ups and participant positioning that mimic how CSs and HHDs are typically used in clinical practice. Consequently, the CS and HHD assessments involved substantially less stabilization compared to the BIO assessments and to previous investigations of similar devices.14,24 For instance, thigh and trunk straps were not used during lower extremity testing with the CS and HHD, as such stabilization is rarely employed in clinical practice based on our experience. The reduced stabilization may have introduced extraneous movements and reduced force or torque production.32,33 These factors could partially explain why the CS significantly underestimated torque relative to the BIO, as well as the differing concurrent validity findings between the BIO and HHD and the inconsistencies with prior research. The authors suggest that clinicians use the maximum amount of stabilization that is feasible when using the devices to improve measurement accuracy.
Several limitations to this study must be acknowledged. First, the study was conducted in a relatively small sample of healthy individuals. As a result, there was low precision around the point estimates of the validity coefficients, a risk of type II error if interpreting the significance of linear regression, and limited generalizability to injured populations. The concerns regarding a small sample size were compounded by equipment malfunction, which further reduced the sample size for HHD measurements. To minimize the risk of type II error the authors relied on visual inspection of regression lines and Bland-Altman plots, rather than statistical significance, to identify fixed and proportional bias. A second limitation was the length of the data collection since testing was performed for five motions with three devices. This increased the potential for physical and mental fatigue. The order of device testing was randomized to limit bias introduced by physical fatigue. Finally, the participant positioning was not identical for all devices. The HHD set-up for shoulder motions deviated from the crane-scale set-up to mirror clinical use. Due to mechanical limitations, it was not possible to exactly recreate the CS set-up for shoulder motions on the BIO. These discrepancies in positioning likely contribute to the decreased concurrent validity.
Clinicians should carefully consider the affordability and convenience of commercially available CSs in comparison to their unknown longitudinal reliability and concerns regarding concurrent validity before integrating these devices into their clinical practice for measuring knee or shoulder strength. At the time of this study, the cost of the CS was approximately $50, significantly lower than comparable HHDs ($1,000) and IDs ($40,000).
Although the CS exhibited excellent between-trials reliability, clinicians should be aware that these values were obtained without removing and reapplying the device, which would have likely resulted in lower reliability coefficients. Additionally, between-session and interrater reliability remain unexplored, requiring caution when comparing values over time or between different examiners. Finally, the minimal detectable change needs to be established over longer time intervals to determine whether perceived changes in strength are more than measurement error.
Despite lower-than-expected concurrent validity compared to BIO across all motions, the CS demonstrated excellent concurrent validity with HHD, suggesting its ability to produce values comparable to those obtained via HHD. However, clinicians should be aware that concerns have been raised about the reliability and validity of HHD for large muscle groups, such as the knee extensors.10,34 If clinicians intend to calculate a limb symmetry index based on these measurements, concerns regarding validity may be mitigated, as the index may minimize the impact of a fixed bias. However, limb symmetry indexes were not examined in this study, as only one side was tested per participant.
CONCLUSIONS
The results of this study indicate that a readily available CS demonstrates excellent between-trials test-retest reliability for assessing isometric knee and shoulder strength. Concurrent validity with HHD is also excellent, but moderate to poor compared to BIO. Therefore, a CS may provide values similar to HHD but is not comparable to BIO. Future research should investigate longitudinal reliability and measurement properties when calculating limb symmetry indices in injured populations to provide additional context on the clinical utility of the device.
Disclosures
The authors have nothing to disclose.
ACKNOWLEDGEMENTS
This work was supported in part by the National Institutes of Health, National Institute of Arthritis and Musculoskeletal and Skin Diseases under award #K23AR080741.