INTRODUCTION

Accurate quantification of muscle force production is important in clinical practice. It allows clinicians to identify limitations as well as track the progression of muscle strength over time, which has shown to be a strong predictor for functional gains.1–3 Manual muscle testing (MMT) has been commonly used clinically to evaluate muscle strength.4,5 However, MMT for strength assessment has been criticized for its subjectivity and low reliability in quantifying muscle force production.6,7 The continued clinical use of MMT can be justified due to ease of performance and lack of cost.

Hand-held dynamometers (HHD) quantify muscle force production accurately8 and offer an alternative to MMT for objectively monitoring patients’ strength progress over time.8,9 HHD isometric testing can encompass a “make-test” technique, where the subject exerts muscle force against a stationary dynamometer. This method involves less measurement error than a MMT “break test”.10,11 Various HHDs have been studied for use in the upper extremity with good-to-excellent reliability [Intraclass Correlation Coefficient (ICC) >.75], and concurrent validity when compared to the isokinetic dynamometry.9,12 HHDs appeal to clinicians due to their portability, ease of use, and lower cost, as compared to isokinetic devices. However, the price (> $1,000) of currently available HHDs may be cost-prohibitive for some clinicians.10,12

The microFET2 (MF2; Hoggan Scientific, Salt Lake City, UT) is a commonly used HHD with high (>.85) reliability in assessing muscle force production across various injury populations3,10,13,14 that include shoulder patients.3,13,14 Similarly, the MF2 has been found to be highly reliable among healthy adults.15 Its moderate-to-high (r ≥.50) concurrent validity has also been established against isokinetic testing16–20 among injury populations16,21 that include the shoulder17,18,20 as well as in healthy adults.19 Thus, it can be considered a criterion-standard for assessing muscle force production.

The ActivForce (AF; Activbody, San Diego CA) digital dynamometer is a newly available HHD, which is marketed as both an exercise monitoring tool and isometric muscle force testing instrument. It was manufactured to be a clinically useful tool for establishing muscle strength impairment baselines and tracking the progress of strengthening programs among patients with various pathologies. The AF is smaller, lighter, and less expensive (priced at < $200.00) as compared to other HHDs. However, to the best of the authors’ knowledge, its psychometric properties have yet to be established. Additionally, the effect of tester clinical experience on the psychometric properties of both the MF2 and AF HHDs has not been established. Knowledge of tester experience on the clinometric properties of these HHDs may advance their clinical utility and implementation among more clinicians with diverse clinical backgrounds.

This study aimed to determine: 1) the intra- and inter-tester reliability for novice and experienced clinicians when testing shoulder isometric muscle force [external rotation (ER), internal rotation (IR), and forward elevation (FE) at the scapular plane] of the MF2 and AF HHDs in healthy adults, 2) the standard error of measurement (SEM) and the minimal detectable change (MDC) values for both HHDs when assessing shoulder isometric muscle force in healthy adults, and 3) the criterion validity of the AF compared to the MF2 HHD on testing shoulder isometric muscle force in healthy adults.

METHODS

This was an observational study, which was approved by the DeSales University Institutional Review Board. Participants were recruited via word of mouth and electronic communication (social media and emails) within the DeSales University community. All participants signed informed consent, their rights were protected, and were screened for eligibility. Inclusion criteria included being at least 18 years of age with no history of shoulder surgery at the dominant arm within two years, no shoulder pain within the prior six months, and pain-free functional shoulder active range of motion. Hand dominance was determined as the participant’s preferred side for writing.22 Functional shoulder motion in ER, IR, and FE were defined as being able to reach with the dominant hand behind the head, up the spine behind the back, and overhead midway between the sagittal and frontal planes, respectively. Exclusion criteria consisted of inability to speak or read in English, cognitive impairment impacting the safety of the participant, any painful shoulder pathology with muscle weakness at the dominant arm, and pregnancy.

Instrumentation

Muscle force production of each participant’s dominant shoulder ER, IR, FE was measured using the MF2 and AF HHDs (Figure 1). A standardized testing position was adopted for both devices with the participants in a seated position (Figure 2). Shoulder ER and IR were tested with the participant’s arm fully adducted, the elbow at 90°, and the forearm in a neutral position. Shoulder FE was tested at 45° in the plane of scapula,23 which was defined as a participant’s natural shoulder upward motion that occurs between the frontal and sagittal plane (halfway between the coracoid process and posterolateral acromion corner), identified via palpation.24 To minimize error and maximize testing standardization efficiency, the tester passively placed the participant’s arm in each test position while the HHD was placed at the volar distal forearm for IR, the dorsal distal forearm for ER, and the distal lateral humerus for FE.23 Before data collection, a small (n=10) pilot training was performed to ensure consistency on the study’s protocol. All three testers were present during this training session. Important feedback was shared between testers, which was used constructively towards establishing inter-tester agreement and consistency across all testing steps.

Figure 1
Figure 1.The HHDs used in this study: MF2 (A) and AF (B).
Figure 2
Figure 2.Shoulder testing positions for IR (A), ER (B), and FE (C).

Procedures

Three testers assessed each participant with each HHD. Two testers were experienced clinicians (>20 years of experience) and one tester was a third-year physical therapy student with only novice HHD skills. Both expert clinicians held advanced Physical Therapy certifications in the Orthopedic, Upper Extremity, and Manual Therapy. The tester, device, and motion assignment order were randomized. An investigator not performing the testing read each device and recorded the data, thus, both the participant and tester blinded to the results. The MF2 device offers a digital display at its side, which was blocked from the tester’s view during testing. The AF device does not display the force. It allows for a remote connection with a cell phone where the test score is displayed. After each test trial, the independent reader who stood next to each tester read and recorded each HHD. Testing consisted of two trials of a maximal isometric “make test” with a 30-sec rest period between trials. Standardized verbal commands were utilized for each trial (“push as hard as possible, push, push, push”), which lasted for three seconds. This protocol was repeated with all testers using both HHDs for each motion assessed. A three to five min rest period was given between motions and testers, respectively. All time periods of the study were monitored by a stopwatch. At the end of each testing session, participants were asked to report which device felt most comfortable during testing.

Data Analysis

Descriptive and inferential statistical analysis was performed using SPSS version 25 (IBM Corp., Armonk, NY). ICC (2,2) was used to determine the intra-tester reliability within each tester based on the two trials each tester competed for each shoulder motion. ICC (2,1) was used to determine the inter-tester reliability across testers of similar and different clinical experience levels based on the mean value of the two trials each tester completed for each shoulder motion. ICCs were interpreted as ≥.75 high, .40 -.75 moderate, and <.40 poor.24,25 SEM and MDC values were determined for both HHDs. The SEM represented the within and between testers’ HHD measurement error for each shoulder motion.25 SEM was calculated as SD x √1-ICC, where SD is the measurement standard deviation within and between each tester.24 The MDC, which is a measure of test-responsiveness,26 represented the minimum test-score change for a statistically significant difference, taking into account variation between subjects, raters, and SEM.21,24 MDC was calculated as z x SEM x √2, with a z score of 1.96 reflecting a 95% confidence level.24 Concurrent validity was determined via Pearson correlation coefficient statistics by comparing the AF to its criterion-referenced MF2 HHD in all 3 shoulder motions. Pearson correlations were interpreted as: >.75 good-high, 0.50 - 0.75 moderate-good, and <.50 fair-poor.24 Participants’ preferences on instrument comfort levels during testing were reported via frequency statistics.

RESULTS

Among 30 recruited participants, only one was excluded due to shoulder pain during testing. The final analysis included 29 healthy participants (17 females and 12 males) with a mean age of 30 ±11.4 years. The majority (25/29) of participants were right-hand dominant. Descriptive intra- and inter-tester reliability statistics for both devices across all three testers and shoulder motions are shown in Tables 1 and 2.

Table 1.Intra-tester descriptive statistics for both HHDs and all three testers and motions.
Tester Instrument/Motion ICC (2,1) (95% CI) SEM MDC95
Expert 1 MF2/FE
MF2/IR
MF2/ER
AF/FE
AF/IR
AF/ER
0.98
0.97
0.98
0.99
0.99
0.98
(0.97, 0.99)
(0.94, 0.98)
(0.97, 0.99)
(0.97, 0.99)
(0.98, 0.99)
(0.95, 0.99)
1.60
2.14
1.06
1.36
1.36
0.91
4.43
5.93
2.92
3.75
3.75
2.51
Expert 2 MF2/FE
MF2/IR
MF2/ER
AF/FE
AF/IR
AF/ER
0.96
0.97
0.97
0.97
0.98
0.97
(0.91, 0.98)
(0.94, 0.98)
(0.93, 0.98)
(0.94, 0.98)
(0.97, 0.99)
(0.93, 0.98)
1.75
1.93
1.25
1.86
1.88
1.39
4.83
5.32
3.45
5.13
5.18
3.83
Novice MF2/FE
MF2/IR
MF2/ER
AF/FE
AF/IR
AF/ER
0.96
0.98
0.97
0.96
0.98
0.95
(0.92, 0.98)
(0.96, 0.99)
(0.94, 0.98)
(0.93, 0.98)
(0.97, 0.99)
(0.91, 0.98)
2.19
1.82
1.23
2.82
1.95
1.68
6.04
5.02
3.39
7.78
5.38
4.63

ICC = Intraclass Correlation Coefficient; CI = Clinical Interval; SEM = Standard Error of Measurement; MDC95 = Minimal Detectable Change; MF2 = microFET2; AF = Active Force; FE = Forward Flexion; IR = Internal Rotation; ER = External Rotation.

Table 2.Inter-tester descriptive statistics for both HHDs, and all three testers and motions.
Testers Instrument/Motion ICC (2,2) (95% CI) SEM MDC95
Exp 1 vs. Exp 2 MF2/FE
MF2/IR
MF2/ER
AF/FE
AF/IR
AF/ER
0.92
0.96
0.93
0.89
0.95
0.85
(0.84, 0.96)
(0.92, 0.98)
(0.85, 0.96)
(0.94, 0.94)
(0.83, 0.98)
(0.05, 0.96)
2.78
2.28
1.93
4.01
2.99
2.77
7.67
6.29
5.32
11.06
8.25
7.64
Nov vs. Exp 1 MF2/FE
MF2/IR
MF2/ER
AF/FE
AF/IR
AF/ER
0.93
0.97
0.91
0.93
0.96
0.94
(0.80, 0.97)
(0.93, 0.98)
(0.81, 0.95)
(0.85, 0.96)
(0.92, 0.98)
(0.87, 0.97)
2.91
2.13
2.17
3.63
2.73
1.68
8.03
5.87
5.98
10.01
7.53
4.63
Nov vs. Exp 2 MF2/FE
MF2/IR
MF2/ER
AF/FE
AF/IR
AF/ER
0.88
0.96
0.96
0.95
0.95
0.89
(0.76, 0.94)
(0.93, 0.98)
(0.90, 0.98)
(0.90, 0.97)
(0.85, 0.98)
(0.03, 0.97)
3.34
2.38
1.42
2.74
3.01
2.53
9.21
6.56
3.91
7.56
8.30
6.98

ICC = Intraclass Correlation Coefficient; CI = Clinical Interval; SEM = Standard Error of Measurement; MDC95 = Minimal Detectable Change; Exp = Expert; Nov = Novice; AF = Active Force; MF2 = microFET2; FE = Forward Flexion; IR = Internal Rotation; ER = External Rotation.

Results demonstrated high intra- (.95 - .99) and inter-tester (.85 - .97) ICCs for both devices, all testers, and all shoulder motions. Table 3 shows the average ICC, SEM, and MDC values for both devices and all testers when all shoulder motions were combined. Pearson correlation analysis indicated strong correlations between the MF2 and AF for the ER (r =.89, p=0.000), IR (r =.93, p=0.000), and FE (r =.91, p=0.000) shoulder motions. These correlations were found statistically significant at the 0.01 level. The AF was reported as the preferred HHD by the majority (86%) of the participants based on comfort levels during testing.

Table 3.Average intra- and inter-tester descriptive statistics for both HHDs and clinical experience levels with all shoulder motions combined.
MF2 AF
Conditions Testers ICC SEM MDC95 ICC SEM MDC95
Intra-tester Exp .97 1.62 4.48 .98 1.46 4.02
Nov .97 1.74 4.81 .96 2.15 5.93
All .97 1.68 4.64 .97 1.80 4.97
Inter-tester Exp-Exp .93 2.33 6.42 .89 3.25 8.98
Nov-Exp .93 2.39 6.59 .93 2.72 7.50
All .93 2.36 6.50 .91 2.98 8.24

MF2 = microFET2; AF = Active Force; ICC = Intraclass Correlation Coefficient; SEM = Standard Error of Measurement; MDC95 = Minimal Detectable Change; Nov = Novice Tester; Exp = Experienced Tester.

DISCUSSION

To the best of the authors’ knowledge, this is the first study to determine the AF psychometric properties. This HHD was found to be a highly reliable and valid tool for assessing shoulder muscle force in ER, IR, and FE motions in healthy adults as compared to the gold-standard MF2. Like the MF2, the AF demonstrated excellent levels of intra- and inter-tester reliability and criterion validity for all tested motions amongst both experienced and novice testers. In this study, tester clinical experience differences minimally influenced the AF intra- and inter-tester reliability, concurrent validity, and SEM and MDC values, demonstrating comparable psychometric stability to the MF2.

The intra- and inter-tester reliability ICCs were high (.85-.99) for both the new AF and MF2 without noticeable differences among the three shoulder motions. These ICCs are in agreement with previous studies that have found high (.82-.99) intra- and inter-tester ICCs for HHDs, including the MF2, on assessing shoulder strength in patients with shoulder pathology,14 swimmers,13 and healthy adults.3,15,20 All the previously referenced studies3,13–15,20 were affected by a lack of standardization in body and shoulder positioning during testing. The current study’s HHD isometric strength-testing process adapted the same test positions as previous studies,23,24 which reported high (.79-.96) intra- and inter-tester ICCs when assessing shoulder strength in ER, IR, and FE with a HHD.23 The advantage of the selected shoulder-testing positions was thought to be the ease of instrument stabilization against the body, better representation of functional shoulder positions, and test standardization consistency. Muscle force testing in ER, IR, and FE at below shoulder-level positions could also more readily apply in patients with shoulder pain. This study’s high ICCs confirmed the Leggin et al23 study results for these AF and MF2 testing positions.

The other variable in this study was tester clinical experience. The initial hypothesis was that a tester with novice clinical experience would be less reliable in HHD testing than the experienced testers. To the best of the authors’ knowledge, this is the first study to investigate the influence of clinical experience on the reliability of shoulder isometric muscle force HHD testing. Thus, it is not feasible to compare this study’s findings to previous studies that have utilized testers with advanced clinical skills or did not report instrument reliability differences based on tester-experience levels. This study has shown that the AF is as reliable as other commonly utilized HHDs such as the MF2, in assessing shoulder isometric muscle force, regardless of tester experience.

In this study, both devices were found to have small intra- (1.68 - 1.80 lbs.) and inter-tester (2.36 - 2.98 lbs.) SEM values for all shoulder motions combined (Table 3). The AF had slightly higher inter-tester SEM values than the MF2 HHD, a difference that was not statistically analyzed. Clinical experience and shoulder motion did not have a noticeable effect on the SEM values of these devices with the novice tester having slightly higher SEM values than the experienced clinicians. The observed SEM variability could be attributed to potential instrumentation error sources such as slight inconsistencies in verbal cueing, participants’ body compensation patterns, and muscle fatigue. Although a small amount of instrumentation error is considered inevitable, testing randomization and standardization were expected to keep this study’s SEM at low levels.

The MDC values of both HHDs followed a similar pattern to their SEM levels. The AF had slightly higher MDC values compared to the MF2 potentially due to higher SEM levels. Based on their MDC values, score changes of 5-6 lbs. and 5-8 lbs. are required to measure statistically significant differences in muscle force production with the MF2 and AF HHD, respectively. Yet, tester clinical experience did not noticeably influence the instruments’ MDC values (Table 3). Among the three shoulder motions, slightly higher MDC values existed for the FE and IR than for ER across all testers and both HHDs. Based on MDC averaged values from Tables 2 and 3, a test-score change of near 7 lbs. implies a statistically significant difference in muscle force production for FE and IR. The comparable test-score change in muscle force production for ER is near 4 lbs. These findings are consistent with previously reported HHD SEM and MDC values for shoulder ER, IR,3 and flexion in healthy adults.27

Regarding the concurrent validity of the AF as it compares to a criterion-standard dynamometer on shoulder muscle force production, the current results are also in agreement with previous research.12 In this study, the criterion-standard device was the MF2, which has been strongly validated against an isokinetic device for testing shoulder isometric strength in healthy adults.17,18,20 This study’s results confirmed that the AF strongly correlates (r =.89 -.93) with its counterpart MF2 HHD in testing shoulder ER, IR, and FE muscle force. These results are slightly higher than the correlation values (r=.76 -.78) previously reported between the MF2 and isokinetic machines.17,18,20 Such strong correlations strengthen the external validity and clinical utility of this new AF and are in line with recent claims in the literature that HHD muscle force testing should be considered a valid and an acceptable, clinically meaningful alternative to other externally fixed and expensive isokinetic dynamometers.28

This study is strengthened by both its observational design and methodological approach. In terms of its design, the aim to determine intra- and inter-tester reliabilities among multiple testers with different experience levels strengthened the study’s utility and extrapolation in clinical practice. In today’s clinical arena, where HHD devices are used regardless of clinical experience levels, an instrument’s psychometric properties should be established for all users. Likewise, the study’s aim to determine the AF concurrent validity via comparing it to an established criterion-standard HHD is consistent with current research standards. It also warrants the new instrument’s external validity and clinical credibility.12 The AF is pocket-sized, lightweight, easy to use, and more affordable HHD than others available on the market with unique advanced features to electronically measure, monitor, and store isometric muscle force data remotely. It can measure up to 200 pounds of muscle peak force. The standardized data collection process with randomization and blinding for both the testers and the participants, and the well-delineated recruitment process via specific inclusion and exclusion criteria strengthened the study. Data collection for each participant was completed within a single session to prevent possible confounding variables (activity, fatigue, diet, hydration, and motivation level differences) that may induce muscle force production variability with different-day testing. Although muscle fatigue was a concern in this study, the incorporation of consistent breaks between testers and test repetitions and the randomized assignment process should have offset any fatigue effects related to same-day testing.

Study results should be generalized with caution due to some methodological limitations. This was an exploratory study, which utilized a sample of convenience of 29 participants. This study utilized a single data-collection site and recruited only healthy adults, limiting its ability to generalize its findings among patients with shoulder pathology. However, the inclusion of only healthy participants allowed for determining the psychometric properties of this new AF among a more stable population, avoiding confounding influences from musculoskeletal injury (pain and muscle weakness). Also, no attempts were made to diversify the sample beyond the available participants’ gender and age. This study presents useful preliminary data on the AF psychometric properties and how clinical experience might influence its clinometric levels. Future studies should establish the AF psychometric properties in assessing patients with musculoskeletal pathologies.

CONCLUSION

The results of this study indicate that the AF is a highly reliable (i.e., ICC, SEM, and MDC clinometric properties) and valid tool for assessing shoulder isometric muscle force in ER, IR, and FE as compared to its criterion-standard MF2 in healthy adults, regardless of tester clinical experience. Evidence from this study implies that the AF might offer clinicians an objective, and cost-effective HHD option for assessing shoulder muscle force production.


Acknowledgements

We are grateful to our DeSales DPT students: Michael Mathis, Sara Cinelli, Joshua Anderson, Cortlyn Van Deutsch, Kaylie Vrabel, and Thomas Epsaro, for their vital contributions and commitment during the entire data collection process. We are thankful to Wendy Thomas for her imperative assistance to the study scheduling process along with all our volunteer participants.

Disclosures

The authors report no present or future financial disclosures.