BACKGROUND
The glenohumeral (shoulder) joint is the most mobile articulation of the human body.1 Due to this vast mobility, proprioception is particularly crucial to keep the humeral head centered throughout range of motion and ensure upper limb optimal function.2 Proprioception deficits may result in a decrease in shoulder stability, alteration in the control of the shoulder, and eventually to injuries, pain and disability.3,4 Therefore, in clinical settings, all aspects of shoulder proprioception should be objectively measured to characterize the proprioceptive deficits that need to be rehabilitated in order to improve treatment outcomes.5,6 Capsuloligamentous and musculotendinous tissues contribute to proprioceptive feedback through the inputs provided by mechanoreceptors located within these structures.7,8 These inputs provides neural substrates for the central nervous system to refine the motor output required for maintaining should stability.9,10
Research on shoulder proprioception has mainly focused on joint position sense (JPS) and sense of movement (SOM), while only few studies have investigated shoulder sense of force (SOF).6,8,9,11–13 SOF encompasses the ability to perceive, interpret, and reproduce force applied to or generated by a joint, integrating sensory and motor components to achieve precise joint control.14 Since the glenohumeral joint relies primarily on dynamic control to maintain stability, the evaluation of SOF could be of clinical interest. Indeed, SOF is largely derived from peripheral afferent inflow rather than central signals when subject has to reproduce a force with the same limb.15,16 The main proprioceptors for SOF appear to be Golgi tendon organs whereas muscle spindles are primarily responsible for JPS but there is no clear consensus.15,17 As neuromuscular control of the rotator cuff is important for stabilizing the joint and limiting the risk of injury, it could be of interest to assess SOF in rehabilitation setting.18,19 Although some measurement techniques have been proposed to assess SOF, mostly using isokinetic dynamometer (IKD), their implementation in clinical practice is limited due to the equipment and time required to perform these evaluations.8,9,20 In contrast, hand-held dynamometers (HHD) are reliable to measure forces, relatively cheap and quick and easy to use.20 To date, no HHD protocol has been proposed that assesses shoulder SOF and the psychometric properties of this measurement are unreported. Previous studies conducted on the elbow joint indicate than SOF protocols could be impacted by contraction intensity and the type and nature of movement reproduced by the participants.11,13,21
The purpose of this study was to test a new SOF protocol with a handheld dynamometer (HHD) and examine its agreement with an isokinetic dynamometer (IKD), as well as its reliability and the effect of contraction intensity.
STUDY DESIGN
Cross-Sectional measurement Study
MATERIALS AND METHODS
Participants
Participants aged 18-35 years with no self-reported shoulder pain or disability were included. Exclusion criteria included previous dislocation, subluxation, surgery of the upper quadrant, or pain in these regions within the six months prior to the study. The study was approved by the Erasme ULB Ethical Committee (registration number B4062021000304), and all participants provided written informed consent.
Study protocol
All measurements were conducted at the Research Unit of Rehabilitation Sciences, Université Libre de Bruxelles, and ERASME Hospital, Brussels. Upon arrival, participants’ sociodemographic data, arm dominance, injury status, wellbeing (Hooper index), fatigue (Visual analog scale) and sport activity were recorded online. Arm dominance was defined by the arm used to throw a ball.
Participants underwent two evaluation sessions, separated by three to six days. The first session (n=51) assessed inter-device agreement and provided preliminary data on the HHD SOF protocol’s error score. Only the dominant arm was tested using both the IKD (Cybex Human Norm, Humac, CA, USA) and HHD (MusTec BioFET, Almere, Netherlands), with the order of assessments randomized through a blind draw.
Twenty-five participants from the initial sample volunteered for a second session focused on HHD reliability, evaluating intra-rater (inter-day) and inter-rater reliability on the dominant arm, and intra-rater (intra-day) reliability on the non-dominant arm by the same operator to minimize fatigue and save time. Due to the HHD SOF protocol’s duration (approximately two hours), both arms were used alternately to measure different reliability parameters, following the findings of Manheout et al.13 that SOF evaluation is consistent between arms in healthy individuals. Only HHD reliability was evaluated in the second session, as previous studies have already reported ICC values for IKD in SOF protocols,12 and logistical constraints limited access to the IKD for this second session.
Participants performed submaximal isometric contractions with a hold-to-rest ratio of ten seconds on:off in the testing position for three minutes to warm up and familiarize themselves with the force reproduction tasks. After a two-minute rest, maximal voluntary isometric contractions (MVIC) in both internal and external rotation were assessed. Following a three-minute rest, six force reproduction tasks were performed, with three targets and two isometric contractions in both internal and external rotation. Three trials were performed for each of the conditions. The testing device (IKD or HHD) was randomly selected and kept consistent in both sessions. A 15-minute break was provided between each device protocol. See Figure 1 for a protocol overview.
Sense of force protocol
Testing Position
The testing position was consistent across all measurements (Figure 2). Participants lay supine on a table with the arm abducted at 90°, the shoulder in neutral rotation, and the elbow flexed at 90°. The arm was supported by the table to prevent horizontal abduction. The non-tested arm rested on the belly. A strap around the anterior superior iliac spine stabilized the participant’s position. Participants wore a blindfold and closed their fist during the HHD protocol to mimic the handle used in the IKD protocol, ensuring standardized hand positioning. Visual feedback was entirely removed in all testing conditions to ensure unbiased sensory input.
For IKD measurements, the dynamometer axis was aligned with the projection of the humeral head’s center of rotation at the elbow level. Participants grasped the handle of the shoulder-elbow attachment with their forearm in pronation. All tests were conducted in the isometric mode.
For HHD testing, the forearm was placed against the HHD (2 cm proximal to the ulna styloid process on the dorsal forearm for external rotation [ER] or ventral forearm for internal rotation [IR]). The examiner positioned themselves in a lunge stance opposite to the direction of force measurement, stabilizing their elbows against their stomach. “Make” tests were used during MVIC and force reproduction tasks due to their superior reliability compared to “break” tests.22 This testing position was chosen to be close to the apprehension test position, a clinical sign indicative of shoulder instability.23 Additionally, this position is practical for clinical settings, as it is easy for patients to relax in a supine position, and clinicians can replicate the setup quickly between evaluations and devices.
Maximal Voluntary Isometric Contraction (MVIC) Testing
The MVIC of the shoulder rotators was conducted in the same testing position. Participants gradually increased force from zero to maximum within two seconds, maintaining it for five seconds. Three trials were performed, with 60 seconds of rest between trials and rotations to prevent fatigue. The highest MVIC force was used to determine the three targets (10% MVIC, 30% MVIC, 50% MVIC) for the force reproduction tasks.13,24 This intensity range was selected to accommodate future testing on pathological populations. Previous studies on the elbow joint have shown differences in SOF scores based on intensity.21 Participants had three minutes to familiarize themselves with the reproduction task at different target intensities with auditive feedback, followed by a three-minute break before performing the force reproduction task.
Force Reproduction Task
The force reproduction task involved isometric contractions (%MVIC), divided into two phases: the target phase and the reproduction phase.24 In the target phase, participants gradually matched the target force within two seconds, guided by verbal cues from the assessor, and sustained the contraction for eight seconds.16 During this phase, participants focused on the sensation of force generated by the shoulder rotators. All target phases were completed before starting the reproduction phase.
In the reproduction phase, participants attempted to reach the target force within a 10-second window without feedback from the investigator. They verbally signaled when they believed they had reached the target, maintaining the contraction until the investigator instructed them to stop. The three seconds following the participant’s signal were recorded for analysis. A 20-second break separated the target and reproduction phases, during which participants kept their testing arm in the same position. Three trials were performed at each contraction intensity and rotation, with a 10-second rest between trials. The order of contraction and rotation was randomly determined, and a 30-second rest was provided between each target. (Figure 3)
Data reduction
Raw data from the IKD and HHD were extracted into spreadsheets. Mean and standard deviation were calculated using Excel V.16.56 for Mac. To measure the accuracy of the sense of force (SOF), relative error (RE) and constant error (CE) scores were used. For each trial, the error score was calculated as the difference between the targeted force (%MVIC) and the observed force, with the average of three trials analyzed. RE was expressed as the absolute difference from the target, while CE was calculated without using absolute values to indicate the direction of mismatch. A positive CE score indicated an overestimation of the target. The coefficient of variation (CV) was calculated to represent the steadiness of force reproduction, expressed as a proportion of the mean force produced during the three trials.
Sample size and Statistical analysis
Based on similar studies12,13 and COSMIN recommendations,25 51 healthy adults were included. With seven observations by subject , alpha at 0.05 and level of power at 0.8, twenty-five subjects were required to complete the reliability analysis.26 Statistical analysis was performed using JASP software (version 0.13.1), with a significance level set at p<0.05. The Shapiro-Wilk test was used to assess normality. A paired t-test was conducted to assess systematic differences between sessions in terms of wellbeing and fatigue.
Intra-rater (inter-session) reliability was assessed between sessions 1 and 2 on the dominant arm, while intra-rater (intra-session) and inter-rater (intra-session) reliability were evaluated during Session 2 on the non-dominant arm. The intraclass correlation coefficient (ICC) formula 3,k was used to calculate reliability, with ICC values ranging from 0 (no reliability) to 1 (perfect reliability).27,28 The standard error of measurement (SEM) and minimal detectable change (MDC95) were calculated to assess absolute reliability. SEM was calculated as SD x √(1-ICC), and MDC95 as SEM * 1.96 * √2.29,30
A linear regression was performed between HHD and IKD measurements (dependent variable) and the theoretical target from MVIC testing (independent variable). Bias, precision, and agreement between the two methods were assessed.31 As measurement units differed (N for HHD and N.m for IKD), total bias was calculated for each device using the formula: bias = α + 𝑥 * (β -1), where α represents differential bias and β represents proportional bias.32 Nonparametric t-tests were used to compare the total bias between the two devices.
A two-way ANOVA for repeated measures was conducted, with %Target and sense of rotation as independent variables and RE score as the dependent variable. Post hoc analyses with Bonferroni corrections were performed, and RE scores were compared between IKD and HHD for similar targets.
RESULTS
Twenty-four females (age = 23 ± 2 years old, height = 1.68 ± 0.08 m, body mass = 60 ± 8 kg, weekly global sports volume= 4 ± 3 h) and 27 males (age = 23 ± 3 years old, height = 1.80 ± 0.06 m, body mass = 78 ± 11 kg, weekly global sports volume= 4 ± 4 h) participated in the first session. Of these, 25 participants(14 females; age = 22 ± 2 years old, height = 1.68 ± 0.08 m, body mass = 61 ± 8 kg, weekly global sports volume= 4 ± 3 h and 11 males ; age = 24 ± 2 years old, height = 1.80 ± 0.06 m, body mass = 80 ± 9 kg, weekly global sports volume= 4 ± 5 h) participated in the second measurement session.
There were no significant differences for wellbeing parameters and self-reported fatigue and pain between Session 1 and Session 2 (p-value > 0.05). Self-reported fatigue was also not significatively different between pre- and post-test values withing a session (p-value > 0.05).
MVIC data are summarized by mean and standard deviation in Table 1.
Hand Held Dynamometer – Sense Of Force protocol reliability analysis
Intra-rater (inter-session) reliability ICCs ranged from 0.44 (30% IR) to 0.64 (50% ER), indicating low to moderate reliability. SEM values varied from 4% (50% ER) to 12% (10% IR), with MDC95 ranging from 12% to 42%. Intra-rater (intra-session) reliability ICCs ranged from 0.27 (30% IR) to 0.84 (10% ER), showing low to high reliability. SEM values ranged from 3% (50% ER) to 11% (10% IR), and MDC95 ranged from 9% to 30%. Inter-rater (intra-session) reliability ICCs ranged from 0.07 to 0.43, indicating little to low reliability, with SEM values from 5%(50%RE) to 19%(10%IR) and MDC95 from 13% to 53%.(Table 2).
Inter-device agreement between Handheld Dynamometer and Isokinetic dynamometer
Total bias was significantly different (p-values < 0.05) between HHD and IKD SOF protocol at 30% and 50 % MVIC but not at 10% MVIC. HHD target at 50%IR presented the highest bias with -12.34 (Table 3).
There were significant differences between rotations for each tool (p-values < 0.005) with ER presenting lower total bias at 10% MVIC and 50%MVIC than IR and the opposite at 30% MVIC for IKD. For HHD ER presented lower total bias for all targets (p-value<0.001).
Impact of SOF protocol modalities on the error score
CE score ranged from -8 (50%IR) to 16 (10%IR), whereas RE and CV ranged from 7 (50%IR) to 19 (10%ER) and 3% to 6%, respectively (Table 4 for details).
There was no significant difference between Session 1 and Session 2 for all measurements except for CE 30%IR and CV 50%IR (p-value<0,05).
Two-way Anova showed a significant effect of target intensity on SOF protocol error score, F(2,100)=10,72, p<0.01. No significant effects were found for rotations or target*rotation, F(2,100)=0,23,p=0,788.
Post hoc test showed a difference between targets at 10% MVIC and 50% MVIC (P<0.001) and targets at 10% and 30% MVIC (P<0.001) but not between 30% and 50% MVIC. Comparing RE between devices for the same target, HHD showed significantly lower values than IK for all targets and rotations.
DISCUSSION
The main goal of this study was to develop a new clinical protocol for assessing a relatively unexplored aspect of proprioception: the SOF. The specific aims were to appraise the relative and absolute reliability of the HHD protocol, to determine the agreement between two devices in assessing SOF, and to present initial findings regarding parameters influencing the SOF error score. Globally, the results suggest that the relative reliability of the HHD protocol for evaluating SOF range from low to high for intra-rater measurements and from little to low for inter-rater measurements.
Key results from this investigation reveal better intra-rater (inter-day) absolute reliability for higher targets (30 & 50%MVIC) compared to lower targets (10%MVIC), with SEM and MDC ranging from 4% (Target 50%ER) to 15% (Target 50%IR). Moreover, employing the same protocol for SOF assessment with different tools yielded different results, precluding direct tool comparisons. Finally, the SOF error score appears to be influenced by the intensity of contraction but not by the direction of the isometric contraction.
Reliability
The reliability findings differ from previous studies on shoulder SOF. Dover et al.12 reported high test-retest (intra-rater inter-day) reliability, while Manheout et al.13 showed high reliability for internal (Cronbach’s α=0.85) and external rotation (Cronbach’s α=0.91). In contrast, low to moderate intra-rater (inter-day) reliability was observed. Differences in testing positions, statistical analysis, and assessment tools likely contributed to these discrepancies. Dover et al.12 used a standing position with an IKD but did not report specific ICC values or details about their reliability analysis. Manheout et al.13 used a seated position with a stationary dynamometer and a different statistical approach (Cronbach’s alpha). The current study employed a HHD protocol where the examiner held the device without external stabilization, possibly leading to higher variability compared to IKD. Proprioception measurement is influenced by testing position,33 making direct comparisons challenging. However, the simplified HHD protocol was designed to save time during clinical assessments. Better reliability during intra-rater testing and at higher contraction intensities was observed, possibly due to the challenges in guiding patients verbally at lower intensities, where they may rely more on other sensory information like skin receptors.34 The lack of external fixation may have increased variability at low intensity due to HHD orientation and skin pressure.
Inter-device agreement
A secondary objective of this study was to assess the agreement between HHD and IK in measuring SOF. Since the conditions for Bland-Altman’s limits of agreement method were not met, regression analysis was used.31,32 The targeted values were in Nm for IK and in N for HHD, so each device was compared to its’ perfect theoretical target based on MVIC testing, describing this outcome as total bias evaluation for each tool.
Regarding agreement, the IKD had a significantly lower total bias than HHD (p<0.05) for targets and rotations above 30% MVIC. Muscles involved in internal rotation, being larger, may cause more compensation during isometric contraction, and the IKD likely provided better support, particularly at the elbow joint. Despite using the make test in the HHD protocol, variation in examiner-applied pressure could have led to differences in bias and error scores between the two devices. Overall, the total bias was greater for the HHD across all targets and rotations, with both tools showing a similar performance trend (same bias direction) and differences only in the extent of bias between devices.
Error Score
Force reproduction was generally overestimated at 10% MVIC and underestimated at 50% MVIC for both methods, with higher error scores at 10% MVIC compared to 30% and 50%. Similar trends were noted by Henry et al.24 and Esreoglu et al.,16 showing overestimation at low force levels. The discharge rate of muscle spindles and proprioceptors, which increases with higher force, correlates with the level of sensory input available to the central nervous system.35
Onneweer et al.36 found that force reproduction error scores depend on contraction intensity, whereas joint position sense (JPS) does not. SOF likely relies on muscle spindles and Golgi tendon organs, both of which provide better feedback at higher contraction intensities.16 At lower force levels, other sensory strategies, such as tactile information, may guide movement and force production.34 For a more accurate assessment of SOF through proprioception, clinicians should consider using contraction intensities above 30% MVIC. Lastly, neither arm dominance nor contraction direction significantly affected the relative error score (p>0.005), consistent with other studies on shoulder proprioception.16,17
Clinical implications and recommendations
This study provides preliminary guidelines for clinicians interested in assessing proprioception, specifically the SOF. First, it is crucial to acknowledge that direct comparison of data obtained from HHD or IK is not reasonable. Therefore, it is important to specify the tools employed for assessment when communicating with other medical professionals. Second, since the direction of isometric contraction (IR or ER) at 90° abduction and neutral rotation does not appear to significantly impact the SOF error score, selecting the direction that aligns with the clinical presentation of each patient is recommended to save time. Future research is warranted to examine the applicability of the SOF protocol in pathological populations, such as patients with anterior instability, who may experience apprehension in extreme positions.37 Third, study findings indicate that the target at 50% demonstrated the highest reliability, emphasizing its priority in assessment for diagnosis purpose. However, given the low to moderate intra- and inter-rater ICC values, clinicians should refrain from using the SOF protocol as a diagnostic tool or for patient categorization at this time. Instead, they should focus on the use of absolute reliability metrics to inform sensory-driven rehabilitation or training programs.27 Clinically meaningful changes in SOF can be considered only if patients show an improvement in their RE error score by 12% or 15% or more. Finally, as ICC values are low to moderate, it may be beneficial to alter the protocol by using a strap to secure the HHD as well as an external fixation device for the HHD to control compensations at the shoulder and minimize the impact of the clinician during measurement. While the strap provides additional feedback to patients, the potential improvement in reliability may outweigh this potential bias.
Limitations and futures directions
Several limitations of this study need to be considered. It was decided to simulate a clinical setting by using minimal equipment, a decision that may have introduced compensatory factors during the HHD force reproduction protocol when compared to IKD. Moreover, this protocol lasted almost two hours, and it cannot be excluded that participants experienced fatigue or lacked motivation, which could have influenced the results. Although efforts were made to control these factors by assessing fatigue and pain and providing precise instructions to participants on how to move their arm, the study results may still have been impacted by them. However, in real-life clinical practice, applying the SOF protocol with a single patient focusing on one intensity and one direction, would take less than ten minutes. This streamlined approach would minimize patient fatigue while offering clinicians a practical and objective measure for proprioception. Nevertheless, further research is needed to determine whether SOF can be effectively trained, improved, and translated into real life scenarios over time. Furthermore, it is essential to note that the data were exclusively collected from healthy young individuals, preventing the generalization of these findings to other populations or individuals with pathological conditions.
In future research on this topic, the development of a more concise protocol with fewer targets and the implementation of a strap to minimize compensations at the shoulder joint and enhance movement control should be considered. Additionally, the use of an external fixation for the HHD to reduce the impact of the examiner during data recording could be beneficial. Finally, future studies examining the SOF in participants with shoulder conditions are required.
CONCLUSION
The results of this study indicate that an optimal combination of relative and absolute reliability (moderate) was achieved at 50%MVIC for both intra- and inter-rater reliability. Consequently, clinicians are advised to use this testing parameter when assessing SOF. It is important to specify the tools used for SOF assessment, as the results of this study indicate that they cannot be used interchangeably. Finally, no significant differences were reported between internal and external rotation error scores. However, the intensity of the contraction impacted the error score differently. Specifically, higher error scores were observed at lower intensity (10%MVIC) compared to higher intensities (30%MVIC, 50%MVIC).
Disclaimer
The authors report no conflicts of interest