Reliability of Upper Extremity Functional Performance Tests for the Non-overhead Athlete

Bryan L Riemann; George J Davies

doi:10.26603/001c.87924

Introduction

The kinesiological complexity of the shoulder complex, coupled with the lack of inherent anatomical static stability, predisposes the shoulder to a variety of injuries, particularly during participation in sport and physical activity.^1,2 Especially problematic are the long term sequelae that can accompany shoulder injury.³ For example, following acute traumatic shoulder injury, there is an alarmingly high rate of recurrent instability and reinjury that occurs upon return to activity.^4,5 The risk for recurrent instability and reinjury appears the highest for young adult males participating in collision sports.^2,6

Following shoulder injury or surgery (or both), many athletes will go through rehabilitation programs to help them return to their sport. One contributing factor to the high reinjury rates could be the lack of agreement regarding which tests, outcome measures, or metrics should be used for making clinical decisions regarding readiness to return to sport (RTS). Consequently, many athletes could be returning to their sporting activities without adequate rehabilitation to optimize shoulder function. Therefore, following sufficient time for tissue healing constraints and patient education, it is important to establish specific tests and criteria that can be used for facilitating RTS decisions. At a minimum, the testing battery should consist of the following components: patient reported outcomes (PROs), impairment measurements (e.g., range of motion, proprioceptive measurements, etc.), measures of muscle strength, power and endurance, general upper extremity functional performance tests (UEFPT), neuro-cognitive-mechanical reactive tests, and sport specific tests.⁷

Naturally, the UEFPT selected to assist with RTS decisions should relate to the patient’s type of sport. Additionally, for a test to be clinically useful as criteria for RTS, it must be reliable. For athletes participating in overhead sports, the reliability of several open kinetic chain UEFPT were recently reported.⁸ Regarding athletes from non-overhead sports, there have been reliability studies conducted on the closed kinetic chain upper extremity stability test (CKCUEST) and seated medicine ball chest push test (SMBCPT); however, there are several complicating factors in some of the reports that inhibit the ability to draw definitive consensus about the reliability of the two tests. Although several authors have reported relative reliability for the CKCUEST and SMBCPT using intraclass correlation coefficients (ICC), in some cases non-generalizable forms of the ICC were used or the ICC model was not specified. Furthermore, some investigators have conducted reliability analyses using a mixed sample of males and females^9–12 while others have conducted separate analyses for males and females^13,14 or limited their study sample to one sex.^15–17 In the case of the CKCUEST, this is particularly important as different plank positions are adopted by males and females¹⁸ which prompts sex differences in the underlying biomechanics.¹⁹ Additionally, slight variations of the original test descriptions^17,20 have been used in the reliability studies, such as having females assume a full push up position instead of the modified push up position during the CKCUEST^12,21 and specifying horizontal projection of the medicine ball^10,11,16 during the SMBCPT compared to the original description of the test.²⁰ Finally, computing reliability estimates for both tests in the same cohort has not been considered. Having reliability estimates for both tests in the same cohort will facilitate direct comparison of their relative (i.e., ICC) and absolute (i.e., coefficient of variation) reliability. Thus, the purpose of the current investigation was to establish the test-retest reliability of the closed kinetic chain upper extremity stability test (CKCUEST), seated medicine ball chest pass test (SMBCPT) and hands-release push-up test (HRPUT) in a cohort of males and females with a history of non-overhead sport participation. A secondary purpose was to examine the associations between the three UEFPT. Of note, The HRPUT provides a mechanism to solely evaluate upper extremity concentric force and endurance as the release phase eliminates the augmentation provided by the eccentric stretch phase (i.e., strain energy, etc.) of a traditional push up. It was hypothesized that there would be moderate magnitude association between the CKCUEST and HRPUT but no associations involving the SMBCPT.

METHODS

Participants

Forty healthy young adults (20 males, 20 females) with a history of non-overhead sport participation were recruited for the study. Prior to study participation, all participants completed a demographic, physical activity and injury history form, and the 2019 Physical Activity Readiness Questionnaire. A member of the investigative team reviewed all completed forms to verify each participant was appropriate for study participation. Inclusion criteria included being between 18 and 35 yrs old, meeting criteria set by the American College of Sports Medicine for being physically active,²² and participation in a non-overhead recreational or competitive sport for a minimum of one year. Participants were excluded from study participation if they had a previous history of cervical spine or upper extremity injury or surgery within a year prior to data collection, were deficient in range of motion needed to perform the tests or were unable to complete the tests as prescribed. This investigation was approved by a university institutional review board and all participants read and signed an approved informed consent form.

Research Design

This study used a randomized repeated measures research design with study participation requiring completion of two data collection sessions with three to seven days of separation. The two data collection sessions were scheduled at near identical times each day and participants were asked to avoid vigorous physical activity (e.g., upper extremity resistance or exercise training) 24 hrs prior to each session. The order of the three UEFPT tests was randomized during the first session and duplicated during the second session. Prior to data collection, all participants completed a warm-up which consisted of arm circles forwards, backwards, and arm crosses. Each warm-up activity was completed in thirty seconds, for a total of three sets. After completion of the warm-up, subjects were shown a pre-recorded demonstration video illustrating the tests to be performed. Prior to the CKCUEST and HRPUT, participants completed five to eight submaximal practice repetitions to demonstrate proficiency with the test procedures, defined as being able to repetitively perform the two tasks without hesitation between repetitions. Practice for the SMBCPT was built into the gradient warm-up described below. During all practice trials, participants received cuing as needed. Two-minute rest periods were given between each UEFPT.

Test Procedures

Closed Kinetic Chain Upper Extremity Stability Test (CKCUEST)

The procedures for the CKCUEST followed the original descriptions of the test.^17,18 For the starting position, males were tested in the traditional push-up position and females were tested in the modified push-up position from the knees. Participants were positioned with their hands over two pieces of athletic tape spaced placed .914m apart with the upper extremities positioned perpendicular to the floor and shoulders over the hands. On the command, “go”, the participant removed one hand from the floor and touched the opposite line and then replaced the same hand on the original line. The participant then removed the opposite hand from the floor and touched the opposite line and replaced the same hand to its original line. A single test consisted of this alternating procedure for 15 seconds. Participants performed as many touches as possible in the allotted time, and touches were recorded by researchers. A touch was defined as the hand crossing over and touching the opposite line. Prior to three maximal effort trials, participants performed a submaximal test at 50% maximum perceived exertion to facilitate test acclimation. Participants rested for 45 seconds between each testing trial. The number of touches from each testing trial were counted and averaged across the three attempts.

Seated Medicine Ball Chest Push Test (SMBCPT)

The SMBCPT (Figure 1) was performed with the participants sitting with their back stabilized against a wall, legs straight, with feet hip-width apart while holding a 2.73kg medicine ball at chest level. Participants were instructed to push the medicine ball as far forward as possible while keeping their back flush against the wall, and their elbows in towards their sides during the push maneuver. Participants completed a 50% effort and 100% effort warm-up trial prior to three maximal effort trials. Three maximal throws for distance were performed and measurements were recorded in meters from the wall to where the medicine ball landed. The furthest distance achieved out of the three trials was recorded as the performance outcome.

Figure 1.Seated Medicine Ball Chest Pass Test starting position (left) and ending position (right).

Hands-release Push-up Test (HRPUT)

HRPUT began in the prone position with the participants’ hands flat on the ground directly under their shoulders similar to the traditional push-up position (Figure 2-left). The males assumed a full plank position with feet together and toes on the ground, whereas the females used a modified push-up position from the knees. From the initial tripod position, participants were next instructed perform the lowering phase of a pushup until their chest, front of the hips, and thighs made contact with the ground; this served as the start position for the test. Upon the command, “go”, participants pushed their entire body as a single unit into the “up” position by fully extending their elbows. The participant’s entire body was required to remain in a straight body alignment, from head to the ankles (males) or knees (females), with no bending or flexing the knees, hips, trunk, or neck. Failure to maintain a straight alignment during a repetition resulted in that repetition not counting. After the elbows were fully extended and the subject reached the “up” position, the participants bent their elbows to lower body back to the ground as a single unit until the chest, hips, and thighs made ground contact. Without moving the head, body, or legs, the participants raised both hands from the ground simultaneously, keeping the upper arms adjacent to their trunk, so that a clear hands-ground gap was visible to ensure the participants released their hands from the ground (Figure 2-right). Their hands were then lowered to the floor directly under their shoulders. This completed one repetition. Failure to make a continuous effort to push off from the ground, resting on the ground, or lifting the feet off the ground during the test resulted in termination of the test. In addition, the test was terminated if the participant expressed that they couldn’t continue. The only authorized rest position during the test was the “up” position. Participants completed as many correct repetitions as possible in two minutes. The number of correct repetitions was counted and recorded as the performance outcome.

Figure 2.Hands-release Push Up Test starting position (left) and ending position (right).

Statistical Analysis

All statistical procedures were conducted using IBM SPSS Statistics for Windows, version 27 (IBM Corp., Armonk, NY, USA) and Microsoft Excel, version 16 (Microsoft Corporation, Redmond, WA, USA). Separate statistical analyses were conducted for the males and females. Exploratory analyses were conducted on the data from each session to identify potential erroneous data entry errors. Normality of the between session difference scores was examined using Q-Q plots and Shapiro-Wilks tests. Heteroscedasticity between sessions was examined by using the Bland-Altman method.²³ Systematic bias between testing sessions were evaluated using dependent t tests. Absolute reliability was determined by computing standard error of measurement (SEM),²⁴ 90% minimal detectable difference (MDD_90%), and coefficient of variation (CV).²⁵ Relative reliability was computed using ICC_2,1. Additionally, following examination of scatterplots, separate sex Pearson correlational analyses were conducted between the three UEFPT. Coefficients of variation were considered acceptable when values were below 10%. The magnitude for the ICC and correlation coefficients were interpreted as follows: less than 0.40: poor/weak, between 0.40 and 0.59: fair, between 0.60 and 0.74: good/moderate, and between 0.75 and 1.00: excellent/strong.²⁶

RESULTS

Forty healthy young adults (20 males, 20 females) with a history of participation in various non-overhead sports participated in the study. (Table 1)

Table 1.Participant demographics.

	Females	Males
Age (yrs)	23.9 ± 2.0	25.2 ± 3.2
Height (m)	1.66 ± .08	1.81 ± .07
Mass (kg)	63.0 ± 7.7	85.1 ± 13.2
Sport
Weightlifting	3	5
Flag football/football	2	7
Running	6	2
Basketball	4	2
Soccer	2	2
Cheerleading	1	0
Golf	1	1
Gymnastics	1	0
Wrestling	0	1

Apart from the UECKCST (p=0.006) for the females, results of the exploratory analysis for normality revealed no significant (p= 0.070 to 0.524) departures for the difference scores. Closer inspection of the UECKCST scores revealed one female participant to exhibit less touches (2.4 touches) during the second session. The lower score contrasted with the other 19 females who all showed improvement during the second session (i.e., more touches). When normality of the difference scores were examined without this participant included, the distribution was no longer statistically significant from normal (p=0.066). Because the non-normality was attributable to a single participant, coupled with wanting to keep the analyses as simple as possible, the UECKCST data for the females did not undergo a data transformation. Results of the heteroscedasticity analysis yielded no significant relationships with Kendall’s Tau values ranging between -.159 to .174.

Descriptive statistics for test performance across the two sessions are provided in Table 2. Only the UECKCST for both sexes (Table 3), demonstrated significant (p ≤ 0.003) improvements in performance during the second session compared to the first session. Except for the HRPUT in males, the absolute reliability expressed as CV for the remainder of the tests in both sexes were all below 8.8%. Across all three absolute reliability metrics, the results were slight better for the females compared to the males (smaller values). The ICC results demonstrated excellent relative reliability for all three tests.

Table 2.Descriptive statistics for each of the functional performance tests. The units associated with CKCUEST are touches, the units associated with HRPUT are push ups and the units for the SMBCPT are meters.

Test	Sex	Session 1	Session 2
Test	Sex	Mean ± SD	Mean ± SD
UECKCST	F	21.5 ± 4.8	23.4 ± 4.6
	M	25.4 ± 3.5	26.9 ±3.3
HRPUT	F	35.5 ± 8.2	36.6 ± 7.6
	M	35.4 ± 7.8	33.9 ± 9.8
SMBCPT	F	3.76 ± .56	3.81 ± .67
	M	5.62 ± .70	5.76 ± .76

UECKCST: upper extremity closed kinetic chain stability test, HRPUT: hands release push up test; SMBCPT: seated medicine ball chest pass test; SD: standard deviation; F: female; M: male

Table 3.Results of the systematic bias (difference between the two assessment sessions), absolute reliability, and relative reliability analysis conducted on the functional performance tests. The units associated with CKCUEST are touches, the units associated with HRPUT are push ups, and the units for the SMBCPT are meters.

Test	Sex	Systematic Bias		Absolute Reliability			Relative Reliability ICC (95% CI)
Test	Sex	Mean ± SD	p-value	SEM	CV %	MDD_90%
UECKCST	F	1.8 ± 1.4	<0.001*	1.0	5.0	2.3	.954 (.888-.982)
UECKCST	M	1.5 ± 2.0	0.003*	1.4	6.0	3.3	.823 (.606-.926)
HRPUT	F	1.1 ± 3.9	0.217	2.7	8.8	6.4	.881 (.725-.951)
HRPUT	M	-1.5 ± 4.6	0.175	3.3	12.5	7.6	.864 (.698-.944)
SMBCPT	F	.05 ± .27	0.383	.19	4.5	.45	.903 (.773-.961)
SMBCPT	M	.13 ± .41	0.160	.29	5.1	.67	.843 (.646-.935)

*statistically significant systematic bias (p<.05)
UECKCST: upper extremity closed kinetic chain stability test, HRPUT: hands release push up test; SMBCPT: seated medicine ball chest pass test; SD: standard deviation; SEM: standard error of the measurement; CV: coefficient of variation; MDD_90%: 90% minimal detectable difference; ICC: intraclass correlation coefficient; CI: confidence interval; D: dominant; ND: nondominant; F: female; M: male

Results of the correlational analyses (Figure 3) yielded only one significant relationship (r=.691, p=0.001) between UECKCST and SMBCPT for the females. Aside from the non-significant relationship between UECKCST and HRPUT for the females (r=.342, p=0.140), the remainder of the coefficients were less than .149 (p >0.530).

Figure 3.Scatterplots between the upper extremity closed kinetic chain stability test (UECKCST), seated medicine ball chest push test (SMBCPT), and hands-release push up test (HRPUT) for the females (closed circles) and males (open circles).

DISCUSSION

The primary purpose of this investigation was to assess the intersession reliability of three UEFPT to assist with RTS decision making in a cohort of males and females with a history of non-overhead sport participation. Based on current literature voids, paramount to the investigation was conducting comprehensive reliability analyses (i.e., systematic bias, relative reliability, absolute reliability), separately for each sex, to enable direct comparisons between the three tests. Across the three UEFPT, the relative reliability of all three UEFPT were excellent with the reliability estimates being slightly higher for females compared to males. Based upon the coefficients of variation, only the UECKCST and SMBCPT had values less than 10% across both sexes; the HRPUT coefficient of variation for the males was just beyond 10%. Thus, this investigation largely provides support for all three UEFPT and provides clinicians with the necessary information to interpret serial testing with these three tests.

The ICC, which largely represents the consistency of an individual’s rank relative to the group across the two testing sessions, for the three UEFPT across both sexes were higher than the threshold criteria of .75. Important to consider when assessing the clinical meaningfulness of ICC are the confidence interval widths, which represent the precision of the estimates (e.g. narrower indicates more precision), and the lower bounds (e.g. are they within an acceptable range). The confidence interval widths and lower bounds for the females also support the relative reliability of the three UEFPT. For the males, in addition to the ICC being slightly smaller than the females, the confidence interval precision was less (i.e., wider confidence intervals) and the lower bounds were below .7. This may be attributable to more activity level variability for the males which likely results in greater upper extremity strength and power differences in the cohort. Compared to the females, more males were involved with weight training and football, activities that both require high degrees of upper body strength; however, within those activities there was likely high degrees of proficiency variability, such as the various positions in football and player size, which likely creates more variation in the actual strength and test performance. As described previously, mixed sex samples, slight methodology differences, and the unspecified or non-generalized intraclass correlation coefficient models used in previous literature make drawing consensus difficult. Those issues aside, our intraclass correlation coefficient values for the UECKCST were either similar or higher than previous literature,^9,13,14,17 while our ICC for the SMBCPT were slightly less^10,11 with the exception of one study.¹⁶

While ICC are the most reported reliability statistics, absolute reliability and systematic bias are essential for clinicians to interpret test performance, whether comparing to normative data or conducting serial assessments on the same patient. For example, when conducing serial assessments during rehabilitation to monitor patient progress, a patient’s performance change must exceed the mean systematic bias plus an absolute reliability estimate (e.g., SEM, MDD) to definitively declare improvement. Compared to the standard error of measurement, a patient exhibiting a score improvement by more than the minimal detectable difference provides extreme confidence that the change is reflecting patient improvement. Likewise, if comparing a patient’s score to normative data yields a lower score that exceeds the minimal detectable difference, the clinician can be confident the patient has a deficiency. Similar to the ICC, the females demonstrated slightly better absolute reliability than the males and is likely explained by the aforementioned speculation regarding the ICC.

The SEM reported for the UECKCST across two testing sessions ranges 1.5 to 2.8 touches for healthy females and 1.2 to 2.0 touches for healthy males.^9,14 The females in the current study demonstrated a lower SEM than the previous literature, while the males were within the range of the previous reports. Interestingtly, the females in the current study demonstrated a smaller SEM (i.e., better absolute reliability) than the males, whereas in the previous reports the males demonstrated smaller SEM than the females. These slight differences are likely attributable to inclusion/exclusion criteria differences. The current investigation focused upon non-overhead athletes. In contrast, the previous literature using healthy participants focused on sedentary,¹⁴ upper extremity upper extremity sport-specific recreational athletes,¹⁴ or just physically active.¹³

Only two previous investigations have reported SEM for the SMBCPT in a cohort of young adult females¹⁶ and mixed cohort of overhead athletes.¹¹ While our SEM for the females is within ~0.06m of the previously reported female cohort, our male participants exhibited SEM that exceeds the previous mixed sample report by ~0.19m. One explanation may be that both previous investigations specified a strict horizontal throw be used. In contrast, and similar to a previous report examining the association between SMBCPT and bench press throw performance,¹⁵ we did not limit the maneuver to a strict horizontal projection. Gillespie and Keenum²⁷ compared chest pass throw performance while seated in a chair under controlled and uncontrolled projection angle conditions and reported ~0.30m greater throw distance for the uncontrolled projection angle condition. Thus, because we did not restrict projection angle, it is likely that our throw distances are greater than the two previous SMBCPT reports^11,16and associated with the greater throw distances are SEM increases.

As a unitless measure, the CV provides a means to directly compare the absolute reliability of the three UEFPT. Across both sexes, slightly smaller CV (better absolute reliability) were identified for the SMBCPT than the UECKCST, with the HRPUT demonstrating the largest CV. It is also important to note that the HRPUT CV for the males was slightly larger than the 10% acceptability threshold. The larger CV for the HRPUT is likely a function of the test duration being up to two minutes. Despite the larger CV, in light of the MDD_90% being 6.4 (females) and 7.6 (males) push-ups for the two-minute effort the HRPUT can likely play an integral role in RTS decision making following upper extremity injury or surgery.

In the current study, only the UECKCST exhibited a statistically significant improvement across the two testing sessions. The two previous studies that considered systematic bias for the UECKCST also reported statistically significant improvements with repeated testing.^9,13 Although the previous literature does not provide full descriptive statistics regarding the significant session improvements, estimation from the graphs presented suggest improvements between 1.5 to 2.4 touches; the significant mean improvements in the current study (1.8 touches for females, 1.5 touches for males) are both within this range. Closer inspection of our data yielded 85% (17/20) of the females and 75% (15/20) of the males to show greater touches during the second assessment. Further investigation yielded 75% (15/20) of the females and 45% (9/20) of the males demonstrating touch improvements that exceeded the standard of error measurement (i.e., typical error) threshold. As a result, practitioners should expect, on average, a one to two touch improvement when interpreting UECKCST scores from a second test administration. Thus, to be reasonably certain improvement in test performance exceeds that attributable to repeated exposure and measurement error, practitioners should seek an increase in touches that exceed the systematic bias and 90% minimal detectable difference. Those thresholds, based on systematic bias plus minimal detectable differences, are ~4 and 5 touches for females and males, respectively. There was only one previous report,¹⁰ which used a radar gun to measure ball velocity instead of measuring distance, that examined and reported no systematic changes in SMBCPT performance.

The void of research examining the HRPUT prohibits comparing the current results to previous results. Since the collection of the data for the current study, a slightly different HRPUT version has been incorporated into some military performance tests in place of the traditional push-up test. In contrast to the hands being lifted with upper arms remaining adjacent to the trunk within the current study, the military version incorporates the hands being moved outward from shoulders until full shoulder abduction position is reached. A traditional push-up is technically an upper extremity stretch-shortening movement because of the eccentric stretch created with the downward movement followed by the concentric power production push-up phase. Therefore, to solely evaluate the concentric power of the upper extremity musculature, the release phase of the HRPUT eliminates the eccentric stretch and attenuates the stretch-shortening augmentation to performance. Therefore, the HRPUT may be best suited to evaluate true concentric power press capacity of the upper extremity muscles, similar to upper extremity movements in sport that occur without the stretch-shorten sequence (e.g., wrestling, jujutsu, gymnastics, etc.). Our results, except for the CV for the males being 2.5% greater than the acceptable threshold, provide initial supportive insight into the test-retest reliability of the HRPUT.

In contrast to the authors’ hypotheses, there were no significant relationships between the UECKCST and HRPUT for either the females or males. Despite similar testing positions, the authors’ speculate that large discrepancy in test duration (UECKCST=15s, HRPUT=2 min) relies upon different muscle performance characteristics. Specifically, it is likely that the HRPUT draws more upon muscle endurance whereas the UECKCST relies upon a blend of muscle performance and power to execute the frontal plane upper extremity movements. Additionally, while HRPUT solely involves moving the head-torso vertically, the UECKCST requires a degree of eye-hand coordination to sequentially touch the targets. Based upon previous research reporting a weak relationship between UECKCST and SMBCPT,¹⁶ the significant moderate strength association between the UECKCST and SMBCPT for the females in the current study was unexpected. The SMBCPT, being a single maximal effort maneuver, is a bilateral upper extremity muscle power assessment that has been previously demonstrated to associate with ballistic bench press performance.¹⁵ The relative short UECKCST test duration (15s) coupled with the modified push-up position used by the females to perform the UECKCST not requiring high levels of muscle endurance to maintain the position while executing alternating touches likely explains the statistically significant association.

Overall, except for the relationship between the UECKCST and SMBCPT for the females, the lack of associations between tests suggests they are assessing different aspects of upper extremity functional performance and thus can be used along a continuum. Specifically, as progression hierarchy, the authors recommend first assessing a patient with the SMBCPT, followed by the UECKCST and then the HRPUT. Unlike the UEFPT for overhead sport activities that are completed unilaterally,⁸ all three of the UPFPT in the current investigation are bilateral assessments. Consequentially, practitioners should be cognizant that the healthy (uninjured) side may be able to partially compensate for deficiencies in the injured limb. As a result, practitioners may want to follow up bilateral UEFPT with unilateral UEFPT and examination of limb symmetry indices.²⁸

Finally, it is important to recognize that this study sample included both recreational and competitive athletes across a variety of specific non-overhead sports. This study population was chosen to provide practitioners with initial reliability estimates regarding these three UEFPT across a broad population of non-overhead athletes. As test performance may differ between participation level and sports, future research should explore the reliability of the three UEFPT in other varied populations.

CONCLUSION

The results of the current study provide support for all three UEFPT and offer clinicians the necessary information to interpret serial testing with using these three tests. With the exception of the moderate relationship between UECKCST and SMBCPT for the females, the lack of strong associations between performance of the three tests implies that they are reflecting different aspect of upper extremity functional performance.

Acknowledgements

We would like to thank Nicole Ebel, Morgan Taylor, Ben DeLoach and Luke Thayer for their assistance with participant recruitment and data collection.

Conflicts of interest

The authors report no conflicts of interest.

Reliability of Upper Extremity Functional Performance Tests for the Non-overhead Athlete

Abstract

Background

Purpose

Study Design

Methods

Results

Conclusion

Level of Evidence

Introduction