INTRODUCTION
The Modified Thomas Test (MTT) is an orthopedic assessment employed by clinicians to examine hip flexor length, specifically targeting the iliopsoas and rectus femoris muscles.1,2 Goniometers, commonly utilized in clinical settings, provide quantitative measurements in angular (degrees) units of motion for various joints in the upper and lower extremities.3 They can also be used to quantify outcomes of muscle length tests. In this study, “hip flexor length” refers to the overall assessment of the MTT, quantified by “hip extension ROM” measured using a goniometer during the MTT. Despite the generally recognized good-to-excellent reliability of goniometric measurements for muscle length in the lower extremity, a limitation reported is the need for two hands on the tool, posing challenges in stabilizing other body segments, particularly when measuring isolated hip range of motion that may be affected by lumbopelvic contributions.4,5
Intra-rater reliability measures the consistency of one individual’s measurements, while inter-rater reliability assesses consistency between different individuals measuring the same phenomenon. The MTT can be used to determine hip flexor length using goniometric measurements of hip extension. The MTT can also be scored using a pass/fail method, where a “pass” indicates that the subject’s ROM meets or exceeds 0 degrees of hip extension in the testing position, and a “fail” signifies that the subject does not achieve this position and remains in varying degrees of hip flexion while in the testing position.
The MTT is used in clinical practice to assess hip flexor length in patients with conditions such as lower back pain, knee dysfunction, and hip pain.6–10 However, the limited evidence regarding the MTT has shown conflicting reliability due to confounding variables including the lack of pelvic stabilization and varying positions of the contralateral hip.1 Researchers have found that uncontrolled pelvic tilt during the MTT measurement contributes to the overestimation of hip extension, leading to poor reliability.11,12 The purpose of this study was to evaluate the intra- and inter-rater reliability of the MTT to assess hip flexor length using a goniometer. It was hypothesized that controlling for pelvic tilt would result in high inter- and intra-rater reliability of these measurements.
METHODS
Subjects consisted of 64 volunteers (128 limbs). A power analysis estimation revealed that a sample size of approximately 93 limbs would provide a 95% confidence level when analyzing data at a p<0.05 level of significance. Inclusion criteria was willingness to participate and being at least 20 years old. Exclusion criteria included recent surgery (within the prior six months) of the lumbar, hip, or knee region, recent physical trauma to the lumbar region or lower extremity (within the prior six months), or if the participant was under the care of a clinician for low back pain. Participants, primarily university students and staff, completed a detailed pre-participation questionnaire covering weekly activity level, height, weight, average sleep duration, dominant leg, prior injuries or surgeries, and back pain intensity (0-10 scale). This study adhered to ethical standards for human research and was approved by the Institutional Review Board (IRB). Informed consent was obtained from all individual participants included in the study.
The primary researcher was a physical therapist, board certified in Orthopedic physical therapy with 12 years of experience in an outpatient setting. The additional four researchers were Doctor of Physical Therapy (DPT) students in the final year of their program. Following completion of all required paperwork, participants proceeded to the first station for MTT measurements of hip flexor length.
Procedures
Participants were instructed to sit at the edge of the mat table, pull one knee towards the chest and gradually roll back to the table. Subjects were instructed to allow the opposite leg to hang off the table. Once the participant was in supine on the mat table, the examiner assisted the participant in further flexing the hip not being measured with one hand while palpating under the lumbar spine for a neutral lumbopelvic tilt with the other. Neutral lumbopelvic tilt was operationally defined as the natural lordosis of the lumbar spine without excessive arching or flattening. Once neutral was found, the participant maintained their neutral position (confirmed by examiner palpation) for completion of the measurement. This allowed for maintaining a neutral lumbopelvic tilt and avoiding compensatory excessive lumbar lordosis during testing. Upon achieving the testing position, examiners utilized a standard plastic goniometer, positioning the fulcrum at the greater trochanter, the distal arm at the lateral midline of the femur, and the proximal arm at the lateral midline of the trunk. Stickers were placed on bony landmarks via palpation to encourage accuracy during measurements. Degrees of goniometric measurements were blinded by a piece of construction paper placed over the goniometer face to prevent the measurer from seeing the results.
Each participant underwent a total of eight measurements: two trials on each leg by each of the two examiners at two separate stations. While one examiner aligned the goniometer, blinded to the measurement by a piece of construction paper, the other examiner removed the paper, read and recorded the goniometric measurement. The stations were separated by a curtain to ensure independent measurements without vocal communication. Intra-rater reliability was assessed by comparing the two measurements on the same leg taken by the same examiner at each station, while inter-rater reliability was determined by comparing measurements on the same legs taken by different examiners across the two stations.
In this study, hip flexor length was measured in the MTT position with hip extension ROM being recorded in degrees. Data were entered into a Microsoft Excel spreadsheet for initial organization and verification before being analyzed in SPSS.
Statistical Analysis
Statistical analysis was performed using SPSS software. ICCs were calculated to assess both intra-rater and inter-rater reliability. SEMs were also computed to quantify the variability around the mean scores across trials, providing a comprehensive understanding of measurement consistency and reliability. Descriptive statistics, including means and standard deviations, were calculated to summarize the central tendency and dispersion of the hip ROM data.
Results
The convenience sample of 64 volunteers, primarily physical therapy students, was between 20 and 43 years of age, and comprised 42 females and 22 males. On average, participants were moderately active, engaging in physical activities for about 5.61 hours weekly (SD ± 3.63). The average sleep duration was approximately 6.77 +/- 0.88 hours per night, indicating consistent sleep patterns. Additionally, the average Body Mass Index (BMI) was 25.55 ± 3.66 kg/m². The average hip flexor length measured using the MTT among the participants was 5.43± 9.73 degrees, suggesting a moderate level of variability across this sample population.
Intra-rater Reliability
The intra-rater reliability was high, indicated by mean ICC’s of 0.899 and 0.923, suggesting repeatable measurement outcomes and good levels of agreement across multiple trials by the same rater. (Figures 2-5, Table 1)
Inter-rater Reliability
High inter-rater reliability was also demonstrated, with mean ICC’s ranging between 0.831-0.871. (Table 2)
Additionally, the SEMs for the hip flexor length using ROM data from the MTT further validated the precision of the measurements. The overall average SEM across both examiners and both sides (left and right) is 2.85 degrees. Overall, these results underscore the high reliability and measurement precision of the MTT, affirming its utility in both clinical and research settings for evaluating hip flexor length.
DISCUSSION
Although the MTT is widely used in orthopedic and physical therapy practice, its reference validity and measurement reliability amongst practitioners has been questioned. This study aimed to assess the reliability of the MTT, addressing inconsistencies in previous studies, particularly those with variations in controlling pelvic tilt during testing.2,12
The results of the current study indicate strong intra-and inter rater reliability for the MTT when utilizing goniometric measurements. These findings align with those reported by Vigotsky et al.12 and Kim and Ha,13 underscoring the increased reliability, specificity, and sensitivity of the MTT when accounting for lumbopelvic movement and controlling for pelvic tilt. Prior studies have consistently shown that pelvic tilt significantly affects the differences between MTT measurements of hip flexor length and standard hip extension goniometric measurements taken in the prone position. This emphasizes the important role pelvic tilt plays in the relationship between hip muscle length and pelvic position, compared to measuring hip joint ROM.
Due to the lack of pelvic control in prior studies, results of the current study contradicted the findings of both Peeler and Anderson14 and Gabbe et al.15 who reported poor reliability for this test. However, Neves et al.8 suggested that positive results for shortening may also be influenced by an increase in the joint capsule and ligament stiffness, a factor not considered in this study. The improved reliability observed in the current study can likely be attributed to the control of pelvic tilt during the MTT measurements. By attempting to reach and maintain a neutral pelvic position, the potential for measurement error and variability was lessened, leading to more consistent results.
Maintaining a neutral pelvic tilt helps isolate the hip flexor muscles and provides a more accurate assessment of hip extension ROM.12 Without controlling for pelvic tilt, compensatory movements in the lumbar spine and pelvis can occur, leading to overestimation or underestimation of actual hip extension.13 This discrepancy highlights the importance of standardized testing protocols that account for pelvic positioning to achieve reliable and valid measurements. Additionally, the use of anatomical landmark stickers may have improved accuracy of alignment during goniometer evaluation across data collection by different examiners for each participant. Outcomes of the current study offer valuable information to clinicians, emphasizing the importance of controlling pelvic tilt when performing the MTT, as pelvic tilt appears to contribute to variability in measures.
The current findings contrast with those of Watkins et al.,16 challenging the assertion that goniometric measurements display high reliability within the same therapist but lacked consistency between different therapists. The current results are consistent with the work of Clapis et al.17 demonstrating that goniometric measurement of hip flexor length using the MTT displayed both intra and inter-rater reliability, surpassing that of an inclinometer based measurements. The superior intra-rater reliability observed may be attributed to several factors. Individual examiners tend to develop consistent personal techniques and methods when performing repeated measurements, which reduces variability and leads to higher intra-rater reliability. In contrast, inter-rater reliability involves comparing measurements between different examiners, each of whom may have slight variations in their technique or interpretation, despite standardized training and protocols. The small differences can lead to slightly lower inter-rater reliability compared to intra-rater reliability.
Moreover, the blinding technique used in the study, where the goniometric measurement was obscured by construction paper, helped minimize bias but did not entirely eliminate individual differences in measurement technique. Therefore, while the standardized protocols and training resulted in overall high reliability, intra-rater reliability was higher as each examiner was more consistent with their own methods compared to aligning perfectly with another examiner’s methods. Clapis et al.17 emphasized the importance of consistency in measurement techniques, which inherently supports higher both inter- and intra-rater reliability. The current study’s findings align with this, showing that when a single examiner performs repeated measurements, the consistency of their technique leads to more reliable results. However, despite these differences, the study still demonstrated high inter-rater reliability, indicating that standardized training and protocols were effective in achieving reliable measurements across different examiners.
This study’s strengths include the inclusion of both sexes, minimizing potential bias through blinding, and involving examiners with varying levels of experience, which demonstrates the reliability of the MTT across different expertise levels.
Although the MTT demonstrated high reliability among examiners, it is crucial to acknowledge certain limitations inherent to the study. The assessment was conducted on a young, healthy population limiting the generalizability of the findings to broader age groups and diverse populations. There is a possibility of hip flexor stretching occurring after repeated measurements, as each participant underwent four goniometric measurements per leg, which could have influenced the hip flexor length and therefore measured ROM. Furthermore, the order in which subjects were tested was not randomized, which could have introduced an order effect, potentially influencing the reliability outcomes. Future studies should consider randomizing the order of testing to eliminate this potential bias.
CONCLUSION
The findings of this study demonstrate that the MTT shows strong inter- and intra-rater reliability when pelvic position is considered, aligning with the results of studies that have implemented similar controls. Previously reported poor reliability of the MTT in some studies may be due to the lack of control of pelvic position. These results support the use of the MMT as a reliable measure of hip flexor length in clinical practice when a neutral lumbar spine is maintained. It is important that physical therapists and medical professionals use reliable tests when assessing hip flexor length.
Conflicts of Interest
The authors declare no conflicts of interest.