Académique Documents
Professionnel Documents
Culture Documents
doi: 10.4085/1062-6050-48.3.03
by the National Athletic Trainers Association, Inc original research
www.nata.org/journal-of-athletic-training
st
excellent, few authors have determined interrater reliability. excellent. For the normalized maximum excursion distances, the
Preliminary evidence has shown poor reliability between intraclass correlation coefficients (1,1) ranged from 0.86 to 0.92.
assessors. Reliability for the nonnormalized measurements was stronger,
Objective: To determine interrater reliability using a group
ir
ranging from 0.89 to 0.94.
of investigators at 2 testing sites. A corollary purpose was to Conclusions: When the raters have been trained by an
examine the interrater reliability when using normalized and experienced rater, the SEBT is a test with excellent reliability
nonnormalized performance scores on the SEBT. eF when used across multiple raters in different settings. This
Design: Descriptive laboratory study.
information adds to the body of knowledge that exists regarding
Setting: University research laboratory.
the usefulness of the SEBT as an assessment tool in clinical and
Patients or Other Participants: A total of 29 healthy
participants between 18 and 50 years of age. research practice. Establishing excellent interrater reliability with
Intervention(s): Participants were evaluated by 5 raters at 2 normalized and nonnormalized scores strengthens the evidence
testing sites. After participants performed 4 practice trials, each for using the SEBT, especially at multiple sites.
rater assessed 3 test trials in the anterior, posteromedial, and Key Words: dynamic postural control, clinical balance tests,
in
posterolateral reaching directions of the SEBT. functional balance
Key Points
nl
When multiple raters in different settings were trained by an experienced rater, the Star Excursion Balance Test had
excellent reliability.
Whether the chosen outcome was average or maximum scored and used raw or normalized data, the anterior,
posteromedial, and posterolateral directions had excellent reliability.
O
C
linicians often use postural-control assessments to
evaluate the risk of injury, initial decits resulting examine the reliability of the SEBT. A single investigator
from injury, and the level of improvement after conducted trials on 20 healthy participants, who performed
intervention for an injury. Dynamic postural control has reaches in 4 directions of the SEBT during 2 sessions, with
gained popularity in clinical and research settings as an moderate to strong intraclass correlation coefcient (ICC)
assessment of function. One measurement of dynamic scores ranging from 0.67 to 0.87.15 Hertel et al14 recruited
postural control that has increased in frequency of use is the 16 healthy women who performed all 8 directions of SEBT
Star Excursion Balance Test (SEBT). The measure of over 2 testing sessions, with 2 investigators evaluating each
dynamic postural control is inferred from how far a participant on each day. The range of interrater reliability
participant can reach while maintaining a base of support. for the 8 directions was wide (ICC 0.350.93), whereas
Widespread use of the SEBT in the clinical and research the range of intrarater reliability was more narrow with
settings has demonstrated its strong capability to differen- stronger reliability scores (ICC 0.780.96). The authors
tiate patients with lower extremity conditions such as ankle attributed the wide range in interrater reliability to lower
instability,18 anterior cruciate ligament reconstruction,9 scores that occurred on the rst day of testing and were a
and patellofemoral pain.10 Additionally, the SEBT can potential artefact of a learning effect. They recommended 6
assess improvements in dynamic postural control after practice trials to overcome the learning effect and
exercise interventions.4,11,12 consequently improve reliability of the measure. More
The limited literature available suggests that when a recently, Robinson and Gribble17 found that 4 practice trials
single investigator performs the assessments and the were sufcient to overcome the learning effect, with better
participant has had an adequate number of practice trials, consistency in subsequent test trials. Similarly, Munro and
the conventional categorization of reliability13 is consis- Herrington16 noted that reliability between test trials
tently moderate or better, even across multiple days of improved after a fourth consecutive trial, with excellent
st
A total of 29 individuals volunteered to participate in the
SEBT measures. To date, no authors have had more than 2
trial: 19 at 1 test site and 10 at the other site. At each site,
assessors examine interrater reliability; a larger number of participants reported to the laboratory for a single testing
assessors is paramount to strengthening interrater reliability
ir
session. The stance leg was determined by randomization.
and expanding the use of this inexpensive tool to multi-site The length of the stance leg was measured from the
applications in which a number of individuals may be anterior-superior iliac spine to the most distal point of the
sharing information about patient screening. Interrater
eF ipsilateral medial malleolus, using a standard tape measure
reliability is also essential to underpin the development of while participants lay supine on a plinth.
prevention and intervention strategies for lower extremity Each participants performance on the SEBT was rated
injury. Additionally, since the rst reliability studies were by all 3 raters in the manner described in the Performance
performed, the now-accepted practice is to use normalized of the SEBT section, with the order of raters being
reach distances (reach distance / leg length).19 Therefore, randomized. A verbal and visual demonstration of the
we must revisit the reliability of the SEBT using SEBT was given to participants by the rst rater, and the
in
normalized, rather than absolute, reach distances. constraints of the test were explained. The participants then
The primary purpose of our study was to determine underwent the same protocol, and their SEBT performance
interrater reliability using a group of investigators at 2 was measured by the 3 raters. Participants performed 4
testing sites. The raters were all trained by the same
nl
st
touched heavily or came to rest at the touchdown point, had Additionally, this model was applied to the leg-length
to make contact with the ground with the reaching foot to measure. The data from the 2 sites were pooled, and an ICC
maintain balance, or lifted or shifted any part of the foot of (1,1) was used because participants were rated by different
ir
the stance limb during the trial.19,21 sets of 3 raters. An ICC (1,1) of ,0.4 represents poor
Although the SEBT consists of 8 directions, conventional reliability; 0.4 to 0.75, fair to good reliability; and .0.75,
testing procedures have adopted a condensed version of the excellent reliability.25
test, using the ANT (Figure 1), PM (Figure 2), and PL
eF
(Figure 3) reaching directions.11,18,22 For the ANT reach,
the stance-foot position is to place the toes at the 0 mark
Results
position of the anterior reach direction line. For the PM and For all 16 measures, the interrater reliability was
PL reaches, the heel is placed at the 0 mark position of the excellent. For the normalized maximum excursion distanc-
es, the ICC (1,1) ranged from 0.86 to 0.92 (Table 2).
anterior reach direction line. At the rst testing station, 4
Reliability for the nonnormalized measurements was
in
practice trials were required in each direction.17 Participants
stronger, ranging from 0.89 to 0.94 (Table 3). The interrater
were afforded 5 minutes rest between the practice and test
reliability of the leg-length measurement was excellent
trials.
(ICC [1,1] 0.92, 95% condence interval 0.86, 0.96).
The ICC (1,1) and 95% condence interval for the average,
nl
Data Reduction maximum, and composite scores in each direction, for both
From each reaching direction (ANT, PM, and PL), the normalized and nonnormalized measurements, are shown in
excursion distances were recorded (cm) and considered the Tables 2 and 3, respectively.
O
without compromising the reliability of results, provided the nonnormalized data seemed to be slightly higher, all
each rater is initially trained in the measurement of the values had excellent reliability with small 95% condence
SEBT by an experienced rater. This nding has promising intervals. Therefore, it is logical that SEBT performance
clinical and research implications: consistent, reliable data should continue to include normalized reaching distances as
can be collected when multiple investigators at multiple previously established,19 and we can be condent that the
sites are trained and then assess participants performing the associated reliability will be excellent.
SEBT. A limitation to the application of our ndings is that A secondary purpose of this study was to establish
an expert provided the training and was involved in rating interrater reliability of the leg-length measurements.
the participants. The next needed step is to determine Because this measurement is critical to the normalization
whether similar levels of reliability can be obtained with of the SEBT reaches, it was imperative that this be reliable
st
the use of written or video instructions (or both) that could across our group of investigators. We demonstrated
be distributed to clinicians and researchers. excellent interrater reliability using the selected leg-length
In the only previous report of interrater reliability of the measurement technique. This technique was chosen based
ir
SEBT, Hertel et al14 reported ICCs between 0.35 and 0.93 on the original article19 describing normalizing SEBT reach
when 16 healthy females performed all 8 directions of the distances. It should be noted that all the investigators were
SEBT over 2 testing sessions and 2 investigators evaluated either credentialed clinicians or students in clinician
each participant on each day. Lower estimates of preparation programs, and all had experience using this
reliability were found on day 1 of testing: ICCs were
0.76, 0.58, and 0.80 for the ANT, PL, and PM directions,
eF technique. Therefore, we must conclude that this result
would be applicable to similar populations of individuals
respectively, less than demonstrated in our study. Hertel et with clinical backgrounds.
al14 included the initial published recommendation for a An additional secondary purpose of the study was to
specic number of requisite practice trials before SEBT compare the reliability of the SEBT assessments when
assessment because of higher ICC values on a second day using the average of 3 trials versus the maximum reach
in
of testing. The interrater reliability scores in our study distance among 3 trials. In more than half of the
might be higher because we included 4 practice trials in assessments, the average normalized value had higher
each direction before the test was performed, based on a associated ICC values than the maximum values. However,
nl
more recent study17 examining the learning effect during for all the assessments, the ICC values were strong (.0.81).
SEBT performance. Common practice is to use an average of 3 or more trials on
Furthermore, although it is not stated in their testing the SEBT. These data could be interpreted to show that if
protocol, Figure 1 in the investigation of Hertel et al14 reliability is strong when maximum trials are used, perhaps
O
shows a participant performing the test wearing footwear. greater time efciency could be gained from only recording
Our participants performed the test barefoot, potentially the maximum trial. When working with large sample sizes,
allowing for a more accurate measurement of excursion this reduction may be advantageous. Additional investiga-
distance along the tape measure. Additionally, Hertel et al14 tion will be needed to determine how much time is saved
did not normalize reaching distances. One purpose of our when using only the maximum of 3 trials.
investigation was to establish the interrater reliability using Each investigator in our study was trained in the
the normalized procedures that are now standard for the measurement of the SEBT before the testing sessions. An
SEBT.19 Both normalized and nonnormalized reaching investigator with more than 11 years of experience in SEBT
distances were associated with stronger interrater reliability performance and measurement provided instruction to the
than previously reported.14 Again, this is likely because we raters at each site before deeming them competent to take
afforded participants a specic number of practice trials, measures independently. The Hertel et al14 reliability study
which was a recommendation from the initial reliability was conducted early in the initial development of the
study by Hertel et al.14 SEBT. Since this time, the evidence base surrounding the
Little difference was observable in the strength of the SEBT has been well established, and the test has become
reliability between the normalized and nonnormalized widely used in both clinical and research settings. The
reaching data. Although the ICC values associated with pretesting training of raters in the current study by an
Table 3. Interrater Reliability for Nonnormalized Reach Distances on the Star Excursion Balance Test
Reach Distance, Intraclass Correlation Coefficient (1,1) (95% Confidence Interval)
Distance Anterior Posteromedial Posterolateral Composite
Average 0.92 (0.86, 0.96) 0.92 (0.85, 0.96) 0.92 (0.86, 0.96) 0.91 (0.85, 0.96)
Maximum 0.89 (0.82, 0.94) 0.90 (0.82, 0.95) 0.93 (0.88, 0.96) 0.94 (0.88, 0.97)
st
1. Akbari M, Karimi H, Farahini H, Faghihzadeh S. Balance problems
current results, improves the external validity of the study, after unilateral lateral ankle sprains. J Rehabil Res Dev. 2006;43(7):
and allows for greater generalizabilty of results. 819824.
The main limitation of our study is that only 3 reach
ir
2. Gribble PA, Hertel J, Denegar CR. Chronic ankle instability and
directions were evaluated. Therefore, we can conclude only fatigue create proximal joint alterations during performance of the
that interrater reliability is excellent for the ANT, PM, and Star Excursion Balance Test. Int J Sports Med. 2007;28(3):236242.
PL reach directions, as well as the composite score of these
eF 3. Gribble PA, Hertel J, Denegar CR, Buckley WE. The effects of
3 reaching directions. However, previous researchers5 have fatigue and chronic ankle instability on dynamic postural control. J
demonstrated considerable redundancy in performance of Athl Train. 2004;39(4):321329.
the different reach directions of the SEBT in participants 4. Hale SA, Hertel J, Olmsted-Kramer LC. The effect of a 4-week
with and without chronic ankle instability. It is now comprehensive rehabilitation program on postural control and lower
common practice to include only 3 reach directions to extremity function in individuals with chronic ankle instability. J
assess dynamic postural control with the SEBT, and the 3 Orthop Sport Phys Ther. 2007;37(6):303311.
in
reach directions in our trial are those most often used in 5. Hertel J, Braham RA, Hale SA, Olmsted-Kramer LC. Simplifying the
both clinical and research settings. Star Excursion Balance Test: analyses of subjects with and without
Additionally, we had a relatively small total sample size, chronic ankle instability. J Orthop Sport Phys Ther. 2006;36(3):131
with unbalanced numbers of participants at the testing sites. 137.
nl
Our sample sizes were based on convenience and were 6. Martinez-Ramirez A, Lecumberri P, Gomez M, Izquierdo M.
restricted by time and availability due to the nature of the Wavelet analysis based on time-frequency information discriminate
multi-site design and the international collaboration. We chronic ankle instability. Clin Biomech (Bristol, Avon). 2010;25(3):
believe that our study provides interesting and useful 256264.
O
information, but future researchers could consider replicat- 7. Nakagawa L, Hoffman M. Performance in static, dynamic, and
ing our methods with larger and equal sample sizes at each clinical tests of postural control in individuals with recurrent ankle
testing site. sprains. J Sport Rehabil. 2004;13(3):255268.
All participants performed the procedures without shoes, 8. Olmsted LC, Carcia CR, Hertel J, Shultz SJ. Efcacy of the Star
but we did not control whether the participants wore socks Excursion Balance Tests in detecting reach decits in subjects with
chronic ankle instability. J Athl Train. 2002;37(4):501506.
during the performances. Although this could pose a
9. Herrington L, Hatcher J, Hatcher A, McNicholas M. A comparison of
limitation, a recent investigation,26 using the same 3
Star Excursion Balance Test reach distances between ACL decient
reaching directions as those in our study, demonstrated no
patients and asymptomatic controls. Knee. 2009;16(2):149152.
differences in SEBT performance between a bare foot and
10. Aminaka N, Gribble PA. Patellar taping, patellofemoral pain
wearing a regular sock. syndrome, lower extremity kinematics, and dynamic postural control.
Finally, a potential limitation is that the verbal instruc- J Athl Train. 2008;43(1):2128.
tions and practice trials were only provided to the 11. McKeon PO, Ingersoll CD, Kerrigan DC, Saliba E, Bennett BC,
participant at the rst testing station by the rst randomly Hertel J. Balance training improves function and postural control in
assigned rater, followed by performing test trials for each of those with chronic ankle instability. Med Sci Sports Exerc. 2008;
the 3 raters. This protocol was implemented because our 40(10):18101819.
interpretation of the previous reliability studies14,17 sug- 12. McLeod TC, Armstrong T, Miller M, Sauers JL. Balance
gested that during a single testing session, after a specic improvements in female high school basketball players after a 6-
number of practice trials are allowed, no signicant week neuromuscular-training program. J Sport Rehabil. 2009;18(4):
improvement in performance follows. This protocol is 465481.
useful for studying reliability, but we do not believe that 13. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater
having 3 raters is practical. Instead, a single rater would reliability. Psychol Bull. 1979;86(2):420428.
provide the verbal instructions and observe the prescribed 14. Hertel J, Miller SJ, Denegar CR. Intratester and intertester reliability
practice trials for a single patient. This is a small during the Star Excursion Balance Test. J Sport Rehabil. 2000;9(2):
discrepancy between our study protocol and practical 104116.
st
Address correspondence to Phillip A. Gribble, PhD, ATC, FNATA, 2801 W Bancroft, University of Toledo, Mailstop #119, Toledo, OH
43606. Address e-mail to phillip.gribble@utoledo.edu.
ir
eF
in
nl
O