The foot posture index, ankle lunge test, Beighton scale and the lower limb assessment score in healthy children: a reliability study
© Evans et al; licensee BioMed Central Ltd. 2012
Received: 9 October 2011
Accepted: 9 January 2012
Published: 9 January 2012
Outcome measures are important when evaluating treatments and physiological progress in paediatric populations. Reliable, relevant measures of foot posture are important for such assessments to be accurate over time. The aim of the study was to assess the intra- and inter-rater reliability of common outcome measures for paediatric foot conditions.
A repeated measures, same-subject design assessed the intra- and inter-rater reliability of measures of foot posture, joint hypermobility and ankle range: the Foot Posture Index (FPI-6), the ankle lunge test, the Beighton scale and the lower limb assessment scale (LLAS), used by two examiners in 30 healthy children (aged 7 to 15 years). The Oxford Ankle Foot Questionnaire (OxAFQ-C) was completed by participants and a parent, to assess the extent of foot and ankle problems.
The OxAFQ-C demonstrated a mean (SD) score of 6 (6) in adults and 7(5) for children, showing good agreement between parents and children, and which indicates mid-range (transient) disability. Intra-rater reliability was good for the FPI-6 (ICC = 0.93 - 0.94), ankle lunge test (ICC = 0.85-0.95), Beighton scale (ICC = 0.96-0.98) and LLAS (ICC = 0.90-0.98). Inter-rater reliability was largely good for each of the: FPI-6 (ICC = 0.79), ankle lunge test (ICC = 0.83), Beighton scale (ICC = 0.73) and LLAS (ICC = 0.78).
The four measures investigated demonstrated adequate intra-rater and inter-rater reliability in this paediatric sample, which further justifies their use in clinical practice.
Outcome measures are important when evaluating effectiveness of treatment and progress towards a final goal in paediatric populations. A Cochrane systematic review published by us recently highlighted the importance of the use of reliable and validated outcome measures . However, the current evidence around the use of reliable outcome measures in paediatric populations is sparse.
In the paediatric health care setting, measuring children's progress towards individual outcomes is increasingly important. Such measurements must be individual, in view of the diversity of developmental disabilities, goals, and interventions. The heterogeneity of the population often induces researchers to use generic standardised measurement tools or health-related quality of life measures; however, many are limited in terms of specificity and responsiveness to change. In contrast, in studies of homogeneous groups the sample size is often too small to detect convincing and clinically relevant differences between two treatment strategies.
Whilst flatfoot is considered to be the most common condition seen in paediatric orthopaedic clinics, it is not clear at what age children develop out of physiological flatfoot, and in the absence of obvious pathology, when and if a flatfoot is defined as pathological . As a frequently reported condition it has significant implications. These are not only for the individual child, where pain or the appearance of the foot is outside normal expectations, but also for the clinician in terms of assessment and management, and the health care setting in terms of resources.
Paediatric flatfoot has been found to be associated with reduced ankle joint range of motion , is inversely proportional to age , is more prevalent in boys , and correlates directly with joint hypermobility  and being overweight/obese . In complement to the clinical assessment, the Oxford Ankle Foot Questionnaire - Children can assess the extent to which the lives of children, aged 5 to 16 years, are affected by foot and ankle problems . This patient-reported questionnaire takes into account the perceptions of both the child and their parent/carer into account. Usual, objective clinical assessment methods do not always capture the subjective patient perspective and may not accurately reflect how children function in their typical environments.
The reliability of clinicians' ratings is an important consideration in areas such as diagnosis and the interpretation of examination findings . Reliability of clinical foot measures commonly used in paediatric foot assessments has been previously investigated in various ways and for varying purposes. For example, Macfarlane et al  established good intra-rater reliability for hand-held dynamometry in establishing isometric torque reference values for 154 young and healthy children for lower leg muscles. Gilmour  reported good intra-rater and inter-rater reliability for the measurement of the medial longitudinal arch, utilising the arch index calculated from footprints, in 272 children. This same study also established good intra-rater and inter-rater reliability for the measure of navicular height from the floor in standing subjects. Navicular height (NH), the Foot Posture Index (FPI), resting calcaneal stance position (RCSP), neutral calcaneal stance position (NCSP), navicular drop (ND) were examined in young children (4 to 6 years) and adolescents (8 to 15 years) in an intra-rater and inter-rater reliability study . This study found differences in the reliability of the measures between the two age groups of children, with much lower inter-rater reliability of measures in the younger children. From this study came the notion that young children require a different approach to foot posture assessment, from which the more recent paediatric flat foot proforma has evolved, and for which adequate inter-rater reliability has been found .
Morrison found good intra-rater reliability for ND in 13 children  and also found good inter-rater reliability for the FPI in children aged 5 to 16 years . The reliability of measures of ankle range has been sparsely examined in healthy children . Bennell et al have established the reliability of the weight-bearing ankle lunge test in adults , and whilst having used the same to examine ankle motion in ballet dancers (aged 8 to 11 years), did not examine the reliability of this measure in this younger sample [18, 19]. Measures of joint hypermobility (the Beighton scale and the lower limb assessment score) have demonstrated good inter-rater reliability in adults and children respectively [20–23].
Whilst there have been some recent attempts to examine aspects of paediatric foot posture and joint range in children, the results are based upon differing subject samples and methodologies. Hence, the aim of this study was to examine the intra and inter-rater reliability of clinical measures of foot posture, joint hypermobility and ankle joint range in a test-retest analysis of the same sample of healthy children.
Thirty children were recruited as a convenience sample from the Auckland University of Technology podiatry clinic and from staff associated with this clinic. All participants were healthy, asymptomatic children, aged between 7 and 15 years of age. The institutional ethics committee approved the study and parents/guardians provided written informed consent.
Demographic and participant characteristic information including age, gender, height, weight, body mass index was determined for each child at baseline. In addition, the parent and child versions of the Oxford Ankle Foot Questionnaire - Children (OxAFQ-C) were completed as the initial stage of data collection. The OxAFQ-C is a validated instrument used to assess the disability associated with foot and ankle problems in children aged from 5 to 16 years. Scores from the questionnaire can be calculated in three domains of children's lives: physical, school and play, and emotional. The questionnaire is appropriate for children with a range of conditions and can provide clinically useful information to supplement other assessment methods .
Four other foot and ankle musculoskeletal measurement instruments or tools were used in this study (determining the reliability of each was the primary aim of this study). These instruments included; the Foot Posture Index (FPI-6) , the Beighton scale  and the lower limb assessment score (LLAS) . The fourth test, the ankle lunge test, utilised a digital read-out inclinometer to record lower leg angulations. The two examiners were both podiatrists. One examiner had 20 years clinical experience (rater 1), whilst the other examiner was newly graduated podiatrist with 1 year of clinical experience (rater 2).
The participating child and their parent/guardian each completed their respective versions of the OxAFQ-C (15 questions). The maximum score for these questionnaires is 15, with lower scores indicative of more severe disability . Each child was then independently assessed twice by each examiner, for each of the FPI-6, Beighton scale, LLAS and ankle lunge test. At least two hours separated the assessment periods. All data collection occurred over three consecutive days.
The FPI-6 was evaluated with each child standing and using the original protocol . FPI-6 values ranged from -2 to +2 for each of the six criteria and from -12 to +12 for the total score, indicative of position of each foot along the supinated (a -ve score) to pronated (a +ve score) continuum of foot posture.
The Beighton scale [22, 23] was rated to ascertain the presence of joint hypermobility at the wrist, fifth metacarpal phalangeal joint, elbow, knee (all bilateral and non-weight-bearing) and the lumbo-sacral spine (forward flexion, in stance). The Beighton scale yields a score from a 9-point rating, whereby the usual arbitrary cut-off of 5/9 or greater indicates joint hypermobility .
The LLAS  was assessed to gauge joint hypermobility of the lower limb (hip, knee, ankle, subtalar, midtarsal and first metatarsophalangeal joint). The subtalar joint assessment only involved weight-bearing evaluation. The LLAS yields a 12-point score/side, and by convention the total (24 point) score is halved to deliver a final score out of 12 (with arbitrary cut-off of 7/12 or greater indicative of joint hypermobility) .
The ankle lunge test was performed using the method described by Bennell et al  and adapted by Irving et al . This method incorporates an inclinometer (Smart Tool™) held on the anterior surface of the tibia, which is used to measure the participant's lunge angle. As previous works [17, 19] have shown the lunge test to return symmetrical results and reliability, only the left-side lunge angle was measured for the purposes of this study.
Data management and statistical analysis
Following data collection, all data were entered and statistical analysis was conducted using SPSS Version 17 for Windows (SPSS, Inc., Chicago, IL, USA). Mean (SD) and n (%) were used to explore the demographic and participant characteristic data.
Reliability analysis was assessed by calculating the intraclass correlations (ICCs) for each of the FPI-6, Beighton scale, LLAS and ankle lunge test (ICC (2,k) absolute agreement). ICCs across the same-subject repeated measures trials were calculated for each of the two examiners (intra-rater) and between the two examiners (inter-rater). Interpretation of the ICCs was conducted in accordance with Portney and Watkins , whereby values > 0.75 indicate good reliability, values ranging from 0.50 to 0.75 indicate moderate reliability and values < 0.50 imply poor reliability. We also used standard error of measurement (SEM) statistics. The SEM is expressed in the actual unit of the measurement, which is very useful: the smaller the SEM, the more reliable the results .
Participant characteristics (n = 30)
Age, years, mean (SD), range
10.6 (2.3), 7.0 - 15.0
Females, n (%)
Weight, kg, mean (SD)
Height, m, mean (SD)
BMI, kg/m2, mean (SD)
Ethnicity, n (%)
Caucasian, 27 (90%)
Maori, 1 (3%)
Asian, 2 (7%)
OxAFQ-C (Parent), mean (SD)
OxAFQ-C (Child), mean (SD)
Intra-rater reliability results for each examiner, across both testing periods (n = 30).
Foot Posture Index, mean (SD)
Lunge test, mm, mean (SD)
Beighton scale, mean (SD)
Lower limb assessment score, mean (SD)
Foot Posture Index, mean (SD)
Lunge Test, mean (SD)
Beighton scale, mean (SD)
Lower limb assessment score, mean (SD)
Inter-rater reliability for each measure and SEM for each of the repeated trials (n = 30)
ICC (95% CI)
ICC (95% CI)
Foot Posture Index
Lower limb assessment score
Inter-rater reliability: mean inter-rater ICC's (95% CI's) and SEM across both testing trials (n = 30)
ICC (95% CI)
Foot Posture Index
Lower limb assessment score
The OxAFQ-C raw domain scores demonstrated good agreement between parents and children (Table 1). In this study's small convenience sample, little more can be inferred from these findings. The OxAFQ-C for children was developed as a site-specific (ankle/foot) instrument to provide an inexpensive and expedient method for assessing health status and evaluating outcomes from the child's perspective, aged between 5 and 16 years . This objective measure should be regularly used to assess the extent to which children are affected by foot and ankle problems.
The examiners displayed largely good intra-rater and inter-rater reliability for the FPI-6, Lunge test, the Beighton scale and the LLAS when applied to the sample population of children with a mean age 10.6 years. Intra-rater reliability results returned very good intraclass correlation results and small SEM for each measure. Rater 1 was the more experienced of the two raters and returned lower FPI-6 scores and also lower LLAS scores, indicating that experience and clinical exposure modulates assessment of flat feet and joint hypermobility within the lower limb.
Inter-rater reliability results, as categorised by the Portney and Watkins levels , were good for the FPI-6, the lunge test and the LLAS. The Beighton score was only slightly short of this cut-off level, and being upper limb dominant may be a less familiar clinical tool for podiatrists, especially podiatry students. Clinicians can feel confident in using the FPI-6, the lunge test and either hypermobility evaluation tool in a busy clinical setting.
This study confirms the reliability of the FPI-6 [12, 15, 24] and the LLAS  in the paediatric setting. Whilst widely used as an expedient measure of global joint hypermobility, the Beighton scale has not previously been examined for its reliability in children. Previous studies in adults have found the Beighton scale to yield good inter-rater reliability [21–23]. The lunge test [17, 28] has been demonstrated to be a reliable clinical tool for ankle joint range assessment in adults [17, 25], but has not been tested for reliability in the paediatric population until now.
Given the known relationships between foot posture and ankle range, ankle range and hypermobility, and foot posture and hypermobility it is pertinent to have identified the most useful measures for clinical assessment of these parameters. Often used in concert, the clinician and researcher can assuredly use the FPI-6, the lunge test, the Beighton scale and/or the LLAS for both baseline and monitoring purposes.
The LLAS has distinct advantages for use in the podiatry setting as it evaluates hypermobility in the lower limb and foot very specifically. The LLAS does take longer to administer than the briefer, and more global Beighton scale, but yields far greater information distal to the hips. The Beighton scale is a very quick and slightly coarser filter for hypermobility screening, and in one author's (AE) experience, is usefully used prior to the more specific LLAS.
This study had limitations, as the sample included children with a mean age of 10.6 (2.3) years were assessed for the purpose of assessing the reliability of the four clinical tools. Caution must be advised if using these measures in ages that are significantly less or more than 10 years, and especially in younger children, for whom very different results with clinical foot measures have been previously found . The clearly disparate examiner experience appears to affect results and must be noted in the assessment of both joint hypermobility and foot posture, where less experience may over-estimate extent.
Future research directions include the establishment of normative reference values across age groups for each of the four measures: the FPI-6, the lunge test, the Beighton scale and the LLAS. Such values already exist for the FPI-6 , so the assignation of normal values for the other three measures for healthy children and specific disease groups (e.g. cerebral palsy, Down's syndrome) would greatly assist both clinicians and research teams.
The present study has, for the first time, found that the four measures of the FPI-6, the lunge test, the Beighton scale and LLAS, demonstrate adequate intra-rater and inter-rater reliability in a paediatric sample. These findings indicate that all of these measures are useful in both clinical settings and research protocols that address the paediatric foot.
- Rome K, Ashford RL, Evans AM: Non-surgical interventions for paediatric pes planus (Review). Cochrane Database Syst Rev. 2010, 7: DOI:10.1002/14651858.CD006311.pub2Google Scholar
- Evans AM: The flat-footed child - to treat or not to treat, what is the clinician to do?. J Am Podiatr Med Assoc. 2008, 98: 386-393.View ArticlePubMedGoogle Scholar
- Rose KJ, Burns J, North KN: Factors associated with foot and ankle strength in healthy pre-school aged children and age-matched cases of Charcot-Marie-Tooth disease type 1A. J Child Neurol. 2010, 25: 463-468. 10.1177/0883073809340698.View ArticlePubMedGoogle Scholar
- Redmond AC, Crane YZ, Menz HB: Normative values for the Foot Posture Index. J Foot Ankle Res. 2008, 1:Google Scholar
- Pfeiffer M, Kotz R, Ledl T, Hauser G, Sluga M: Prevalence of flat foot in preschool-aged children. Pediatrics. 2006, 118: 634-639. 10.1542/peds.2005-2126.View ArticlePubMedGoogle Scholar
- Murray KJ: Hypermobility disorders in children and adolescents. Best Pract Res Clin Rheumatol. 2006, 20: 329-351s. 10.1016/j.berh.2005.12.003.View ArticlePubMedGoogle Scholar
- Evans AM, Rome K: A Cochrane review of the evidence for non-surgical interventions for flexible pediatric flat feet. Eur J Phys Rehabil Med. 2011, 47: 69-89.PubMedGoogle Scholar
- Morris C, Doll HA, Wainwright A, Theologis T, Fitzpatrick R: The Oxford ankle foot questionnaire for children: scaling, reliability and validity. J Bone Joint Surg Br. 2008, 90: 1451-1456. 10.1302/0301-620X.90B11.21000.View ArticlePubMedGoogle Scholar
- Sim J, Wright CC: The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements. Phys Ther. 2005, 85: 257-268.PubMedGoogle Scholar
- Macfarlane TS, Larson CA, Stiller C: Lower extremity muscle strength in 6- to 8-year-old children using hand-held dynamometry. Pediatr Phys Ther. 2008, 20: 128-136. 10.1097/PEP.0b013e318172432d.View ArticlePubMedGoogle Scholar
- Gilmour JC, Burns Y: The measurement of the medial longitudinal arch in children. Foot Ankle Int. 2001, 22: 493-498.PubMedGoogle Scholar
- Evans AM, Copper AW, Scharfbillig RW, Scutter SD, Williams MT: Reliability of the Foot Posture Index and Traditional Measures of Foot Position. J Am Podiatr Med Assoc. 2003, 93: 203-213.View ArticlePubMedGoogle Scholar
- Evans AM, Nicholson H, Zakaris N: The paediatric flat foot proforma (p-FFP): improved and abridged following a reproducibility study. J Foot Ankle Res. 2009, 2: 25-10.1186/1757-1146-2-25.View ArticlePubMedPubMed CentralGoogle Scholar
- Morrison SC, Durward BR, Watt GF, Donaldson MDC: The intra-rater reliability of anthropometric data collection conducted on the peripubescent foot: A pilot study. Foot. 2005, 15: 180-184. 10.1016/j.foot.2005.07.003.View ArticleGoogle Scholar
- Morrison SC, Ferrari J: Inter-rater reliability of the Foot Posture Index (FPI-6) in the assessment of the paediatric foot. J Foot Ankle Res. 2009, 2: 26-10.1186/1757-1146-2-26.View ArticlePubMedPubMed CentralGoogle Scholar
- Evans AM, Scutter S: Sagittal Plane Range of Motion of the Pediatric Ankle Joint. A Reliability Study. J Am Podiatr Med Assoc. 2006, 96: 418-422.View ArticlePubMedGoogle Scholar
- Bennell K, Talbot R, Wajsweiner H, Techovanich W, Kelly DH, Hall AJ: Intra-rater and Inter-rater reliability of a weight-bearing lunge measure of ankle dorsiflexion. Aust J Physiother. 1998, 44: 175-179.View ArticlePubMedGoogle Scholar
- Bennell K, Khan KM, Matthews B, De Gruyter M, Cook E, Holzer K, Wark JD: Hip and ankle range of motion and hip muscle strength in young female ballet dancers and controls. Br J Sports Med. 1999, 33: 340-346. 10.1136/bjsm.33.5.340.View ArticlePubMedPubMed CentralGoogle Scholar
- Bennell K, Khan K, Matthews B, Singleton C: Changes in hip and ankle range of motion and muscle strength in 8 - 11 year old novice female ballet dancers and controls: a 12 month follow up study. Br J Sports Med. 2001, 35: 54-59. 10.1136/bjsm.35.1.54.View ArticlePubMedPubMed CentralGoogle Scholar
- Ferrari J, Parslow C, Lim E, Hayward A: Joint hypermobility: the use of a new assessment tool to measure lower limb hypermobility. Clin Exp Rheumatol. 2005, 23: 413-420.PubMedGoogle Scholar
- Remvig L, Jensen DV, Ward RC: Are diagnostic criteria for general joint hypermobility and benign joint hypermobility syndrome based on reproducible and valid tests? A review of the literature. J Rheumatol. 2007, 34: 798-803.PubMedGoogle Scholar
- Juul-Kristensen B, Rogind H, Jensen DV, Remvig L: Inter-examiner reproducibility of tests and criteria for generalized joint hypermobility and benign joint hypermobility syndrome. Rheumatol. 2007, 46: 1835-1841. 10.1093/rheumatology/kem290.View ArticleGoogle Scholar
- van der Geissen LJ, Liekens D, Rutgers KJ, Hartman A, Mulder PG, Oranje AP: Validation of Beighton score and prevalence of connective tissue signs in 773 Dutch children. J Rheumatol. 2001, 28: 2726-2730.Google Scholar
- Redmond AC, Crosbie J, Ouvrier R: Development and validation of a novel rating system for scoring foot posture: the Foot Posture Index. Clin Biomech. 2006, 21: 89-98. 10.1016/j.clinbiomech.2005.08.002.View ArticleGoogle Scholar
- Irving DB, Cook JL, Young M, Menz HB: Obesity and pronated foot type may increase the risk of chronic plantar heel pain: a matched case-control study. BMC Musculoskelet Disord. 2007, 8: 41-10.1186/1471-2474-8-41.View ArticlePubMedPubMed CentralGoogle Scholar
- Portney LG, Watkins MP: Foundations of clinical research. Applications to practice. 2000, Upper Saddle River, NJ: Prentice Hall Health, 2Google Scholar
- Weir JP: Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res. 2005, 19: 231-240.PubMedGoogle Scholar
- Dennis RJ, Finch CF, Elliott BC, Farhart PJ: The reliability of musculoskeletal screening tests used in cricket. Phys Ther Sport. 2007, 9: 25-33.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.