Intra-assessor reliability and measurement error of ultrasound measures for foot muscle morphology in older adults using a tablet-based ultrasound machine

Background To gain insight into the role of plantar intrinsic foot muscles in fall-related gait parameters in older adults, it is fundamental to assess foot muscles separately. Ultrasonography is considered a promising instrument to quantify the strength capacity of individual muscles by assessing their morphology. The main goal of this study was to investigate the intra-assessor reliability and measurement error for ultrasound measures for the morphology of selected foot muscles and the plantar fascia in older adults using a tablet-based device. The secondary aim was to compare the measurement error between older and younger adults and between two different ultrasound machines. Methods Ultrasound images of selected foot muscles and the plantar fascia were collected in younger and older adults by a single operator, intensively trained in scanning the foot muscles, on two occasions, 1–8 days apart, using a tablet-based and a mainframe system. The intra-assessor reliability and standard error of measurement for the cross-sectional area and/or thickness were assessed by analysis of variance. The error variance was statistically compared across age groups and machines. Results Eighteen physically active older adults (mean age 73.8 (SD: 4.9) years) and ten younger adults (mean age 21.9 (SD: 1.8) years) participated in the study. In older adults, the standard error of measurement ranged from 2.8 to 11.9%. The ICC ranged from 0.57 to 0.97, but was excellent in most cases. The error variance for six morphology measures was statistically smaller in younger adults, but was small in older adults as well. When different error variances were observed across machines, overall, the tablet-based device showed superior repeatability. Conclusions This intra-assessor reliability study showed that a tablet-based ultrasound machine can be reliably used to assess the morphology of selected foot muscles in older adults, with the exception of plantar fascia thickness. Although the measurement errors were sometimes smaller in younger adults, they seem adequate in older adults to detect group mean hypertrophy as a response to training. A tablet-based ultrasound device seems to be a reliable alternative to a mainframe system. This advocates its use when foot muscle morphology in older adults is of interest. Supplementary Information The online version contains supplementary material available at 10.1186/s13047-022-00510-1.


Background
Less propulsive gait and diminished balance capabilities, being consequences of the normal ageing process [1,2], are associated with an increased likelihood of falling in older adults [1,[3][4][5][6]. Plantar intrinsic foot muscles (PIFMs) play an important role in these two features of gait, at least when they are unaffected [7][8][9]. For instance, propulsion is aided by the PIFMs by stiffening the foot during the push-off phase of walking, hence contributing to the effective force transmission onto the ground [7]. The PIFMs also act to stabilize the foot arch, which is imperative for sound postural balance [8,9].
Concurrent with the PIFMs' related mobility decline, a decreased capacity of the PIFMs to produce force has been observed in older adults [10]. In this population, toe flexor weakness is associated with a higher probability of falling [11]. Consequently, toe flexor strengthening is often one of the goals in fall prevention interventions [12]. However, toe flexor strength is the combined result of contraction of intrinsic foot muscles (i.e., origin and insertion in the foot) and extrinsic foot muscles (i.e., origin proximal to ankle joint, insertion in the foot), both having a shared as well as a distinct function [9,13]. It is thus important to distinguish the PIFMs as a separate group of foot muscles, as well as to distinguish individual PIFMs, in order to gain more insight in the unique role of PIFMs in fall-related mobility parameters. These insights may lead to the enhancement of related treatment.
Recently, some studies investigated the role of individual PIFMs in foot function [14] or evaluated a PIFM strengthening intervention [15], using a measure for toe flexor strength or strength capacity. However, directly assessing the strength of individual foot muscles is unviable in vivo because of the redundant combinations of intrinsic and extrinsic foot muscles' contractions resulting in the same net force [16]. Therefore, measuring flexor strength of plantar foot muscles in units of force is restricted to measuring net toe flexor or toe grip force produced by the PIFMs in conjunction with the extrinsic foot muscles [16]. To overcome this limitation, ultrasound has been applied to study individual foot muscles (i.e., both intrinsic and extrinsic) [17][18][19]. This imaging technique is used to obtain the dimensions of these muscles, as an estimate of its capacity to exert force. The validity of this approach is confirmed by the observation that both the cross-sectional area (CSA) and the thickness of the PIFMs correlate well with maximum toe flexor force [10,[20][21][22].
Although magnetic resonance imaging (MRI) is considered the gold standard in the assessment of muscle morphology, ultrasonography is often preferable in both clinical and research settings [23]. In comparison with an MRI machine, an ultrasound machine is more accessible, portable and has superior temporal and spatial resolution when used to image superficial structures [23,24]. The ongoing advancement of pocket-sized ultrasound equipment advocates the utility even more [25]. Despite the eminent tissue differentiating capabilities of MRI [26], ultrasound derived measures for lower extremity muscle morphology correlate well with values obtained by using MRI [26][27][28][29]. In contrast to MRI, ultrasonography enables the operator to capture a muscle's contraction in a cineloop, which facilitates the post-processing identification of a muscle's circumference [30].
Determining the morphology of specifically the PIFMs, as part of the foot muscles, using ultrasound images is, however, challenging. This is due to the complex anatomical architecture of each of these muscles [31] and their non-parallel arrangement over several muscle layers [32]. Nevertheless, in general, studies revealed excellent interand intra-operator reliability and acceptable measurement errors for the ultrasound assessment of the thickness and CSA of PIFMs in younger adults [21,[33][34][35][36]. In addition, a study [37] that compared the reliability across machines for one of the PIFMs (i.e., abductor hallucis), demonstrated at least good reliability, even when using a laptopbased machine. These findings indicate the potential of ultrasonography to discriminate between individuals and to measure changes over time [38].
However, these measurement properties (i.e., reliability and measurement error) cannot be simply generalized to older adults for two reasons. Most importantly, physiological changes that occur with ageing, such as a higher degree of intramuscular adiposity and connective tissue [39] or increased presence of callus [40], may interfere with image quality and thus may limit the accuracy of muscle morphology measurements [41,42]. Furthermore, the reliability, as expressed in the intra-class correlation coefficient (ICC), is mathematically dependent on the biological variability between subjects [38] and this may differ between younger and older adults. Although ultrasonography has been shown to be a reliable instrument to measure quadriceps and gastrocnemius morphology in older adults [43], this is still unknown for the foot muscles. These muscles, including the PIFMs, extrinsic toe flexor muscles, but also extrinsic in-and evertors, should be jointly assessed together with the plantar fascia (PF), considering the synergistic contribution of this group of foot tissues to foot function [17][18][19]. The reliability of ultrasonography to assess the morphology for these foot muscles and the PF needs to be determined in order to judge the potential of this instrument to be used in future research concerning, for instance, the role of the foot muscles in relation to fall risk-related mobility parameters. For this future purpose, a single operator performing the ultrasound scans is recommended, as this is expected to result in more reliable measures [43]. Therefore, the main goal of this study was to investigate the intra-assessor reliability and measurement error of ultrasound measures for the morphology of selected foot muscles and the PF in older adults using a tabletbased ultrasound machine. In addition, we assessed a younger population using the same ultrasound machine and a mainframe machine to explore factors that could underlie these measurement properties. To investigate the effect of age, we compared the measurement error between older and younger adults using the tablet-based machine. To investigate if improved repeatability in older adults can be expected when changing to a mainframe machine, we compared the measurement error between the tablet-based and the mainframe machine in younger adults.

Study design
The design used was a blinded single assessor test-retest reliability study.

Ethical considerations
The medical ethical committee of Maxima MC declared that ethical approval was not required for this study protocol (N19.105). Written informed consent was obtained before the start of data collection.

Study population Recruitment
A sample of 18 older adults was recruited in the region of Eindhoven, The Netherlands, via advertisement in senior housing complexes and by sending a recruitment leaflet per e-mail to the social network of the investigator. Ten younger adults were recruited through personal communication within the University of Applied Sciences, Eindhoven, The Netherlands. Due to a lack of consensus on the required number of participants to achieve reliable measurement properties [38], the sample sizes were based on comparable studies [33,34].

Selection procedure
Individuals were eligible for participation in the older adult group if they were at least 65 years of age, in accordance with the categorization of the World Health Organization [44]. To participate in the younger adult group, volunteers had to be between the ages of 18 and 45 years, as lower extremity muscles start to atrophy after the age of approximately 45 years [45]. Further, volunteers had to be free of any known condition or disease affecting foot muscles and had to be able to ambulate ten meter without using a walking aid in order to represent a mobile population. Volunteers were excluded from study participation if they reported bilateral musculoskeletal injuries or bilateral symptoms distal to the knee (i.e., current musculoskeletal pain or overuse symptoms, orthopedic surgery or acute injury within the past 5 years, amputation) or if mobility or lower extremity motor function was likely to be affected by medical conditions (i.e., neurological condition, systemic disease, cardiovascular or pulmonary disease).

Measurement procedure
The measurement set-up is schematically depicted in Fig. 1. In a first period, images of foot muscles and PF were acquired in the older participants using a tabletbased ultrasound machine only. Thereafter, we decided to repeat the protocol in younger adults to explore factors that could underlie the established measurement properties. Approximately one year after the data collection period in older adults, the data were collected in the younger participants using both the tablet-based and a mainframe ultrasound machine for all measurements. This allowed us to investigate the influence of age on the measurement error and also to see if improved image quality, as expected from the mainframe machine, would reduce the measurement error. The ultrasound images in each participant were collected on two separate measurement occasions, at least one day apart [33], with a maximum of 8 days apart, assuming that foot muscle and PF morphology remains stable within this period. Participants were instructed not to engage in vigorous physical activities in the three days prior to the measurement sessions to avoid exercise-induced swelling of the muscles. The time of day was kept constant over the repeated measurements within participants. The older adults were measured at home. The younger adults were invited to the movement analysis laboratory at Fontys University of Applied Sciences (Eindhoven, the Netherlands).
At the first measurement occasion, demographics and characteristics were collected that are related to foot muscle and PF morphology or its ultrasound measurement reliability. Body length and weight were assessed manually in the older adults (213i and 750, Seca co., Hamburg, Germany) and electronically in the younger adults (DS-103, Dong Sahn Jenix co., Seoul, Korea). The participants were also asked about their physical activity behavior.
The ultrasound scans were performed by the investigator (LW), having a master degree in human movement sciences and several years of experience in teaching foot and ankle anatomy to podiatry students. She underwent a 10-month training program in imaging the foot muscles and PF prior to the study, without having previous experience in ultrasound imaging. The training started with three technical lectures and an individual training session from a formal ultrasound teacher specialized in musculoskeletal ultrasound. Throughout the training period, a few instructional sessions were supervised by either a physiotherapist experienced in clinical musculoskeletal ultrasonography or by a researcher experienced in imaging the foot muscles. The remainder of the training consisted of unsupervised sessions in which the proposed scan protocol [33,34] was practiced on younger adult volunteers alternated by interpreting the ultrasound images using an interactive anatomy atlas of MRI images, cadaveric videos and schematic illustrations. After the training, a pilot study in younger adults revealed intra-assessor ICC and limits of agreement (LoA) values that were overall comparable to the ones found in a previous study with an experienced operator [33]. This was considered sufficient to start the data collection. Additionally, the collected data in the current study were examined for any inevitable ongoing improvement in the operator's skills, which is explained in more detail in the statistical analysis paragraph below.
In the older and younger participants, the foot tissues were imaged using a tablet-based ultrasound system (Lumify, Philips Ultrasound, Inc., Bothell, USA) consisting of a 4-12 MHz broadband linear array transducer with a footprint length of 34 mm, the Lumify app and a Samsung Galaxy S4 tablet (Samsung Electronics co., Suwon, South Korea). Because of practical feasibility reasons, the scan protocol was repeated only in the younger participants using a mainframe system (Xario 200 g, Canon, Tochigi, Japan) with a 5-14 MHz linear array transducer (PLU-1005BT, Toshiba, Tochigi, Japan) with a footprint length of 58 mm. The order of the systems used was randomly chosen for each measurement occasion.
The dominant stance limb, decided by asking the participant to stand on one leg, or the asymptomatic limb, in case of unilateral symptoms, was scanned. Foot structures that were imaged consisted of intrinsic foot muscles (i.e., abductor hallucis (AbH), flexor digitorum brevis (FDB), quadratus plantae (QP), flexor hallucis brevis (FHB), abductor digiti minimi (AbDM)) and extrinsic foot muscles (i.e., tibialis anterior (TA), peroneus longus together with the peroneus brevis (PER), flexor digitorum longus (FDL), flexor hallucis longus (FHL)) and PF. TA, PER, FDL and AbH were imaged while the participants were in a supine position, their knee slightly bent and their distal thigh resting on a cushion, preventing compression of the lower leg muscles. To image FHL, FDB, QP, FHB, AbDM and PF, the participants lay in a prone position, their foot hanging freely off the plinth and their distal shank resting on a cushion. Using anatomical landmarks, washable lines were drawn on the skin to guide the placement of the transducer. The scan protocol was adopted from previous studies [33,34]. Based on our own pilot testing, we decided to image FHL slightly proximal to the ankle joint and TA at 25% of lower leg length. Figure 2 illustrates the transducer position and Additional file 1 provides a detailed scan protocol. Considering their shape, we decided to image all muscles, except for the FDL, and the PF in the longitudinal plane and selected muscles (TA, PER, AbH, FHL, FDB, QP, FDL) additionally in the transverse plane. Imaging QP in the transverse plane was omitted in the protocol for younger adults, because of the indefinite appearance of QP in the transverse images. This was most probably due to the non-parallel orientation of QP and its surroundings and was not expected to be an effect of age.
A generous amount of water-based coupling gel was applied between transducer and skin to obtain a clear image while avoiding compression of the tissue of interest. The probe was held perpendicular to the tissue border to achieve optimal appearance of the tissues of interest. The depth, focal point and gain were adjusted for each participant and each tissue to optimize image  quality. In case of difficulties with identifying individual muscles and whenever possible, participants were asked to perform a specific movement to provoke a contraction that assisted the offline identification of the muscle's border at the time of processing [46]. The contraction as well as the relaxed state of the muscle was captured in the same cine-loop of up to 5 s duration. Once the cineloops were acquired for each muscle and the PF, the protocol was repeated twice resulting in three trials for each morphology measure. To minimize the discomfort for the participant of lying still, an efficient workflow was accomplished by fixing the order of tissues to be imaged. However, to avoid systematic interference (e.g., due to fatigue or familiarization) with the ultrasound measures, the starting position of the participant (i.e. supine or prone) was randomly chosen for each measurement occasion. At the end of the session, the drawn scanning lines were removed from the skin. The captured videos were stored offline for the post-processing measurements. The whole procedure was repeated at the second measurement occasion, 1-8 days later.

Data processing
The assessor who performed the post-processing to determine the morphology of the foot muscles and the PF was the same person who acquired the images (LW). Cine-loops were post-processed per trial per participant several days to several weeks after the images were collected. To avoid recall bias, no more than one trial per participant was processed per day, followed by at least two trials of other participants. This ensured that no less than 34 other values were assigned before another trial of the same participant was administered. Muscle dimensions obtained in previous trials were not presented to the assessor to further minimize the risk of recall bias.
Image J software (National Institute for Health, Bethesda, MD, USA) was used for the offline processing of data. Within the cine loop, the best quality frame was selected in which the muscle was in a relaxed state. To measure the thickness of the muscles and the PF, the built-in digital caliper was applied on all longitudinal images and the transverse images of TA, PER and FHL. The thickness of a muscle was represented by the vertical distance between the muscle's epimysium, while the thickness of the PF was determined by the perpendicular distance between the deeper and superficial fascia borders. The CSA of FDL, AbH, FDB and QP was measured by delineating all intramuscular tissue using the polygon or freehand tool. Care was taken not to include hyperechoic surroundings of the muscles (e.g., epimysium or fascia) in the measurements for muscle morphology. The mean of three measurements (i.e., trials) for each morphology measure per occasion was administered as the final measure for further analysis.
Statistical analysis SPSS 25.0 (IBM, Chicago, IL, USA) was used for the statistical analysis. The study populations' characteristics were specified by describing gender, age, body length, body weight, BMI, daily time spent on their feet and whether or not the global recommendations on physical activity for health were met [44]. Measurement characteristics included whether the dominant stance leg was measured, actual number of days between the measurement occasions and the difference in time of day between the measurement occasions.
ICC (3,1; absolute agreement) for a single measurement was estimated by intra-assessor reliability analysis of the repeated ultrasound measurements (occasions) based on a 2-way mixed-effects model. Cut-off values were used to interpret the ICC as a measure for reliability (i.e., < 0.50: 'poor', 0.50-0.75: 'moderate', 0.75-0.90: 'good' and > 0.90: 'excellent') [47]. The standard error of measurement (SEM) and smallest detectable change (SDC) were calculated using the error variance, estimated using a 2-way mixed-effects model via the restricted maximum likelihood approach, according to the formulae: where σ so. e 2 is the error variance consisting of variance due to both systematic and random error.
All measurement properties (i.e., ICC, SEM, and SDC) were calculated separately for 1) older and 2) younger adults using the tablet-based machine and 3) for younger adults using the mainframe machine (Fig. 1). 'Occasion' was the only component that was varied across the repeated measurements.
In order to explore a possible learning curve, the older adults were divided over three sets of participants based on the chronological order of the first ultrasound measurement occasion [48]. The SEM for each morphology measure was calculated for each set of participants. When the SEM dropped with more than 50% from the one set to the next set of participants, without showing a previous increase, it was decided that the operator was still learning. The measurement properties (i.e., ICC, SEM, SDC), for the morphology measure(s) where this applied to, were then determined with the exclusion of the respective set(s) of participants.
Statistically (α = 0.05), 1) the SEMs for the tablet-based ultrasound measurements were compared between the older and younger adults and 2) the SEMs in the younger adults were compared between the tablet-based and the mainframe machine by applying the 'variance ratio approach' and the 'paired approach', respectively [49]. Table 1 shows the characteristics for the study populations and the measurements. Eighteen older adults with a mean age of 73.8 (SD: 4.9) years, and ten younger adults with a mean age of 21.9 (SD: 1.8) years participated in the study. The older adults had a higher body weight and BMI compared to the younger adults (mean body weight: 75.9 (SD: 13.5) kg vs. 63.8 (SD: 10.9) kg, p < 0.05; mean BMI: 26.3 (SD: 3.2) kg/m 2 vs. 21.4 (SD: 2.5) kg/m 2 , p < 0.05). Other participant characteristics were not statistically different between the age groups, nor were the measurement characteristics.

Results
The learning curves (Additional file 2), revealed that the SEM of the second set of older adults (n = 6) was less than half the SEM of the first set of older adults (n = 6) for the CSA of FDB, thickness of QP, AbDM, PF prox , PF dist , and FHL long , whereas the SEM was stable (i.e., the learning curve flattened) between the second and the third set of participants. Therefore, for these morphology measures, the data of the first set of older adults was omitted from the reliability analysis. This resulted in the final SEM to be 18 to 50% lower compared to when no data was omitted (Additional file 3). The SEM remained within the critical limits over the three sets of participants for all other morphology measures. Therefore, for these remaining morphology measures, the data of all participants were included in the analysis. The first set of participants (n = 6) from which data was omitted was not statistically different from the remaining participants (n = 12) on the demographics. The groups' mean morphology measures for each of the two occasions are listed in Table 2. Table 3 and Fig. 3A-C show the intra-assessor measurement properties (i.e., ICC, SEM and SDC) for each muscle and morphology measure (i.e., CSA and thickness). The exact p-values for the comparison of the error variances, are listed in Additional file 3. The raw data on which the measurement properties are based are graphically presented in Additional file 4. Table 3 demonstrates that, in older adults, the ICC of intrinsic foot muscle and PF morphology measures ranged from 0.57 (PF dist ) to 0.96 (CSA AbH and FDB). In the older adult group, the SEM did not exceed an absolute value of 1.0 mm for the thickness of any of the foot muscles or the PF. Relative to the average tissue size, in older adults, the SEM was smallest for the thickness of FHL (2.8 and 3.2%) and TA (3.2 and 3.9%) and largest for the CSA of FDL (11.9%), followed by the CSA of QP (9.7%). Considering only the thickness measures in the older adult group, the largest relative SEM was found for PF mid (7.7%) and PF dist (8.0%). When the relative SEM in older adults was compared between the two morphology measures for the same muscle, smaller values were observed for the CSA of both the AbH and FDB (AbH: 5.0% vs. 7.0%; FDB: 5.7% vs. 6.3%).
Comparing the age groups (Table 3, Fig. 3A-C), it was shown that the SEM was significantly greater in older adults for the thickness of AbH, CSA of FDB, PF prox , PF mid and FHL long . The corresponding relative SEM ranged from 3.2% (FHL long ) to 7.7% (PF mid ) in older adults versus 1.4% (FHL long ) to 5.8% (PF mid ) in younger adults. In contrast, the thickness of TA trans was measured with a significantly smaller measurement error and more reliably in older adults compared to younger adults (SEM: 0.8 mm (3.2%) vs. 1.3 mm (5.9%), ICC: 0.94 vs. 0.67).
The comparison between the two ultrasound machines ( Table 3, Fig. 3A-C) revealed a statistically larger SEM for the thickness of AbH and PER long when the mainframe machine was used compared to when the tabletbased machine was used (AbH: 0.4 mm (3.7%) vs. 0.2 mm (1.6%); PER long : 1.0 mm (7.2%) vs. 0.7 mm (5.0%)). For PF dist , a significantly smaller SEM was achieved when the images were obtained with the mainframe machine (0.06 mm (5.0%) vs 0.13 mm (11.0%)).

Discussion
In this study, we assessed the intra-assessor reliability and measurement error for the morphology of selected foot muscles and PF derived from ultrasound images collected by a single operator using a tablet-based machine in older adults. We also compared the measurement error with that obtained in younger adults and examined the influence of the ultrasound machine that was used. The results showed that the morphology of most of the assessed muscles can be reliably assessed with acceptable measurement error in older adults when using a tablet-based device, although measurement errors were smaller for some muscles in younger adults. In general, using a mainframe machine did not improve the repeatability in younger adults. In fact, when the repeatability differed between the machines, the repeatability of muscle morphology measures was superior for the tablet-based machine. The morphology of foot muscles could be assessed in older adults with an error ranging from 2.8 to 11.9%, equating to an SDC of 7.7 to 33.1%. When omitting the CSA of FDL, which showed an exceptionally large error, and selecting the most accurate morphology measure (i.e., CSA or thickness) for each muscle, the SDC was 15.7% at its greatest extent in the older adult group using the table-based device. This means that, on an individual level, a change in foot muscle size beyond 15.7% can be considered a real change [38]. In order to be a meaningful metric to measure a group mean change in muscle morphology, for example in a prospective intervention study, this change should exceed the SDC divided by the square root of the sample size [50]. A group change of this magnitude is realistic as an 8-week foot strengthening intervention in younger adults showed average foot muscle hypertrophy ranging from 5 to 15% [17] and, in general, older adults are expected to have a similar response to strength training [51]. Whether the morphological changes of foot muscles as a response to training in older adults indeed exceed the SDCs, needs to be investigated in future studies. Our range of SDCs corroborates well with the limits of agreement (LoA), a metric comparable with the SDC [52], reported by previous studies where the same muscles were examined in younger populations by operators with 8 years of ultrasound experience using more advanced machines [33,36]. Next to the measurement errors, the  Table 3 Intra-assessor measurement properties for ultrasound morphology of selected foot muscles and plantar fascia indicates the presence of heteroscedasticity. * indicates a significant difference (P < 0.05) across age groups or ultrasound machines reliability of the muscle morphology measures in older adults was predominantly 'excellent' and at least 'good' (ICC: 0.75 to 0.96). This signifies that overall the measurement error is small enough relative to the betweensubject variance [38], enabling us to differentiate between older individuals. Considering direct measures for toe flexor strength, previous research [21] revealed poorer reliability for toe flexor strength measures compared to ultrasound morphology (i.e., an indirect strength measure). In addition, a study in older adults [53] indicated a larger measurement error for toe flexor strength measurements than reported in the current study for foot muscle morphology. Together with the favorable ability of ultrasound to assess individual muscles, and although being an indirect strength measure, this supports its use to study the role of foot muscles in the older adult population. Future studies can use this methodology to investigate the effect of a foot strengthening program on foot muscle morphology in older adults. The latter also meets the requirement for a prospective study to approve the responsiveness of ultrasound to detect foot muscle hypertrophy in older adults.
For AbH and FDB, the cross-sectional area was measured with a slightly smaller error when compared to the thickness of the same muscle in older adults. The human error is assumed to affect the straightforward linear distance between the deeper and superficial epimysium (i.e., thickness) to a lesser degree than the demarcation of a muscle's outline (i.e., CSA) [37]. Apparently, this advantage did not outweigh the larger image capturing variability associated with longitudinal imaging of these muscles with an oval-like cross-section. The superior repeatability for the CSA of AbH and FDB is auspicious as a two-dimensional quantity is better able to cover a non-uniform change in muscle morphology that may occur in response to exercise or muscle disuse [28]. The SDC for AbH's CSA in the current study was substantially smaller than what was found using MRI (28 vs. 46.1 mm 2 ) and comparable for the CSA of FDB (39 vs. 36.4 mm 2 ) [54]. The ability to accurately measure the CSA of AbH and FDB in older adults is further promising as AbH and FDB are, together with QP, the intrinsic foot muscles most closely aligned with the medial longitudinal arch of the foot. Hence, these are key structures in the investigation of foot function [55,56].
In contrast to FDB and AbH, QP is a deeper located muscle and is, therefore, less accessible by ultrasound [57]. Indeed, the results show that QP's CSA was measured with only small precision in older adults, the SDC being 26.8%. This SDC is three times worse than shown in a previous study in younger adults [33]. Apart from the contrasting study populations, the studies differ substantially at the level of experience of the operators (i.e., 10 months vs. 8 years) and the ultrasound machines used (tablet-based vs. mainframe). The post-processing delineating of the QP's fascial borders, was experienced as extremely difficult by the researcher of the current study. Partially, this was due to the oblique orientation of the tissues adjacent to the medial aspect of the QP (e.g., plantar nerves, FDL tendon). This causes the ultrasound beam to reflect away from these tissues [57], resulting in an isoechoic appearance of QP and its surroundings. Together with the poor lateral resolution at this depth, this may have led to inaccuracy in the definition of the muscle's envelope. In contrast to the SEM for the CSA, the SEM for the thickness of QP was reasonably low and presents, therefore, a better alternative to quantify the morphology of QP for operators with similar experience. Three muscle morphology measures (CSA: FDB; thickness: AbH, FHL long ) showed superior repeatability in younger adults compared to older adults. This may be caused by an age-related decline in muscle quality, such as a higher degree of intramuscular adipose tissue [39], which indeed has been observed in the PIFMs [58]. As a consequence, the ultrasound beam scatters and sound is largely absorbed before reaching the deeper epimysial border, preventing it to appear as a bright hyperechoic structure [57]. Further, muscle contractions, aimed to aid the offline muscle delineation, were not always accomplished in the older adults as they tended to fall asleep or the researcher observed difficulties to relax the muscle after contraction. Although the superior repeatability for some muscle morphology quantifications in the younger age group was in line with our expectations, the measurement errors for these morphology measures were extremely low in the younger adults (SEM: 1.4-3.6%) and acceptable in older adults (SEM: 3.7-7.0%). Surprisingly, TA trans showed less measurement error in older adults (SEM: 3.2 vs. 5.9%) which could be explained by its more consistent shape in longitudinal direction compared to in younger adults.
Against our expectations, for the muscles showing dissimilar repeatability across the machines (i.e., thickness AbH and PER long ), the tablet-based device turned out to be advantageous. Apparently, for these muscles, the larger field of view owing to the larger footprint of the probe of the mainframe system and its expected superior image quality due to the use of advanced options were of insufficient benefit. The tabled-based device, instead, is equipped with a smaller probe and lighter cables, minimizing the chance for unintentional transducer manipulation that would be at the cost of the perpendicularity of the image captured and thus repeatability [57]. The similar performance across the machines indicates that for each muscle the image quality of the tablet-based machine was sufficient to delineate the epimysial borders, which has previously been demonstrated for the thickness and CSA of AbH using a laptop-based machine [37]. Likewise, another study observed invariant thickness of hip extensor muscles across two different ultrasound machines [59]. Our findings confirm the assumption from other authors that machines can be used interchangeably in the assessment of muscle morphology [41], except for FHB, which showed a systematic difference across the machines in our study. Hence, this implies that health care professionals are not contingent on the availability of a specific machine to monitor muscle morphological changes over time. It further indicates that measurements for foot muscles morphology are not restricted to well-equipped hospitals or research labs, but can easily take place at the home of aged or diseased populations. This is promising given the current implementation of transitioning care from hospital to peoples own home.
The relative SDCs for PF, a passive structure associated with foot posture [19], increased the more distal it was assessed, ranging from 16.5 to 22.2% in older adults. In younger adults the SDC was significantly smaller for the proximal and middle portion (PF prox : 9.6%; PF mid : 16.0%), resembling previously reported inter-assessor repeatability [34]. Callus may have interfered with image quality [43] in older adults more than in younger as this skin condition is more prevalent in that population [40] and especially manifested in the region of PF prox and PF mid . Changing to the mainframe machine improved the repeatability for the distal portion of PF to an SDC of 13.9% and is, as such, also in agreement with the literature [34]. This can be explained by the enhanced visibility of deeper tissues (e.g., 2nd metatarsal bone) through the advanced penetration of sound by the mainframe machine, aiding the correct probe position [30]. Whenever available, a mainframe machine is, therefore, recommended to measure the thickness of PF in the feet of older adults when callus is present.
The current study is the first to investigate the reliability and measurement error for a large selection of both intrinsic and extrinsic foot muscles, in addition to PF, in older adults. Accounting for the operator's proficiency and excluding measurements accordingly ensured a valid comparison of the measurement properties between older and younger adults. Nevertheless, the study was subjected to several limitations that need to be considered. Most importantly, this intra-assessor reliability study does not provide information on the validity of the ultrasound morphology measures. Although the operator was a novice scanner at the start of the training, she received intensive specific training in scanning the foot muscles according to a fixed protocol. In addition, detailed anatomical knowledge of the scanner and the triangulation during the training program further contributed to valid ultrasound measurements. Nevertheless, because of the uncertain accuracy of the morphology measures, a systematic error cannot be ruled out. A systematic error would influence the mean morphology measure itself and, as a consequence, the relative measurement properties (i.e., relative SEM and SDC), but not the absolute measurement properties. However, it is not expected that this has occurred substantially. In addition, the repeated measurements were sometimes only one or two days apart. Therefore, the image capturing at the second occasion may have been subjected to recall bias. Nevertheless, this bias is expected to be marginal, considering the straightforward scan protocol. Further, the measurement properties were estimated from samples consisting of 10 to 18 participants, believed to be relatively small [60]. However, the sample size of the older adult group exceeded that of other ultrasound reliability studies with this amount of tissues [33,34]. Another limitation is that the comparison between machines only pertains to younger adults, as we decided to extend the protocol after the data collection was completed in older adults. Whilst it is unclear how a change of machine affects the repeatability in older adults, the measurement properties for the tablet-based device were already promising for its future use. The results are, however, limited to a subset of tissue morphology measures determined from pilot testing with the tablet-based device. Hence, the applicability of the mainframe machine in the assessment of, in particular, the CSA of muscles such as QP, FHB and AbDM remains elusive. Lastly, only a single operator was involved in this study which means that the measurement properties cannot simply be generalized to any other operator [60]. This is because not only is the quality and consistency of the acquired images determined by the ultrasound experience and the background of the operator, but also does the post-processing delineation or identification of muscle borders rely heavily on the anatomical knowledge of the rater [61]. Nevertheless, a 10month period of intense specific training, during which a specific foot muscle scan protocol was used, appeared to be sufficient to obtain good to excellent reliability and measurement error, but only when a single operator performs the measurements.

Conclusion
The results of this intra-assessor reliability study showed that a tablet-based ultrasound machine can be reliably used to assess the morphology of selected foot muscles in older adults, with the exception of plantar fascia thickness. This supports the use of this instrument in future studies to gain understanding of the role of these foot muscles in foot function. Although the measurement errors were smaller in younger adults for some muscle morphology measures, they seem adequate in older adults to detect hypertrophy as a response to training on a group level. The use of a tablet-based device seems to be a good alternative to a mainframe system, but how its superior repeatability applies to older adults needs to be further investigated. Nevertheless, our findings advocate the use of ultrasound in future studies or in clinical practice when foot muscle morphology is the outcome of interest, without being restricted to expensive ultrasound machines that often have limited access. In addition, the use of a tablet-based device enables the researcher or clinician to perform the ultrasound measurements at any location, even at the home of the older adults.