Bootstrap exploration of the duration of surface electromyography sampling in relation to the precision of exposure estimation

Bootstrap exploration of the duration of surface electromyography sampling in relation to the precision of exposure estimation. Scand J Work Environ Health 2007;33(5):358– 367. Objectives This study examined the effect of sampling duration, in units of work cycles, on the precision of estimates of exposure to forceful exertion obtained with surface electromyography (EMG). Methods Recordings of the activity of the flexor digitorum superficialis, extensor digitorum, and upper trapezius muscles over 30 consecutive work cycles were obtained for a random sample of 25 manufacturing workers, each of whom was performing a unique production task representing a portion of the whole job. The mean root-mean-square amplitude and the 10th, 50th, and 90th percentiles of the distribution function of the amplitude probability were calculated for each cycle. Bootstrap analyses were used to examine the precision of the summary measures as the sampling duration increased incrementally from 1 to 30 work cycles. Precision was estimated by calculating the coefficient of variation (CV) of the bootstrap distributions at each sampling duration increment. Results The average minimum sampling duration for a bootstrap distribution CV of 15% ranged from 2.0 (SD 1.5) cycles to 7.5 (SD 9.6) cycles, depending on muscle and summary measure. For a 5% CV, the average minimum sampling duration ranged from 11.9 (SD 9.0) to 20.9 (SD 10.5) cycles. Conclusions The results suggest that sampling as few as three work cycles was sufficient to obtain a bootstrap distribution CV of 15% for some of the muscles and summary measures examined in this study. While limited to machine-paced, cyclic manufacturing work, these results will assist the development of exposure assessment strategies in future epidemiologic studies of physical risk factors and musculoskeletal disorders.

Musculoskeletal disorders of the upper extremities continue to affect a substantial proportion of manufacturing industry workers. Several physical risk factors commonly found in the work environment, such as repetitive motion, awkward postures, and forceful exertions, have been found to be consistently positively associated with musculoskeletal disorders of the upper extremities (1,2). While the strongest associations appear to exist when risk factors are present in combination (3,4), epidemiologic studies have reported an independent, positive association between forceful exertions and specific musculoskeletal disorders of the upper extremities, such as carpal tunnel syndrome and epicondylitis (5)(6)(7)(8). Quantitative estimation of the association between exposure to specific levels of forceful exertions and musculoskeletal disorders of the upper extremities is sparse, however, due partly to the use of imprecise self-report or observational exposure assessment techniques (9)(10)(11)(12).
When collected for an adequate duration, direct quantitative measurements of muscle activity with surface electromyography (EMG) can produce precise estimates of exposure to forceful exertions (10,13). Surface EMG has the most commonly been used to describe muscle activity patterns in laboratory and small-scale field settings to characterize exposure to forceful exertion or to compare exposure levels pre-and postintervention. The upper trapezius is among the most commonly studied muscles with respect to forceful exertions of the Fethke et al shoulder (14), while the flexor digitorum superficialis and extensor digitorum are commonly studied with respect to the distal upper extremities (15,16).
For several reasons, few large-scale epidemiologic studies have used EMG to estimate exposure to forceful exertion (4,(17)(18)(19)(20). The cost of the instrumentation can be prohibitive, and equipment operation and maintenance may require specialized training. In addition, conventional electromyographic (EMG) systems are either not portable or use limited-range telemetry, requiring the subject to be near the data collection and storage location. Newer portable systems are available (21)(22)(23), but they have been used in only a limited number of field studies.
In addition to operational limitations, the sampling duration, in units of work cycles, required to obtain estimates of exposure to forceful exertion of adequate precision is not well characterized and has been selected arbitrarily in previous investigations. Reported EMG sampling durations in field studies of cyclic manufacturing work range from about 20 minutes per task (24) to more than 60 minutes per task (25). If multiple tasks are sampled, then the total EMG sampling duration can exceed several hours per study participant. Prolonged sampling periods may result in unacceptable levels of interference with workplace production, especially in epidemiologic studies capturing exposure information for multiple physical risk factors and for which hundreds of persons may be needed for adequate statistical power.
Previous studies have investigated the reliability of surface EMG summary measurements repeated during the same experimental day or measurements repeated on different days (26,27). The precision of EMG measurements has been examined in several studies in the form of the within-persons components of exposure variance generated from modeling techniques using a randomeffects analysis of variance (ANOVA) (25,(28)(29)(30). In general, the precision of an EMG measurement improves as the sampling duration increases (29). Traditionally, sampling duration is specified in terms of time and is held constant for each study participant. However, for repetitive tasks, a fixed sampling duration will result in exposure estimates based on different numbers of work cycles when study participants are drawn from a population with a wide range of cycle times. It is unknown whether, and by how much, the precision of EMG summary measures varies when computed over a range of work cycles.
A better characterization of the relationship between sampling duration and exposure estimate precision would allow researchers to optimize sampling durations in light of their resources, the number of study participants available, workplace constraints, and anticipated effect size. To address this issue in our study, we evaluated the effect of varying the sampling duration, in units of work cycles, on the precision of exposure estimates derived from EMG data.

Study population
We report the results of an analysis of EMG data collected during a prospective cohort study designed to examine the association between physical risk factors and musculoskeletal disorders of the upper extremities among household appliance manufacturing workers. All of the participants were 18 years of age or older and were employed in production jobs at a single facility. Altogether 232 persons were enrolled in the cohort at the time of this analysis. Of these, 198 performed cyclic production jobs and were eligible for inclusion in our study; the remaining 34 performed noncyclic work and were excluded. A random sample of 25 eligible cohort members was selected for participation in this study [12 female, 13 male, average age 47.7 (SD 7.2) years]. The median number of tasks comprising the whole job for each participant was three (range 1-6 tasks). For the participants performing multiple tasks, we randomly selected one for inclusion in this study. Thus 25 unique tasks were obtained from among the 25 participants, with an average cycle time of 26.6 (range 14.4-49.9) seconds.

Source electromyographic data
The EMG data collected for each participant was a continuous recording of the activity of the dominantside upper trapezius, the flexor digitorum superficials (flexors), and the extensor digitorum communis (extensors) muscles over 30 consecutive work cycles. Bipolar, silver-silver chloride surface electrodes with an interelectrode distance of 20 millimeters and preamplified with a gain of 30 were used for all of the recordings. Standard placement procedures were used to position the electrodes over the three muscle groups (31).
The EMG data were collected with a portable data logger system. Within the data logger unit, the raw, analog EMG signals were bandpass-filtered between 10 and 4000 hertz, further amplified with a gain of 2000, root-mean-square (RMS) processed in realtime with a 100-millisecond time constant and sampled at 100 hertz with a 12-bit analog-to-digital converter. The digitized signals were then streamed to compact flash memory for later analysis in the laboratory.
The EMG data were normalized with submaximal reference voluntary exertions (RVE) for each muscle. A rapid normalization procedure was used (32), such that the reference exertions for all three muscle groups were performed simultaneously. The participants grasped a hand dynamometer (Commander Grip Track, JTECH Medical, Salt Lake City, UT, USA) with a power grip while standing with the dominant arm abducted in the scapular plane, the elbow fully extended and forearm pronated. A 2-kilogram weight was then placed over the dorsum of the hand to elicit a contractile response in both the extensors and upper trapezius. At the same time, the participants maintained a grip force of 88.94 newtons for 15 seconds.
The average RMS amplitude, in millivolts, of the middle 10 seconds of the 15-second reference contractions was calculated for each muscle group. Three reference contractions were performed for each person, and measures of resting muscle activity were also obtained. For each muscle, the EMG voltage values sampled during the worktask (EMG task ) were expressed in terms of the percentage of the RVE (%RVE) using equation 1, where EMG RVE is the average of the three reference contraction voltages and EMG rest is the baseline voltage:

Data analysis procedures
After the normalization, all of the EMG signals from the worktasks were analyzed with a custom-signal processing package written in LabVIEW 7.1 (National Instruments, Austin, TX, USA) (33). The continuous signals from each participant were parsed into 30 discrete work cycles with the aid of digital video recordings obtained at the time of the measurement (figure 1). For each cycle and muscle group, the mean RMS amplitude and the 10th, 50th, and 90th percentile values of the amplitude probability distribution function (APDF) were calculated (in %RVE). The effect of varying the number of  work cycles sampled on the precision of these four EMG summary measures was then evaluated with a bootstrapping procedure.

Statistical methods
Bootstrapping is a statistical technique whereby the precision of a parameter estimate, such as mean RMS amplitude, can be evaluated empirically by simulating the process of sampling population data using observed sample data (34). The chief advantage of the bootstrap procedure is that assumptions about the distribution of the population data (eg, normality) are not required in order to make inferences about the parameter of interest (35). The bootstrap procedure begins with an observed parent data set of sample size N, from which a population parameter q is to be estimated with the statistic q. A resample of size n is randomly drawn with replacement from the original N, such that each value has a probability 1/N of being chosen for inclusion in the resample each time a value is selected from the original sample of size N. The resampling process is repeated a large number of times (eg, 1000) in order to simulate the process of repeated sampling from the population (36). Then q is recalculated for each iteration based on the n resampled values. The distribution of the bootstrap replicates of q, called the bootstrap distribution, serves as an estimate of the sampling distribution of q. If, for example, 1000 resampling iterations are executed, then 1000 estimates of q would be produced. The precision of q can be estimated by either constructing percentile ranges (36,37) or calculating the coefficient of variation (CV) of the 1000 estimates (38). The latter method was chosen for this study. For each of the 25 participants, 12 bootstrap parent samples of 30 observations each were obtained from the 30 work cycles of EMG data (three muscle groups by four EMG summary measures). ^

Fethke et al
Families of bootstrap distribution CV were obtained from each parent sample by incrementally increasing the resample size from 1 cycle to 30 cycles and generating 1000 estimates of the EMG summary measure at each resample size (figure 2).
A fundamental assumption of the bootstrap is that the data values within the parent sample are statistically independent (36). Since the parent samples in this study consisted of time series data, there was concern that EMG summary measures from successive cycles may not satisfy the independence criterion. Therefore, we conducted standard autocorrelation analyses on each parent sample as a test for independence prior to performing the bootstrap procedure (39,40). Parent samples exhibiting significant autocorrelation at lags of one and two cycles were excluded from further analysis.
The effect of sampling duration on the precision of an EMG summary measure was estimated by calculating the CV for each bootstrap distribution of 1000 estimates at each resample size for each muscle group. For each participant, muscle group, and EMG summary measure, the minimum resample size needed to obtain bootstrap distribution CV values of 15%, 10%, and 5% was determined. A value of 30 cycles was assigned in cases in which a particular CV level was not achieved. A one-way repeated-measures analysis of variance (ANOVA) was used to explore the differences between the muscle groups in the average minimum resample size needed to achieve a desired precision level. The Greenhouse-Geisser correction was used to adjust the degrees-of-freedom of the models to compensate for sphericity violations (41), and the Tukey procedure was used for posthoc pairwise comparisons between the muscle groups. Separate analyses were conducted at each precision level for each EMG summary measure.
The total sampling duration of the 30 parent work cycles, in terms of time, varied among the study participants due to the cycle time variation between the selected tasks. To explore the possibility that cycle time differences between the participants may have affected the precision of the bootstrap distributions, we calculated the Pearson correlation between the average cycle time and the bootstrap distribution CV associated with a resample size of 10 cycles. Altogether, we conducted 12 separate analyses, one for each combination of the EMG parameters and muscle groups. The data eliminated from the bootstrap analyses on the basis of the autocorrelation results were also excluded from the Pearson correlation analyses.

Results
Due to the presence of auotcorrelation, the number of participants within each bootstrap analysis set was reduced slightly. For the mean RMS amplitude and 50th percentile APDF summary measures, the number of participants was reduced from 25 to 23. For the 90th and 10th percentile APDF measures, the number of participants was reduced from 25 to 22. Across all of the summary measures and muscle groups, 10% of the overall data set was excluded from the bootstrap analyses due to the presence of autocorrelation at lags of one and two cycles.
For all of the participants remaining in the analyses after the autocorrelation procedures, the precision of the bootstrap distributions for each summary measure and muscle group increased as the resample size increased (an example from a typical participant is displayed in figure 3). Accordingly, for all of the EMG summary measures and muscle groups, the average bootstrap distribution CV, the parameter selected as our estimate of precision, decreased as the resample size increased (figure 4).
Shown in table 1 are the results of both the bootstrap analyses and the repeated-measures ANOVA conducted at the three precision levels for each EMG summary measure. The average minimum resample size increased with increasing levels of the bootstrap distribution precision (ie, a reduction in CV) for all of the summary measures and muscle groups. At CV levels of 15% and 10%, the 90th percentile of the APDF for the flexors required the largest average minimum resample sizes [7.5 (SD=9.6) cycles for a CV of 15% and 11.5 (SD=10.2) cycles for a CV of 10%]. However, for the 5% CV level,   The smallest average minimum resample sizes at each of the three precision levels were all found for the extensors [2.0 (SD 1.5) cycles for the 90th percentile of the APDF and a CV of 15%, 3.6 (SD 3.0) cycles for the mean RMS amplitude and a CV of 10%, and 11.9 (SD 9.0) cycles for the mean RMS amplitude and a CV of 5%].
The results of the repeated measures ANOVA models were mixed. For the mean RMS amplitude and the 50th percentile of the APDF, no significant differences were found in the average minimum resample size between the muscle groups at any of the three precision levels. However, significant differences between the muscle groups were found for the 90th and 10th percentiles of the APDF. For the 90th percentile APDF summary measure, the Tukey posthoc pairwise comparisons indicated no difference in the average minimum resample size between the extensors and upper trapezius muscle groups at any of the three CV levels. The flexors, however, needed a larger average minimum resample size than the extensors to obtain each of the three CV levels and a larger resample size than the upper trapezius for the 15% and 10% CV levels. For the 10th percentile APDF metric, the upper trapezius muscle required a larger resample size than the flexors at each of the CV levels.
The correlation between the average cycle time and the bootstrap distribution CV at a resample size of 10 cycles was statistically significant only for the 90th percentile APDF of the flexor muscle (r 0.48, P=0.01). The direction of the correlation was positive and therefore indicated that, as the average cycle time increased, the bootstrap distribution became less precise.     The correlation between the average cycle time and the estimated precision associated with the mean RMS amplitude of the flexors was also positive, although it was not statistically significant. The correlations for the remaining 10 analyses were both negative and not statistically significant.

Discussion
Several investigators have sought to quantify the contributions of the different sources of variability affecting estimates of exposure to physical risk factors obtained with both observational and direct quantitative measures, including surface EMG (29,(42)(43)(44)(45)(46)(47). Understanding the nature of exposure variance is essential when exposure assessment strategies are being developed, especially in field studies with large numbers of participants performing multiple and varied tasks. Still, formulations of the magnitude of different exposure variance components are dependent at least partially on an adequate precision of the individual measurements on which they are based. While previous studies have used bootstrapping to explore several exposure assessment issues (28,35,37,38,48), this is the first study using the technique to examine the effect of sampling duration on the precision of an individual EMG measurement.
Overall exposure variability in occupational epidemiology has been broadly partitioned into betweenparticipant and within-participant components (49). If an individual exposure assessment strategy is used, the components of variance can help estimate the degree of attenuation of risk estimates resulting from measurement error (50)(51)(52)(53). Attenuation can be substantial if the within-participants variability is large when compared with the between-participant variability. Our study, however, cannot be used to estimate attenuation, since within-participant variability also included a between-day component (28). Similarly, estimates of the within-participant (within-day) variability could not be generated since, for each participant, we considered the precision of an EMG measurement obtained for only one of several tasks comprising the entire job. Capturing 30 cycles of EMG across multiple tasks and multiple days was beyond the scope of our study. In addition, the selection of participants and tasks was systematic and not representative of the actual exposure distribution within the facility. As a result, formulations of between-participant variance using the current data are also not appropriate.
Aside from showing the general expected finding of increased precision with increasing resample size, figure 3 also illustrates the fact that the bootstrap distributions were asymmetric, especially at low resample sizes. Although figure 3 indicates positive asymmetry, negative asymmetry was observed for some persons. The nonsymmetric bootstrap distributions at low resample sizes suggest that analytical procedures for estimating precision which assume normality, such as the classical confidence interval for the mean, are likely to be error prone.
Few studies have computed EMG summary measures of occupational tasks on a cycle-by-cycle basis, regarding each cycle as a distinct measurement period. In a controlled laboratory experiment of a 4-second Table 1. Means and standard deviations of the minimum resample size needed to obtain a bootstrap distribution coefficient of variation (CV) of 15%, 10%, and 5% with the use of the electromyographic summary measure and muscle group. (RMS = root-mean-square, APDF = amplitude probability distribution function)  (25) reported between-cycle CV levels of 11% to 19% for the trapezius and 4% to 14% for the extensors in a small field study of three electronics assembly tasks with cycle times ranging from 3 to 4 minutes. In our study, the between-cycle CV levels in the bootstrap parent samples obtained with random-effects ANOVA models ranged from 19% to 43%, depending on the muscle and summary measure (analyses not shown). The between-cycle CV levels were higher in our data set for two reasons. First, unlike Mathiassen et al (29), we collected our data in an uncontrolled field setting. Second, unlike Moëller et al (25), we did not exclude work cycles with unexpected periods of rest (eg, a short delay in production) from the analyses.
The assignment of 30 cycles as the minimum resample size necessary to achieve a specific level of precision occurred for less than 1% of the participants at a CV of 15% and less than 5% of the participants at a CV of 10%. However, the parent samples from nearly 30% of the participants failed to achieve a CV of 5% within a resample size of 30 cycles. While it is unlikely that the assignment of 30 cycles for the 15% and 10% CV levels affected the results, the average minimum resample size needed to achieve a bootstrap distribution CV of 5% is probably underestimated. Therefore, caution is needed before the results from the 5% CV level are used as a guideline for determining the number of cycles to sample.
The autocorrelation analyses were critical to the development of this work. Significant autocorrelation, especially at small lag values such as one cycle, are possible indicators of phenomena such as the onset of localized muscle fatigue, equipment drift, or nonrandom changes in the work environment occurring during the original data collection period. While autocorrelation appeared to be an issue for a small percentage of the data set, excluding the autocorrelated data did not affect the overall results. If autocorrelation were more pervasive within the bootstrap parent samples, the EMG information contained within one work cycle could not be considered statistically independent of the EMG information contained within other work cycles.
The independence of work cycles in terms of EMG information also had an operational benefit regarding the bootstrap procedure. In field EMG data collection situations, random work cycles are typically not recorded. Rather, as was the case with the source data used in this study, work cycles are sampled as a consecutive sequence. Ideally, the bootstrap procedure would have been performed by randomly selecting blocks of consecutive work cycles, rather than by using individual cycles, during the resampling process to better reproduce the reality of field data collection. However, selecting blocks of consecutive work cycles would have reduced the size of the parent sample available at each resample size. To maintain a bootstrap parent sample of 30 observations with a resample size of 30 consecutive work cycles, EMG data for 59 work cycles would have been required. Therefore, establishing the independence of the individual cycles within each parent sample of 30 cycles allowed for a maximum analysis of the available data.
The average cycle time of the tasks performed by the 25 study participants was consistent with assembly tasks in both field (54) and laboratory (30) studies. A reasonable concern was a possible effect of cycle time on the resample size required to obtain the different levels of precision, since EMG summary measures computed over short time periods are more sensitive to transient changes in muscle activity than measures computed over long time periods. As a consequence, a low level of precision (high CV) in the bootstrapped distributions of exposure estimates may be observed. However, the correlations between the average cycle time and the estimated precision revealed a modest effect only for the 90th percentile APDF of the flexors. The positive direction of the correlation was somewhat surprising; however, the decreased precision with increasing average cycle time may be a consequence of increased between-cycle exposure variability. Longer cycle times may give workers more opportunity to vary motion and effort patterns, so that each cycle has a unique exposure profile. In general, differences in the average cycle times of the tasks did not meaningfully influence the results of the bootstrap analyses. Thus precision of the EMG summary measures appeared to be more strongly related to the number of work cycles sampled than to the actual time duration of the sampling period.
The results of the repeated-measures ANOVA models imply that different sampling requirements are needed for different muscle groups, depending on the summary measure. Obviously, when EMG is carried out with multiple muscles, a sufficient number of cycles should be sampled to ensure the desired level of precision for the muscle group with the greatest degree of exposure variability. As figure 5 shows, a sampling duration based on a specific number of work cycles results in exposure estimates that are more precise for some persons than for others. However, in epidemiologic studies involving large numbers of participants performing multiple and varied tasks, such as the longitudinal study for which our data were collected, adjusting the sampling duration based on a priori knowledge of each person's EMG profile is not possible. Therefore, applying the bootstrap Fethke et al procedures outlined by us to a pilot sample of people with varied exposure can serve to guide the overall exposure assessment effort. If, for example, sampling seven work cycles per task was adequate, on the average, to achieve a desired precision level in the pilot data, then researchers would have a target minimum requirement to apply to the entire study population.
Mean RMS amplitude and the APDF are common EMG summary measures found in the ergonomics literature (55). In terms of exposure to forceful exertion, mean RMS amplitude is strictly an estimate of average intensity, while the APDF presents the probabilities associated with different intensity levels over the duration of the recording. Other EMG analysis techniques, such as exposure variation analysis (56) and gap analysis (57), provide insight into different aspects of overall exposure. Gap analysis, for instance, quantifies the frequency and duration of periods of muscular rest. Therefore, applying the methods used by us to additional EMG summary measures may provide different results.
Cyclic manufacturing work is an ideal scenario for the use of the bootstrap procedure described in this study. The video recordings obtained at the time of the EMG measurement allowed for a straightforward demarcation of the work-cycle end points. However, the EMG data in this study represent a subsample obtained from a larger population of manufacturing workers employed within a single facility. Thus the results may not be applicable to repetitive work with different exposure and cycle time features. Highly variable noncyclic work is a characteristic of many industries with high rates for upper-extremity musculoskeletal disorders, such as construction and agriculture (23,48,58,59). In such cases, work is not machine-paced and exposure to physical risk factors has little or no periodicity.
Adapting the bootstrap procedure to study the effect of sampling duration on exposure estimate precision for noncyclic work may prove problematic. First, rather than originating from well-defined work cycles, the bootstrap parent samples would need to be based on periods of EMG activity of equal lengths of time (eg, 1-minute segments). In addition, since noncyclic work may be composed of long periods in tasks with differing mean exposure levels, autocorrelation of the EMG information is more likely to be an issue than in cyclic work (60). In this case, wholeday EMG recordings may be the most viable solution for obtaining precise exposure estimates. However, even EMG summary measures obtained from wholeday recordings may not accurately capture aggregate exposure since the nature of the work conditions may change on a day-to-day basis.
In conclusion, the asymmetry about the mean of the bootstrap distributions at small resample sizes suggests that bootstrapping, which does not require normally distributed data, was an appropriate strategy for exploring the effect of the number of work cycles sampled on exposure estimate precision in this study. Autocorrelation, while present to a small extent, did not invalidate the bootstrap assumption of independence of the EMG information between adjacent work cycles for most of the study participants. Depending on the desired precision level, the difference in sampling requirements between the least and most variable muscle group ranged from three to eight work cycles. Intuitively, when EMG is carried out for multiple muscles, the number of work cycles to be sampled simultaneously should be specified according to the muscle group with the greatest degree of expected between-cycle variability. For some of the summary measures and muscle groups examined in this study, sampling as few as three work cycles was sufficient to achieve a 15% coefficient of variation of the empirically derived sampling distribution of exposure estimates. support was provided by NIOSH grant RO1/OH007945-01A1.
The contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Institute for Occupational Safety and Health. The authors thank the management, production supervisors, and the workers at the study facility for their gracious assistance with this project.