Validity of a self-completed questionnaire measuring the physical demands of work

Validity of a self-completed questionnaire measuring the physical demands of work. Scand J Work Environ Health 1998;24(5):376-385. Objectives This study determined the accuracy of workers in quantifying occupational physical demands on a self-administered questionnaire. Methods First, a self-administered questionnaire on work postures, manual materials-handling, and repetitive upper-limb movements was validated using direct simultaneous observations for 123 randomly selected employees from 6 occupational settings. Second, weight estimation accuracy was assessed on visual analogue scales for 6 manual materials-handling activities using 20 randomly selected employees from 1 occupational setting. Results At a dichotomous level (ever-never), the accuracy of most of the self-reported physical demands was good (sensitivity 60-100%; specificity 56-100%). A more-detailed analysis of the dimensions studied (frequen-cy, duration and amplitude) also showed that the accuracy of the self-reported estimates was satisfactoly. Full agreement between the estimated and observed frequency was >60% for most of the manual materials-handling activities. In addition the average difference between the estimated and observed duration of the physical demands was found to be small. Finally the average difference between the self-reported and actual weights of various loads was found to be modest. self-reported questionnaire used in this study would provide a useful instrument for estimating occupational physical demands and the frequency, duration, and amplitude of these demands in future epidemiologic studies associated with musculoskeletal pain.

The high prevalence of musculoskeletal disorders in physically demanding occupations is a well documented feature of both cross-sectional surveys (1)(2)(3) and cohort studies (4,5). Physical occupational exposures that might be associated with such symptoms have typically been classified indirectly using job titles (6)(7)(8). However, it is recognized that job title alone will poorly define a person's occupational exposures (9-ll), and it does not allow conclusions to be drawn about the quality and quantity of physical work load (12). Before the contribution of physical exposures in the workplace to the occurrence of musculoskeletal disorders can be determined, detailed and accurate descriptions of such exposures are needed (13). Such descriptions require that physical de-mands be measured in the workplace (eg, posture, manual materials-handling, and tasks involving repetitive limb movements) on an individual basis.
The most widely used method of measuring occupational physical demands on an individual basis is the selfadministered questionnaire (14), which offers benefits over alternative methods in terms of cost, versatility, and capacity (use in large populations) (15). However, an estimation of physical demands by self-reports has been criticized as being inaccurate, especially for the more complex load-bearing activities carried out by employees (16). An accurate estimation of such demands is, however, essential in determining the relationship with musculoskeletal disorders in the workplace (17). It is therefore important to evaluate the performance of a questionnaire in obtaining valid information about physical demands before an epidemiologic study based on self-reports on a questionnaire is conducted (16,18). Despite this situation, there have been relatively few studies published concerning the validity of quantitative data collected by self-administered questionnaires (19).
This study determines how accurately workers quantify the physical demands of their work in a self-administered questionnaire and validates the questionnaire across a variety of occupational groups carrying out a diverse range of manual tasks. It is the first phase of an epidemiologic survey investigating how the physical demands of work are associated with shoulder pain.

Design and subjects
The study comprised two investigations, both cross-sectional in design.

Subjects
Part 1. Employees from 6 workplaces in south Manchester formed the population from which the subjects were selected for part 1 of the study. The subjects comprised 140 full-time employees randomly selected to represent between 5% and 10% of each occupational setting. The occupational groups included cashiers and shelf stackers (supermarket, N=40), cashiers and shop assistants (department store, N=20), production-line workers (packaging factory, N=20), mail sorters (post office, N=20), nurses (hospital, N=20), security staff and baggage handlers (airport, N=20).
The subjects were randomly assigned 1 hour of their workshift for which observations were made of the physical work being carried out. After the period of observation, the subjects completed a self-administered questionnaire concerning the same hour of physical work.
Part2. An additional 10% random sample of cashiers and shop assistants was selected from 1 occupational setting (department store, N=20) to form the subjects for part 2 of this investigation.
The subjects estimated the weights of 5 items of known weight involved in 6 manual materials-handling operations.

Self-administered questionnaire (part I)
A questionnaire was developed that included 8 items on manual materials-handling, 4 items of work postures, and 2 items on repetitive movements of the upper limbs. (See the appendix). The items were selected because they had previously been identified as risk factors for shoulder pain. For each of the manual materials-handling operations, the subjects were asked to estimate the frequency (on a 4-point scaleafter a preliminary analysis revealed that employees had difficulty estimating frequency as an absolute value), weight (on a visual analogue scalenot analyzed in the current study), and duration (recorded in minutes). For the work postures and repetitive movements, the subjects were asked to report only frequency and duration. The reference period for the questionnaire was 1 specified recent workhour.

Observations
The observations were carried out by 2 researchers, an ergonomist and a physiotherapist, who discussed and agreed upon the definitions of the observations being made (eg, "work above shoulder level" was defined as "a task being carried out with arm(s) raised above a 90" angle). The researchers then practiced the observation method at the research institution by observing volunteers carrying out each physical demand.
Each subject was observed for 1 hour by 1 of the 2 researchers. The observations were carried out using a time-sampling approach. At 30-second intervals the researchers recorded which physical demands were being carried out on an observation schedule after being prompted by an auditory cue.
The definition of standing and sitting for periods of more than 30 minutes required positive observations of these postures to have been made consecutively on at least 60 of the 30-second time intervals.
Crude estimates of frequency were calculated for the manual materials-handling activities as a summation of positive observations made at the 30-second intervals (a maximum of 120 representing the hour of observation). For the analysis, frequency was categorized into 4 groups reflecting the ordinal scale in the self-administered questionnaire and compared with categories of self-reported frequency. Estimates of duration for the occupational physical demands were based on the total number of positive observations made at 30-second intervals divided by 2 (a maximum of 60 minutes).
A pilot study of the observation method using 2 researchers was carried out to determine interobserver agreement in the assessment of the duration of physical demands. Six manual workers from a flour mill in south Manchester were observed simultaneously by the 2 researchers for 1 hour. The interobserver repeatability was very good for the duration of work postures and manual materials-handling activities (table 1). There was less agreement for the repetitive use of the wrists and arms, the 95% limits of agreement ranging from -4.8 to 6.8 minutes and from -5.2 to 3.6 minutes, respectively. The imprecise definition used by necessity to describe repetitive upper-limb movements (defined as "an observed activity requiring multiple wrist or arm movements") might have resulted in this greater discrepancy.

Estimation of handled weight
Various items were selected from the warehouse of the department store and weighed using electronic scales. Five different items were used in 6 manual materials-handling operations (4 tasks involving lifting and carrying and 2 tasks involving pushing and pullingtotal of 30 items). The tasks and items were presented in random order to the subjects, who were asked to complete a visual analogue scale (VAS) for each estimate of weight before proceeding to the next.

Statistical analysis
The accuracy of self-reported occupational physical demands was calculated using a sensitivity and specificity analysis based on the premise that the observations represented the standard. Sensitivity and specificity were calculated as the proportion of workers correctly reporting positive exposure (sensitivity) and the proportion of workers correctly reporting negative exposure (specificity). The analysis was carried out as a dichotomous exposure (ie, whether or not the physical demand was carried out). Estimated frequency (recorded on a 4-point ordinal scale: 1-10 timeslhour, 11-30 timeslhour, 3 1-50 timeslhour, 251 timeslhour) was compared to observed frequency (using the same categories), the percentage of full agreement being calculated as the proportion of workers' estimates of frequency in the same category as the observed frequency. Disagreement was described as mild, moderate, and severe if the workers' estimates of 2) the differences between reported (estimated) and observed (weighed) were calculated on a continuous scale, and the median and interquartile range were reported.

Results
Of the 140 employees observed, 123 (88%) completed and returned the questionnaires on the day of observation, the majority having completed the questionnaire immediately after the reference hour, 76 of whom were women (92%) and 47 were men (82%). The mean age of the employees who completed the physical demands questionnaire was 36 years. The respondents did not differ from the 17 employees who did not complete the questionnaire in terms of age (P=0.58), gender (P=0.81), or physical demands recorded by the researchers (P>0.20 for all comparisons). It is likely, therefore, that nonresponse was related primarily to general disinterest in the study, an impression confirmed by researchers visiting the participating companies.
All of the physical demands detailed in the questionnaire were observed to take place (table 2, column 2). For the work postures, approximately 10% of the employees were observed to be standing or seated for periods of 230 minutes, whereas approximately half the employees were observed to kneel or carry out work at shoulder level. The majority of the manual materials-handling was observed to be carried out by most of the subjects, with as many as two-thirds of the employees lifting or carrying weights. However only 4 persons were observed to carry weights on a shoulder during the observation hour. Two-thirds of the employees were observed to carry out repetitive movements of the wrists or arms for periods of 210 minutes.
Questions referring to activities involving "lifting" and "carrying" were combined after 5 of the 6 companies had been visited, when it was established that employees did not distinguish between the 2 activities.

Agreement for the recording of physical activities
The accuracy of self-reported estimates of occupational physical demands at a dichotomous level (ever; never) was greatest for the work postures (table 2). For all the posture variables the sensitivity and specificity of the self-reports were at least 70%.
For the majority of the manual materials-handling the accuracy of the self-reports was good. The sensitivity values for the self-reports were at least 60% for all except 2 of the manual materials-handling activities ("carrying weights with one hand" -43% and "lifting weights above shoulder level" -40%). Only the activity "lifting weights with both hands" had a low specificity value, all the other activities receiving a specificity of at least 70%.
Activities involving repetitive movement of the wrists and arms showed good sensitivity (above 80% for both activities); however, the specificity was below 60% for repetitive arm movements.

Agreement for the frequency and duration dimensions of exposure
The employees' ability to discriminate between levels of frequency (according to the 4-point scale) was satisfactory (table 3). Full agreement was above 60% for the majority of the manual materials-handling activities, with kappa statistics of moderate concordance (kappa 0.2-0.6). The poor kappa statistics observed for "carrying with both hands" and "pulling weights" reflects the high proportion of employees reporting a frequency of a Number of subjects classified as exposed and not exposed by the subjects (self-reports) and by the researchers (observations). The sensitivity and specificity have been predicted on the assumption that the researchers' observations are the standarda for the self-report. Data only available for the first 5 companies (N=85). Data only available for the last company (N=38). The lifting and carrying variables were combined after the analysis of the data from the first 5 companies. For this reason, frequency information in a categorized form was only available for these variables separately for 2 companies (N=35) and for these variables combined for 1 company (N=38).
1-10 timeslhour. In such a situation the expected agreement by chance will be very high, resulting in a low kappa value even when the observed agreement is good. Only for "lifting or carrying weights with one hand" was the full agreement below 50%. For estimates which were not in agreement, the majority was classified as mild-tomoderate disagreement (the employees' estimates were only 1 or 2 categories away from the observed frequency). Generally speaking, the employees tended to overestimate the duration of physical demands slightly, although the overestimation only averaged about 5 minutes (table 4). Repetitive movements of the upper limbs were overestimated to a greater extent, although the median differences for such exposures were still below 10 minutes.

Agreement for the estimating-handled-weights dimension of exposure
The median, interquartile range, and range of the differences between the weights estimated on a VAS and the actual weights (electronic scales) of handled items are illustrated in figure 1, in which box-and-whisker plots have been used for the 6 manual materials-handling activities. For each activity the 5 handled items have been placed in the order of actual weight (lightest to heaviest) although they were administered to the employees in random order. The workers were able to estimate the weight of the handled items accurately on the VAS. The median difference was <10 kg for manual materials-handling not involving assisted force. For "pushing" and "pulling" there was a greater discrepancy between the estimations and the actual weights.
For all the manual materials-handling, as the items became heavier, the estimates were more discrepant. However the low magnitude of the difference suggests that the workers could differentiate between items of different weights.

Discussion
Validity of self-reports in comparison with other studies General physical demands. For the majority of the selfreported physical demands, at a dichotomous level of whether or not the activity was carried out, the accuracy of recall was good. In other studies (19,20) the greatest accuracy was observed for self-reported posture. However such studies have also reported poor accuracy for the recall of more active tasks (eg, those involving repetitive movements or manual materials-handling) (18). It is encouraging, therefore, that the accuracy of the selfreports was found to be good for the majority of the manual materials-handling tasks and for repetitive movements of the upper limbs in the current study. Of concern, however, were the items that did not have a satisfactory accuracy: "carrying weights with one hand" (43% sensitivity) and "lifting weights above shoulder height" (40% sensitivity). When the former was combined with "lifting weights with one hand", the new activity "lifting or carrying weights with one hand" did have a satisfactory accuracy (75% sensitivity and 97% specificity). In addition an alteration of the latter item to "lifting weights at or above shoulder level" was found to have good agreement after being assessed in a small study later (data not shown).
Frequency and duration of physical demands. One limitation of the time-sampling approach adopted in this study is the likelihood that the frequency of manual materials-handling would be underestimated. The observers might have missed activities being repeated between the 30-second time intervals. In this respect, the observation method cannot be considered as a standard, and our analysis of these data is restricted to a comparison of self-reports with a crude measure based on the observations. Although the agreement between the self-reported and observed frequencies was high for the majority of the manual materials-handling activities, the high level of agreement is likely to reflect a clustering of responses in the lowest frequency categories. It is therefore  to that used in the current study to represent frequency and found good agreement between the estimated and observed frequency of weights of >I-5 kg (kappa = 0.65-0.66).
The greatest variation in the estimation of self-reported duration was for repetitive movements of the wrists and arms. This finding is likely to have been a reflection of the natural difficulty in understanding what represents Scand J Work Environ Health 1998, vol24, no 5 a "repetitive movement". Rather than impose an artificial definition which is necessarily complicated, it was decided to leave assessment of what represented a repetitive activity to the participants themselves. On the average, the misclassification in terms of overestimated duration was still small for these movements.
It is difficult to compare our analysis of the self-reported duration of physical demands with that of other studies given the different styles of question used to represent duration; other studies have tended to inquire about duration as a closed question. For example, Wiktorin et a1 (19) (18) found that estimates of the duration of postural activities (in hours and minutes, converted to a percentage of the workday) significantly differed from the observed durations as follows: standing: difference, observed minus estimated duration = 34%, P<0.0001; kneeling difference = -7.6%, P<0.0001; and walking difference = -33.3%, P<0.0001. These discrepancies might have resulted from the requirement for respondents to recall hours and minutes for the postures during the whole workday. Such estimation is likely to be difficult and subject to inaccuracy. The employees in our study were only required to estimate the duration of exposures for 1 hour of their shift.
Estimates of handled weight. It is believed that this is the first time that the VAS has been used to estimate the self-reported weight of items handled by employees. The sample of workers was found to estimate handled weights accurately using the VAS. One additional feature of the design of the VAS, which was felt to be useful for United Kingdom populations, was the inclusion of a conversion card for transferring pounds into kilograms (which also included illustrations to guide employees in their estimates). This provision might have improved the employees estimates of weight, although the possibility was not tested.
Studies have tended to inquire about the weight of handled items using an ordinal scale with a categoriza-tion of weights. This method requires employees to be familiar with the weights of items they have handled, and it appears that employees find it easier to remember handling heavier loads (6-15 kg) than lighter loads (1-5 kg) (19). Given that the employees accurately estimated the weight of a variety of both light and heavy loads on the VAS, the use of a VAS (together with the illustrated conversion card) might be a more appropriate measure with which to estimate the self-reported weight of handled items in a questionnaire than more direct questions are.

Methodological issues
Although the observation method was found to have good interobserver reliability (table I), it is unlikely that the observations were entirely accurate measures of the physical demands carried out by employees. Ideally, the use of a direct measure (eg, video taping) would have provided a more accurate reference for the self-reports. However, the use of such methods was not practical for the present study. It is possible that, using the time-sampling approach, the observers missed activities carried out between the 30-second time intervals. If so, it would have resulted in an underestimate of the specificity. However, despite possible flaws in the observation method, the accuracy of the majority of self-reported exposures was still found to be good.
As mentioned previously, the time-sampling approach used in the study is likely to have led to an underestimation of the frequency of manual materials-handling. The occupational groups chosen for study tended to report a low frequency of manual materials-handling, and this result may explain the high proportion of agreement between the self-reports and observations. For occupational groups with a greater frequency of manual materialshandling, the time-sampling approach would not have been appropriate for assessing frequency accurately. In such situations, when observations are made, accurate measures of frequency can only be obtained by recording in real time.
One factor that might have influenced the estimates of validity in this study is the observation procedure. It is possible that the presence of an observer may have influenced the way in which the workers carried out their work or may have prompted them to remember activities carried out during the recall period. It is unlikely that the workers would have changed their normal work activities due to the observations, given the demands of the work involved in the occupational settings. However it is unclear whether the observers could have indirectly influenced recall; employees, aware of the observation procedure, may have estimated physical exposures in greater detail than if the observations had not taken place (20).
The reference period of 1 hour was appropriate for the occupational settings forming the sampling frame for the current study. Occupational activities could be well defined in a relatively short reference period because the tasks did not vary substantially across the shifts. Clearly, for occupations with much greater variability of occupational tasks, the reference period may result in a less accurate quantification of physical demands.

Concluding remarks
General conclusions from previous studies of the validity of self-reported physical demands have been that, while recall is satisfactory at a dichotomous level (ever; never), quantifying the magnitude of the physical demands is more problematic. In this study we demonstrated that workers can estimate both general physical demands and the dimensions of these demands with satisfactory accuracy. The questionnaire in this study was used in a variety of occupational settings, and it differed in several ways from those used in previous studies. This result may have contributed to the improved accuracy observed in the recall of the workers, for example, the use of a VAS to estimate weight instead of questions with specified levels of weight.
Although the questionnaire on occupational physical demands in our study was designed to assess potential physical factors associated with shoulder symptoms, it covered a wide spectrum of exposures also relevant to other musculoskeletal conditions, for example, low-back symptoms, neck symptoms, and lower-limb symptoms. The questionnaire on occupational physical demands is therefore recommended as an instrument for estimating physical factors associated with musculoskeletal pain in appropriate occupational settings in future epidemiologic studies.