Validation of a questionnaire for assessing physical work load

Validation of a questionnaire for assessing physical work load. 0 bjectives Reliable, valid, and compatible methods are required for exploring the complex interactive effects of psychosocial and physical stressors on complaints and disorders. An instrument for assessing physical work load that integrates information from a biomechanical model of lumbar load is presented and validated. Methods Four hundred and fifty-five people working in nursing homes for elderly people in Germany filled out the developed questionnaire 3 times within 1 year. Test-retest reliability was calculated, and validity was checked several times. Relationships with other, theoretically related and unrelated variables were examined. Results The test-retest reliability of the questionnaire measures was about 0.65. The convergent and discriminant validity was satisfactory, and the questionnaire was able to separate professional subgroups with different physical work loads. The Spearman rank-order correlations between physical load and musculoskeletal complaints were about 0.30. The method developed in this study is a reliable and valid instrument for assessing physical work load. The integration of statistical methods from psychological testing and theory in the development of methods exploring the effects of physical work load is advocated.

Disorders and complaints regarding the musculoskeletal system are the most frequent reasons for sick leave in Germany (I), as in other industrialized countries (2,3).
For the prevention of such disorders and complaints, information is needed on factors contributing to their pathogenesis. During the last several years it has become increasingly clear that both physical work load and psychosocial factors at work contribute to the development of musculoskeletal problems (2,(4)(5)(6). Unfortunately, these 2 sets of factors, physical and psychosocial, have only been studied separately in most cases. The unknown covariation of physical and psychosocial demand factors across workplaces makes it difficult to estimate the "true" influence of either set of factors. In addition, there may be complex interactions between the 2 factors. If one is interested in the complex and probably interactive effects of physical and psychosocial stressors on musculoskeletal problems, information on physical and psychosocial work load has to be combinedand thus should be assessed in compatible ways.
The instruments used for the integration of information from the 2 sources, physical and psychosocial, should have sufficient reliabilities, and their validities should have been proved. But the instruments that are available for measuring and determining physical work load on the one hand and psychosocial stressors on the otherthat also fulfill criteria for reliability and validityoften differ in their measurement approach. More precisely said, many good measures of physical work load are observational in nature (7). Instruments in this class of measurement methods are exact in their determination of an index of physical work load (eg, body postures, consumed energy, etc), but they are also very time-consuming and thus allow the examination of small samples (up to 20 subjects) only. Psychosocial stressors, in contrast, are assessed by means of questionnaires (8,9), a technique that builds on psychological-statistical measurement theory. For such methods, large samples are needed (at least >I00 subjects) to provide results that are stable across samples and therefore have theoretical and 1 practical relevance. These different approaches result in difficulties in combining the information from the instruments, at least at the person level, so that little is known on how physical work load and psychosocial stressors interact in causing musculoskeletal problems (10).
A wide range of questionnaires on physical work load is already available. But the reliability and validity of these scales have not been checked in most cases and therefore must be regarded as questionable (11). Two strategies are suggested on how to check a questionnaire on physical work load (1 1). The first is to compute the test-retest reliability. A high test-retest comelation indicates that physical work load has been assessed reliably and with only little random error because random measurement error at one point in time does not correlate with random measurement error at any other point in time. The second is to check the validity of the instrument. This check can be performed by comparing the mean squares within and between homogeneous groups of workers with the same job. The mean square within the groups should always be significantly lower than the mean square between the groups. If an instrument is not able to separate groups with objectively different physical work loads, it cannot be a valid indicator of individual work load.
Furthermore a questionnaire is valid only if the resulting measures correlate to a greater extent with theoretically related variables (convergent validity) than with theoretically unrelated variables (discriminant validity) (12,13). For example, a meaningful index of physical work load in one body region (eg, the lower back) should correlate to a greater extent with reported problems in that, or a close, body region than with problems in more distant body regions (eg, the hands or feet). Moreover, the relation should be stronger for people with greater exposure to the stressor (eg, full-time workers) than for people less exposed to the stressor (eg, part-time workers). When variables are being selected for discriminant validation, a look at some current developments in the area of psychological stress-strain research can be helpful. In this field, it has been reported that a general tendency to report things in a negative manner ["negative affectivity" (14)(15)(16)] may influence the relations between stressors and strain when both are subjectively assessed. People with a tendency to report things negatively generally report more psychosomatic complaints (16,17), less work satisfaction (16), and less self-efficacy (18,19) than people without this tendency. If no correlations between these variables and the subjectively assessed physical work load are found, it could be concluded that the physical work load measure is not affected by negative affectivity.
One common problem of questionnaires assessing physical work load is the lack of theoretical models that determine which items should be selected and, therefore, which stressors should be assessed. But such a theoretically guided item-selection process seems to be essential. Some of the possible physical work load stressors may be more important than others for different forms of musculoskeletal problems, and the combined effects of stressors might otherwise be overlooked.
In the following report we describe the validation of a questionnaire (20) that is suitable for large epidemiologic studies on physical work load. The questionnaire is based on a sophisticated biomechanical model describing forces in the lumbar spine during work activities (21). Reliability and different aspects of validity were tested according to the principles already presented in this report.

Study design
The reported data are part of a large longitudinal study on different aspects of work in German nursing homes for elderly people. The questionnaires were administered 3 times, in the spring of 1996, the autumn of 1996, and the spring of 1997. The participants filled out the questionnaire in small groups of 5 to 25 people during normal workhours, while one member of the research team was present. Participation was voluntary. No contact with co-participants was allowed during the procedure. Confidentiality was assured by giving each participant an individual code number which was accessible only by the members of the research team.

Subjects
All staff members of 16 nursing homes at the time of the first collection of responses were asked to participate in the study. Only those who filled out the questionnaire the first time were asked to fill out the questionnaire on the 2 later occasions. The number of participants was 610, 530 and 491, respectively. The subjects belonged mainly to the following 4 job categories: nursing, service, social work, and management. Altogether 455 persons (45.5% of all the staff members of the participating nursing homes) filled out the questionnaire all 3 times. Table 1 gives basic information on the biographical characteristics of the sample.

Instruments
All the described instruments were embedded in a larger questionnaire asking about different aspects of work design, most of them psychological (eg, job satisfaction, control, ambiguity, etc).
Questionnaire on physical work load, Probably the most important factors contributing to physical work loadespecially when occupational work conditions are being researchedare unfavorable postures of the body or the extremities, like bending, twisting, kneeling, or squatting, as well as the handling of heavy loads required during work (22,23). Therefore we decided to assess such loads. A questionnaire with 19 items describing these work situations was constructed. In the questionnaire, whose English version is presented in appendix I, the items were also presented as pictograms. Five of the items described postures of the trunk (in the following text the item identifications are given in parentheses): straight, upright (TI) (trunk bent 5 degrees forward), slightly inclined (T2) (trunk bent 45 degrees forward), strongly inclined (T3) (trunk bent 75 degrees forward), twisted (T4), and laterally bent (T5). Three items asked for the following positions of the arms: 2 arms below shoulder height (Al), 1 a m above shoulder height (A2), and 2 arms above shoulder height (A3). Five items asked for positions of the legs: sitting (LS), standing (L2), squatting (L3) (trunk bent 15 degrees forward), kneeling on one or both knees (L4), and walking or moving (L5). Six items described the lifting of weights. Three concerned lifting with the trunk upright (Wul-Wu3) and 3 with the trunk inclined 60 degrees (Wil-Wi3). Each set of 3 items asked for lifting of light weights (<SO kg; Wul & Wil), medium weights (10-20 kg; Wu2 & Wi2) and heavy weights (> 20 kg; For the construction of an index for the physical work load of each person the total compressive force acting at the lower lumbar spine was determined for all the body postures and loads contained in the items. With the use of the biomechanical model "The Dortmunder" (21), the compressive forces at the lumbosacral disc L5-S 1 that resulted from different body postures with or without the handling of loads can be estimated. The index of physical work load discussed in this report does not cover the "real" compressing force on the lumbosacral disc (eg, in kilo-Newtons) for each person at one given moment, but, instead, the relative contribution of compressive forces caused by the body postures and the handling of weights described in the items to the overall load of the spine. Therefore, all the calculated parameters were derived from an arbitrarily chosen person who was 174 cm tall and who weighed 66 kg. The compressive forces resulting from the 4 postures trunk upright (5 degrees forward), 2 arms below the shoulder, standing without lifting weights, and sitting without lifting weights (items TI, A l , L l , L2) equaled the lowest compressive force and were regarded as a standard for further calculations. The index of physical work load is a weighted summation of the scores of the remaining 15 items. The weight of each item is the difference between the compressive force at the posture given by that item and the "standard compressive force" on the lumbar spine. The weighing factors and the complete formula, which is a simplified version of the formula described in Klimmer et a1 (20), are presented in appendix 11.
Our aim was to construct a questionnaire that measures physical load factors which might possibly interact with psychosocial factors in the determination of different forms of strain (musculoskeletal problems, psychosomatic complaints, etc). Because it can be assumed that these processes take longer times to develop, we were interested in the physical workload factors that affect persons regularly. Therefore, we asked for an average frequency of occurrence of body positions or the handling of loads during ordinary daily work. The answers were given on a 5-point rating scale ranging from "never" to "very often". The weighting factors from a biomechanical model were multiplied by the item scores of the corresponding body postures reported in the questionnaire and then added to an index of physical work load. This index is a measure of physical work load on the lower lumbar spine within longer time frames (days to months). On the assumption that most factors contributing to physical work load in certain regions of the body also affect the lower lumbar spine, it seemed reasonable to regard this index as an equivalent to the overall work load.
For purposes of validation, we constructed another index ("standard positions index"), which was an unweighted combination of the 4 items not contributing to the workload index (standard positions: items T l , A l , L l , L2). A low correlation between this index and the workload index and also with musculoskeletal problems (see the following text) would indicate that method variance (because of some form of response bias) is not responsible for any observed relation between the workload index and other scales.
Questionnaire on musculoskeletal symptoms. Symptoms in different body regions were recorded by means of a German version of the general form of the Nordic questionnaire for the analysis of musculoskeletal symptoms (24). This questionnaire asks for musculoskeletal problems in 9 body regions during the last 12 months and the last 7 days and also inquires as to whether these problems were restricted to normal work during the last 12 months. For the validation procedure of the described workload index we needed a few easy-to-handle and psychometrically sound indicators (eg, with acceptable Cronbach's alpha and test-retest-reliabilities and with a distribution approximately equaling the normal distribution) for musculoskeletal problems. Therefore we constructed scales from the items of the Nordic questionnaire that were coded as follows: neither symptoms nor restrictions = 0, symptoms in the last 12 months but not in the last 7 days and no restrictions = 1, symptoms in the last 12 months and either symptoms in the last 7 days or restrictions = 2, symptoms in both the last 12 months and the last 7 days and, in addition, restrictions = 3. These indicators were factor analyzed with an analysis of main components because we wanted to have 2 or more statistically almost independent measures of the physical workload indices for validation purposes. Like the basic-positions index that has already been described, the correlations of the indices of musculoskeletal problems with the workload index should result in a theoretically meaningful pattern. The correlations of our physical workload index with problems in proximal regions like the back should be higher than with problems in distal body regions like the hands or the feet. An analysis of the eigenvalues showed that extraction of either 1 or of 2 factors seemed reasonable (eigenvalues of 3.82, 1.08, and 0.95 for the first filling out of the questionnaire, explaining 42%, 12%, and 11% of the total variance, respectively; data from the other 2 sets of questionnaire responses gave comparable results). For purposes of validity we decided to extract only 2 factors after a varimax rotation. The indices for problems in the lower and upper back, the neck, and the shoulders consistently loaded on the first factor, whereas problems with the knees, the feet, the elbows, and the hands loaded on the other. Therefore we constructed 2 indices by adding the items of each factor, one describing problems within proximal body regions and one describing problems within distal body regions. Cronbach's alpha ranged across the 3 sets of responses from 0.84 to 0.86 for the proximal scale and from 0.61 to 0.70 for the distal scale. The index describing problems with the hips loaded inconsistently on both factors and was thus excluded from further analysis.
For further validation some other variables were assessed. Psychosomatic complaints were measured with a slightly modified version of a validated German complaint list developed by von Zerssen (25). This list asks how intensely the participants experienced 24 general symptoms like headache, insomnia, or nausea on a 4-point scale from "not at all" (score 0) to "strongly" (score 3). It measures 1 homogeneous construct (25,26). Of the 24 items, 4 items asked for musculoskeletal complaints and were excluded from the analysis because of the overlap with the Nordic questionnaire. The resulting 20-item list had an internal consistency (Cronbach alpha) of at least 0.90 for all 3 sets of responses. Job satisfaction was measured with a 7-item scale taken from a German job-satisfaction questionnaire (27) tapping 7 different aspects of job satisfaction (eg, satisfaction with the supervisor, the job itself, the payment). The scale had a Cronbach alpha between 0.80 and 0.85 for the 3 sets of responses. Self-efficaw a psychological construct developed by Bandura (28), describes how certain a person is about his or her own performance on a specific task or about the general mastering of problems in life. For this purpose, self-efficacy was taken as a measure of discriminant validity for physical work load. This choice is grounded in its known psychological (but not physical) contribution to the etiology of complaints (eg, people high in self-efficacy have lower levels of anxiety and depression and higher levels of immunologic functioning). The scale was developed by Schwarzer (19) and had an internal consistency of about 0.85 for all 3 sets of responses.

Reliability
For the participants in the same job throughout the study, the assessed indices should have had a reasonably high stability over time to be meaningful. To test this assumption, test-retest reliabilities were calculated for the index of physical work load and for the indices of musculoskeletal complaints (table 2). The test-retest correlations of all 3 indices within the time intervals of between 4 months (r,,) as a minimum and 1 year (r,,) as a maximum were satisfactory and were all above 0.60; for the proximal complaints they were even above 0.70. This finding indicates that both physical work load and musculoskeletal complaints were measured with little random error.

Validity
The questionnaire was validated using the following 2 steps: (i) the power of the questionnaire to discriminate between workers in different job categories was examined both with and without statistical control for musculoskeletal symptoms and the other strain variables and (ii) several aspects of the convergent and discrimi-nant validity of the questionnaire were tested. As an additional check of the convergent validity, the correlations between stressors and strains for people with more or less long-lasting exposure to physical work (workhours: parttime versus full-time) load were contrasted.
In the first step of the validation process, different job categories were contrasted. Among the job categories of the sample, social workers and managing directors should have had less physically demanding work than the other 2 groups. We tested this aspect of validity with an analysis of variance (ANOVA) with job category as the independent variable and physical work load as the dependent variable. The results for all 3 sets of responses showed significant differences between the groups [F(t,) (31431) = 35.6, P<0.001; F(t,) (31442) = 51.3, P<0.001; F(t ,) (31448) = 46.5, P<0.001]. The physical work load was highest for the participants working in nursing, lower for the persons in the service trade, the social workers, and the managing directors, respectively (figure 1). A posteriori Scheffe tests between each pair of groups showed that all the comparisons between the groups, except that between the social workers and the managing directors, were significant at the 0.05 level. To make sure that these differences were not biased by different levels of strain (eg, satisfaction or musculoskeletal symptoms), we also contrasted the groups with an analysis of covariance in which musculoskeletal symptoms, psychosomatic complaints, job satisfaction, and self-efficacy were introduced as statistical controls. In this analysis the groups still differed significantly in respect to the physical load questionnaire [ F(t,) (31390) = 32.7, P<0.001; F(t,) (31412) = 49.5, P<0.001; F(t ,) (31 435) = 50.0, P<0.001]. Thus the differences between the groups seemed to depend on real differences in physical work load and not on differences in the variables related to psychological processes. As has already been described, in the second step of the validation process, we expected a high correlation between physical work load and musculoskeletal complaints in proximal body regions, lower correlations between physical work load and complaints in distal body regions and psychosomatic complaints, and no correla-tions between physical work load and job satisfaction and self-efficacy. Table 3 gives the information concerning the means, standard deviations, and correlations of the scales for each set of responses.
At first inspection the results show that the means of the questionnaire variables did not change significantly across the 3 sets of responses. In addition the correlations within each cell were remarkably similar for all 3 sets of responses.
For the physical workload index to be valid, it should conelate to a higher extent with theoretically related variables than with theoretically unrelated variables. The low correlations of physical work load with gender, age, and tenure coincided with this hypothesis because in the field under research no such correlations can be expected. As expected, physical work load did not correlate with the "standard positions index". There were significant relations to all 3 symptom and complaint scales, the correlations being highest for proximal musculoskeletal symptoms and somewhat lower for distal symptoms (asms and legs) and psychosomatic complaints. For 2 sets of responses there were also low negative significant correlations with job satisfaction, but the workload index was not related to self-efficacy.
With the exception of significant correlations to the age of participants, the "standard positions index" does not generally consistently correlate with any other variable. Both measures of musculoskeletal symptoms were positively related to age and tenure, interestingly the correlations with distal symptoms being somewhat higher than for proximal symptoms. Both were also highly related to psychosomatic complaints (~~0 . 4 5 for proximal and ~~0 . 4 0 for distal symptoms). There were low negative correlations with job satisfaction. Proximal and distal symptoms correlated with each other by about 0.50. As expected, self efficacy correlated only slightly with age, psychosomatic complaints (negative sign), and job satisfaction.
Thus the correlation matrix generally supports the assumption of a convergent and discriminant validity for the presented scale of physical work load, although the relations of physical work load with 2 of the 3 more Nursing Service Social workers Managing directors Figure 1. Physical work load on the lower back in different j ob categories in nursing homes for elderly people for all 3 sets of questionnaire responses (t,, t , , t,). All the differences between the groups were significant, except for the social workers and managing directors. None of the differences within the groups were significant.  "psychological" variables (psychosomatic complaints and job satisfaction) requires further explanation. A closer inspection of the correlation matrix showed that there was not only a high correlation of musculoskeletal symptoms with both psychosomatic complaints and job satisfaction, but also, simultaneously, with physical work load. This finding may lead to the assumption that physical work load affects psychosomatic complaints and job satisfaction through the mediation of musculoskeletal symptoms (eg, people with high load have more backache, which also leads to psychosomatic complaints and influences satisfaction negatively). To check this assumption, we partialed out musculoskeletal symptoms from the corre-lation of physical work load to psychosomatic complaints and job satisfaction. This procedure led to nonsignificant correlations except in one case. For the second set of responses a significant correlation of 0.17 remained between physical work load and psychosomatic complaints. Thus it may be that physical work load influences psychosomatic complaints and job satisfaction via musculoskeletal complaints. In addition, we expected a higher correlation for employees working full-time than for employees working part-time. Table 4 presents the correlations between physical work load and musculoskeletal symptoms for fulltime and part-time workers separately. In fact, the relations between the variables for full-time workers were stronger than the relations for part-time workers. This finding can be regarded as another indicator of the validity of the questionnaire.

Discussion
A questionnaire for assessing physical work load through self-reports was checked for reliability and different aspects of validity with data from a longitudinal study of employees working in nursing homes for elderly people. The questionnaire had a high test-retest reliability, showed an adequate convergent and discriminant validity, and separated between groups with objective differences in physical work load. Further testing indicated that the questionnaire data are not only a consequence of differences in symptoms, complaints, or some psychological variables, but also dependat least to a reasonable extenton differences in the work conditions. In agreement with recent results of Viikari-Juntura et a1 (29), the data presented advocate the use of carefully constructed self-report measures in studies aiming to identify causes of musculoskeletal problems. Of course, such questionnaires are not designed for making individual diagnoses, but, instead, are meant to generate and test hypotheses about the causes of musculoskeletal problems. Nonetheless, further testing that compares results from the questionnaire with data on physical work load from other sources [eg, logbooks, task analysis, or observational methods (29)(30)(31)] is recommended. It would also be helpful to check the reliability and validity of the questionnaire in other samples with different kinds of physical work load.
Thus far, there have been many calls for more and better interdisciplinary work on the etiology of musculoskeletal diseases. Our results on the validation of a questionnaire of physical work loadachieved with procedures grounded in psychological-statistical measurement theoryshow that this may not only be true for theoretical work, but interdisciplinarity could also be a promising approach in the field of method development. The described statistical methods could add to the possibilities of testing the validity of self-report measures of physical work load, if other methods, like observations or analyses via job title, are not available or favorable.
Although not being the focus of this article, the correlation of about 0.45 between musculoskeletal and psychosomatic complaints is remarkable. As far as we know, there is no study that has reported any correlation between these 2 different measures of strain. While in the more psychologically oriented literature psychosomatic complaints are discussed as an outcome of psychological stressors (eg, control, demands and support), the literature on diseases of the musculoskeletal system relies on musculoskeletal symptoms and complaints. Our results indicate that there is a considerable overlap between these 2 constructs and this overlap wasrants further research and theoretical discussion. Furthermore, the mediating function of musculoskeletal complaints between physical work load on one hand and psychosomatic complaints and job satisfaction on the other could be considered as one of the mechanisms in the complex network between physical and psychosocial work conditions and different measures of strain.