Physical load during work and leisure time as risk factors for back pain.

This systematic review assessed aspects of physical load during work and leisure time as risk factors for back pain. Several reviews on this topic are available, but this one is based on a strict systematic approach to identify and summarize the evidence, comparable with that applied in the clinical literature on the efficacy of intervention for back pain. A computerized bibliographical search was made of several data bases for studies with a cohort or case-referent design. Cross-sectional studies were excluded. A rating system was used to assess the strength of the evidence, based on the methodological quality of 28 cohort and 3 case-referent studies and the consistency of the findings. Strong evidence exists for manual materials handling, bending and twisting, and whole-body vibration as risk factors for back pain. The evidence was moderate for patient handling and heavy physical work, and no evidence was found for standing or walking, sitting, sports, and total leisure-time physical activity.

Back pain is a major health problem in the Western world. The lifetime prevalence has been estimated at 60% to 90%, and the point prevalence varies between 15% and 42%, depending on the study population and the definition of back pain. The annual incidence of back pain has been reported to be approximately 5% (1)(2)(3)(4).
In a recent study of a general population in The Netherlands, the annual prevalence of low-back pain was found to be 46% for men and 52% for women. This study also showed that the high prevalence of back pain has important consequences in terms of disability, the utilization of health services, and sick leave. Twenty-eight percent of the people with low-back pain were restricted in their daily activities, 42% underwent medical treatment, 23% took time off work, 8% received a (partial) disability pension, and 6% changed jobs or had adaptations in the workplace (5). In 1991, the total cost of back pain to society in The Netherlands was estimated to be 1.7% of the gross national product (6).
The prevalence rate of low-back pain also varies between workers in different professions. High prevalence rates are found, in particular, for nonsedentary occupations (7,8). This phenomenon indicates that work-related factors may play a role in the etiology of back pain. In order to define potentially effective intervention in the workplace, the relationship between various exposures and back pain must be examined more specifically.
Several reviews on risk factors for back pain have already been published (9)(10)(11)(12)(13)(14)(15). However, none of them include clearly defined inclusion and exclusion criteria, a methodological quality assessment of the studies, and explicit criteria on which overall conclusions on the strength of the evidence were based. The current interest in evidence-based medicine has led to an extensive increase in the publication of systematic reviews, because a systematic approach is less susceptible to bias. This increase has, in turn, led to the development of methodological guidelines for systematic reviews (16). This paper examines the evidence for certain aspects of physical load as risk factors for back pain. Physical load is assumed to have both an acute and a cumulative effect on the occurrence of back pain. A load that exceeds the failure tolerance of the tissue, applied once, can cause back pain. However, cumulative load resulting from subfailure magnitude loads may be even more important. In such cases, back pain is assumed to be the result of a repeated application of loads or the long-term application of a sustained load. Moreover, a combination of cumulative and acute loads can also cause back pain (17,18).
In this paper, a systematic approach, comparable with that applied in the clinical literature on the efficacy of intervention for back pain (16,19), was used to answer the following research questions, based on the available literature: (i) which aspects of physical load at work are risk factors for the occursence of back pain and (ii) which aspects of physical load during leisure time are risk factors for the occurrence of back pain?

Search strategy and screening
The available literature was identified by means of a computerized search of several bibliographical data bases, including Medline (1966( -November 1997, Embase (1988( -October 1997, Psyclit ( -September 1997, NIOSHTIC, CISDOC and HSELINE (1977( -July 1997, and Sportdiscus (1949( -October 1997. The following key words were used: back pain, low-back pain, lumbago, backache, intervertebral disk displacement, hernia, herniated disc, sciatica, sciatic pain, risk factors, causality, causative, precipitating factors, determinants, predictor, etiology, aetiology, epidemiology, and case-control studies, retrospective studies, case-referent, prospective studies, longitudinal studies, follow-up studies, and cohort studies. For practical reasons, the search was restricted to publications in English, Dutch, German and French. The abstracts of all the citations were retrieved and examined.

Selection
A selection was made from the identified papers. The first reviewer (WH) was responsible for the entire selection, but in order to check the reproducibility of the selection process, a second reviewer (MP) selected a random sample (N=100) from the papers identified in Medline.
The studies had to meet the following inclusion criteria: . The design of the study had to be case-referent or cohort (prospective or historical) with at least 1 year of follow-up. Studies with a cross-sectional design, defined as studies in which the exposure(s) and the disease were assessed at the same time, were excluded.
. The study had to concern a working population or a community-based population. Studies involving patient populations were excluded.
. The operationalization of back pain had to be based on symptoms or signs of nonspecific back pain, selfreported or measured otherwise, including such consequences of back pain as sick leave, medical consultation, or treatment and disability. Studies on back pain due to a definite herniated lumbar intervertebral disc diagnosed according to well-defined diagnostic criteria and studies on back pain due to osteoporosis, cancer, or other specific causes were excluded. Studies which focused on back pain during pregnancy were also excluded.
. The exposures accepted for study included physical load during work or physical load during leisure time. Studies which only involved a comparison between different occupational groups were excluded.
. The publication had to be a full report. Letters and abstracts were excluded. The references of all the selected articles and recently published review articles (9,11) were screened for additional, potentially eligible publications.

Methodological quality assessment
The selected studies were scored by two reviewers (WH, MP), independently, on the basis of a standardized set of criteria. The criteria concerned the study population, the exposure measurements, the assessment of back pain, and the analysis and presentation of the data. Two slightly different, separate criteria lists were used for the cohort studies and the case-referent studies. A description of these 2 lists is given in the appendix. These lists were adapted from criteria lists used in systematic reviews of randomized controlled trials on treatment (1 6) and criteria lists used in other reviews of observational studies (20,21).
The reviewers rated each criterion according to the following rules: -I-= informative description of the criterion at issue, and study meets the criterion, -= informative description, but study does not meet the criterion, and ? = lacking or insufficient information, assigning "+" or "-" was not possible.
All disagreements between the reviewers were subsequently discussed during a consensus meeting. If disagreements were not resolved during this meeting, a third reviewer (PB) was consulted for a final judgment. Each study was assigned a total methods score, which was the sum of all positive ratings for the criteria on validity and precision. This evaluation finally resulted in a hierarchical order for both the cohort and the case-referent studies, ranking the studies according to their methodological quality.

Data extraction and analysis
Data on the effect of the exposures of interest were abstracted from the text and tables of the original publications. Whenever possible, the data extraction not only included information on the statistical significance of the effect, but also on the magnitude of the estimated effect. For some studies that did not provide an effect estimate, this figure was computed from the data provided in the article. If a study (only) reported that a factor did not enter the model in stepwise modeling, this result was disregarded in the data extraction because a stepwise analysis is not appropriate for modeling focused on the assessment of a causal relationship (22).
Due to the expected heterogeneity with regard to the study population, exposure measurements, and assessment of back pain, it had been previously decided to refrain from statistically pooling the findings of the individual studies. In order to synthesize the available information, use was made of a method based on levels of evidence adapted from the US Clinical Practice Guideline for Acute L o w Back Pain in Adults (23). The rating system was applied to each individual exposure, and it consisted of 3 levels of scientific evidence based on the number, the quality, and the outcome of the studies as follows: (i) strong evidence: provided by generally consistent findings in multiple high-quality studies, (ii) moderate evidence: provided by generally consistent findings in 1 high-quality study and 3l low-quality studies, or in multiple low-quality studies, (iii) no evidence: only 1 study available or inconsistent findings in multiple studies. Strong or moderate evidence could concern both the presence and the absence of an effect. A study was considered to be of high quality if the methodological quality score was more than 50% of the maximum score, and low-quality studies were those with a methodological score of less than 50% of the maximum score. The findings of the studies were considered to be inconsistent if less than 75% of the available studies reported the same conclusion. In the case of multiple high-quality studies, the available low-quality studies were disregarded in the drawing of an overall conclusion. In the assessment of the level of evidence for an exposure, an increased risk was regarded as a positive effect, regardless of the statistical significance. A risk estimate [relative risk (RR) or odds ratio (OR)] in the region of 1 was considered to indicate no effect, and a decreased risk was considered to indicate a negative effect, notwithstanding the statistical significance of this effect. Studies that only reported nonsignificance, without presenting an effect estimate, were excluded from the evaluation. This exclusion, and ignoring the statistical significance of the findings, was based on the fact that, in general, the information provided in the articles was too meager to evaluate if no significant effect was found, either because there was no effect or because of a lack of statistical power, due to a small study size, a small percentage of exposed subjects, or a small percentage of subjects who developed the disease in question (24). As ignoring the statistical significance could be controversial, the exposures for which it was concluded that there was strong or moderate evidence of an effect were subjected to a sensitivity analysis. In this analysis all the studies with a nonsignificant effect were considered to indicate no effect.
If studies reported results of analyses with different outcome measures, the assessment of the effect was based on the results obtained for symptoms and findings, as opposed to measures of the consequences of back pain such as sick leave, medical consultation or treatment and disability. If studies reported results of analyses in different subgroups, the studies were considered to indicate a positive or a negative effect if such an effect was found in at least 1 of the subgroups.

Selection
The literature search in the various data bases resulted in the identification of 1363 publications, mostly in English. Twenty-seven studies met the inclusion criteria . On the basis of a screening of the references of the articles on these studies and recent reviews (9, l l ) , an additional 9 studies were included (94)(95)(96)(97)(98)(99)(100)(101)(102)(103)(104). The selection of studies for inclusion, from a random sample (N=100) of the papers identified in Medline by the second reviewer, led to an initial 2% disagreement. Five of the 36 selected studies were excluded post hoc for the following reasons: (i) there was low variability in physical load because the study population was restricted to workers with lifting tasks (50,51,95,98,99), (ii) the physical exposures at work were measured by means of a questionnaire on which only 1 of a list of items could be ticked (85), and (iii) the early retirements that were studied did not necessarily have a back disorder as the main diagnosis (25,26). Thus a total of 31 studies was finally included in this review, comprised of 28 cohort studies (27-49, 52-76, 78, 82-84, 86-94, 96, 97, 100, 102-104) and 3 case-referent studies (77,79-81, 101). For most of the studies there was more than 1 publication, and the assessment of the methodological quality of these studies was based on the information provided in all the publications.

Methodological quality assessment
The scoring of the 28 cohort studies and the 3 case-referent studies led to an overall initial disagreement of 20% (95 of 476 items) and 25% (14 of 57 items), respectively. The 2 reviewers subsequently reached consensus on all the initial disagreements. Tables 1 and 2 show the cohort and case-referent studies on physical load as a risk factor for back pain in order of their methodological quality score. Eleven of the 28 (39%) cohort studies (2845,49,52,61-71,73,78, 82-84,86, 100) and 2 of the 3 (67%) case-referent studies (77,79-81) had a positive score for over 50% of the criteria on validity and precision, and they were therefore considered to be of high quality.
The numbers refer to the numbers of the criteria in the list for the methodological quality assessment in the appendix: +=yes, -=no, ?=don't know. Table 2. Case-referent studies on physical load during work and leisure time as risk factors for back pain, ranked according to their methodological quality scorea.  Tables 3 and 4 give a detailed description of important aspects of the cohort and case-referent studies included in the review.

Physical load at work
Lifting: manual materials handling and patient handling Four high-quality studies and 1 low-quality study reporting on the effect of manual materials handling were identified (53, 67, 77-79). Manual materials handling includes lifting, moving, carrying, and holding loads. Three high-quality studies found a statistically significant positive effect for manual materials handling (67, 77,79) and 1 high-quality study found no effect (78). According to these findings there is strong evidence for manual materials handling as a risk factor for back pain. The magnitude of the risk estimates [relative risk (RR) or odds ratio (OR)] ranged from 1.5 to 3.1.
Three low-quality studies examined the effect of patient handling (91,93, 102). Patient handling includes the Table 3. S u m m a t y of the cohort studies on physical load and back pain. (MQS = methodological quality score based o n items o n validity and precision, N S = not significant, RR = relative risk, OR = odds ratio, the corresponding 95% confidence intervals for the RR  2. Acute injuries causing Nonathletes of the same age low-back pain (questionnaire attending 2 elementary schools at every follow-up) and participating in recreational sports less than twice a week, follow-up response 85%      Some of the results of the multivariable analyses in the article(s) on this study were disregarded in the data abstraction because it was only reported that a factor did not enter the model in stepwise modeling. It is unclear if the analysis of risk factors in this study was really based on longitudinal data. More results of this study, for example, with different operationalizations of back pain, are presented in a more-detailed version of this table, which is available from the author. The results that are presented were used in the assessment of the levels of evidence. The article on this study does not make exactly clear how the exposure of persons without complaints was assessed, and therefore it is not possible to judge if the conducted statistical analysis is correct. Variables concerning physical load at work were also examined in this study, but disregarded in the data abstraction because the information was derived from an open question in which it was asked to report the work-related stressful factors that were experienced, three at most. Q The results for work-related risk factors for a multivariable analysis including occupation in the article on this study were disregarded in the data abstraction because of the possibility of overadjustment due to the high correlation between occupation and work-related risk factors.
lifting and moving of patients. All the studies found a statistically significant positive effect for patient handling. According to these results there is moderate evidence that patient handling is a risk factor for back pain. The magnitude of the risk estimates (RRIOR) ranged from 1.7 to 2.7.

Bending and twisting
Two high-quality studies reported on the effect of bending and twisting (79, 82). Both studies found a statistically significant positive effect for bending and twisting. According to these results there is strong evidence for a positive effect for bending and twisting. In the only study that presented an effect estimate, an odds ratio of 8.1 was found (79).

Standing or walking
Three high-quality studies determined the effect of prolonged standing or walking (49,67,78). One found a statistically significant positive effect for prolonged standing or walking (67), and 1 found no effect (78). The third study only reported that no statistically significant effect was found (49). According to these inconsistent results, there is no evidence for an effect of prolonged standing or walking.

Sitting
Two high-quality studies (49, 67) studied the effect of prolonged sitting. One found a statistically significant negative effect for women only (67), and the other only reported that no statistically significant effect was found (49). Therefore, no evidence was found for an effect of prolonged sitting.

Whole-body vibration
Three high-quality studies and 1 low-quality study examined the effect of driving a car (67,72, 77, 78) and a 4th high-quality study evaluated the effect of whole-body vibration (82). This latter study included a group of machine operators. The exposure of the back during machine operating is somewhat similar to that when driving a car, namely, low-frequency whole-body vibration in a seated position. Three high-quality studies found a statistically significant positive effect for this exposure (77,78, 82). Referents: selected from the same population and matched by age, gender and department or physical requirements of the job a The article on this study presents two effect estimates for lifting of loads, both adjusted for confounders. One of the presented estimates was lower and nonsignificant due to the inclusion of severity of alarms, a variable that was highly correlated with lifting of loads and therefore disregarded in the data abstraction. b More results of this study are presented in a more-detailed version of this table, which is available from the author. The results that are presented were used in the assessment of the levels of evidence.
One high-quality study found a nonsignificant positive effect of driving a car for women only (67). According to these results there is strong evidence that whole-body vibration is a risk factor for back pain. The 2 studies that presented an effect estimate found an odds ratio of approximately 4.8 (67,78).

Heavy physical work
Finally, there were several studies which did not study specific aspects of physical load, but evaluated physical activity in the workplace in general. Five high and 6 lowquality studies reported on the effect of this exposure (27,29,38,60,61,73,74,77,94,96,104). Since 4 of the high-quality studies only reported that no statistically significant effect was found (29,38,61,73), assessment of the consistency of the evidence for this exposure was based on the combined results of the 1 high-and the 5 low-quality studies that reported an estimate or the direction of the effect found. One study found a nonsignificant negative effect for heavy physical work (104). Five studies showed that a high level of physical activity had a statistically significant positive effect (60, 74, 77, 94,96). According to these results there is moderate evidence that a high level of physical activity is a risk factor for back pain. The magnitude of the risk estimates (RR or OR) ranged from 1.5 to 9.8.
The studies differed somewhat in the timing of the exposure. The effect of a cumulative work load (94), the effect of short-term physical exertion (77) and current physical work-load at baseline (60, 74,96) were examined. It was, however, not possible to draw separate conclusions for the cumulative and short-term effects of heavy physical work.

Sensitivity analysis
For manual materials handling, patient handling, bending and twisting, whole-body vibration, and heavy physical work it was concluded that there was (strong or moderate) evidence of an effect. Considering all the studies that found a nonsignificant effect for these exposures to indicate no effect did not change the conclusions for manual materials handling, patient handling, bending and twisting, or whole-body vibration. For heavy physical work this assumption would mean that 6 studies indicated no effect, and 5 studies indicated a positive effect, and therefore the conclusion would be drawn that there is no evidence for an effect of heavy physical work, due to inconsistent findings.

Sports
Six high-and 5 low-quality studies examined the effect of sports activities (52, 60, 62, 73, 76-78, 82, 87, 96, 101). Two high-quality studies that only reported that no statistically significant effect was found and 1 high-quality study that reported a significant effect, but did not show the direction of the effect, were excluded from the evaluation of the evidence (52,73,78). Of the remaining high-quality studies, one found a statistically significant positive effect for physical activity (82), one found a statistically significant negative effect among men and no effect among women (62), and one found a nonsignificant negative effect (77). According to these inconsistent results there is no evidence for an effect of sports activities.

Total physical activity during leisure time
Four high-quality studies and 1 low-quality study examined the effect of total physical activity during leisure time (62,75,77, 86, 100). Total physical activity during leisure time includes sports activities and other physical activities such as gardening, walking, traveling to and from work, and housework. One high-quality shtdy found a statistically significant negative effect for off-duty activities (77). The other high-quality studies only reported that no statistically significant effect was found (62, 86, 100).
One high-quality study and 2 low-quality studies examined the effect of physical activity, but did not make it explicitly clear whether this only involved sports or exercise or also included other leisure-time physical activities. The high-quality study only reported that no statistically significant effect was found (29,46). According to these results there is no evidence for an effect of total physical activity during leisure time.

Specific sports and physical activities during leisure time
Four studies determined the effect of participation in specific sports, namely, golf (49), cycling (96) and athletic training (55,103). No evidence was found that any of these were risk factors because either there was only 1 study available (49,96) or the findings were inconsistent (55,103).
Two high-quality studies focused on the effect of driving a car during leisure time. One study (82) found no statistically significant effect for annual car driving (total kilometers) and the other found no effect of driving more than 25 miles (40 km) in the off-duty period before the report of low-back pain (77). According to these results there is no evidence for an effect of driving a car during leisure time.

Selection of studies
Although a systematic approach with a large variation of key words was used and references of selected articles were screened to identify all the available literature, the possibility of selection and publication bias cannot be excluded.
An important difference between this review and previously published reviews on the same topic is the exclusion of studies with a cross-sectional design. The main argument for the exclusion of this type of study is that temporality, the only unarguable and therefore necessary criterion for causality (105), is not met in cross-sectional studies, in which exposure and outcome are assessed simultaneously. Cohort studies were only included if the follow-up period was at least 1 year. The major reason for this restriction was that the follow-up needs to be long enough to record sufficient cases of back pain.
In addition, the choice was made to include studies with a fairly broad spectrum of outcome measures. This can lead to contradictory findings if the effect of an exposure is specific to certain categories of the outcome but, on the other hand, maximum power can be achieved (106). Since symptoms, reports of back pain, sick leave, medical consultation, or treatment and disability due to back pain are all part of a continuum, it was assumed that any factor that causes the back pain itself will have an effect on all these outcome measures. However, some factors may not only affect the development, but also the prognosis of back pain. Eventually, in most studies the assessment of the outcome was based on symptom reporting, mainly due to the lack of generally accepted criteria for an objective clinical diagnosis of back pain. Unfortunately, the operationalization of back pain based on the symptom reporting used in the included studies did not make it possible to examine the risk factors for different groups of back pain, classified based on characteristics such as the duration, frequency, intensity, and localization of the pain. (107) Studies with a diagnosed herniated lumbar disc as the outcome measure were excluded, because a separate review of risk factors for this more homogeneous disease entity was regarded to be more appropriate.

Assessment of evidence
The main difference between this review and previously published reviews on the same topic is the application of a systematic approach which includes explicitly defined criteria, on which the conclusions on the strength of the evidence were based. The review could only be qualitative, because, in many of the studies reviewed in this paper, quantitative measures of effect were missing for at least some of the exposures of interest. Moreover, the methods used to measure exposure are often so different that it is not possible to compare the evaluated contrasts of exposure.
Scoring the quality of a study plays an important role in the assessment of the strength of the evidence. However, it is only meant to distinguish between high-and low-quality studies. Criteria lists adapted from lists used in the clinical literature and in other reviews of observational studies were used to assess the methodological quality of the studies. As in the clinical literature, it is still unclear which items are especially important because of the influence of bias (16). One of the specific problems encountered in this review of observational epidemiologic studies, compared with reviews of clinical trials which usually evaluate only 1 contrast, is the fact that the relatively broad objective of this review and most of the evaluated studies resulted in a relatively nonspecific list of criteria. As the evidence for more than 1 exposure per study was evaluated, it was not possible to include a criterion on the power of each individual study. The most appropriate solution to the problem raised would be a series of reviews, each focusing on 1 specific risk factor. Only for such reviews could really specific criteria lists be developed. However, an advantage of a review like ours, with its broader focus over reviews with a more specific focus, is that it offers the possibility to compare the evidence found for different risk factors.
Criterion 5, which only pertains to the list for casereferent studies, may sound contradictory, because the exclusion of subjects with recent back pain from the reference group may be considered incompatible with the requirement that cases and referents have to be drawn from the same population. The criterion reflects that, on one hand, it is important that cases and referents be drawn from the same population and selected independently of their exposure status to make sure that the referents are representative of the source population with respect to exposure. While, on the other hand, there has to be a clear contrast between cases and referents with respect to the disease in question. For recurrent outcomes like back pain this is more difficult than for diseases like cancer. With the exclusion of subjects with low-back pain during the previous 90 days from the reference group, one can make sure that there is a real contrast in disease status between cases and referents.
Another problem which arises from the rating system applied in this review is that the synthetic approach can give a false impression of consistency across study results, because all the studies were prone to a common systematic error (106), such as (residual) confounding. With regard to the definition of the levels of evidence applied in this review, it could be argued that the conclusion could be limited evidence instead of no evidence if only 1 study evaluated the exposure. This procedure was decided against because the consistency of results, an important aspect of the definition of the other levels of evidence in this review, cannot be evaluated on the basis of 1 single study.
In spite of the limitations of defining levels of evidence, it was thought that this approach was appropriate in the present qualitative review. One important advantage is that the reader is given a lot of insight into the process used to assess the evidence. And there is also the possibility to repeat the analysis and to examine how the conclusions are influenced if slight changes are made in the assessment of the findings or the methodological quality of the studies. The sensitivity analysis already included the effect of a different way of dealing with nonsignificant findings. Another means of assessing the methodological quality of a study is to use another cutoff point for the assessment of high-quality studies. The use of a cut-off point of 40% for the assessment of highquality studies leads to an increase in the number of highquality studies, which, in turn, leads to strong instead of moderate evidence for the effect of patient handling and heavy physical work. This change would not influence the conclusions with regard to the other exposures, and the use of a cut-off point of 60% does not affect the conclusion for any exposure. Moreover, the results of the review are rather insensitive to the exclusion of the items on the assessment of different exposures (items 8-13) from the criteria list for the methodological quality assessment. Thus the conclusions drawn in this review are also rather insensitive to a slightly different assessment of high-and low-quality studies.

Quality of the studies
An examination of the scoring of studies on the various items shows that all the studies had a clearly defined objective. However, this objective did not always include an examination of the exposures of interest in this review (53,55,73,86,104). Twenty percent of the studies failed to describe the main features of the study populations, and very few studies used standardized methods of acceptable quality for the assessment of physical load at work and back pain. Furthermore, the rate of participation at base line was less than 80% in approximately twothirds of the studies. Some 60% of the cohort studies collected data on the outcome at least every 3 months, most of which used registered data, and many of these studies did not report on the loss to follow-up in their registration system. Three cohort studies did not collect data on the occurrence of back pain for at least 1 year, although the follow-up period was at least 1 year. These studies used the point prevalence of back pain at the end of follow-up as an outcome measure (46,75,94). Due to the low number of case-referent studies, it was not possible to present data on the scoring of the specific criteria for this study design.
There are also a few aspects of the quality of the studies that were not included in the criteria list, but were observed during the scoring of the studies. Hardly any studies included repeated measurements of the exposure, although there were many studies with an extremely long follow-up period during which the exposure easily could have changed considerably (27,60,61,72,100,103). Moreover, some of these studies did not assess the occurrence of back pain for the entire follow-up period (61, 62, 100, 103). The studies included in the review do not provide much insight into the effect of adjustment for certain covariates, because only a few studies showed the effect estimate for a certain exposure with and without adjustment for covariates (79, 102, 103).

Evidence for aspects of physical load during work as risk factors for back pain
For manual materials handling, bending and twisting, and whole-body vibration it was concluded that there was strong evidence for an effect. For patient handling and heavy physical work it was concluded that there was moderate evidence for an effect. To exclude the possibility of a false impression of consistency of the findings, the potential lack of controlling for likely important confounders was examined for these exposures. In general, only a few studies on lifting, bending and twisting, driving or whole-body vibration, and heavy physical work had adjusted for other physical and psychoso-cia1 factors at work. None of the studies had adjusted for physical load during leisure time. The effect of driving a car has been attributed to whole-body vibration on one hand and to prolonged sitting on the other. However, none of the studies on driving a car or whole-body vibration had adjusted for prolonged sitting.
In the sensitivity analysis, instead of moderate evidence, no evidence was found for the effect of heavy physical work. For this exposure 6 studies only reported that no statistically significant effects were found. However, it is debatable whether these studies can lead to the conclusion that there is no effect. In the original papers of 4 studies (29,38,73,104) it was emphasized that physical work load factors could not be effectively studied, due to the method of selection of the subjects in combination with the nonspecific method used for the assessment of the exposure. In the 2 other studies, the absence of an effect could be explained by the relatively long follow-up period, which probably coincided with changes in exposure (27,61). In addition, the effect of physical load was analyzed separately for white-and blue-collar workers, and the occurrence of back pain was only assessed for the last 12 months of the 10-year follow-up period (61).
For standing or walking, it was concluded that there was no evidence because of the contradictory findings. The only study that found an effect had only adjusted for prior back pain (67), and the study in which no effect was found had also adjusted for other aspects of physical load at work (78). However, it is debatable whether this difference in study results indicates the presence of confounding. The absence of an effect in the second study could also have been caused by the combination of a population of persons with similar work conditions and a badly defined measure of exposure, namely, a yes-no question about frequent prolonged standing. No evidence was found for an effect of sitting because the available information was too limited.
Prolonged sitting and standing are both assumed to be a risk factor for back pain because, among other things, they are both aspects of static load. Prolonged working in awkward postures is also an aspect of static load. However, with regard to sitting, standing, and working in awkward postures, none of the studies adequately evaluated the static effect of these exposures. Appropriate measurements for static load of the trunk, which should preferably be included in future studies, are the total duration of working continuously in a certain posture for longer than a certain period of time and the number of changes in posture during a workday for all parts of the body separately and combined.

Evidence for aspects of physical load during leisure time as risk factors for back pain
There appeared to be no evidence for an effect of sports, due to inconsistent findings. The available studies differed in their individual definition of back pain, the composition of the study population, the control for confounding, and also the time-period between the measurement of exposure and back pain. Moreover, no evidence was found for an effect of total physical activity during leisure time, various specific sports, or other physical activities during leisure time.
One important aspect of all the studies on physical activity during leisure time was that the operationalization of physical activity in the studies differed and was, in general, not very specific. It has been concluded that in epidemiologic studies on the role of physical activity in the etiology of diseases, the type, intensity, frequency, and duration of physical activity should be addressed and the measurement method should be in agreement with the disease in question (108). The methods used in most of the studies included in this review do not meet these criteria. Therefore, it may be worthwhile to develop new methods to measure physical load during leisure time and to evaluate more adequately the effect of this exposure. If this process results in a method involving operationalizations that correspond to the measurements of physical load at work, it may also enhance the possibility to study these exposures simultaneously.

Comparison with the results of previous reviews
It is interesting to see how the conclusions of this review compare with the conclusions of 2 other recently published reviews on the same topic (9,11). With regard to the work-related physical factors, it appears that there is no significant difference in the conclusions. Both reviews (9,11) conclude that there is evidence for an effect of lifting, bending and twisting, whole-body vibration, and heavy physical work. Burdorf & Sorock (1 1) also concluded that the evidence for exercise and sports is contradictory.

Concluding remarks and recommendations
According to the literature reviewed in this paper, there is moderate evidence that patient handling and heavy physical work are risk factors for back pain, and strong evidence that manual materials handling, bending and twisting, and whole-body vibration at work are risk factors for back pain. However, to determine the priorities for intervention in the workplace, it is also important to be aware of the magnitude of the effect of the various risk factors. For the purpose of evaluation, future studies should include quantitative measurements of exposure and report effect measures that reflect the risk of equivalent levels of contrast in exposure, measured in a comparable way. This procedure would make it possible to quantify the role of different risk factors in a meta-analysis.
For standing or walking, sitting, and various aspects of physical load during leisure time it was concluded that there was no evidence of an effect. For these risk factors, further well-designed research is needed if a conclusion is to be drawn on the presence or absence of an effect of these factors. With regard to physical load at work, adequate measures of static load must be related to the occurrence of back pain. Appropriate methods must also be developed to measure the relevant aspects of physical load during leisure time, and the combination of exposure to physical load during work and leisure time should also be addressed.
The results of this review are rather insensitive to slight changes in the assessment of the findings and methodological quality of the studies. The application of a systematic approach, adapted from the evaluation of randomized controlled trials on intervention for back pain, in the review of observational epidemiologic studies is shown to be worthwhile, not withstanding the problems encountered. Assessment of back pain 17. Positive if based on standardized methods of acceptable quality, namely, positive if one of the following criteria were met: (i) self-reported: data presented or in reference showed that the intraclass correlation coefficient was >0.60 or the kappa was 20.40 for the test-retest reliability; (ii) registered data: data presented or in reference demonstrate that the registration system is valid and reliable; (iii) physical examination blinded with respect to exposure status: data presented or in reference showed that the intraclass correlation coefficient was >0.60 or the kappa was >0.40 for the intraobserver reliability if only 1 observer is involved or the interobserver reliability if >1 observer is involved. If no intraclass correlation coefficient or kappa had been computed, but the data presented showed clearly that the reliability of the method was good, this criterion was also rated positively 18. Positive if the time-period on which the assessment of back pain was based was at least one year 19. Positive if data were collected at least once every three months or obtained from a continuous registration system 20. Positive if incident cases were included (prospective enrollment) Analysis and data presentation 21. Positive if the method used for the statistical analysis was appropriate for the outcome studied and the measures of association estimated according to this model (including confidence intervals) were presented 22. Positive if the analysis included a stratified or multivariable analysis 23 This criterion was rated positively if one of the following criteria was met: direct measurement method: data presented or in reference showed that the intraclass correlation coefficient was >0.60 or the kappa was >0.40; observational method: data presented or in reference showed that the intraclass correlation coefficient was >0.60 or the kappa was >0.40 for the intraobserver reliability if only 1 observer was involved or the interobserver reliability if more than 1 observer was involved; self-reported: data presented or in reference showed that the intraclass correlation coefficient was 20.60 or the kappa was >0.40 for the test-retest reliability. If no intraclass correlation coefficient or kappa had been computed, but the data presented showed clearly that the reliability of the method was good, this criterion was also rated positively.