Recommendations for individual participant data meta-analyses on work stressors and health outcomes: comments on IPD-Work Consortium papers

The IPD-Work (individual-participant data meta-analy-sis of working populations) Consortium job combination of control and

The IPD-Work (individual-participant data meta-analysis of working populations) Consortium has published several papers on job strain (the combination of low job control and high job demands) based on Karasek's demand-control model (1) and health-related outcomes including cardiovascular disease (CVD), cancer, obesity, diabetes as well as health-related behaviors, utilizing meta-analyses of a pooled database of study participants from 17 European cohorts. An IPD approach has some advantages over typical meta-analyses, eg, having access to all the data for each individual allows for additional analyses, compared to typical meta-analyses. However, such an approach, like other meta-analyses, is not free from errors and biases (2-6) when it is not conducted appropriately.
In our review of the IPD-Work Consortium's (hereafter called the Consortium) publications of the last two years, we have identified and pointed out several conceptual and methodological errors, as well as unsubstantiated conclusions and inappropriate recommendations for worksite public health policies (6)(7)(8)(9)(10)(11)(12)(13)(14)(15). However, the Consortium has not yet appropriately addressed many of the issues we have raised. Also several major errors and biases underlying the Consortium IPD meta-analysis publications have not been presented in a comprehensive way, nor have they been discussed widely among work stress researchers. We are concerned that the same errors and biases could be repeated in future IPD Consortium meta-analysis publications as well as by other researchers who are interested in meta-analyses on work stressors and health outcomes. It is possible that the inappropriate interpretations in the Consortium publications, which remained uncorrected to date, may have a negative impact on the international efforts of the work stress research community to improve the health of working populations.
Recently, Dr. Töres Theorell, a principal investigator of the Consortium, responded in this journal (16) to some of our criticisms on the Consortium papers (17,18). The purpose of this article is to discuss the methodological and substantive issues that remain to be resolved and how they could be addressed in future analyses. We provide recommendations for future IPD or typical meta-analyses on work stressors and health outcomes. Finally, we discuss the inappropriate conclusions and recommendations in the Consortium publications and provide alternative recommendations, including a comprehensive perspective on worksite intervention studies.

Part 1: Unresolved methodological issues and recommendations for future research
Theorell's commentary (16) is largely consistent with our criticisms (9,12,14) on recent Consortium publications: the error of equating job strain with workplace stressors in general; ignoring the interrelationships between job strain and health-related behaviors; ignoring emerging evidence of the beneficial effects of organizational and task-level interventions on the health of working populations; the limitations of a one-time measure of job strain in most publications; and overgeneralization of the findings from the publications beyond the countries and cultures from which the Consortium cohort data originated. The methodological issues discussed in this section have been appropriately addressed neither in the Consortium analyses nor in the Theorell commentary. The overall net effect of the limitations and errors discussed in this section show a tendency to bias the apparent associations between work stressors and health out-comes towards the null, which will also underestimate the population attributable risk (PAR) of work stressors for health outcomes. In turn, these biases contribute to errors in interpretation of the findings discussed in the second section of this paper.
Need to follow appropriate guidelines for the reporting of IPD meta-analysis of observational data The internal and external validity of an IPD or typical meta-analysis of observational studies largely depends on the quality of the individual studies as well as how appropriately investigators do the review, quantification, and characterization of the results of individual studies. Thus, many medical and public health journals have adopted standard guidelines for the reporting of meta-analysis of observational data such as PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) (19) and MOOSE (Meta-analysis Of Observational Studies in Epidemiology) (5). Adherence to the guidelines is essential because it provides consistent and transparent information about the assembly of the data and the conduct of the meta-analyses to editors, reviewers, and readers. In addition, it relates to ethical practice because inadequate reporting can give false credibility to biased results (20).
However, most of the Consortium publications on job strain and health outcomes have not adhered to the guidelines for the reporting of their meta-analyses. In one recent IPD publication by Heikkila et al (21) on job strain and tobacco smoking, the authors reported following the MOOSE guideline. However, in the paper they failed to provide basic information on each individual study, such as follow-up participation rates and attrition rates by job strain.
Furthermore, the standard for reporting meta-analyses based on the IPD approach should be even more comprehensive than the guidelines, such as PRISMA and MOOSE, for meta-analyses of aggregated data. A primary reason is that the IPD approach requires that the investigators gain access to the individual-level data for each of the included studies, while traditional meta-analysis is based on analysis of published articles only. For example, Riley et al (2) provide a checklist of items that will significantly help reviewers, editors, and readers alike assess the methodological quality of IPD meta-analyses by the Consortium: (i) The process used to identify relevant studies for the meta-analysis; (ii) How many authors (or collaborating groups) were approached for IPD and the proportion that provided such data; (iii) The number of authors who did not provide IPD, the reasons why, and the number of patients (and events) in the respective study; (iv) Whether those authors who provided IPD gave all their data or only a proportion; if the latter, then describe what information was omitted and why; (v) Whether there were any qualitative or quantitative differences between those studies providing IPD and those studies not providing IPD (if appropriate); and (vi) Whether the IPD results for each study were comparable with the published results, and, if not, why not (for example, IPD contained updated or modified information).
We recommend that the Consortium and other researchers follow the guidelines for the reporting of IPD meta-analysis of observational data such as those proposed by Riley et al (2).

Comparability of different measures of exposure
Heterogeneity in measures of work stressors has been identified as a barrier to drawing strong conclusions about the associations between work stressors and health outcomes in meta-analyses (16,(22)(23)(24)(25) and for calculating PAR of work stressors for health outcomes (9,17,(26)(27)(28). The Consortium had to address substantial heterogeneity in the measures of work stressors across the European cohorts of the Consortium (25, 29). For example, despite the fact that the Job Content Questionnaire (JCQ) and the Demand-Control Questionnaire (DCQ) were based on the same work stress model (1,30), the two questionnaires for job control and job demands differ in the number of items, item wording, scale formula, and item response set (31,32). Job strain was measured with the standard scales of the JCQ or the DCQ in only 6 of the 17 cohort studies of the Consortium. Only some standard JCQ or DCQ or similar items for job control or job demands (called hereafter partial or proxy scales) were available in 11 of the 17 cohort studies (25). The Consortium attempted to indirectly test how comparable the partial or proxy measures in their 11 cohort studies would be with the standard scales of the JCQ or the DCQ in their 6 cohort studies (called "harmonization process of job strain measures" by the Consortium). However, we have identified two significant errors in their meta-analysis based on the harmonization process as well as the harmonization process itself. First, several individual cohort studies were included in their IPD meta-analyses that were unqualified for being a "harmonized" job strain measure according to the criterion developed by the Consortium. The authors stated in a methodological paper that "job strain indices based on one complete and one partial scale, seemed to assess the same underlying concepts as the complete survey instruments" (25, p1). However, 4 cohorts (DWECS, NWCS, POLS, and Still Working) of the 13 in the article by Kivimäki et al (32) did not meet the criterion (ie, job strain was measured with one incomplete job control scale and one incomplete job demand scale). We calculated a stronger association between job strain and coronary heart disease (CHD) than appeared in Kivimäki et al (17) when including only the 9 qualifying cohorts [hazard ratio (HR) 1.32; with all 13 cohorts, HR 1.23]. When one excludes data from the 4 cohorts that do not meet the "qualification criterion", as stated in the Kivimäki et al paper, then the PAR of job strain for CHD increases from the reported 3.4% to 4.9% (10).
We also found an error in the harmonization process of job strain measures across the European cohort data (32). The Consortium developed an approach for creating comparable job strain groups between the two questionnaires as part of the harmonization process of job strain measures across the 17 cohort data in the following way: (i) they dropped three job control items from some of their cohort data [eg, data from the Belstress (33) and GAZEL (34) studies] in which job control had been assessed with the standard nine JCQ control items in order to make the same number of items for job control as in the DCQ; (ii) they then used simple summation-based scale formulas rather than standard JCQ scale formulas for job control and job demands; and (iii) they defined high job strain based on the medians of the job control and job demands scores. This Consortium approach has been assumed by the Consortium to be free of major errors in their meta-analyses (17,18,(35)(36)(37). Our recent analysis with a dataset from a random population sample of middle-aged Malmo men and women (32), who were given a questionnaire with the 14 JCQ and 11 DCQ items for job control and demands, indicated two major weaknesses of the Consortium approach compared to using the standard JCQ scale formulas for job control and demands: a lower (5-7%) prevalence of job strain and a lower agreement percentage of job strain between the JCQ and the DCQ. This suggests that the Consortium approach is likely to have resulted in an underestimation of the prevalence of job strain and a weaker association between job strain and health outcomes due to greater misclassification for job strain, as well as a lower PAR for health outcomes in their meta-analyses.
In addition, false negatives for job strain between the two questionnaires were much greater than false positives (37-49% versus 7-13%) (31,32). That is, there is a higher likelihood of misclassification for the job strain group between the two questionnaires than for the non-job strain group. This implies that the results of meta-analyses using job strain data with either the JCQ or the DCQ as in the meta-analyses of the Consortium's cohorts are likely to underestimate associations between job strain and health outcomes due to the differential misclassification of job strain exposure.
The measures of effort-reward imbalance (ERI) were also heterogeneous across the 15 Consortium cohorts in the number of items, item wording, and item response set (7,29). Full ERI scales (based on the short version of the ERIQ) (38) were available in only 5 cohorts, while only partial or proxy ERI scales were available in 10 cohorts. We have previously disagreed with the conclusion of the Consortium that the partial or proxy ERI scales in the 10 cohorts were validated due to a high degree of heterogeneity in terms of item wording and response set between the partial or proxy ERI scales and the standard ERI scales; low content validity of the partial or proxy ERI scales; and low sensitivity for ERI in some of the 10 cohorts with the partial or proxy scales (7).
We recommend that the Consortium exclude unqualified job strain and ERI data in their future meta-analyses.
In addition, the Consortium should be aware of the errors and weaknesses in their harmonization processes of the measures of job strain and ERI. The Consortium needs to at least conduct sensitivity tests for examining the impact of the heterogeneity of their measures of work stressors in their future meta-analyses, for example, comparison of their meta-analysis results by the JCQ versus the DCQ (for job strain) or full versus partial or proxy scales (for both job strain and ERI). More methodological studies are needed to examine the comparability of different measures of job strain and ERI in terms of exposure prevalence and the associations between work stressors and health outcomes as well as cross-cultural (national) measurement equivalence (39,40).
How to define the exposure group?
The comparability of the work stressor exposure group between different instruments varies to some extent by how the exposure group is defined in meta-analyses. For example, Karasek et al (31) demonstrated that the sensitivity for job strain of the DCQ against the JCQ improved to some extent when the job strain group was defined based on tertiles or quartiles of the job control and demands scores without the group of workers close to the medians or means of job control and demands scores -which is the group most vulnerable to misclassification of job strain. Thus Karasek et al (31) recommended using tertile-or quartile-based job strain definitions, in particular when greater sensitivities for job strain between the two questionnaires are needed in epidemiological studies. The quartile-based job strain definition was also more strongly associated with leisure-time physical activity (41) and had a higher PAR for mental health (42) than the median-based job strain definition. However, the Consortium defined and tested job strain using only two common methods based on the medians of job control and job demands scores in their meta-analyses: (i) two groups (job strain versus non-job strain) and (ii) four groups (high strain, low strain, passive, and active).
This issue is not limited to the definition of job strain. There is no clear cut-point for defining ERI with the short version of the ERIQ (38,43), while there is an official cut-point (>1.0 of the ratio of effort to reward) with the original version of the ERIQ (44). The short version has a fewer number of items for effort and reward (3 effort and 7 reward items) than the original (5 effort and 11 reward items) and a different response set (a 4-point Likert response set versus a 2-step, 5-point response set in the original version). In the seminal paper on the short version (38), the authors defined ERI based on the quartiles of the effort-reward ratio score to examine the association between ERI and self-reported health. However, in the methodological paper of different measures of ERI across the 15 European cohorts, the Consortium used the cut-point (>1.0) to define and compare indirectly the full ERI scales (based on the short version) with the partial or proxy ERI scales using the five European cohorts.
A recent Japanese study compared the original and short versions of the ERIQ in terms of the agreement and prevalence of ERI after applying the same cut-point (>1.0) (43). The agreement level between the two versions was low, and the prevalence was very different (63.2% with the short version versus 18.9% with the original). It indicates that the Consortium approach for defining ERI (using the cut-point of >1.0) may overestimate the real prevalence of ERI in the five cohorts of the Consortium with the full ERI scales (based on the short version). Also, it is not certain yet whether the same cut-point is applicable to partial or proxy measures of ERI scales based not only on the different number of items, but also on different wording and response sets that were available in the ten cohorts of the Consortium.
We suggest that applying the quartile-based job strain definition may improve the level of agreement for job strain between the JCQ and the DCQ and be better able to detect the associations between job strain and health outcomes. The quartile-based job strain definition is not a random or arbitrary choice, but a promising choice that has been supported theoretically and empirically (31,41). Given no official or fixed cut-point for defining ERI with the short version of the ERIQ, a possible risk of overestimation of the prevalence of ERI, and uncertain applicability of the cut-point (>1.0) to the partial or proxy ERI scales, using the percentile-based definition of ERI may be an alternative way of defining ERI in the Consortium studies.

Change in exposure over time
Theorell acknowledged the limitation of using a onetime measure of job strain in most publications of the Consortium (16). However, the potential impact on the associations between job strain and health outcomes in the meta-analysis papers by the Consortium has not been adequately discussed. In addition, change in exposure over time and its impact on the longitudinal associations between job strain and health outcomes have not been discussed in the meta-analyses by the Consortium.
As we have pointed out previously (6,9,12), a onetime measure of job strain versus repeated measures "underestimates" associations between job strain and health outcomes (45,46). In addition, another study (37) by the Consortium using four European cohorts (Belstress, FPS, HeSSup, and Whitehall II) with follow-up periods of 3-9 years indicated a possibility of substantial differential exposure misclassification when using only baseline information of job strain. In this study by Nyberg et al (37), 58% of the people with job strain at baseline changed to non-job strain at follow-up, while 11% of the people in the non-job strain category at baseline changed to job strain at follow-up. Thus, significant exposure misclassification may have occurred in the meta-analysis papers using only one-time exposure information, leading to an underestimation of the true associations between job strain and health outcomes. In addition, Clays et al (33) reported that considerably more people in the job-strain group (at baseline) dropped out during the follow-up period in the Belstress study compared to the non-job-strain group. However, the Consortium neither provided nor discussed such information in their meta-analysis publications. There was also no way for readers to assess whether such differential attrition associated with job strain also occurred in the other three European cohorts (FPS, HeSSup, and Whitehall II) The differential attrition rate by job strain status during followup at least confirmed in the Belstress study can result in a significant underestimation of the associations between job strain and health outcomes.
We suggest that the Consortium and other researchers clearly discuss the possible impact of using baselineonly measures of work stressors in future papers. Also, researchers should report basic information such as follow-up and attrition rates by exposure group during follow-up to help readers assess the validity of their meta-analyses and interpretations. Furthermore, researchers should consider and discuss the impact of change in exposure over time on longitudinal associations between work stressors and health outcomes.

Distribution of exposure in working populations
The interpretation of results of the Consortium metaanalyses should be based on an understanding of the characteristics of the target and study populations in the individual studies. Kivimäki and Kawachi (47) insisted that the findings in the Lancet paper (17) by the Consortium resolved a longstanding debate about differences in the association between job strain and CHD by socioeconomic status (SES). However, as we have previously pointed out, "only three of the cohorts were randomly selected from general working populations with participation rates of more than 50%; most of the others were recruited from white-collar organizations" (9, p448).
We acknowledge that 5 cohorts (DWECS, COP-SOQI, POLS, NWCS, HeSSup) were randomly selected from general working populations. However, participation rates of 2 (NWCS and HeSSup) were <50% (33-34% and 40%, respectively). In these cohorts, the low socioeconomic status (SES) group was underrepresented based on non-response analyses of the cohort data (48,49). Even in the POLS study, which had a participation rate >50%, the low SES group was underrepresented (50). Thus, only 2 (DWECS and COP-SOQ-1) among the 13 cohorts in the Lancet paper (17) were randomly selected from general working populations and without a significant non-response bias in the low SES group.
Regarding the other 8 cohorts in the same paper, we agree with the authors that 3 (Whitehall II, Belstress Study, and Gazel study) are white-collar dominated samples. However, 2 other cohorts (WOLF-S and FPS) are also white-collar dominated samples. The proportion of the white-collar workers was 60% in the WOLF-S study and 85% (versus "15% performing manual work" in low SES) in the FPS study (51). Only 3 cohorts (Still Working, Wolf-N, and IPAW), consisting of <10% of the total study subjects in the Lancet paper (N=15 829 out of 197 473) (17), are blue-collar dominated samples. Thus, among 13 cohorts in the Lancet paper (17), only 2 represent general working populations, and 8 are white-collar and 3 bluecollar dominated. As we pointed out elsewhere (9, 10), the prevalence of job strain is generally lower in whitethan blue-collar occupations (52); and workers facing job strain are less likely to participate in occupational health studies than those not facing job strain (53,54). While there is a large sample size of low SES workers, this does not necessarily indicate that the low SES group was well represented in the selection process of study subjects in the Consortium cohorts. The under-representation of the low SES group and workers with job strain in the cohorts of the Consortium needs to be discussed as a possible limitation in future publications. As we have previously stated, the debate regarding SES differences in the association between job strain and CHD cannot be resolved because of the unrepresentative Consortium data (10). A comprehensive meta-analysis based on all existing published and unpublished cohorts might generate better information to address the longstanding issues.
Assessing the PAR of job strain and CHD Theorell (16, p93) stated in his commentary: "The fact that there is an independent relationship between job strain and MI (myocardial infarction) risk already provides an important rationale for employers to deal with psychosocial stress, regardless of the size of the association." However, we think that accurately estimating the PAR of job strain in relation to CHD is as important as estimating the association between job strain and CHD. One researcher wrote in response to the Lancet paper (17): "The small HR and PAR may make employers wonder whether implementing organizational changes to reduce job strain and CHD is the right strategy given the evidence and potential costs." (55, p53).
The PAR of job strain for CHD was calculated as 3.4% in the Lancet paper (17). We believe this calculation is an underestimate. A number of factors operate to likely bias the estimated HR towards the null, thus, underestimating the true PAR (9,12). In addition, due to the under-representation of low SES groups in the IPD cohorts, the prevalence of job strain in the cohorts of the Consortium is likely underestimated. In fact, the prevalence of job strain among the 13 cohort studies in the Lancet paper (17) was highest in the 2 cohort studies (DWECS and COPSOQ-1) that were collected randomly from the general population without a significant nonresponse bias in the low SES group, 21% and 22%, respectively -which is much higher than the average of 15% among the 13 cohort studies used in the Lancet paper (17). These higher prevalence rates are closer to the average prevalence of job strain (23.9% and 26.9%) calculated using the same measures (for job demands and control) in 7 European countries (Belgium, Denmark, France, Finland, the Netherlands, Sweden and, the UK) -which the Consortium cohorts in the Lancet paper came from -and all 31 European countries, respectively, in the 2005 European Working Conditions Survey (EWCS) (27,28). If the European prevalence of job strain of 23.9% (in the 7 countries) and 26.9% (in 31 countries) were used, the Consortium's resulting PAR of job strain for CHD with a HR of 1.23 would increase to 5.2% and 5.8%, respectively. In addition, the estimated PAR of job strain for CHD in the Lancet paper (17) was significantly lower in comparison to the previous study based on a French national representative sample: 6.5-25.5% (56).
If other important work stressors are included in the calculation, the PAR of workplace stressors would be greater than the one reported for job strain alone by the Consortium. Recently, Niedhammer et al (27) reported that the PAR% of ERI for CVD was 18.2% in the 31 European countries participating in the 2005 EWCS.
We suggest that the Consortium and other researchers should be more accurate in estimating and presenting the PAR of job strain for CHD with full consideration of the limitations of their cohorts and meta-analyses, and their impact on public policy and stakeholders, including public health officials, regulatory agencies, and employers and workers, and their organizations. We look forward to the results of the calculation of the PAR for multiple work stressors, including job strain, ERI, job insecurity, and long work hours, which will present a clearer picture of the impact of work stressors on CHD and other health outcomes.

Part 2: IPD-Work Consortium's public health recommendations
The Consortium publications have offered a number of conclusions and recommendations for prevention of CVD and "tackling" of standard risk factors, which we believe are inconsistent with the accumulated body of research findings and are counter to current scientific recommendations in this field (8,9,12,14,15). We believe these recommendations must be carefully scrutinized. The following are some examples of the conclusions and policy recommendations made by the Consortium authors: • "Our findings suggest that prevention of workplace stress might decrease disease incidence; however, this strategy would have a much smaller effect than would tackling of standard risk factors, such as smoking." (17, p1491) • "…reducing work-related psychosocial stress, operationalized as job strain … is unlikely to be an important target for any policy or intervention aiming to influence health-related lifestyle factors or overall lifestyle." (35, p2095) • "…it is unlikely that intervention to reduce job strain would be effective in combating obesity at a population level." (37, p66) • "For many people, avoidance of stress at work is unrealistic. The absence of strong evidence for effective interventions to reduce job strain therefore raises the challenge of identifying additional approaches for dealing with the health impact of stress in the workplace." (18, p763).

Various methodological problems leading to an underestimate of the PAR in the Consortium articles call into question their conclusions
The public policy conclusion of the Lancet article (17) regarding the "tackling of standard risk factors" such as smoking, rather than addressing work stress interventions, is based on what we believe to be an underestimation of the impact of work and work stressors (including job strain) on CVD (see Part 1). Tobacco smoking has a higher PAR for CVD than job strain alone, but not necessarily a higher PAR when all major work stressors are accounted for, and it ignores the likelihood that work stressors contribute to tobacco smoking, as well as other "standard risk factors." At the 6 th ICOH conference on Cardiology and Occupational Health held in Tokyo, Japan, in March 2013, after a debate that included representatives of the Consortium, a consensus of scientists concluded in a statement that: "According to research data, about 10-20% of all causes of CVD deaths among the working age populations can be attributed to work, ie, are work-related. The loss of work days and work ability is likely to be substantially greater." (57, p4) Consortium conclusions are based on Northern and Western European populations and may not be applicable to other working populations around the world The Consortium authors make general recommendations not limited to the study populations of the studies on which the meta-analyses are based. These populations are largely white, whitecollar, Northern and Western European populations. Their recommendations could be easily misinterpreted as applying to all working people worldwide. Many regions outside of Europe have substantially different worker populations and workplace conditions, and include countries that do not have the progressive social and workplace health policies that exist in many of the Northern and Western European countries where the Consortium populations are based. We suggest that the Consortium state in future publications that their findings are most applicable to Northern and Western European populations and acknowledge limitations in generalizing their findings to other working populations, countries, and regions of the world.
Consortium conclusions focusing on the "tackling of standard risk factors" are not in concert with current public policy in Europe or the United States (US) The conclusion in the Lancet paper (17) to "tackle standard risk factors" is of concern since it could be interpreted as advocating the medical treatment of standard CVD risk factors, such as hypertension, cholesterol or diabetes, as the sole approach. While there has been reduction in mortality risk from medical treatment of CVD risk factors, treatment is not without costs. Already in the US, 1 in 6 healthcare dollars are spent on CVD, and healthcare costs in the US for CVD are predicted to increase to $818 billion by 2030 (58). While costs are lower in the socialized medical systems of many European Union countries, CVD costs are still quite expensive (EU €169 billion/year) (59). In addition to economic costs, medical interventions frequently have unwanted side effects (60) as well as limitations to the efficacy of treatment. For example, there is still no solid evidence that there is benefit from the treatment of individuals with mild hypertension (see recent Cochrane review) (61). In the Cochrane review, about 9% of the clinical trial participants withdrew from the trial due to side-effects of medications. In addition, there is also evidence of a J-shaped curve of benefit, indicating that there is an optimal target level of BP but more aggressive lowering of BP may result in increased morbidity and mortality (62). Since the Consortium authors mentioned "smoking" in their publication as an example of "tackling of standard risk factors," we can also assume that they meant, among many possible interventions, individual-focused behavioral or lifestyle modification. To date, programs to change smoking behavior, overeating, reduce weight, and promote exercise, etc, through mainly individualfocused workplace health promotion have met with only limited success (63)(64)(65)(66)(67).
While we support providing affordable medical treatments and public health education programs to at-risk populations, we do not think this need be in lieu of trying to improve working conditions. The Consortium's "either/or" perspective is consistent with current public policy efforts neither in Europe nor the US. The 2002 Barcelona Declaration on Developing Good Workplace Health in Europe pointed out that smoking and alcohol use are also work-related and "can only be tackled through health promoting workplaces" [cited by LaMontagne et al (68, p277)]. The WHO Healthy Workplace Framework and the European Network for Workplace Health Promotion's Luxembourg Declaration also defines "workplace health promotion" to include: "…a combination of … improving the work organisation and the working environment… promoting active participation… encouraging personal development" (69, p2). US policies on Total Worker Health from the National Institute for Occupational Safety and Health (70,71) posit that workplace health promotion programs may be more effective when they also address the physical and organizational work environment (72)(73)(74). The American Heart Association has endorsed integrated occupational health and safety (OSH) and health promotion (HP) program approach (75). Such policies developed in part due to the limited success of solely individual-focused health promotion programs (64)(65)(66), which rarely reach blue-collar or clerical workers or workers in small businesses (76,77). Smoking cessation rates have been shown to be higher, especially among lower SES workers, in workplaces which implemented the integrated OSH/HP approach compared to HP-only workplaces (78,79).
Consortium conclusion about individuals not being able to "avoid stress at work" ignores work stress interventions beyond the individual level Policy recommendations by the Consortium authors imply that interventions to reduce CVD risk are only feasible at the individual level and that, absent more research demonstrating the benefit of workplace stressor interventions, there is little that can be done at the workplace level. Theorell addresses this issue by observing that, while few and far between, work intervention research is "worth a closer look." However, we would expand on Theorell's position to emphasize the widelyaccepted model that work stressor interventions can be implemented at four levels: the level of the individual, the job/task, the organization, or outside the organization through laws and regulations (80)(81)(82).In contrast to claims by the Consortium of "the absence of strong evidence for effective interventions to reduce job strain…." (18, p763), there is a wide range of evidence of the effectiveness of work stressor reduction interventions, especially job/task level interventions to improve job design, and reduce job stressors (68,81,(83)(84)(85)(86)(87) (see Appendix, www.sjweh.fi/data_repository.php).
Legislative/regulatory interventions in the Nordic and other Northern and Western European countries have led to a lower prevalence of exposure to work stressors. For example, the prevalence of job strain was lower in 6 of the 8 Consortium countries than the European average -far lower in Denmark, Sweden, and the Netherlands. The prevalence of ERI was lower in 5 of 8 Consortium countries than the European average -far lower in Denmark and the UK (27,108). Psychosocial safety climate, a measure of management concern for worker psychological health, was highest in the Nordic countries, as well as Belgium, the Netherlands, Ireland and the UK, and lowest in Eastern European and Southern European countries (109).
Compared to their Danish counterparts, Spanish workers faced higher job insecurity, lower influence and development (latitude), and lower supervisor support (although higher co-worker support) (110). Nordic countries and several other Northern and Western European countries rank highest in the world in the Labor Market Security Index of the International Labour Office (111).
Limited research also suggests that the strength of association between work stressors and ill health is weaker in the Nordic and other Northern and Western European countries, a buffering effect. Dragano et al found a weaker association between work stressors (ERI and low job control) and depression symptoms in the Nordic countries compared with other European countries. The strongest associations were seen in Southern European countries and the UK (112). In another study, the most important factors explaining worker self-reported health between European nations were two levels of labor protection, macro-level (union density), and organizational-level (psychosocial safety climate) (109), both of which are higher in the Nordic countries and other Northern and Western European countries than in Eastern or Southern European countries. It seems likely that improved working conditions in European countries, and in particular Northern and Western Europe, result in a reduced prevalence of work psychosocial stressors and a weaker relationship between work stressors and health outcomes compared to the rest of the world.
In contrast to claims by the Consortium of the "… absence of strong evidence for effective interventions to reduce job strain…." (18, p763), we argue on behalf of the "precautionary principle" (113). Rather than wait for strong evidence from randomized controlled trials, which are rare in occupational health research and sometimes inappropriate, workers, employers, health professionals and policy-makers need to act on the basis of information from quasi-experimental studies, observational studies, natural experiments, and case studies. When existing evidence, even if incomplete, strongly suggests that job, organizational and legislative changes are beneficial for worker and organizational health, it is imperative to act and evaluate. This was exactly the policy strategy implemented by the Nordic countries 40 years ago with the passage of the Swedish Work Environment Act of 1977 (114).

Concluding remarks
We have identified six methodological issues in the publications of the IPD-Work Consortium and/or in the commentary by Theorell that remain to be resolved for improving the quality of meta-analysis on work stressors and health outcomes in the future: (i) no or incomplete adherence to appropriate guidelines for the reporting of IPD meta-analysis of observational data, (ii) use of unqualified or highly heterogeneous measures of expo-sure; (iii) less comparable definitions of the exposure group; (iv) underestimated associations between job strain and health outcome due to the change of exposure over time; (v) under-representation of the low SES group and workers with job strain in the cohort data of the Consortium; and (vi) underestimated PAR of job strain and work stressors for CVD. We hope that the Consortium and other researchers take into consideration our various suggestions in their future meta-analyses.
More troubling are the public health recommendations appearing in IPD-Work Consortium publications suggesting that interventions targeted at the individual are preferable and more effective than interventions targeted at the worksite. We believe these conclusions overstep the Consortium data given the methodological problems we have described. Furthermore, they ignore the large body of research evidence on effective work organization interventions carried out to date as well the existence of ongoing collective action by trade unions, political parties, and public policy-makers that have improved working conditions. The public health policy recommendations of the Consortium are particularly difficult to accept since these two intervention strategies (individual and worksite) are not mutually exclusive, but complimentary. We conclude that the Consortium's public health recommendation of ignoring work organization change is premature and not appropriate based on the findings of their meta-analyses. Future recommendations from the Consortium should take into account the limitations discussed above and await a complete analysis of all the important work-related risk factors in their study as they impact on health.
The fact remains that serious errors in the recommendations of the publications by the IPD-Work Consortium appear, uncorrected, in a number of prestigious journals. We think that the Consortium, as responsible members of the scientific community, needs to take scientific action (eg, writing an erratum) to correct and clarify their errors in their publications. Lastly, we hope this article facilitates a dialog among researchers, journal readers, reviewers, and editors on how best to improve current and future practices of IPD and typical meta-analyses on work stressors and health outcomes.