Precarious employment and mental health: a systematic review and meta-analysis of longitudinal studies

Precarious employment and mental health: a systematic review and meta-analysis of longitudinal studies. Scand J Work Environ Health online first. Objectives Precarious employment (PE) is a term used to describe non-standard employment forms characterized by low security that may have negative effects on mental health. The objective of this review was to system atically review the evidence for effects of PE on mental health and identify important areas for further research. Methods A protocol was developed following PRISMA-P guidelines. Web of Science, PubMed and PsycINFO were searched up to 4 September 2017. All unique records were assessed for eligibility and quality by at least two reviewers. Data from included studies were summarized in forest plots and meta-analyses using a random-effects model. Evidence quality was rated using the GRADE method. Results We obtained 3328 unique records, of which 16 studies of sufficient quality met the inclusion criteria. Mod erate quality evidence (GRADE score 3 of 4) was found for an adverse effect of job insecurity on mental health; summary odds ratio (OR) 1.52 [95% confidence interval (CI) 1.35–1.70]. There was very low quality (GRADE 1 of 4) evidence for effects of temporary employment or unpredictable work hours on mental health. Five studies on multidimensional exposures all showed adverse effects, weighted average OR 2.01 (95% CI 1.60–2.53). Conclusions Research on PE and mental health is growing, but high-quality prospective studies are still scarce. Job insecurity likely has an adverse effect on mental health. A clear multi-dimensional definition of PE is lacking, and harmonization efforts are needed. Further single-variable observational studies on job insecurity or temporary employment should not be prioritized.

Over the last 30 years, there have been substantial changes in the global labor market. In the Western countries, factors such as globalization, neoliberal politics, technological advances, and deindustrialization have developed a demand for a more flexible workforce (1).
Accentuated by the financial crisis of 2008 and the rise of the "gig economy", the prevalence of nonstandard employment forms characterized by low or no long-term security has increased (2).
"Precarious" is a term used to describe these employments by means of a set of conditions such as temporary contract forms, lack of bargaining power and rights, vulnerability in the employee-employer relationship, employment insecurity, and insufficient wages. In contrast to many established concepts within the scope of occupational health and medicine, it pertains to the employee-employer relation rather than hazards of work environment or job content per se.
There is no internationally accepted definition of precarious employment, and it is controversially discussed, for example the recent debate in this journal regarding multiple job-holding (3,4). However, several multidimensional constructs have been proposed, including those described by Guy Standing (5) and Vives et al (6).
The International Labor Organization (ILO) describes precarious employment as "uncertainty as to the duration of employment, multiple possible employers or a disguised or ambiguous employment relationship, a lack of access to social protection and benefits usually associated with employment, low pay, and substantial legal and practical obstacles to joining a trade union and bargaining collectively" (7). In The Precariat (2011), Guy Standing proposes several levels of security lacking in precarious employment: "employment security" -eg, government goals of full employment; "job security"protection against arbitrary dismissal, rules for hiring and firing; "occupational security" -ability to maintain a niche on the labor market and opportunities for career development, "workplace security" -protection against occupational accidents and diseases through regulations; "competence security" -the opportunity to develop new skills through internships and job training as well as the opportunity to utilize existing skills; "income security" -guarantees of a stable and adequate income; and "representation security" -the access to a collective voice in the labor market (5). Attempts to operationalize a multidimensional concept of precarious employment for research purposes have been made, most notably the Employment Precariousness Scale, which includes six dimensions; "temporariness" (contract duration), "disempowerment" (level of negotiation of employment conditions), "vulnerability" (defenselessness to authoritarian treatment), "wages" (low or insufficient; possible economic deprivation), "rights" (access to workplace rights and social security benefits) and "exercise rights" (powerlessness, in practice, to exercise workplace rights) (6).
During the last decades, a growing number of studies have focused on the health effects of precarious employment. Previous scoping reviews on this issue have indicated links to an array of health issues (8). A recent systematic review by our group has also found that various aspects of precarious employment may be linked to occupational injuries (9).
It has been hypothesized that precarious employment acts as a stressor on the individual, predisposing to mental health problems (8). However, the effects are complicated to investigate considering the possibility of bidirectional causality, confounding and selection effects where healthier employees are more likely to gain stable employments (10). In order to disentangle these relationships and examine causal effects, data from well-designed longitudinal studies controlling for confounders and baseline mental health need to be analyzed.
Many studies investigating mental health effects have relied solely on isolated aspects of precarious conditions, such as temporary employment contracts or perceived job insecurity. Although there are previous reviews published on the association between isolated aspects of precarious employment and mental health (11)(12)(13)(14), to our knowledge there has been no systematic review conducted taking a multidimensional and strict longitudinal approach to the relation between precarious employment and mental health.

Objectives
The primary aim of this systematic review was to assess the evidence of effects of precarious employment on mental health. Both multidimensional definitions and isolated aspects of precarious employment were investigated. The purpose of this was to (i) provide summary effect estimates and grading of evidence quality -where adequate evidence can be obtained -which can be used for policy and health economics decisions; and (ii) map areas where evidence is scarce or lacking to provide directions for future research.

Protocol
A protocol for the conduct of this review was developed in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocol (PRISMA-P) guidelines (15). Two external researchers with experience in conducting systematic reviews and the research field independently reviewed the protocol before the start of the review process. The protocol can be read in full in the supplementary material A, www. sjweh.fi/show_abstract.php?abstract_id=3797.

Eligibility criteria
Studies were required to meet the following criteria for inclusion: Setting and participants. Studies on human subjects of working age active in the labor market were considered. To limit the impact of diversities in economic, labor market and cultural settings and achieve a reasonable level of generalizability of the results in Western context, the review was restricted to studies performed in the European Economic Area (Member States of the European Union, Iceland, Lichtenstein and Norway), Switzerland, Australia, New Zealand, USA and Canada.
Exposure. Studies on precarious employment, defined as exposure to single or multiple dimensions as outlined by the abovementioned three articles (5-7) were considered.
Comparators. Studies comparing outcomes in workers exposed to precarious employment to those unexposed, recruited from the same population were considered.

Study design.
Only studies with a longitudinal design were included. These were required to assess: (i) outcome variable at baseline and either exclude those participants defined as cases or control for baseline levels of outcome in analyses and (ii) exposure at ≥1 time-point ≥1 year prior to the time of outcome measurement used in analysis. The results should be reported in effect sizes relative to control group [eg, odds ratios (OR) or relative risk (RR)], including a measure of dispersion/statistical confidence [eg, confidence interval (CI)].
Outcomes. Studies on any outcome that pertains specifically to the mental health of participants were considered. Direct diagnosis of mental illness such as depression or anxiety disorders (eg, through the use of diagnostic interviews), questionnaire-based scores of symptoms or self-rated mental health, and the incidence of adverse events such as suicide, self-inflicted harm or hospitalization were considered primary outcomes. Indirect effects of mental health, such as psychotropic drug prescription or sickness absence from work were considered as secondary outcomes. Outcomes where mental health constitutes a part but cannot not be isolated from other factors, such as self-rated general health or wellbeing, were not included.
Study report. Original research articles written in English, Swedish, Norwegian or Danish and published (including ahead-of-print publishing) in a peer-reviewed journal between 1 January 2000 and 4 September 2017 (date of last search) were included.

Information sources and search strategy
The first and last authors drafted and piloted the search strategy, which was externally reviewed before the search were executed. The strategy was constructed using a combination of Medical Subline Headings (MeSH) and keywords based on multidimensional definitions of precarious employment (5)(6)(7). A schematic presentation of the search strategy is shown in table 1. Three electronic databases were searched: PubMed/ Medline, PsycINFO and Web of Science. Full search strings with limiters used for all these can be found in supplementary material B, www.sjweh.fi/show_ abstract.php?abstract_id=3797. The first search was performed on 3 May 2016; an additional search was performed 4 September 2017, which is latest publication date covered by this review. In addition, the third author performed a manual search for studies in the reference lists of retrieved review articles.

Study selection and data collection
EndNote X8 software was used to manage reference libraries. After removal of duplicate references, three reviewers participated in the screening of titles and abstracts, where each reference was independently assessed by at least two persons. References that met inclusion criteria on the basis of information provided in the title and/or abstract according to both reviewers proceeded to fulltext assessment.
Five reviewers participated in the fulltext assessment. Each fulltext was assigned to a pair of reviewers for independent assessment of relevance, study quality and data collection. The pairs were blinded to each other's opinion until both had made their judgments. If different judgments had been made regarding relevance and/or study quality, disagreements were discussed; if a consensus could not be reached, a third reviewer was asked to make an independent assessment. If disagreement or uncertainty remained after discussion between these three, the article was discussed by the whole group at a final meeting where a consensus decision was made. The first author extracted the data collected from articles included after final consensus, and, in the case of inconsistencies, he reassessed the fulltext for errors in the data collection process. Remaining inconsistencies were independently assessed by two reviewers and discussed among three reviwers until consensus was reached. Due to a conflict of interest, Canivet et al's article (16) was sent to Prof Christer Hogstedt for an external review of quality and eligibility.

Tools for assessing relevance and study quality
The first and last author developed and piloted the form used for assessment of relevance and rating of quality and data collection from studies. The full form can be found in supplementary material C, www.sjweh.fi/ show_abstract.php?abstract_id=3797. Each reviewer pair was trained in concordance of judgments, and recalibrations were regularly made during the review process to ensure consistent interrater reliability.
The rating of study quality was based on an adaption of a form developed for systematic reviews of observational studies by the Swedish Agency for Health Technology Assessment and Assessment of Social Services (SBU) (17) also used in a recent large-scale systematic review of work environment and depressive symptoms by Theorell et al (18). The following aspects of study quality were considered: Participants. Large study samples representative of a broad target population (eg, stratified national random samples) with well-defined inclusion criteria and methods of recruiting rendered higher ratings. Descriptions of non-response at each stage were required, and low attrition rates or analyses of non-responders/dropouts indicating low risk of selection/loss to follow-up bias rendered higher ratings.
Measurement methods. Clear descriptions of assessment methods and definitions for both exposure and outcome were required. The use of standardized and validated methods, where applicable, contributed to higher ratings.
Study design. Designs that to as far an extent as possible eliminate the possible influence of reverse causality (ie, mental health problems selecting individuals into precarious employments) on results were requested. A longitudinal design was required, with assessment of exposure and baseline mental health at a time point at least one year prior to outcome assessment. Models that used a time lag between baseline mental health assessment and first exposure assessment, thereby introducing risk that mental health problems preceded exposure in some participants, rendered lower ratings. Designs enabling investigation of dose-response relationships between exposure and outcome, such as repeated measurements of exposure prior to outcome assessment, enabled higher ratings. Very long intervals between measurements that might result in failure to detect actual effects (ie, assessing mental health problems so long after exposure that many affected may have recovered prior to time point of assessment) contributed to lower ratings.
Confounding. Description of the comparability of exposed and non-exposed groups was required. Demonstrated high similarity and/or adequate statistical handling of any substantial differences in potentially confounding variables rendered higher ratings. Results were required to be stratified or adjusted for gender, age and some aspect of socioeconomic status. Thorough presentation of the results of adjustment factors introduced in stepwise manner contributed to higher ratings.
Relevance and precision. The relevance of studies was evaluated on basis of how well they were designed to answer the research question "does precarious employment cause mental health problems compared to nonprecarious employment?". In this regard, the relevance of investigated exposures and study population combined was assessed in terms of sensitivity and specificity in capturing the precariously employed population. Outcomes were evaluated on basis of their directness in assessing the mental health of participants and robustness of measurement methods. Direct symptom assessments by study conductors were preferred to register data of secondary outcomes such as drug prescription, and diagnostic interviews by health professionals were considered more robust than single-item questionnaire assessments.
Transferability. The results of studies should be relevant to participants in the Northern European and similar labor markets. Study samples representative of a broad target population rendered higher ratings, while more restricted samples such as individual workplace cohorts were considered to have lower transferability. Studies using exposure and outcome definitions highly dependent on local (national) rules and legislations, such as employment contract forms and records of certified sick leave/disability pensions were considered less transferable, while those using definitions robust over different populations were given higher ratings.
Reporting bias. Reporting bias, ie, the selective reporting of outcomes or analyses post hoc on the basis of results is difficult to assess in observational studies, since most elements of efforts made to reduce selective reporting in clinical trials (preregistered protocols, rigorous ethical approvals at different stages, CONSORT statement, trial registers etc.) are usually lacking. On individual study level, the risk of reporting bias was assessed based on described methodology, where any sign of post hoc decisions regarding analyses made (eg, stratification or subgroup analyses without justification provided, differences between variables assessed in questionnaires and those reported in articles, data driven cut-off points, etc.) resulted in lower ratings.
Based on these criteria, each study was graded as low, moderate or high quality. The starting point for an observational study meeting inclusion criteria was considered to be moderate quality, with considerable merits "outsourced services" * "outsourced" "outsourcing" "precarious" "temporary" "layoff" "layoffs" "downsizing" "atypical" "contingent" "atypical" "flexible" "casual" "non-standard" "nonstandard" "unprotected" "insecure" "insecurity" AND "employment" * "work" * "job" AND "mental health" * "mental disorder" "depressive disorder" * "depression" "anxiety" * "dyssomnias" * "sleep initiation and maintenance disorders" resulting in an upgrade to high quality, and significant flaws in study design or risk of bias resulting in a downgrade to low quality. Studies receiving a rating of low quality after consensus decision were excluded from the review. As the cut-off was set at a minimum of moderate quality, at least 3 reviewers assessed any borderline low/ moderate quality studies before decisions were made.
It should be noted that the quality assessment strategy used does not provide an algorithm for a quantitative determination of each study's overall quality rating based on a scoring system or similar. The process of evaluating a study's quality and risk of bias is inevitably prone to include subjective elements. The purpose of our strategy was to evaluate systematically all relevant aspects of quality in each study and provide a coherent and transparent basis for decisions based on the combined judgement and experience of the reviewer team. This aligns with the core concept of the Grading of Recommendations Assessment, Development and Evaluation (GRADE) for assessment of overall evidence quality (19), which was used in subsequent steps.

Data items and grouping
All relevant data on population, exposure and outcome, their definitions and measurement methods, study design, statistical methods, adjustment factors and results were extracted from included studies. The first author grouped exposures and outcomes and drafted the choice of analysis model for inclusion in the metaanalysis, which three reviewers then discussed and finalized. If several different multivariate models were presented in a study, the one that closest resembled a model controlled for gender, age and some measure of socioeconomic status was chosen.

Synthesis of results
RevMan 5.3 software was used for meta-analyses, statistical tests and the construction of forest and funnel plots. Results from included studies were grouped according to type of exposure and outcome. For each exposureoutcome combination, a meta-analysis was performed if ≥2 studies provided mathematically comparable data and were assessed to be sufficiently homogenous to make the interpretation of a summary estimate meaningful. Additionally, since all outcomes included in this review pertain to states of suboptimal mental health, overall summary estimates for each exposure category were calculated using pooled data for all outcomes. A generic inverse variance method was used to allow for the input of adjusted effect sizes. Since OR was by far the most common effect size measure in included studies and recalculation to risk ratio (a more intuitive measure) of an OR adjusted for covariates is difficult, OR was chosen as summary measure. Whenever possible, individual effect sizes were recalculated as OR with 95% CI (20). For studies providing relative risk or rate ratios, the original values were input as OR for the purpose of meta-analyses, as this is the most conservative approach when an actual recalculation of the adjusted effect sizes is not possible. Since the assumption of a single common true effect size for all studies within the same exposure-outcome combination was not plausible due to differences in operationalization, a random effect model was chosen (21). When no CI were presented in the original study, 95% CI were derived from P-values using Altman et al's proposed method (22). When there were data for more than one relevant outcomes derived from the same study population (regardless if they were presented in different publications), the arithmetic mean of the log OR and standard errors for different outcomes were input as a single data item in the all-outcome summary meta-analyses.

Risk of bias across studies
Funnel plots were created for all outcomes combined in each exposure group and separately for each outcome when there was an adequate number of studies. These were visually inspected to detect signs of publication bias. No adjustments of summary estimates based on these were made (eg, "trim and fill" methods), but any detection of significant signs of publication bias resulted in downgrading of overall evidence quality (see "assessment of evidence quality" below).

Additional analyses
Sensitivity tests of meta-analyses were performed using the most adjusted model available from each study to test the robustness of results to potential confounders other than gender, age and socioeconomic status.

Assessment of evidence quality
The overall quality of evidence for each exposureoutcome combination was assessed by means of the GRADE method. According to the latest guidelines, the following criteria should be explicitly considered when rating overall evidence quality: risk of bias/study limitations, directness of evidence, consistency of results, precision, signs of publication bias, magnitude of the effect, evidence of a dose-response gradient, and influence of plausible residual confounding. On the basis of these criteria, each exposure-outcome combination was assigned an overall evidence quality score of 1-4, ranging from Very Low to High quality. Evidence from observational studies starts at level 2 (Low quality), and upgrading of evidence quality may be done if the magnitude of effects is large (effect estimates from several studies OR >2.0 or <0.5), if there is evidence of a dose-response effect, and if any plausible sources of bias would influence the association in opposite direction of the effect. Downgrading of evidence quality by one or two points may be done in the case of inconsistent effects across studies, signs of significant publication bias, indirect evidence, imprecision of results or high risk of bias in individual studies (23).

Reporting of review conduct and results
Guidelines for meta-analyses and systematic reviews of observational studies (MOOSE) (24), Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (25) and, where applicable, the Methodological Expectations of Cochrane Intervention Reviews (MECIR) (26) were followed when preparing this manuscript. Figure 1 illustrates an overview of the selection process. The database searches resulted in a total of 4260 records. Additionally, 404 records were obtained through manual search in reference lists of related reviews. From a total of 4664 identified records, 1336 duplicates were removed, leaving 3328 records to be assessed by title. Of these, 2825 were dismissed as not relevant, leaving 503 records to be assessed by abstract. Among these, 371 studies were excluded as not matching inclusion criteria, leaving 132 full-text articles to be assessed. Among these, 115 studies that did not meet the inclusion criteria and/or were assessed to be of low quality were excluded, and 16 studies were included in the final review. Of these, 14 provided result measurements suitable for inclusion in quantitative synthesis (forest plots and/or meta-analyses).

Study selection
Common reasons for exclusion were: (i) exposures insufficiently specifically for precarious employment, such as job strain, work-time control, injustice at work, downsizings, social support and other items that may arguably be experienced to a higher as well as lesser degree among precarious employees than among regular workers; (ii) outcomes insufficiently specific pertaining to mental health, such as subjective well-being, selfrated health, negative affect, quality of life and similar; (iii) cross-sectional study design; and (iv) longitudinal study designs not excluding the possibility of outcomes preceding exposures, such as those assessing exposure and outcome at the same time point sometime after baseline health assessment; or those with a wide time gap between baseline health assessment and first data point for exposure. Table 2 shows details of the included studies. The most studied exposure was job insecurity. Additionally, temporary employment, unpredictable work hours and a variety of multidimensional exposure constructs were studied. All included studies were prospective cohort studies, or cohort studies with a design equivalent to prospective. Most were performed on general population samples, some on sector specific cohorts and one on a companybased cohort. Three study populations were used by more than one included study for identical or overlapping data years. These were handled accordingly (see "synthesis of results" under Methods). Outcomes studied were: depression, anxiety and general psychiatric morbidity ("psychological distress") as assessed by diagnostic/screening questionnaires and in one case diagnostic interviews; the use of psychotropic medication as assessed by self-report or pharmacy registers; and records of sick leave due to depression from social insurance registers.

Continued
Results of individual studies, meta-analyses & evidence quality Figure 2 shows the results and quality ratings of individual studies as well as summary measures and grading of overall evidence quality. Four exposure groups were constructed: job insecurity, temporary employment, unpredictable work hours and multidimensional exposures.

Job insecurity
Ten studies with a total of 45 075 participants (in the case of more than one study analyzing the same sample, only the highest number was taken into account) investigated the effects of job insecurity. These provided a summary OR of 1.61 (95% CI 1.29-2.00) for depressive symptoms, with consistent effects from six studies (one not included in the quantitative synthesis due to data type) (32) and an evidence quality upgraded to Moderate (GRADE level 3 of 4) due to dose-response associations;

Temporary employment
A large number of studies on temporary employment were screened during the selection process, but only four eligible studies, with a total number of 104 535 subjects [whereof 103 530 came from one single register-based study (33)] were of sufficient quality for inclusion. One study could not be included in the quantitative synthesis due to data type, but showed a weak adverse effect on depressive symptoms (35). The effects were highly inconsistent. Because of this and the limited total number of studies (three unique study populations), the overall evidence quality was downgraded to Very Low (GRADE level 1 of 4), and no meta-analysis was performed.

Unpredictable work hours
Unpredictable work hours was investigated in two studies. However, the same data set were used for both. The results showed no or negligible effects, but since the CI could not exclude either a substantial beneficial or adverse effect, the overall evidence quality was downgraded to Very Low (GRADE level 1 of 4).

Multidimensional exposures
Five studies investigated different multidimensional exposures. Hammarström et al (41) analyzed the effects of temporary employment and low education combined compared to permanent employment and high education, and found an OR of 3.13 (1.28-7.63) for depressive symptoms, and 2.45 (95% CI 1. 45-4.15) for psychological distress. Canivet et al (16) applied a definition of precarious employment using data on selfreported job insecurity, employment contract form, and unemployment, and found an incidence rate ratio of 1.4 "Peripheral employment score", highest tertile of exposed vs non-exposed (95% CI 1.1-2.0) for psychological distress. Virtanen et al (38) found an effect of job insecurity and temporary employment combined of OR 1.67 (95% CI 0.78-3.58) for psychological distress. Waenerlund et al (39) used an extended version of Aronsson's core-periphery model (42) to construct a cumulative measure of peripheral labor market position ("peripheral employment score"), and found an OR of 2.79 (95% CI 1.52-5.14) in men and 1.79 (95% CI 0.98-3.29) in women for major depressive episodes among those highly exposed compared to those not exposed. Finally, Rugulies et al (37) found an OR for psychotropic prescription drug use of 2.38 (95% CI 1.56-3.63) for those with a history of prolonged unemployment reporting job insecurity, compared to those with neither exposure. A meta-analysis was performed for all adverse mental health outcomes combined, which gave a summary OR of 2.01 (95% CI 1.60-2.53). This is however not to be interpreted as an estimate of the true effect size of "precarious employment", but rather as an indication of where the effects sizes would probably disperse when studying more than one dimension of precarity simultaneously. No GRADE assessment was performed for this category.

Publication bias
Funnel plots constructed to assess risk of publication bias can be found in supplementary material D, www. sjweh.fi/show_abstract.php?abstract_id=3797. Only the funnel plot for studies on temporary employment showed significant skewedness, and for this exposure the evidence quality for all outcomes was already downgraded to Very Low (GRADE 1 of 4).

Additional analyses
Sensitivity tests where all meta-analyses were reperformed using the most adjusted models provided in each study showed only marginal decreases in effect sizes and did not alter any conclusions (data not shown).

Discussion
The main objective of this study was to review the evidence for an effect of precarious employment on the mental health of workers in longitudinal studies. After a systematic search, screening and evaluation of literature, 16 original studies were identified as relevant and of sufficient quality for inclusion in synthesis. From these, data on the effects of four different categories of exposures on five adverse mental health outcomes were extracted.
We found Moderate quality (GRADE level 3 of 4) evidence that perceived job insecurity has a detrimental effect on mental health, corresponding to a mean absolute risk increase of 32-48% in a population with a baseline risk of 5-30%. The same level of evidence quality was also present for two specific outcomes: depression and anxiety, where dose-response effects were seen. Very Low quality (GRADE level 1 of 4) evidence for an effect of temporary employment on mental health was found. There were few studies of sufficient quality, and inconsistent results were seen.
No effect of unpredictable work hours on mental health was shown, but the evidence was very limited (GRADE level 1 of 4).
Multidimensional exposures were investigated in five studies, ranging from combinations of two survey items (38,41) to combinations of self-reported and register data (37) and more complex trajectory and cumulative exposure constructs (16,39). Adverse effects on mental health were seen for all these, and the effect sizes were generally larger than those seen for single-item variables.

Limitations
Research on health outcomes in relation to multidimensional definitions of precarious employment is still a rather new topic, and it is not surprising that we found few original papers exploring such relations. We used a wide definition of precarious work in order to identify as many relevant studies as possible, and searched three major databases to cover most scientific areas. Our search string identified the two major components that have been most extensively studied: temporary employment and job insecurity. It further identified all papers included in reviews on temporary employment and job insecurity (11)(12)(13)(14). We additionally identified a number of studies investigating other dimensions of precarious employment (16,33,(37)(38)(39)41). Still, the output of observational research is large and not as rigorously catalogued as that of clinical trials, and we cannot be certain that there are no important studies we failed to identify.
Our restriction regarding country poses several problems. First, we might have overlooked important implications of mental health effects of precarious employment in populations outside our scope. Second, although the purpose of our restriction was to limit heterogeneity, important aspects of labor market policies and societal circumstances may very well differ substantially even between countries included in our review. Caution is advised when transferring results from this review to other locations than those providing data.
We also limited the review to studies published from 2000 and onwards. The rationale behind this is similar; studies performed in widely different times will differ in setting just as those from widely different locations. There is good reason to believe that what constitutes Rönnblad et al Figure 2. Results of individual studies & meta-analyses, study quality ratings and overall evidence quality (GRADE). [IV=inverse variance; CI=confidence interval; OR=odds ratio, HR=hazard ratio; COR=cumulative OR; RR=risk ratio; JI=job insecurity; TE=temporary employment.] 1: Evidence quality upgraded by one level due to reliable evidence of a dose-response effect. 2: Evidence quality upgraded by one level due to reliable evidence of a dose-response effect. 3: Overall evidence quality upgraded by one level due to consistent effects, evidence from several studies of high quality, and evidence for dose-response effects on several outcome parameters. 4: Evidence quality downgraded by one level due to results from only one study with non-negligible potential sources of bias, and inconsistency of results within exposure group. 5: Evidence quality downgraded by one level due to results from only one study population, and inconsistency of results within exposure group. 6: Evidence quality downgraded by one level due to results from only one study with non-negligible potential sources of bias, and inconsistent results within exposure group. 7: Evidence quality downgraded by one level due to results from effectively only one study (same data set), and confidence intervals not excluding considerable benefit or harm (imprecision). a No summative grading of evidence quality considered meaningful due to large variance of exposure operationalization. Summary odds ratio should be interpreted as a weighted average of effects.
"secure" and "insecure" employment and the impact this has on individuals will have changed along with societal transformations on a macro level. It was also evident from our search results that the bulk of literature in this research field has been published in recent decades. Still, with this approach we risk having missed important studies published before 2000.

Implications for research and society
The conclusions regarding temporary employment are in line with the results of a recent scoping review by Hunefeld et al (14), which -using no limitations on study designs or quality -examined the results of 84 individual studies on fixed-term employment and mental health and found no clear direction of associations at all. Our hypothesis on underlying reason for this is that temporary employment may be an overly broad category to capture precarious employment conditions. This has also been confirmed by our previous systematic review on precarious employment and occupational injuries (9). A crucial differential point regarding health outcomes is whether the arrangement is decided by the worker or employer (ie, voluntary or involuntary fixed-term employment). Worker-based flexible arrangements have been found to have neutral or positive effects, while company-based arrangements show a negative impact on health parameters, particularly psychosocial and mental health (43,44). We conclude that there is enough data by now to support that a category that includes such a wide range of employment statuses (ie, from day laborer to a four-year appointment as associate professor) clearly lacks in specificity if the purpose is to study marginalized segments of the labor force. Therefore, the use of temporary employment as an exposure should be limited to specific settings, such as comparing workers within the same occupation. In defense of temporary employment research however, it should be acknowledged as a concrete concept that can be translated directly to policy reform initiatives. Nevertheless, as long as there is work in a market economy, there will be a need for temporary employees due to fluctuations in demand for goods and services, and it therefore is difficult to abolish completely. A plausible political aim could be to limit "chronic temporariness", ie, indefinite prolonging of involuntary temporary contracts. There is however still not enough research to implement such policy on the basis of health effects, and we recommend future research to focus on this area and develop tools and study design to specifically measure cumulative effects of involuntary temporary employment.
Our results regarding job insecurity are in line with those found in a recent review and meta-analysis focusing on the effect of unemployment and perceived job insecurity on depressive symptoms, this too restricted to longitudinal studies, by Kim et al (13). Here stronger associations were seen in Europe than in the US, highlighting that labor market setting may influence the impact adverse employment conditions have on the individual. Job insecurity is a psychological (cognitive and/or affective) phenomenon with many proposed definitions. In summary, it could be described "a perceived threat to the continuity and stability of employment as it is currently experienced", as proposed in a review by Shoss (45). There are to our knowledge no studies that have found a predictive value of job insecurity on actual subsequent job loss, however correlations with macro-level labor market trends have been shown (46). DeWitte et al (12) conclude in their recent review that the majority of longitudinal job insecurity research concerns self-reported general mental well-being. Associations were strongest between self-reported job insecurity and self-reported general mental health, while evidence regarding specific conditions was less convincing. We also note that studies on more objective outcomes such as doctor's diagnoses or psychotropic medications, are very scarce. This is an important limitation, since research on self-reported subjective stressors and mental health entail an inherent risk of inflated associations: individuals more likely to report mental symptoms may be more likely to report exposure to the stressor in the first place (common method bias; when both exposures and outcomes are self-reported). Inconsistencies in how job insecurity is measured in different studies makes comparison difficult at times. Combinations, sometimes multiplicative, of questions into composite variables make some studies hard to interpret, and whether the exposure is a measure of prediction or fear of job loss is often not defined. The different wordings such as "how worried are you about losing your job" and "how likely is it that you will lose your job" are likely to render different results in the same population. Although interest in this field originally stemmed from investigating the effects of downsizing and lay-offs in an era when zero unemployment policy was abandoned in favor of neoliberal inflation-battling policies in welfare states, it has gradually been reduced to a psychological and mainly individualized problem with less focus on policy implications. This might be changing with new studies investigating job insecurity climate at work places (47), which is a step forward. However, when job insecurity is compared to the demand-control and effort-reward models (48), two concepts which emerged during the same era, the societal impact of job insecurity research has been discouraging. Demand-control and effort-reward are now integrated in parts of public policy in several countries including Sweden, demanding action from employers to reduce the exposure to these social and organizational work environment hazards (49). With an established close relation to mental health, we conclude that job insecurity should be studied in more detail as an outcome (psychological response to an actual threat) or in relation to other outcomes such as performance and group effects. It could also be included as a mediator in the causal pathway between other policy-relevant labor market exposures and mental health outcomes.
After somewhat having questioned (on different grounds) the relevance of both temporary employment and self-reported job insecurity as exposures in future research concerning precarious employment and mental health, we would like to encourage the development of questionnaire items with high predictive value of involuntary job loss that could be integrated into the concept of precarious employment, as this is lacking today. We would also welcome researchers to explore other more objective ways of measuring this (eg, using register data, peer-rating or group assessments) to reduce the influence of individual traits on the assessment of one's own situation in the labor market. The strong associations seen in the studies applying multidimensional exposure constructs indicate that the mental health effects of precarious employment are probably larger than those seen when studying parts of the phenomenon as single-item variables. Therefore, they provide a relevant way forward for research in combining subjective and objective parameters to construct more accurate representations of labor market position.
Precariously employed workers are at the bottom of the labor market pyramid. They are hard to reach and the least likely to be captured by surveys. They are, almost by definition, difficult to follow longitudinally. Many studies assessed for this review investigated employees in sectors with very low unemployment rates, such as healthcare professionals and other public servants. These are professionals that have a strong position on the labor market in general. Efforts are needed to conduct studies on more marginalized labor market groups. We call on the research community to join us in an effort to create a standard for research on this topic. Although the issue has been discussed for some time, there is still much to be done, and the potential health consequences in the population are large. We welcome correspondence from other research groups in this matter, as we believe much can be gained by coordinating our efforts.

Concluding remarks
In conclusion, this review has found evidence for an adverse effect of job insecurity on mental health. For temporary employment and unpredictable work hours, little evidence of an effect was found. We have identified several studies applying multidimensional exposure definitions. Despite being heterogeneous, these studies show that the mental health effects of precarious employment can successfully be studied, but there is a great need for formative work and harmonization efforts. We question the need for additional observational studies on temporary employment in general and perceived job insecurity in relation to mental health outcomes. Instead, we encourage development of new methods to overcome the inherent limitations of these research concepts. In this review, we had to exclude many studies due to fundamental design issues. Apparently, there is still a dire need for more high-quality prospective studies with policy-relevant results. Precarious employment conditions are becoming widespread and affect both individuals and communities. Our results indicate that this is an important emerging public health issue that urgently needs to be addressed by researchers and politicians.