Night-shift work and breast cancer – a systematic review and meta-analysis

A dose–response analysis of night-shift work and breast cancer found the evidence insufficient for a causal link. For the first time, quality of the studies was assessed and the results of the review viewed in the light of the quality of evidence. The review calls for studies with prospective follow-up and objective data on night-shift work. Night-shift Objective The aim of this review was to synthesize the evidence on the potential relationship between nightshift work and breast cancer. Methods We searched multiple databases for studies comparing women in shift work to those with no-shift work reporting incidence of breast cancer. We calculated incremental risk ratios (RR) per five years of nightshift work and per 300 night shift increases in exposure and combined these in a random effects dose–response meta-analysis. We assessed study quality in ten domains of bias. Results We identified 16 studies: 12 case–control and 4 cohort studies. There was a 9% risk increase per five years of night-shift work exposure in case–control studies [RR 1.09, 95% confidence interval (95% CI) 1.02–1.20; I 2 =37%, 9 studies], but not in cohort studies (RR 1.01, 95% CI 0.97–1.05; I 2 =53%, 3 studies). Heterogeneity was significant overall (I 2 =55%, 12 studies). Results for 300 night shifts were similar (RR 1.04, 95% CI 1.00–1.10; I 2 =58%, 8 studies). Sensitivity analysis using exposure transformations such as cubic splines, a fixed-effect model, or including only better quality studies did not change the results. None of the 16 studies had a low risk of bias, and 6 studies had a moderate risk. Conclusions Based on the low quality of exposure data and the difference in effect by study design, our findings indicate insufficient evidence for a link between night-shift work and breast cancer. Objective prospective exposure measurement is needed in future studies.

In 2007, an expert group of the International Agency for Research on Cancer (IARC) convened on shift work and its association with cancer. Based on strong animal and weak human evidence, the expert group concluded that night-shift work that involves circadian disruption was probably carcinogenic for breast cancer among women (1).
Using diverse methods and a number of studies, four previous systematic reviews (2)(3)(4)(5) concluded that night-shift work could increase the risk of breast can-cer, although the evidence was considered limited or weak in three of these. However, none of the reviews took the variation in exposure assessment between studies into account or made an attempt to model the relationship between night-shift work and the risk of breast cancer. In light of the increasing evidence and the lack of rigorous methods in previous reviews, an up-to-date assessment of the association between exposure to night-shift work and breast cancer was undertaken (6).

Inclusion criteria
We included studies on working women exposed to night-shift work. The comparison was women in day work. We included studies where the outcome was incidence of breast cancer confirmed by histopathology for ≥90% of the cases or where it would be reasonable to infer the same. We included both retrospective and prospective cohort and case-control studies.
We excluded: (i) airline crew studies because of the additional exposures (cosmic radiation, time-zone changes) and lifestyle factors in this occupation; (ii) studies reporting only mortality, benign breast disease, or other proxy outcomes; and (iii) cohorts where incidence was assessed without differentiating between exposed (to shift work) and non-exposed members. The protocol is available here: http://www.crd.york.ac.uk/PROS-PERO/display_record.asp?ID=CRD42012002247.

Search, selection and data extraction
We searched Medline, EMBASE, CINAHL, PsycInfo, LILACS, OSH Update and ProQuest dissertation and theses database without date or language restriction. Our search strategy for Medline is presented in figure 1 (7)(8)(9).
We checked the references from included studies, existing systematic reviews, and expert commentaries and contacted subject experts and authors of included studies. Two authors independently selected the studies, extracted data, and assessed the risk of bias according to the recommended methods for systematic reviews, with a third person resolving disagreements. Though personal communication with authors, we tried to obtain missing information. Data on night or over-night work were chosen over only evening, early morning, or combined work. When no distinctions were made in the study, we assumed night or overnight work. We chose self-reported exposure over that assessed by a job exposure matrix alone when both were reported separately in a study. Data obtained from authors directly were chosen over data reported in publications or modeled by us.
For each included study, we assessed the risk of bias as low, high, or unclear against ten important sources (domains) of bias by following a validated checklist for measuring bias in studies of risk factors (7)(8)(9). Following were the domains where risk of bias was assessed: (i) exposure definition, (ii) exposure assessments, (iii) blinding of assessors, (iv) reliability of assessments, (v) confounding, (vi) attrition, (vii) selective reporting, (viii) analysis methods in the study (research-specific bias), (ix) funding, and (x) conflict of interest.
Exposure definition. If the definition included at least two of the following three aspects recommended by IARC, exposure definition was considered to be at low risk of bias: shift system (rotating or fixed, forward or backward rotation); shift duration (in years); and shift intensity (per week or per month frequency). The study was considered to have a high risk of bias when it used a categorical definition with an arbitrary threshold (eg, 1 year, "ever done night work") or a definition that covers only one aspect of exposure (start or end time of shift or duration, intensity, or shift system).
Assessment of exposure. If objectively measured (direct measurement of exposure, such as logging data, shift schedule data from the human resources or employers' records, and prospective self-measurement of exposure, eg, diaries), a study was considered to have a low risk 2) "Light at night" OR "LAN"[tiab] OR ((circadian OR "biological clock" OR "sleep-wake cycle" OR "sleep-wake schedule") AND disrupt*)  of bias in the assessment of exposure. The risk of bias was considered to be high if the exposure was assessed using subjective measures: reported by participants (interviews/questionnaires) or a proxy used to allocate exposure status (job matrix, job title).

Blinding.
A study was given a low risk judgment on blinding if assessors were reported or indicated to be blind to exposure status in cohort studies and to case status in case-control studies. A high risk judgment was given when either it was reported or indicated in the report that assessors were not blind to exposure or case status for cohort and case-control studies respectively.
Reliability of exposure estimates. When good inter/intra observer reliability was achieved with reported reliability values or when objective measures were used (such as log data), cohort studies' reliability of exposure estimates were judged to have a low risk of bias. A study was considered to have a high risk in this domain when observer variability was reported by means of a subjective judgment of reliability. A lack of information was given a judgment of unclear.
Confounder assessment. We assessed confounding on two levels: whether 4 of the 5 major confounding factors/effect modifiers [age, body mass index (BMI), ethnicity, parity (number of children, age at first birth), and socioeconomic status] were assessed completely (low risk) or assessed partially (high risk), and if confounders were measured with valid methods (low risk) or not (high risk). As a rule, we gave a low risk judgment overall when both categories were marked low risk. However it was also marked low risk if two reviewers agreed that, even though one aspect was considered unclear or high risk, the results of the study were not affected by this factor: for example, when ethnicity was not assessed in a study but it was clear that ethnic variation in the sample was minimal.
Attrition. A total loss of participants (non-response in case-control studies) of ≥20% or a dropout/non-response difference between the compared groups of ≥10% or the reasons for dropout/non-response not given/different led to a judgment of high risk. Conversely a <20% loss in total and ≤10% difference in dropout/non-response between the two groups was considered low risk. A lack of information led to a judgment of unclear.
Selective reporting of results. This domain was given a high risk judgment if authors presented incomplete/ selective reporting of the tested hypotheses (compared to aim and objectives) and/or crude estimates only. A low risk grade was given when adjusted estimates were presented for all hypotheses tested as per aims, and unclear was given when not enough information was available or the hypothesis was unclearly stated.
Research-specific bias. This pertains to the analysis conducted in the study and includes three aspects: (i) the methods used to reduce bias due to research design (these methods include standardization, matching, adjustment in multivariate model, stratification, and propensity scoring), (ii) the assessment of dose-response in some way (subgroup, regression), and (iii) author justification of the sample size, in descending order of importance. When all three of these were at low risk of bias or two reviewers agreed that unclear or high risk in one of these aspects in a particular study did not affect the results significantly, the whole domain was given a low risk judgment. Authors were contacted to clarify any ambiguity.
Funding. This was assessed in two areas: source of funding and the involvement of the funding body in the research. When a study was funded by non-profit organization(s) and it was clear that the funding body was not involved in the conduct or interpretation of the research, it was considered to have low risk of bias. If one of these factors was high risk, the study was considered to have a high risk of bias; if one of these was not reported, the study was marked as having an unclear risk.

Conflicts of interest.
A study was considered to have a (i) low risk of bias if there were no conflicts of interests to be declared or if declared interests were not deemed conflicting (as assessed by two reviewers), (ii) high risk if one or more authors had indicated a conflicting interest, and (iii) unclear when the information was not provided.

Bias prioritization
For the overall assessment of the risk of bias per study, we had a consensus that exposure to shift work schedules have the most relevant impact on the biological rhythm, circadian de-synchronization and re-adjustment, as well as sleep deprivation and recovery, thus on health. Exposure definitions and assessments were, therefore, obviously the most important domains for risk of bias in our review. Similarly the analysis and the confounders taken into consideration may affect more significantly the reliability of a study in the context of the current review than, for example, blinding. Therefore, we placed the domains into two hierarchical groups. Major domains of bias: (i) exposure definition, (ii) exposure assessment, (iii) reliability of assessments, (iv) confounding, and (v) analysis methods in the study (research-specific bias). Minor domains of bias included: (i) blinding of assessors, (ii) attrition, (iii) selective reporting, (iv) funding, and (v) conflict of interest.
We then rated the study-level risk of bias as: low (low risk in all major domains and ≥2 of the minor domains), moderate (low risk of bias in ≥4 major and 2 minor domains), or high risk of bias (low risk of bias in <4 major domains) The detailed form is available in appendix A, www. sjweh.fi/data_repository.php.

Confounders
The complete set of confounders for shift work and breast cancer relationship can be seen in the directed acyclic graph (DAG) presented in figure 2. The appropriate adjustment set for estimating the total effect of shift work on breast cancer would include: age, ethnicity, parity, socioeconomic status, all of which are factors that causally influence shift work as well as breast cancer. We decided to adjust for these confounders and additionally for other potential confounders that were a major risk factor (30% increased risk) for breast cancer and were found to be differentially associated with shift work. The final adjustment set therefore was: age (1,11,12), ethnicity (1,13,14), socioeconomic status (or a proxy) (10,(15)(16)(17), parity (16,(18)(19)(20)(21) with adjustment done for either number of children, or age at first child, and body mass index (BMI) (overweight, obese) (22)(23)(24).
Some factors although significant for breast cancer were not found to be associated with shift work. Alcohol consumption for example, a known, albeit weak, risk factor for breast cancer was not differentially associated with night-shift compared to day workers and thus was not considered as an important confounder (25)(26)(27).

Statistical analysis
We performed a dose-response analysis in a two stage procedure. First, we estimated a dose-response curve for individual studies. We started by assigning a single dose to each shift work exposure category reported in a study (28). For six studies where we got information from authors, we used doses as advised. We used STATA, release 12 (StataCorp, College Station, TX, USA) to calculate study-level incremental risks (29,30).
In the second stage, we combined the study-specific estimates with a random effects meta-analysis model for trend estimation. Several mechanisms have been hypothesized in literature with at least some evidence of a biological plausibility for night-shift work having a causative link to breast cancer (31). We were limited to analyze exposures as measured in the studies. Studies usually reported the total number of years in nightshift without distinguishing between continuous and interspersed exposure years, indicating an assumption that the effect is because of the total exposure years. We assumed the same for this review. Furthermore, two studies reporting risk per year of night work indicated a very small effect estimate for one year. We therefore took five years (irrespective of intensity or continuity) as an exposure long enough to show a meaningful difference in effect, and 300 shifts an equivalent to maximum intensity night work for one year (6 shifts per week=288 shifts), as the best proxy for the circadian disruption related to night-shift work. We present the risks for five years and 300 night-shift increases, respectively, as the most relevant biological doses for subgroups of casecontrol and cohort studies.
We took both odds ratios (OR) and risk ratios (RR) as valid estimates of the relative risk because of the low incidence of breast cancer.
We assessed statistical heterogeneity with the I 2 statistic. We performed a meta-regression analysis in STATA with the following pre-specified study level effect modifiers: occupation, site of the study, type of shift system, and study design. We tested our model assumptions in a-priori defined fixed effect analysis and by exclusion of high risk studies.
We tested our choices for assigning a dose to the open ended highest categories by capping the ≥20 years highest exposure categories using the lowest bound of the category as the dose value. In some previous studies, authors reported increased risk with only very long exposures (5,16,32). Thus a linear model would not hold. We therefore tested the assumptions underlying the dose-response relationship by fitting a cubic spline model with various knots, and by using the natural logarithm of the dose for the exposure to see if this improved the goodness of fit.
We tried to avoid reporting biases by including studies irrespective of language and publication status and by contacting authors. We assessed publication bias by observing funnel plot asymmetry and performing the Egger's test to ascertain bias due to small studies (33).
We used the approach of the Scientific Committee of the Danish Society of Occupational and Environmental Medicine and the GRADE approach (supplementary appendix B, www.sjweh.fi/data_repository.php) for grading the quality of the total evidence (34).
Exposure definitions mostly included start and end times and duration in years. Frequency of shifts per week or month was part of the definition in five studies. Four studies included the shift system as part of the exposure definition. None included all three aspects advised by the IARC (shift system, years of shift work, and shift intensity) (49).
Ten studies reported exposure as binary categorical data (yes versus no shift work). Twelve studies reported categories of increasing years of exposure and two reported increasing duration with increasing frequency categories. Six studies reported cumulative lifetime number of shifts for various exposure levels. Of the five confounding factors, age and parity were adjusted for most often. Eight studies adjusted for all five confounders (table 1c).
No study had an overall low risk of bias and six studies were of moderate risk (37, 39-41, 44, 48) (table  2 and appendix D on www.sjweh.fi/data_repository. php). The same six studies had a low risk of bias in how they defined night-shift work. For method of exposure measurement, only one study used objective exposure assessment from prospectively collected records and consequently had a low risk of bias (39). Thirteen studies (16, 32, 35-37, 39-45, 48) were considered to have a low risk of bias for reliability of exposure assessment. Ten studies had a low risk of bias in adjustment for confounding factors (16, 36-38, 40, 41, 43-45, 48) and ten studies had low risk in the analysis domain (16, 32, 37, 39-41, 43-45, 48). Nine studies had a low risk of bias for blinding (16,36,37,40,(43)(44)(45)(46)48) and nine had low risk in the domain of attrition (16,32,36,39,41,42,44,45,47). Authors confirmed that sponsors had no role in conduct or reporting of 12 studies while 13 reported no conflict of interest or this was confirmed by the authors.

Effects of exposure
No specific dose relationship between the exposure and the risk of breast cancer in the individual studies was      4). The transformation of data is presented in table 3.
Three cohort studies (38,39,46) and one casecontrol study (36) could not be included in our metaanalysis because of insufficient data (table 4) to allow calculation of a 5-year exposure risk. These studies, with the exception of Li (39), did not report duration categories of exposure. If control numbers become available, the addition of the Li study to our analysis would improve the precision of our results.
When we looked at the effect of the type of occupation, site of study, and shift system (rotating, fixed, rotating and fixed together) simultaneously, none were significantly related to the risk for breast cancer in the meta-regression analysis. The results of fixed effect analyses were similar to random effects analyses with narrower confidence intervals. The test for non-linearity was non-significant (P>0.05) with log dose, quadratic dose, and cubic splines models fitted in all studies. The linear model fitted the data of the included studies best.
Restricting the result to the moderate risk studies (ie, those of better quality) did not change the results. Five years of night work gave a relative risk of 1.06 (95% CI 0.98-1.14) and for 300 night-shifts it was 1.05 (95% CI 0.95-1.16). The differences between case-control and cohort studies were retained.
A sensitivity analysis in which we capped the highest exposure categories to their lowest bound did not change the results.
Median exposure in the case-control studies was 4 years and the predicted relative risk at this exposure in a post-hoc analysis was 1.07 (95% CI, 1.02-1.12). Median   Relative Risk per 5 year exposure increase (logscale) exposure in the three cohort studies was 9.5 years with a relative risk of 1.02 (95% CI, 0.97-1.10).
The funnel plot represented in figure 7 indicates that small studies might be missing on the side of no effect. However the Egger test was not significant (Egger's coefficient=0.94 (95% CI -0.7-2.6; P=0.24).
Using the GRADE method (34), we judged the evidence to be of very low quality. According to the approach of the Danish Occupational Medicine Association on grading the strength of causality, there is insufficient evidence of a causal association (grade 0).

Discussion
Based on a meta-analysis of 12 of the 16 included studies, we found an average 5% incremental relative risk increase with 5 years of night-shift work. However, cohort studies showed a very small, non significant risk of 1% as opposed to 9% average in case control studies. Different exposure models or sensitivity analyses did not change these results.
Our search was comprehensive, and there was no strong indication of publication bias from Egger's test. Many studies were conducted in Scandinavia where the issue of breast cancer and night-shift work seems to be a topic of debate. Quite a few of these were registerlinked studies. However there exist many more such registers worldwide that remain untapped and, therefore, we believe that the included studies alone could form an incomplete picture (50). Many studies referred to nurses and few to the general population and, therefore, it is likely that the results are more applicable to nurses.
Similarly most studies were from high income, white populations and thus the pooled results apply largely to these. Of the four studies that could not be included in the meta-analysis, only one (39) had adequate exposure assessment, which interestingly is part of a thesis not yet available as a journal publication. The addition of this large, nested, case-control study would have increased the precision of our results.
We consider the overall quality of the evidence to be low. The most important risk of bias in the studies included in the review was exposure measurement. Exposure to shift work measured by interview or questionnaires has been shown to be influenced by respondent characteristics in a recent study (51). We do not know of other validation studies on night-shift work exposure assessment by self-report. Some improvements in validity could be achieved with repeated questionnaires, as was done in two cohort studies (38,45). Use of self-report complemented by expert assessment/categorization could similarly improve the validity of exposure assessment in case-control studies (52). A job exposure matrix is a useful tool in epidemiological studies for assessing variation in exposure across jobs. However, since night-shift work exposure varies within an occupation, we believe that this method alone is too imprecise. It is conceivable that retrospective exposure assessment of shift work in interviews or questionnaires, as was the case in most case-control studies, would be subject to recall bias. Especially now, when the association of shift work and breast cancer has gained a lot of publicity, one could imagine that a woman with breast cancer better recalls and reports her shift-work exposure than a woman without breast cancer.
The cohort study design generally provides less biased results for causality especially when the exposure has been ascertained before the disease has occurred. We found exposure assessment of sufficient quality in only one study, a nested case-control study by Li (39). Knutsson et al (38) had probably the most comprehensive prospectively collected questionnaire data but this valuable information was not put to use when categorizing exposure for analysis. Therefore, Li (39) is probably the more reliable, albeit including only Chinese participants. Asian women, based on an unknown genetic disposition, may be less at risk for breast cancer (14).
It is a common assumption in observational epidemiology that it can always be predicted which direction the effect size would change (inflate or attenuate) as a result of bias. We however concur with Rothman et al (53,54) that this is not the case. We had planned to adjust for bias at study level due to confounding using the methods prescribed by Greenland et al (29), however this was possible for only one study due to lack of relevant data. Risk for duration (years exposed) not available We took the risk of bias into account by conducting a sensitivity analysis.

Limitations
We used an established method of modeling categoryspecific risk estimates into an incremental risk estimate assuming a linear dose-response. Based on additional information from study authors, we found that our model was accurate in all except the highest usually open category, where it overestimated the dose. However, a post-hoc sensitivity analysis using the lowest dose for these categories did not change the results. We tested our model assumptions by applying cubic splines and log transformation of dose and found the results unchanged. We could not take into account any latency period because only one study assessed it, finding no increase in risk when adjusting for a lag of 10 and 20 years of exposure to rotating night shifts (39). Over-adjustment for confounding in studies is a problem like under-adjustment. However, we consider the effect of any potential over-adjustment to be minimal because, for many of these established confounders of breast cancer, the association with night-shift work is weak.
It was not possible to examine and draw a conclusion on intensity of night-shift work or permanent night-shift work in our meta-analysis. We did not have a real cumulative index in which both duration and intensity of expo-sure were measured. It would be good to develop such an index. Finally, the future addition of ongoing studies to these results should improve the precision of our findings.

Agreements with other studies and reviews
In contrast to the previous reviews (2)(3)(4)(5), this review followed an priori protocol comparing night-shift with day work. Our review includes 4-8 more studies than previous reviews. None of the previous reviews modeled the dose-response relationship appropriately, in individual studies, to inform the choice of a dose-response model or tested these assumptions.
Besides this, we performed a formal risk of bias assessment for the included studies and incorporated these assessments in the analysis and conclusions drawn where none of the other reviews did so. We consider this extremely important as the quality of the studies, especially in exposure assessments, was the major factor in coming to clear conclusions.
Our findings are different from the reviews of Megdal et al (4) and Erren et al (3) with respect to the strength of the association. Kamdar et al (2) found a relative risk of 13% for up to eight years of night work, close to our findings, but this included flight crew studies in addition to night-shift studies. Kolstad's review (5) did not include a meta-analysis, although a later publication indicated a non-significant risk (RR 1.02 95% CI 0.92-1.13) (55). This meta-analysis included at  Funnel plot with pseudo 95% confidence limits least one study outside our inclusion criteria, and it was not clear which estimates from each study were entered in the analysis.

Implications for practice and research
Based on the low quality of evidence and the difference in effect estimate by study design, there is insufficient evidence for a link between night-shift work and breast cancer. For the same reasons we cannot rule out a relationship between the two. The uncertainty is largely due to less-than-valid exposure measurement and can only be resolved by means of better data in the future. Evidence from the two moderate risk Chinese studies indicates no increased risk for this population.
We need studies in which exposure is measured in an objective way before the disease has occurred, ideally in cohorts with long, prospective follow-up. Validation studies of interview/questionnaire data are needed as well to find out if and, to what extent, recall bias occurs.