Effects of a stress management intervention on absenteeism and return to work – results from a randomized wait-list controlled trial

Objective High levels of work-related stress are associated with increased absenteeism from work and reduced work ability. In this study, we investigated the effects of a stress management intervention on absenteeism and return to work. Methods We randomized 102 participants into either the intervention or wait-list control (WLC) group. The intervention group received the intervention in weeks 1–16 from baseline, and the WLC group received the intervention in weeks 17–32. Self-reported data on absenteeism (number of days full- or part-time absent from work within the previous three months) were obtained at 16, 32, and 48 weeks follow-up. Register-based data on long-term absence from work were drawn from the Danish public transfer payments (DREAM) database from baseline and 48 weeks onwards. The DREAM database contains weekly information on long-term sickness absence compensation. The threshold to enter DREAM is sick leave for two consecutive weeks. Results At follow-up in week 16, self-reported absenteeism in the intervention group [median 11 days (range 3–25)] was lower (P=0.02) than in the WLC group [median 45 days (range 19–60)], corresponding to a 29% [95% confidence interval (95% CI) 5–52] reduction. On register-based data (cumulated weeks in DREAM, weeks 1–16), the intervention group median [6 weeks (range 0–11)] was lower than that of the WLC group [median 12 weeks (range 8–16)], though not significantly (P=0.06), corresponding to a 21% (95% CI 0–42) reduction. For return to work, a hazard ratio of 1.58 (95% CI 0.89–2.81) favoring the intervention group was found (P=0.12). Conclusions The intervention reduces self-reported absenteeism from work. A similar trend was found from register-based records. No conclusive evidence was found for return to work.

Absenteeism from work has been associated with concurrent increasing levels of work-related stress in European countries (1) and is a global measure of workers health (2). Evidence on the prevention of work disability from mental health problems is scarce. In three recent Cochrane reviews (3)(4)(5), only two studies targeted work ability directly (6,7).
Cognitive behavioral stress management interventions often use only psychological outcomes (8)(9)(10). In a recent review, 4 out of 36 studies used absenteeism as an endpoint; none of these used a cognitive behavioral approach (8). As pointed out by de Vente et al (11) the majority of previous studies targeted non-clinical samples. An exception to this is a string of Dutch studies (6,(11)(12)(13)(14)(15), of which three studies are relevant to the present study.
In a study by de Vente et al (11), contrary to the authors' hypothesis, individual-and group-format cognitive behavioral stress management intervention led to more days absent compared to care-as-usual. Workers (N=82) on >2 weeks of sick leave, with no selection on occupation, were included.
Studies by both Klink et al (12) and Blonk et al (6) have demonstrated an effect on absenteeism by approaches based on a cognitive behavioral rationale and pre-structured graded activity time schemes. The Klink et al study (12) included postal company workers (N=192) on their first sick leave, while Blonk et al (6) included self-employed people (N=122) on sick leave.
In our study, we conceptualized work-related stress as the experience of intense negative cognitions, Willert et al emotions, and physical sensations in relation to repeated critical situations at work, typically involving perceived demands that one is not able to meet (16), and negative expectancies of coping with future situations (17).
An inherent limitation in studies of work ability and absenteeism from work are differences in the legislation governing the labor market of individual countries. This weakens comparability of studies across countries. In Denmark, sick leave extending beyond two weeks must be sanctioned by the worker's general practitioner. Workers are permitted sick leave for ≤52 consecutive weeks with full compensation.
It has been discussed how to measure absenteeism from work and return to work (18) -and which method (ie, using self-report or register-based data) is preferred (19). Young et al (18) note that no consensus on the appropriate outcomes of return-to-work interventions exists; they advocate a multidimensional approach. Pole et al (19) suggest that researchers should carefully consider the most appropriate measure in the context of a particular study, potentially collecting both self-report and register-based data. From the Whitehall study, Ferrie et al (20) found good agreement between self-reported data and employers' registers of sickness absence.
Numerous ways of assessing absenteeism have been proposed, including (i) incidence; (ii) cumulative duration from ≥1 absence spells; (iii) time until first, or lasting return to work; and (iv) time until first recurrence of sickness absence (4,21). As a measure of absenteeism from work, we have used cumulative duration from ≥1 absence spells, since this measure is not dependent on whether or not participants were on sick leave at the time of inclusion in the study and could be measured using both self-report and register-based data. Furthermore, for those on sick leave at inclusion, we looked at time until lasting return to work. For those not on sick leave at inclusion, we looked at time until first incidence of sick leave.
The intervention was directed at workers that were either at risk of going on sick leave or returning from a period of sick leave -returning from sick leave is a transition often feared due to the renewed exposure to work. In one recent study, fear-avoidance beliefs about work were the most important risk factor for not returning to work among workers on long-term sick leave (22). Workers typically fear not being able to cope with work, the subsequent reappearance of their symptoms, and risk of renewed sick leave. Both for those returning to work and those already active at the workplace, the goal of the intervention was to improve the ability to cope with experienced demands at work and reduce the need for sick leave to cope with the situation. We expected the effects of the intervention to take place either from the onset of the group sessions, through the perceived help and support offered, or alternatively following the first four weekly sessions, where most of the intervention tools were introduced. Our expectation for change earlier rather than late in the stress management intervention, was adapted from the literature on the effects of psychological interventions, where the most rapid changes in symptom relief appear in the earlier phases of treatment (23).
The objectives of this study fall in two parts. In hypothesis 1, we examine if a group-format cognitive behavioral stress management intervention reduces absenteeism from work, measured as cumulative duration of sickness absence from ≥1 absence spells. In hypothesis 2, we examine (i) if the intervention shortens the time to lasting return to work for those on sick leave at the time of inclusion in the study and (ii) whether the intervention reduces incidence of new spells of sick leave for those working at inclusion.

Study design
The study used a randomized wait-list control design (figure 1). Participants were randomized to either the intervention group or to a wait-list control (WLC) group, after their baseline measurement. After three months on the wait-list, the WLC group also received the intervention. Participants in the WLC condition were not hindered in seeking supplementary help while on the wait-list, nor were the participants hindered from seeking help upon completion of the treatment.
Follow-up from baseline was 48 weeks. Questionnaires were obtained at 16, 32, and 48 weeks. Registerbased data on long-term sick leave were drawn from baseline and 48 weeks onwards.

Sample size and inclusion period
An a priori power calculation, based on one of the main outcome measures of the study [ie, the Perceived Stress Scale (PSS)], estimated the necessary sample size to be 90 participants. This would allow for detection of a between-groups difference of one standard deviation (SD) from the score at baseline (24,25). The sample size calculation was based on significance level: 95%; power: 80%; SD: 5; intra-class correlation coefficient: 0.15; and average cluster size: 9. To allow for a 10% dropout, 102 participants were included. At the time of performing the power calculation, the estimated sample size was considered adequate for all outcome measures included.
Induction into the study took place over a period of ten months, from December 2006 through September 2007, with groups commencing in succession from January-December 2007.

Referral
Persons from the working population (18-67 years) in the municipality of Aarhus could participate in the study. Referral was available through local general practitioners, union social workers, and direct inquiry.
In total, 173 persons were referred to participate, as illustrated in figure 1. Out of this group, 156 persons were invited to an assessment interview to determine eligibility, while 17 persons were excluded (see figure  1 for reasons). From the assessment interview, 102 persons were invited and accepted to participate, while 54 persons were not included. All persons not included were informed about alternatives.

Assessment and eligibility
A clinical psychologist (>5 years training) undertook a semi-structured assessment interview with potential participants. Inclusion criteria included persistent symptoms of work-related stress, defined by physiological and psychological symptoms of sustained animation, lasting >4 weeks, and elevated reactivity of symptoms to demands at work. Motivation to remain employed and, if on sick leave, a planned return to work ≤4 weeks was required in order to comply with the intervention rationale of homework assignments between group sessions, applying the techniques learned in groups at work. Participants were either on sick leave following an assessment by their general practitioner or working. For the latter, a score of ≥20 points on the PSS was required [equaling 1.0 SD above the population mean reported by Cohen & Williamson (25)].
Exclusion criteria were: (i) >26 consecutive weeks of sick leave (to select individuals recently active at their workplace and deselect those at risk of falling under social service regulations); (ii) substantial psychosocial strains outside of work; (iii) bullying as the main problem; (iv) severe psychiatric condition or a history of repeated psychiatric conditions; and (v) current abuse of alcohol or psychoactive stimulants.

Allocation
The study used block randomization in blocks of six, generated using the RANNOR computer algo rithm (SAS Inc, Cary, NC, USA). After the baseline measurement, an independent individual open the envelopes containing the participants' allocation. After randomization, the intervention and WLC groups each comprised 51 participants. At the first measurement after baseline, 15 participants did not complete their follow-up measurement (figure 1).

Intervention
Each group contained nine participants, encompassed eight 3-hour sessions over a period of three months and was led by one of two licensed clinical psycholo-

Willert et al
gists, with >5 years of clinical experience and a 1-year advanced training course in cognitive behavior therapy. Groups met for weekly sessions the first four weeks, and then every fortnight for the remaining four sessions. Treatment was manualized, and used a slide show to set the agenda for each group session, promoting uniform delivery of the intervention between groups.
A goal of the intervention was to enable the participants to cope with stressful situations at their workplace and strengthen their ability to be active at work, despite their current difficulties. This goal was underpinned by the content of the group sessions, the main topics of which were: (i) introduction to cognitive behavior therapy, (ii) psychoeducation on stress, (iii) identifying dysfunctional thinking, (iv) modifying dysfunctional thinking, (v) communication and stress, (vi) communication skills training, (vii) implementing strategies at work, and (viii) review of techniques. Between group sessions, participants completed homework assignments aimed at promoting implementation of the techniques learned in the groups at work.

Outcome measures
Two independent measures of absenteeism from work were used: one measure was a self-reported questionnaire, the other comprised data from a national database of public transfer payments. The two measures represent overlapping but not identical time periods during follow-up. The self-reported data consist of information on three-month periods in retrospect at three follow-up points that are four months apart, while the registerbased data consist of continuous week-by-week registrations in three follow-up periods of 16 weeks each.

Self-reported data
At follow-up measurements, participants reported in a questionnaire their amount of days on full or partial sick leave in the preceding three months. There were two questions covering this dimension, voiced as follows: "How many full working days have you been on sick leave from your work in the last three months?" and "How many days have you been working reduced hours in the last three months?" After each question, there was space for the participant to fill in the number of days. The number of days reported for each question was added to give a single measure of full or partial sick leave from work, which allows for comparability with the register-based data.

The DREAM database
In Denmark, 102 types of public transfer payment to Danish citizens have been registered week-by-week in a national registry since 1991 (the so-called DREAM database). Once registered in the database, it is possible to change the type of transfer payment registered between the major types of registrations (eg, "full sick leave" to "unemployment"). A limitation of the database is that changes within the "family" of sick leave registrations (eg, full and partial sick leave) cannot be distinguished within the same period of sickness absence. Termination of registration occurs following the first full week of not receiving any type of transfer payment.
Data on registrations in the DREAM database were obtained from each participant's date of randomization and 52 weeks ahead, as well as back in time.
When investigating the mean number of weeks between measurements on self-reported data, it turned out that the three-months intervals between measurements appointed by the research protocol, was in fact on average four months due to logistic and practical reasons. In accordance with this, registrations in the DREAM database were divided into three 16-week intervals, covering 48 weeks total, corresponding to the time intervals in the study design (see time line in figure 1).
At the onset of the trial, registration in DREAM covered either "no registration" or a registration of "part-or full-time sick leave", with the exception of one participant registered with "early disability pension". The threshold for registration in the database with full or partial sick leave compensation is two consecutive weeks on sick leave. As the trial timeframe moves through the 48 weeks, registrations of the participants diversify into six additional categories: (i) unemployment, (ii) public education grant, (iii) flexible job (Danish labor market arrangement for people with reduced ability to work, wage is partly compensated), (iv) rehabilitation, and (iv) maternity leave.
Registrations in DREAM of part-or full-time sick leave were used in the analysis of cumulative weeks registered in DREAM within the different phases of the trial. For the analysis of return to work, a registration of part-or full-time sick leave in DREAM was used in conjunction with unemployment as negative outcomes, while no registration in DREAM, public education grant, flexible job, rehabilitation, and maternity leave were all defined as positive or neutral outcomes.

Statistical analysis
For statistical analyses, we used the STATA (Stata Corp LP, College Station, TX, USA) software package. Baseline characteristics were compared using the Chi-squared test of comparable distributions and the Student's t-test. Both self-reported and register-based data were skewed, depicting a U-shape in a histogram, reflecting many participants with either no or the maximum amount of absenteeism from work. As a result, the Mann-Whitney U-test was used to test for differences in the cumulative number of days and weeks in the different phases of the trial. Calculation of Somer's D was used to estimate the percentual difference in sick leave registrations between two randomly chosen participants from the intervention and WLC groups.
Cumulative probability of being registered in the DREAM database over time was performed by drawing a Kaplan-Meier plot and testing for difference between the two groups with a Cox regression. "Leaving the DREAM database" was defined as four consecutive weeks with no registration in the database. Model validation of the proportional hazards assumption was performed by visual inspection of a log-log plot of the survival curves and the proportional hazards test.
For the self-reported data, those dropping out of the study or failing to complete their follow-up measurement for each phase of the trial, could not be included in the analyses (see figure 1 for number of participants with incomplete data). Register-based data were not affected by dropout and analyzed as intention-to-treat.
When measuring the amount of days or weeks of sick leave, one can compare the intervention and WLC groups in two different ways in this study design. One form of comparison is to look at the difference between the two groups in the first phase of the trial, where you compare the intervention to no intervention, represented by the WLC condition. With reference to figure 1, this means investigating differences between the two groups on the T1 reporting of days absent in the past three months for the self-reported data and in the interval from week 1-16 for the DREAM database data. Another mode of comparison is to look at the whole timeframe of the study and compare the two groups as a case of early or delayed intervention. One then investigates whether the amount used of the given resource accumulates over time, depending on whether the intervention comes early or is delayed. Referring again to figure 1, this can be achieved by looking at DREAM database registrations in weeks 1-16, 17-32, and 33-48, as well as in the whole timeframe (ie, weeks 1-48).

Baseline characteristics
Demographic characteristics of participants at the time of inclusion to the trial are presented in table 1. No significant differences were found between the two groups.
A total of 40 participants were not registered in DREAM at inclusion in the study, while 61 participants were on part-or full-time sick leave, and 1 participant was registered with early disability pension (see table 1). At the end of the trial, in week 48, a total of 75 partici-pants were not registered in DREAM, 16 were registered with part-or full-time sick leave, and 11 participants had other registrations [unemployment (N=2), education grant (N=2), flex job (N=4), rehabilitation (N=2), early disability pension (N=1), and maternity pay (N=1)].
A total of 14 participants in the WLC group consulted a psychologist outside of the study, with a mean number of 5.4 visits. However, surprisingly, 13 participants from the intervention group also consulted a psychologist outside of the study but while still in group, with a mean of 3.1 visits.

Hypothesis 1: cumulative duration of sickness absence
In table 2, results on self-reported absenteeism from work, represented by days full-or part-time absent from work in the preceding three months, are presented. Median and mean days absent are presented for both groups, and results of the Mann-Whitney U-test are displayed, comparing the intervention to the wait-list control condition. Using Somer's D, a 29% [95% confidence interval (95% CI) 5-52] reduction of reported days on sick leave was found.
For the self-reported data, a number of participants dropped out of the study and did not provide data at the follow-up measurements (see figure 1). Dropout analyses were performed and revealed no systematic differences between those dropping out of the study and those remaining in terms of gender, age, sick leave status or PSS-score at inclusion. Also, no systematic differences were found between those dropping out of the intervention and WLC groups, respectively.
Results on long-term absence from work, represented by the cumulative number of weeks registered with either part-or full-time sick leave in the DREAM database, are presented in table 3. Results are displayed for the each of the phases of the trial, the entire timeframe of the trial, and the 48 weeks prior to randomization. Results of the Mann-Whitney U-tests are presented, comparing the two groups in the first phase of the trial, in the entire timeframe of the trial, and in the 48 weeks prior to randomization. Using Somer's D, a 21% (95% CI 0-42) reduction in DREAM registrations of sick leave was found.
To control for possible gender differences driving the observed effects, the analyses were re-run for women only. This only slightly affected the estimates.

Supplementary analysis
We have performed supplementary analyses, looking at those working and on sick leave at inclusion to the study, separately. We are aware this introduces a division of the study population in addition to that provided by the randomization. However, since the distribution of those on sick leave and those working is almost equal in the Willert et al two groups at the time of inclusion (see table 1), we were motivated to look at these two groups separately. This may provide insight into differences in the effects of the intervention depending on the participants' starting point.
For the self-reported data of those working at time of inclusion in the study, at the first follow-up measurement (T1 in figure 1), we found a median number of 4.5 days (range 2-14) on sick leave for the intervention group, compared to a median of 7.5 days (range 1-40) for the WLC group (P=0.33). For those on sick leave at inclusion to the study, the intervention group reported a median of 32 days (range 7-66), compared to a median of 61.5 days (range 43-90) in the WLC group (P=0.07).

Hypothesis 2b: incidence of new sick leave spells
We also conducted an analysis of incidence of new periods of sick leave, for participants who were working at randomization (N=42). During the follow-up in weeks 1-16, two individuals from the intervention group (N=24) and four from the WLC group (N=18) entered a period of sick leave registered in the DREAM database. A further four individuals from the intervention group entered a period of sick leave in weeks 17-32. In total, six participants (25 %) from the intervention group, and four (22.2 %) from the WLC group entered a period of sick leave in the 48 weeks of follow-up. There were too few cases to perform a statistical test.

Discussion
Findings in relation to hypothesis 1 From a randomized, WLC trial, we have found a reduction in self-reported absenteeism from the intervention compared to the WLC condition in the first phase of the trial. The difference between the two groups on median number of days absent from work was 34 days, corresponding to a 5-55% reduction. Regarding participants' long-term absence from work in weeks 1-16, a three-week difference in the median number of weeks registered in the DREAM database was observed, corresponding to a 0-40% reduction, but falling short of reaching statistical significance. On long-term absence from work across all phases of the trial, there was a tendency for the intervention group to have fewer weeks registered with sick leave. This was calculated considering the complete timeframe of the study, from 1-48 weeks, indicating a possible reduction in long-term absence from work, from an early intervention.

Findings in relation to hypothesis 2a
The rate of return to work among participants that were sick listed was faster in the intervention group, although not statistically significant. In the first phase of the trial, both groups saw a decline in sick leave registrations, which accelerated for the intervention group compared to the WLC group in the following stages of the trial. This was contrary to our expectations of a more immediate effect of the intervention within the first month after baseline and may lead to questioning whether the 16-week follow-up period was long enough to catch up on the effects. Also, we saw a decline in sick leave registrations in the WLC group before receiving the intervention. This may be due to the inclusion criterion Table 3. Register-based records of absenteeism from work, represented by cumulative number of weeks registered with part-or full-time sick leave in DREAM database. Results are reported for the different phases of the trial, the complete time interval, as well as the 48 weeks prior to randomization. P-values are from the Mann-Whitney U-test statistical analyses. [95% CI=95% confidence intervals].

Findings in relation to hypothesis 2b
Regarding the incidence of new spells of sick leave, for those working at the time of inclusion, one in four participants entered a new spell of sick leave during the follow-up period. There were too few cases to analyze differences between groups and we cannot formally test hypothesis 2b with the sample size in this study.

Comparison with previous studies
In their study, de Vente et al (11) found a trend towards more days absent, comparing two stress management interventions based on cognitive behavior therapy with care-as-usual. Care-as-usual was defined as consultion of an occupational physician (mean number of visits 2.56), general practitioner (mean number of visits 1.44) or a psychologist/social worker (mean number of visits 4.64, N=11). In our study, we compare a cognitive behavioral stress management intervention with a WLC condition. Participants in the latter condition were not hindered in seeking other help while on the wait-list, and reported a mean 2.5 visits to their general practitioner while 14 participants on the wait-list reported consultations with a psychologist outside of the study (mean of 5.4 visits). There appear to be some similarities between de Vente et al's care-as-usual condition (11), and the WLC condition employed in our study. However, contrary to the findings of de Vente et al, the cognitive behavioral intervention program we investigated was found to be effective in lowering self-reported absenteeism. The diverging findings may be by explained by differences in the content of the stress management interventions, but also that they are embedded within two different labor market regulations (namely, Denmark and the Netherlands). In the study by Blonk et al (6), a stress management intervention based on cognitive behavior therapy was not more effective than the no-intervention control group. However, a combined intervention (based on cognitive behavior therapy but with the added components of a graded activity scheme guiding the rate of return to work and workplace interventions) surpassed both the control and group format intervention. In our study, contrary to the Blonk et al study (6), we found that an intervention based on cognitive therapy is superior to a WLC group. Blonk et al's added elements of graded activity schemes and workplace interventions were not part of the intervention manual used in our study.
The study by Klink et al (12) compares a graded activity scheme intervention, based on the cognitive behavioral approach "stress inoculation training", with care-as-usual visits to a resident occupational physician within a postal company. An effect on return to work and absenteeism was found. As in the previous study, the graded activity component is central to the intervention. This component was not explicitly part of the intervention manual in our study. Another difference between the two studies is the population sample, where the Klink et al study is situated within a specific company and reports 63% male participants. These differences reduce the comparability of the Klink et al study to our study. Based on participants registered with sick leave in DREAM at randomization (N=60). Lasting return is defined as four consecutive weeks off on sick leave or unemployment. Note: intervention group receives the intervention between weeks 0-16. Wait-list control group receives the intervention between weeks 16-32.

Validity
There are several factors to consider when evaluating the internal and external validity of this study. In the first phase of the trial, comparing the intervention to the wait-list, observed differences may reflect an effect of the intervention. On the other hand, observed differences may also be associated with the WLC study design, which may compromise internal validity. One can speculate that a participant randomized to the waiting list may postpone work resumption as planned, until the wait-list is over. Another threat to the internal validity may also come from the WLC design; since those on the wait-list do not receive any placebo treatment. Compared to those receiving the intervention, it is not possible to discern whether the observed effects stem from the gesture of offering any form of help or if the effect is due to specific components of the intervention. From research on the efficacy of psychological treatments in general, it is known that the effects one can expect stem from both non-specific and specific factors (23).
In the study, we see a low drop-out rate in the WLC phase of the trial; the drop-out is distributed between the two groups, supporting the internal validity of the study.
Compared to the general working population, participants are weighted towards being middle-aged female workers working in the social, healthcare, education, and administration sectors. Less is known from this trial on the effects of the intervention on, for example, male or blue-collar workers, which may threaten the external validity of the study. Also, we have no measure of the extent of sickness presenteeism (ie, going to work despite not feeling fit for work), which may be more associated with some occupations than others (26).
Both the self-reported and register-based data have their strengths and limitations. The self-reported data reflect both short and long-term spells absent from work. However, the retrospective sampling method used lends itself to potential recall bias and also information bias in terms of a potential drive to "please the researchers" after receiving the intervention. Dropout is another source of bias, as cases are lost at follow-up measurements. On the other hand, data from the DREAM register reflect only long-term spells of absence (>2 weeks). DREAM is an administrative database and an objective source of information not influenced by recall bias, and unaffected by dropout.
When studying absenteeism and return to work, administrative regulations of the labor market may have powerful consequences in guiding worker behavior and actions. In Denmark, a worker can receive a maximum of 52 weeks on sick leave with full compensation. This may impose pressure on participants who are approaching the limit of 52 weeks of absence, limiting the comparability of our study with studies from other countries.
Both the self-reported and register data on absenteeism are highly skewed. The differences found between the groups may be driven by differences at the extreme ends of the distribution of the data, as proposed by Loisel et al (27). In a histogram, we see more participants with no days or weeks absent in the intervention group, and more participants with all days or weeks absent in the WLC group. In the distribution of the data between these two extremes, the differences between the groups are less pronounced.

Concluding remarks
We believe the observed reduction in absenteeism from work has potential clinical and practical implications, since costs associated with absenteeism from work is a major concern for employers and society. It is an unanswered question whether the intervention improves the health of participants, also because the concept of health has multiple definitions. The intervention aims to improve participants' motivation to face challenges experienced at work and supplies a set of tools, as well group support, to take an active stance toward handling those challenges.
In conclusion, we have found support for our first hypothesis: the intervention reduces self-reported absenteeism from work when compared to a WLC condition. Using register-based information on long-term absence from work a similar trend was found, but did not reach statistical significance. With regards to the second hypotheses, no conclusive evidence was found on the rate of lasting return to work (or equivalent) for those on long-term absence from work at the onset of the trial or on the incidence of new spells of sick leave for those working at the time of the study.