Exercise to reduce work-related fatigue among employees: a randomized controlled trial

This study shows that exercise has the potential to serve as a relatively simple and inexpensive secondary prevention strategy to improve (long-term) well-being among employees experiencing high work-related fatigue. The extent to which the exercise intervention under study caused beneficial effects in work-related fatigue depended on participants’ compliance, underlining the challenge of implementing exercise interventions in practice among fatigued employees. a randomized Objectives The present study evaluated the efficacy of an exercise intervention to reduce work-related fatigue (emotional exhaustion, overall fatigue, and need for recovery). The effects of exercise on self-efficacy, sleep, work ability, cognitive functioning and aerobic fitness (secondary outcomes) were also investigated. Methods Employees with high levels of work-related fatigue were randomly assigned to either a 6-week exercise intervention (EI; N=49) or a wait-list control group (WLC; N=47). All participants were measured pre- (T0) and post-intervention (T1). EI participants were also measured 6 (T2) and 12 weeks (T3) after the end of the intervention. Analyses were based on intention-to-treat (ITT) and per-protocol (PP). PP analyses only included EI participants (N=31) who completed the intervention and WLC participants (N= 35) who did not increase their exercise level during the wait period. Results Analyses of covariance (ANCOVA) revealed that, at T1, the EI group reported lower emotional exhaustion and overall fatigue than the WLC group, however, only according to PP analyses. Both according to ITT and PP analyses, EI participants showed higher sleep quality, work ability, and self-reported cognitive functioning at T1 compared to WLC participants. Intervention effects were maintained at T2 and T3. Conclusions The exercise intervention had enduring effects on work-related fatigue and broader indicators of employee well-being. This study demonstrates that, in case of work-related fatigue, exercise does constitute a powerful medicine for those who comply with the treatment.

Many employees experience work-related fatigue (estimated at 22%, [1,2]). Its more extreme manifestation, "burnout" is, at least partly, the result of prolonged work-related stress, resulting from excessive workload, time pressure, or organizational change (3,4). Negative consequences for employees include impaired cognitive functioning, reduced productivity at work, and health problems such as depression and cardiovascular diseases (4)(5)(6). These negative consequences have prompted calls for effective interventions to reduce work-related fatigue. In the current study, exercise as a potential intervention to reduce work-related fatigue is investigated.
It has been proposed that a combination of psychological and physiological mechanisms underlies the beneficial effect of exercise on work-related fatigue. As regards the first, exercise may help employees to psychologically detach from work (7,8) and in this way prevent prolonged stress responses that may result in enduring fatigue (9). As regards the second, increased physical fitness may promote stress resilience through faster stress recovery (10) thus reducing the risk of persistent fatigue.
So far, only few studies have examined the effect of exercise on work-related fatigue. Cross-sectional and longitudinal studies have reported an inverse relationship between the two (11)(12)(13)(14), and the few available intervention studies show beneficial effects of exercise on workrelated fatigue (15)(16)(17)(18)(19). However, these intervention studies suffered from one or more methodological shortcomings, such as no adequate control condition, no (described) randomization procedure, and lack of non-response and intention-to-treat (ITT) analyses. Due to these shortcomings the causality of the association between exercise and work-related fatigue remains largely unclear.
Therefore, the aim of the current study was to uncover the causal association between exercise and work-related fatigue by employing a randomized controlled trial. To this end, we selected employees with high levels of workrelated fatigue and randomly assigned them to either a 6-week exercise intervention (EI) or a wait-list control group (WLC). As such, it was investigated whether exercise has beneficial effects on work-related fatigue compared to the natural course of these symptoms. It was hypothesized that exercise reduces work-related fatigue. Additionally, we aimed to investigate the effect of exercise on five secondary outcomes of employee well-being: self-efficacy, sleep, work ability, cognitive functioning, and aerobic fitness. Prior work shows that fatigued employees often show deficiencies in these outcomes (20)(21)(22)(23)(24), while it has been suggested that exercise positively affects them (25)(26)(27)(28)(29), see 30 for a more extensive justification for the choice of secondary outcomes). We therefore expected that exercise improves general and work-related self-efficacy, sleep quality and quantity, work ability, cognitive functioning, and aerobic fitness.

Study design
It was investigated whether the EI group was superior to the WLC group with respect to the reduction of workrelated fatigue. Participants were randomly allocated to one of the two conditions at a ratio of 1:1 and a block size of 20. A full description of the study protocol has been previously published (30). The Ethics Committee of the Faculty of Social Sciences of the Radboud University (registration number: ECSW2015-1901-278) approved the study protocol, which was preregistered at the Netherlands Trial Register (NTR5034).

Participants and procedure
Participants were recruited via advertisements in newspapers, via social media, and on the intranet of large healthcare organizations. They were eligible to participate if they were currently employed and had high levels of work-related fatigue, as indicated by a high score on two validated questionnaires (ie, ≥2.2 on the emotional exhaustion scale of the Utrecht Burnout Scale [31], and ≥22 on the Fatigue Assessment Scale [32]). Exclusion criteria were (i) ≥1 hour of exercise/week; (ii) fatigue attributable to a medical condition; (iii) currently or in the past six months receiving psychological and/or pharmacological treatment; (iv) drug dependence; and (v) contra-indications to exercise. The latter were measured with the Physical Activity Readiness Questionnaire (33). Sample size calculation can be found in the study protocol (30). In total, 362 employees were screened for eligibility (see figure 1). Of these, 96 were eligible and willing to participate. After baseline assessment randomization was carried out by the first author (JdV) or a research assistant, using sealed opaque envelopes.

Exercise intervention
The exercise intervention consisted of 1-hour lowintensity running sessions three times a week for a period of six consecutive weeks. Two running sessions were carried out in a small group of ten participants, led by a licensed running trainer, and one running session was carried out independently by the participant. More details of the intervention can be found elsewhere (30).
WLC participants were offered to follow the intervention after six weeks of waiting. Thirty-nine of 47 WLC participants actually followed the intervention.

Measures
All outcomes were measured among EI and WLC participants at pre-(T0) and post-intervention (T1). EI participants were also measured at follow-up: 6 (T2) and 12 (T3) weeks after the intervention period. We used self-reported data, and "objective" tests of cognitive performance and aerobic fitness. Cognitive functioning and aerobic fitness were not measured at follow-up. Full details of the materials can be found in the study protocol (30).   [31]). Example item: "I feel burned out from my work" (0=never, 6=every day). Cronbach's α was 0.80 at T0, 0.91 at T1, 0.94 at T2, and 0.90 at T3. A mean score ≥2.2 was considered as high work-related fatigue (31). Overall fatigue represents general mental and physical fatigue (32). It was measured with the 10-item Fatigue Assessment Scale, a valid questionnaire to measure fatigue in the working population (FAS; [32]). Example item: "I get tired very quickly" (1=never, 5=always). Cronbach's α was 0.84 at T0, 0.88 at T1, 0.88 at T2, and 0.82 at T3. A sum score ≥22 signifies high overall fatigue (32). Need for recovery is meant to represent short-term work-related fatigue (35). Conceptually, it bridges the stage between fatigue that occurs after one effortful workday and serious long-term workrelated fatigue, such as burnout (35). It was assessed by the short version of the Need for Recovery Scale, including 6 items (35,36). Example item: "Because of my job, at the end of the working day I feel rather exhausted" (1=(almost) never, 4=(almost) always). A mean score was computed. Cronbach's α was 0.85 at T0, .89 at T1, 0.87 at T2, and 0.90 at T3.

Secondary outcomes
Sleep. Poor sleep quality was measured with the 6-item sleep quality scale of the Dutch Questionnaire on the Experience and Evaluation of Work (36). A higher sum score indicates poorer sleep quality. Example item: "I often wake up several times during the night" (0=no, 1=yes). Cronbach's α was 0.62 at T0, 0.65 at T1, 0.71 at T2, and 068 at T3. Sleep quantity was assessed by questioning employees' average hours and minutes of sleep. Self-efficacy. General self-efficacy was assessed by the Dutch version of the 12-item General Self-Efficacy Scale (37). Example item: "If I made a decision to do something, I will do it" (1=strongly disagree, 5=strongly agree). Cronbach's α were 0.84 at T0, 0.87 at T1, 0.84 at T2, and 0.84 at T3. Work-related self-efficacy was measured with the "competence" subscale of the Utrecht Burnout Scale (31). Example item: "If I make plans, I am convinced I will succeed in executing them" (0=never, 6=every day). Cronbach's α were 0.81 at T0, 0.85 at T1, 0.90 at T2, and 0.89 at T3. Work ability. Work ability was measured by means of a single-item (38,39): "Can you indicate how you rate your current work ability when you compare it with your lifetime best?" (0=completely unable to work, 10=work ability at its best).
Cognitive functioning. Four indicators were used to measure participants' cognitive functioning. Selfreported cognitive functioning was assessed by the 25-item Dutch version of the Cognitive Failures Questionnaire (40). Example item: "Do you read something and find you have not been thinking about it and must read it again?" (1=never, 5=very often). Cronbach's α was 0.90 at T0, and 0.90 at T1. A sum score was computed, higher scores indicating lower cognitive functioning. Three types of executive functions (ie, updating, switching, and inhibition) were measured by means of cognitive performance tests. Updating (41) was measured with the 2-back task (42). During the task, 284 letters were presented one by one on the screen. When the displayed letter was similar to the letter that was shown two screens before, participants had to push a button (ie, correct response). Performance was measured by the number of correct responses. Switching was measured with the matching task (22,43). The task consisted of 31 task runs, each consisting of 4-8 trials. In the trials participants had to match several colored figures to each other according to shape or color (as indicated by a cue before each task run). Half of all task runs consisted of switch runs, in which the type of cue differed from the previous run. The other half consisted of repetition runs, in which the type of cue was identical to the previous run. Switch cost (ie, the difference in reaction time to switch and repetition runs) was used as an indicator of cognitive performance. Inhibition was measured with the Sustained-Attention-to-Response Test (SART; [44]). Digits were presented on a screen and participants had to push a button as fast as possible, except when the digit was 3. The number of correct inhibitions (ie, not pressing the button when 3 appeared) was taken as a measure for cognitive performance. To obtain a more thorough insight in cognitive functioning, the subjective costs (fatigue, motivation, demands and effort) associated with doing the cognitive performance tests were evaluated. These subjective costs were measured using single-item measures, answered on a 10-point scale from 1 (not at all) to 10 (very much).
Aerobic fitness. VO 2max was used as an indicator of aerobic fitness. It was obtained from the Urho Kaleva Kekkonen (UKK) walk test, a simple and valid method to measure aerobic fitness (45). Participants needed to walk 2 km as fast as possible. Based on heart rate, walking time, body weight, height, and gender, VO 2max was estimated (45).
Higher VO 2max indicates a better aerobic fitness level. Subjective costs of doing the UKK walk test were also assessed. Items used to this purpose were similar to those questioned before and after the cognitive performance tests, except that another item was added about how short of breath participants were immediately after the test.
Exercise activities. During each week of the intervention period, EI participants were asked to indicate their compliance to the guided and individual exercise sessions. At T2 and T3, they were asked whether they engaged in regular exercise during the last six weeks (type, frequency, duration). WLC participants were also asked to indicate whether they engaged in regular exercise during each week of the intervention period (type, duration, and frequency).

Statistical analysis
Results with respect to pre-and post-comparisons of primary and secondary outcomes were based on the ITT, and the per-protocol (PP) principle (46). ITT is a strategy for the analysis of RCT that compares participants in the conditions to which they are originally randomly assigned, irrespective of dropout, non-compliance or anything that happens after randomization (46). Thus, all participants who are randomized are analyzed. The ITT strategy has two main purposes: (i) it maintains intervention groups that are similar apart from random variation. If analyses are not performed on the groups produced by randomization, the principle of randomization is lost; (ii) it reflects an effect estimate of the intervention that would have been observed in practice, since dropout and non-compliance is also common in practice. ITT therefore reflects an estimate of the effectiveness of the intervention (ie, the working of the intervention in practice). The PP strategy excludes participants who deviated from the protocol (46). Thus, only a selected part of participants is analyzed, ie, only those who show high compliance. PP therefore reflects an estimate of the efficacy of the intervention (ie, the working of the intervention under "ideal circumstances"; [46]). Analyses were performed with SPSS version 23 (SPSS Institute, Cary, NC, USA). Reported P-values are two-sided with a significance-level of 0.05.
Missing data. Although we attempted to keep in contact with all randomized participants at post intervention and follow-up -including those who withdrew from the study -not all participants completed all measures. Self-reported baseline data were available for all participants. At T0, 6 participants (6.3%; EI: N=3, WLC: N=3) did not complete the cognitive performance tests, and 8 participants (8.33%; EI: N=4, WLC: N=4) did not take part in the aerobic fitness test. At T1, for the self-reported outcomes, the attrition rate was 9.4% (EI: N=7, WLC: N=2). At this point in time, 19 participants (19.8%; EI, N=11, WLC: N=8) did not complete the cognitive performance tests and 28 (29.3%; EI: N=15, WLC: N=13) did not participate in the aerobic fitness test. At T2, 12 EI participants (24.5% of the 49 EI participants that were randomized) did not provide follow-up data. At T3, 13 EI participants (26.5%) did not provide follow-up data. Participants who did not provide post-intervention data at T1 were younger than those who provided follow-up data (mean 31.10 versus 46.22, P<0.01), but did not differ on other demographics and baseline outcomes. Little's overall test of randomness indicated that pre-and post-data were missing completely at random. As a consequence, it was justified to use multiple imputations to estimate missing values (47). In case of multiple imputations, missing values are replaced by randomly chosen values that are drawn from an estimate of the distribution of the corresponding variable (48). We used 20 imputations with 100 iterations.
Participants who did not provide follow-up data at T2 and T3 did not significantly differ from those who provided data at these time points, neither on demographics as on baseline outcomes. However, follow-up data (T2 versus T3) were not imputed, since Little's overall test of randomness indicated that the data were not missing completely at random. We based our follow-up analyses on EI participants who completed all measures.
Intervention efficacy. Due to unforeseen (small) baseline imbalances (eg, in our primary outcome fatigue, see results section), we adapted the analytic strategy from the study protocol (30). Originally, we planned to test the effects of the exercise intervention by using 2×2 repeated measures (M)ANOVAs. However, as literature suggests that univariate analyses of covariance (ANCOVA) better controls for baseline imbalance (49,50), and generally has greater statistical power to detect intervention effects than other methods such as RM-ANOVA, we preferred to use ANCOVA in the present study. This means that the EI and WLC group were compared on all outcomes at T1, using T0 scores as covariates. Partial eta-squared (η 2 ) was reported as effect size, and values between 0.01-0.06 were considered as small, 0.06-0.14 as medium, and ≥0.14 as large (50). For reported Cohen's d, effect sizes of 0.2-0.5 were considered as small, 0.5-0.8 as medium and ≥0.8 as large (51).
Clinical meaningfulness. To assess the clinical meaningfulness of the EI changes in our primary outcomes (52), we performed Chi-square tests to see if the number of participants who scored below cut-off scores of work-related fatigue (UBOS <2.2; FAS <22) after the intervention period (T1) differed between the interven-de Vries et al tion and the control condition. Need for recovery was not included in this analysis due to the absence of clear cut-off scores for this scale.
Follow-up effects. To investigate whether interventioneffects were maintained at follow-up, for each primary and secondary outcome, a repeated measures ANOVA was performed to see whether the outcome differed between pre (T0), post (T1), 6 (T2), and 12 (T3) weeks after the intervention period. If the overall time effect of ANOVA was significant, difference contrasts were computed to exactly determine between what time points the outcome had changed. As follow-up measures were only available for the EI group, WLC participants were not included in these analyses.

Follow-up effects in relation to maintenance of exercise.
To investigate whether follow-up effects differed as a function of whether EI group participants engaged in regular exercise during the follow-up period, we performed separate RM-ANOVA for the first (1-6 weeks after the intervention; T1-T2) and second (6-12 weeks after the intervention; T2-T3) follow-up period. For each outcome, engaging in exercise (yes versus no) was added as between-subjects factor to the model, and time (T1 versus T2 or T2 versus T3) was added as withinsubjects factor. Significant time×exercise interaction effects were further examined by paired-sample t-tests.

Sample characteristics
Details of participants' general, work and health characteristics are presented in table 1. We tested whether participants' work characteristics changed throughout the intervention period, since work-related fatigue is closely related to work (4). No significant change in work characteristics was observed. Detailed results can be obtained from the first author on request. Employees worked in a variety of occupations, and most were employed in the education or healthcare sector. The sample consisted primarily of females (80.2%) and most had at least a Bachelor's degree (62.5%). Multivariate analyses and Chi-square tests revealed that participants in the two conditions did not significantly differ on most pre-intervention characteristics, except that the EI group scored lower on job demands (d=0.49) and more often reported that they had irregular working hours (EI: N=17 and WLC: N=9). We checked if these differences affected our study's conclusions, but including job demands and irregular work hours as covariates in the analyses did not change our results. There were also significant baseline differences between conditions in our outcome measures (see table  2 for means and standard deviations). In general, the EI group was less fatigued than the WLC group, as they scored significantly lower on emotional exhaustion (between group Cohen's d=0.37), overall fatigue (d=0.52), and need for recovery (d=0.53). EI group participants also reported significantly higher general self-efficacy (d=0.40) and aerobic fitness (d=0.54) than WLC participants. Given the blinded randomization to the different conditions, these differences are regarded as chance findings.
Compliance EI group participants completed on average 11.88 (SD 5.86) of the in total 18 exercise sessions. This number includes the four participants who did not receive the exercise intervention and the 14 participants who discontinued the intervention. Participants who discontinued the intervention completed on average 5.00 (SD 4.12) exercise sessions. Reasons for dropout can be found in figure  1. The 31 participants who completed the intervention did on average 15.12 (SD 3.46) exercise sessions. Participants who did not receive or discontinued the intervention were similar to completers as to demographics and primary and secondary outcomes at baseline.
In the WLC group, between T0 and T1, ten participants appeared to exercise ≥ one hour a week [mean=95.92 (SD 38.53) minutes]. Exercising less than one hour a week was set as an inclusion criterion to be eligible to be included in this study. Thus, ten participants increased their exercise level during the wait period. They did not differ from those who did not increase their exercise level as regards demographics and primary and secondary baseline outcomes.

Intention to treat: pre-and post-comparisons
Work-related fatigue. Mean and SD of the three indicators of work-related fatigue pre-and post-intervention are presented in table 2. The three indicators were significantly inter-related (at T0: emotional exhaustion and overall fatigue r=0.56; emotional exhaustion and need for recovery r=0.59; overall fatigue and need for recovery r=0.50). The EI group did not report significantly lower emotional exhaustion, overall fatigue and need for recovery at post-intervention than the WLC group.
Clinical meaningfulness. Table 3 presents the number of participants who improved, recovered, unimproved and deteriorated (52) after the intervention period (T1) with respect to emotional exhaustion and overall fatigue. Chi-square tests revealed that the number of participants who scored below cut-off scores of fatigue after the intervention period (ie, "recovered") was higher among EI participants for both emotional exhaustion (χ²=5.19, P=0.03) and overall fatigue (χ 2= 4.78, P=0.04) compared to WLC participants. General and work-related self-efficacy. As can be seen in table 2, the EI group did not display higher general and work-related self-efficacy levels at post-intervention when compared to the WLC group.
Sleep quality and quantity. The EI group reported significantly better sleep quality at post-intervention than the WLC group (moderate effect, see table 2). No differences at post-intervention between groups were found as regards sleep quantity.
Work ability. At post-intervention, the EI group reported significantly higher levels of work ability when compared to the WLC group (small effect, see table 2).
Cognitive functioning. Table 4 presents the means and standard deviations of self-reported and objectively measured cognitive functioning. The EI group reported better self-reported cognitive functioning at T1 compared to the WLC group (moderate effect). No differences at T1 between groups were found for the cognitive performance tasks (measuring updating, switching, and inhibition) nor for the subjective costs associated with doing the cognitive performance tasks.
Aerobic fitness. Table 5 displays the means and standard deviations of VO 2max obtained from the UKK walk test and the subjective costs associated with performing this test. At T1, no differences between the EI and WLC group were found as regards aerobic fitness and related subjective costs.

Follow-up effects
Most of the participants who filled out follow-up questionnaires, had completed the intervention (N=30 at T2 and at T3). Seven (T2) and respectively six (T3) had discontinued the intervention. Results of the repeated measures ANOVA that were conducted to examine follow-up effects are displayed in table 2. This table also presents means and standard deviations of outcome measures 6 weeks (T2) and 12 weeks (T3) after the end of the intervention. EI participants showed a decrease in emotional exhaustion, overall fatigue, need for recovery and sleep quality from baseline (T0) to post intervention (T1) and to follow-up (both at T2 and T3). From post-intervention (T1) to 6 weeks after the intervention (T2), we found small further improvements in emotional exhaustion and overall fatigue. For none of the primary or secondary outcomes we found improvements from 6 de Vries et al Table 2. Means and standard deviations for participants in the EI (N = 49) and WLC (N = 47) group of emotional exhaustion, overall fatigue, need for recovery, sleep quality, sleep quantity, general self-efficacy, work self-efficacy and work ability pre (T0) and post intervention (T1), at follow-up 6 weeks (T2) and 12 weeks (T3) after the intervention. weeks after the intervention (T2) to 12 weeks after the intervention (T3).

Follow-up effects in relation to maintenance of exercise
At T2, 23 participants indicated that they had engaged in regular exercise since they finished the intervention [mean 133.48 (SD 104.69) minutes a week]. No differences in the development of outcomes from T1 and T2 were found between those who continued exercising and those who did not continue exercising in this period (ie, no significant time×exercise interactions; F's ranging from 0.07-4.06, all P>005).
At T3, 21 participants reported that they had engaged in regular exercise since 6 weeks after the end of the intervention [mean 178.00 (SD 165.13) a week]. A significant time×exercise interaction for need for recovery was found (F 1,33= 10.27, P<0.01, η²=0.24). T-tests revealed that participants who continued exercising in this period showed a (marginally significant) decrease in need for recovery from T2 to T3 (d=-0.39, t=2.02, P=0.06), while those who stopped exercising showed an increase in need for recovery (d=0.54, t=-2.75, P=0.02). Other outcomes were not related to the amount of exercise participants engaged in the period between T2 and T3 (ie, no significant time×exercise interactions); F's ranging from 0.08-3.11, all P>0.05)

Per protocol analysis
In PP analyses, we only analyzed participants who complied with the protocol. We considered EI participants to have complied with the protocol if they completed the intervention and WLC participants if they exercised <1 hour during the wait period. Results of PP analyses are only presented when different from ITT analyses (all results can be obtained from the first author).
The EI (N=31) and WLC (N=35) groups were compared on T1 outcomes by means of ANCOVA with respective T0 scores as covariates. Similar significant results, but (slightly) higher effect sizes were observed for PP analyses when compared to ITT analyses as regards sleep quality (F 1,63 =5.54, P=0.02, η 2 =0.08), work ability (F 1,63 =4.58, P=0.04, η 2 =0.07) and self-reported cognitive functioning (F 1,63 =10.50, P<0.01, η 2 =0.14). Furthermore, in contrast to ITT analyses, PP analyses showed that part of subjective costs of both cognitive performance tests as well as aerobic fitness test were lower for EI participants as compared to WLC

Discussion
In order to better understand the causal association between exercise and work-related fatigue, the present study evaluated the efficacy of an exercise intervention on (i) work-related fatigue (primary outcome) and (ii) self-efficacy, sleep, work ability, cognitive functioning and aerobic fitness (secondary outcomes).
As to our primary outcome, ITT analyses revealed no effects of EI on the three indicators of work-related fatigue as compared to WLC. We did find that the num-ber of EI participants who "recovered" with respect to emotional exhaustion and overall fatigue was higher when compared to WLC participants, but this result should be interpreted with caution since EI participants were less fatigued at baseline and, as a consequence, less improvement was needed to recover. A closer examination of the EI and WLC groups reveals that some participants in the EI group did not start exercising or gave up exercising before T1, whereas some of the WLC group members did not wait but increased their exercise level between T0 and T1. The latter behavior may reflect their motivation to exercise, ie, reflect the reason why they were willing to take part in this study. PP analyses between "pure" groups of "completers" and "true controls" showed that EI participants displayed lower emotional exhaustion and overall fatigue compared to WLC at post-intervention. Together, this means that (i) exercise is effective to reduce emotional exhaustion and overall fatigue, and (ii) sufficient exposure to exercise (compliance) is needed in order to observe beneficial effects. These results are in accordance with previous intervention studies that concentrated on participants who completed the exercise intervention (15)(16)(17)(18)(19). Further small improvements in emotional exhaustion and overall fatigue were found 6 weeks after the end of the intervention, and these were maintained at 12 weeks. Similar to ITT analysis, PP analysis revealed no significant post-intervention between-group difference in need for recovery, but follow-up results indicated that EI participants showed a moderate improvement in need for recovery from baseline to 6 weeks and 12 weeks after the intervention, and for those who engaged in regular exercise a further improvement between 6 and 12 weeks after the intervention.
Significant effects were also found for a number of secondary outcomes. Similar to previous intervention studies (28), a moderate effect of EI on sleep quality was found in both ITT and PP analyses. Effects on sleep quality were maintained at follow-up. The small improvement in work ability in the EI group compared to the WLC group both in ITT and PP analyses, is in accordance with previous suggestions that exercise increases employees' physical and psychological capabilities to cope with work demands effectively (25,53). EI participants also reported better cognitive functioning compared to WLC participants at T1, indicating less problems with attentiveness and memory in everyday life (54). PP analyses revealed larger effect sizes in sleep quality, work ability and self-reported cognitive functioning than ITT analyses. This implies that the received dose of the intervention also impacted the intervention's efficacy with respect to these outcomes.
Contrary to expectations, no improvement was found in some other secondary outcomes. We found no effects of EI on general and work-related self-efficacy. It is possible that exercise is particularly related to mastery feelings relating to the exercise domain (ie, exercise self-efficacy; [55]) and that these feelings are not (yet) transferred to other life domains. Also, no difference between groups with regard to sleep quantity was found. It is possible that a stronger dose of exercise would have resulted in improvements of sleep quantity (28). We also did not find a between-group difference at post-intervention in objective cognitive performance and aerobic fitness (Vo 2max ). However, PP analysis revealed that EI participants considered the aerobic fitness test as less demanding than WLC participants at post-intervention, which might suggest positive changes in capacity. It is plausible that low intensity exercise needs to be maintained longer than six weeks to observe improvements in VO 2max (56).

Limitations
Six critical issues of this study deserve further attention. First, despite a state of the art concealed allocation to groups, baseline imbalances with respect to work characteristics and outcome measures occurred. Baseline imbalances may result in chance bias. This means that differences in group outcomes could accidentally be due to participants' characteristics, not the intervention (57). Baseline differences in indicators of fatigue were, however, not large but small-to-moderate (Cohen's d's between 0.37-0.54), and differences in job demands and irregular working hours were not substantially related to our primary outcomes (r ranging from -0.02-0.25; [57]). Furthermore, by analyzing our data by means of ANCOVA, we used the preferred method to control for baseline imbalances (49,50). It is therefore not plausible that this issue seriously impacted our findings.
The second issue is the non-blinded waitlist design. This was considered to be the best option, given our aim to compare exercise to the natural course of work-related fatigue in absence of a gold standard intervention to reduce it. WLC participants may not truly be untreated, since they are contacted, consented, randomized and measured (58). Such research participation effects may possibly contribute to a change in behavior and outcomes between T0 and T1, as evidenced by the increased exercise activity among WLC participants between T0 and T1. The waitlist design also implied that WLC participants were offered the intervention following T1, and hence no comparison between groups could be made at follow-up. This limits strong conclusions about follow-up effects. Furthermore, as result of this design, we cannot assess the extent to which non-specific intervention factors (ie, factors other than exercise itself) contribute to the reported beneficial effects. Peer support in the group exercise sessions might have acted as a non-specific intervention factor. Future studies may compare individual and group exercise interventions to shed more light on this matter. They may also investigate the amount of social (peer) interaction before, during and directly after each exercise session between participants. Attention of the trainers/researchers might be regarded as another potential non-specific factor. WLC participants received less attention than EI participants. From our data it was not possible to assess the potential contribution of this ingredient to the reported beneficial effects. Future research may design the control group in a way to control for the factor attention (59,60).
Third, the feasibility of our intervention requires attention. Of 180 eligible employees, 34 (18.9%) were not able or willing to participate in the exercise intervention (see figure 1). Another 50 employees (27.8%) were willing to engage in the intervention, but declined to participate because of lack of time to attend the two group-based supervised sessions, for instance due to family obligations. This is unfortunate, as it indicates that fatigued employees may well be motivated for participation in an exercise intervention, but at the same time practical considerations limit them to do so. Additionally, 14 (28.6%) of EI participants dropped out during the intervention, often because of injuries (N=7, see figure 1). We tried to minimize injury risk by applying a graded running protocol, in which low-intensity running was built up slowly and walking periods gradually decreased. As among intervention-completers compliance was high (ie, 83.9%), which suggests that the intervention was appreciated, future studies might try to further tailor the intervention to participants' practical possibilities. Future studies might also consider other exercise types. For example, in case of a running injury alternative, less demanding exercise activities (e.g. cycling) could be prescribed.
Fourth, given that we only included employees who at the study start exercised <1 hour a week, our sample consisted of (relatively) inactive employees. This implies that our study findings cannot be simply generalized to other fatigued employees who already exercise on a more regular basis. Further work is required to establish how exercise can benefit this population, for instance by further investigating the optimal exercise dose to reduce work-related fatigue.
Fifth, this study might have benefitted from other approaches to measure exercise and sleep, such as diaries or ambulatory devices (see also 61,62). Future studies may include such measures to assess exercise and sleep more thoroughly.
Sixth, the interpretation of our study's results might have been improved by conducting a process evaluation of the intervention (63). We mainly concentrated on effect-evaluation of the intervention, ie, a comparison of pre-and post-intervention outcomes. However, also process factors may explain intervention results (63,64), de Vries et al as is indeed acknowledged in the current intervention's study protocol (30). For instance, processes such as the quality of the intervention provider and participants' mental models, may have affected the results of the intervention. Future exercise trials may include relevant process factors in their evaluation of the intervention in order to better understand its outcomes. Practical frameworks may help to guide these future process evaluations (eg, 65,66).

Theoretical and practical contributions
This study contributes to the scientific evidence on the effect of exercise on work-related fatigue in a theoretical and practical sense. As regards the former, based on psychological [eg, detachment (7-9) and physiological (eg, faster stress recovery (10)] mechanisms, we expected that exercise would reduce work-related fatigue. By adopting a randomized controlled trial design, causal inferences could be reached at. We found that exercise indeed works to reduce work-related fatigue and enhance broader indicators of employee well-being (sleep quality, work ability and cognitive functioning). Given that mechanisms underlying the relationship between exercise and work-related fatigue are hardly empirically studied, future research into these mechanisms would help to enhance further theory development. Because sufficient compliance played a role in whether beneficial effects of exercise were found, this study provides an example for future effectiveness trials in which the (implementation of) the exercise intervention can be further investigated (67).
As regards practical contributions, this study showed that exercise can serve as a relatively simple and inexpensive secondary prevention strategy to improve wellbeing among employees with high levels of work-related fatigue, especially if compliance is high. As such, it not only provides a practical tool for employees wanting to reduce their levels of fatigue, but may also help employers, health practitioners and policy-makers when aiming to implement evidence-based guidelines to reduce fatigue among employees.