Effects of graded return-to-work: a propensity-score-matched analysis

Graded return-to-work after a clinical rehabilitation program reduced the risk of permanent work disability and the duration of welfare benefits while increasing work participation. Our finding from a propensity-score-matched analysis of a large German cohort supports recent evidence indicating that workplace involvement is a promising strategy to reduce societal costs due to work disability benefits. Objectives Graded work exposure is deemed to have a therapeutic effect. In Germany, graded return-to-work (GRTW) is therefore frequently used following a rehabilitation program if workers are still unable to perform full job duties. The aim of the analyses was to determine long-term effects on disability pension and regular employment. Methods Analyses were performed with longitudinal administrative data. Patients aged 18–60 years who attended an orthopedic, cardiac, oncological, or psychosomatic rehabilitation between January and June 2007 were eligible to participate in a GRTW scheme. The effects of GRTW were analyzed by a propensity-score-matched comparison of patients with and without GRTW. Outcomes were disability pension rates, regular income, and the duration of receiving welfare benefits due to sickness absence and unemployment up to the end of 2009. Results The propensity-score-matched sample comprised 1875 patients on GRTW and 1875 matched controls not undergoing GRTW. The probability of a disability pension was decreased by about 40% among GRTW patients [5.4% versus 8.6%; hazard rate ratio (HR) 0.62, 95% confidence interval (95% CI) 0.49–0.80]. The three-year income (2007–2009) was €12 920 higher (95% CI €10 054–15 786) in the GRTW group. The duration of receiving welfare benefits due to sickness absence and unemployment was significantly reduced. Conclusions Graded work exposure supports labor participation and reduces the risk of permanent work disability.

Work disability has rising direct and indirect costs to societies (1). In Germany, current levels of sickness absence are estimated to equate to 568 million days lost per annum. This corresponds to an annual loss in production of around €59 billion. The loss in productivity due to sickness absence is estimated to be €103 billion annually (2). To meet the challenge of work disability, prevention and management of work disability have become a priority in many national health and welfare strategies (3)(4)(5)(6)(7)(8). These strategies are diverse. In many countries, rehabilitation services are provided to support work-disabled patients to return to work and to achieve sustainable work participation for patients who experience limitations in work functioning. Though clinical interventions like functional restoration or work hardening (9-15) might be a major component of a rehabilitation strategy, several authors have also stressed the importance of involving workplace stakeholders in managing the return-to-work (RTW) process (16,17). Involving workplace stakeholders gives rehabilitation and occupational health professionals the opportunity to gauge if rehabilitation can be facilitated by getting the worker back to work though she or he is still unable to perform full job duties. If she or he has returned to work, working hours and tasks can be increased gradually until the worker is again able to cope with regular and full demands (18). This strategy is called therapeutic work resumption, graded work exposure or graded RTW (GRTW).
In Germany, GRTW is possible if patients have finished their rehabilitation program but are still unable to perform full duties. GRTW is defined as a therapeutic measure that aims to test and practice work capacity at the workplace. It is usually initiated by the rehabilitation physician and the social worker in the rehabilitation center and needs consent from the patient, the employer, the general practitioner and the occupational physician. The patient begins to work for at least two hours/day. The rehabilitation physician develops a scheme which gradually increases the working time. The scheme ends with full RTW. In case of GRTW, employees continue to receive sickness benefits from the Pension Insurance Agency. There are no direct costs for wages for the Effects of graded return-to-work employer. GRTW must be started within four weeks after completion of the rehabilitation program.
Graded work exposure is deemed to have a therapeutic effect among work-disabled persons as prolonged absence from work worsens physical deconditioning, negatively affects mental health, and increases the risk of receiving a disability pension (3,17,(19)(20)(21). Returning to work with reduced demands allows the worker to re-experience selfefficacy and co-worker support and challenge avoidance beliefs. Anema and colleagues (22) reported that therapeutic work resumption was a significant predictor for sustainable RTW in a six-nation cohort study of back-pain patients on long-term sick leave. Moreover, a systematic review by Krause and colleagues at the end of the 1990s summarized that graded work exposure increased work participation and reduced the number of lost working days (18). This conclusion was, however, mostly based on observational studies. Moreover, in the included studies, graded work exposure was usually only one element of a broader intervention. Specific effects of graded work exposure were difficult to separate.
In Germany, retrospective analyses demonstrated positive effects of GRTW on labor participation up to one year. One year after the rehabilitation program, 91% of GRTW participants were working, but only 78% without therapeutic work resumption (23). However, covariates for matching were assessed retrospectively and were prone to recall bias. Moreover, follow-up was restricted to one year. Analyses of long-term follow-up effects on disability pension and regular employment are still lacking. Therefore, the aim of this study was to explore the long-term effects of GRTW following a rehabilitation program for patients who were still workdisabled at the end of their rehabilitation program.

Study design
The German Pension Insurance Agency provided the dataset, which comprised income trajectories, disability pensions and welfare benefits due to sickness absence and unemployment for a random sample of all rehabilitation patients who finished a rehabilitation program between 2002 and 2009. We included persons aged 18-60 years who had finished an orthopedic, cardiac, oncological, or psychosomatic rehabilitation program in the first half of 2007. Persons had to be eligible for GRTW, ie, they had a regular job contract and their rehabilitation physician had given a positive RTW prognosis although they were still unable to perform full job duties at the end of the rehabilitation program. Persons were excluded if they started to receive a dis-ability pension before the end of 2007 or died during the follow-up period.

Primary and secondary outcome
Primary endpoint was the receipt of a disability pension. Disability pensions can be approved as full or partial pensions. About 90% are full pensions. Disability pensions are usually permitted temporarily (≤3 years). Continuation of the pension requires further verification. Once approved, a later refusal is rather rare. Temporary pensions become permanent after nine years. Survival time was computed from 1 January 2008. Cases were censored until the date of starting to receive pension benefits. Non-cases were censored until 31 December 2009. Secondary outcomes were the income from regular employment from 2007 to 2009 as well as duration of receiving welfare benefits due to sickness absence and unemployment during the follow-up period (unemployment, long-term unemployment or sickness benefits).

Explanatory variables
We considered the following variables as potential confounders: age; sex; place of residence; income from regular employment in 2005 and 2006, respectively; the duration of receiving welfare benefits in 2005 and 2006 (unemployment, long-term unemployment or sickness benefits); rehabilitation following a prompt by the health insurance agency; cumulative duration of sickness absence prior to the rehabilitation program (<3 versus ≥3 months); type of rehabilitation program (postacute rehabilitation versus rehabilitation due to chronic conditions); rehabilitation diagnosis (musculoskeletal disorders, cardiovascular disorders, psychosomatic disorders or cancer); and duration of the completed rehabilitation program.

Statistical analysis
Propensity score matching was used for defining a comparison group, as comparable as possible to the group of persons with GRTW, and issued from the large group of subjects without GRTW (24)(25)(26)(27)(28)(29)(30). The propensity score is the conditional probability of receiving the treatment (ie, GRTW) given the vector of observed background variables. Matching by propensity scores enables balanced characteristics of the treated and untreated sample if there is sufficient overlap of the propensity scores of both groups. Compared with a conventional direct matching procedure, the problem of multidimensionality in finding a corresponding control (for instance related to age, sex, sick leave duration, former income and others) is thereby reduced to one dimension only.
The propensity score was estimated by a logistic Bethge et al regression model including the 16 potential confounders as described above. For every person who gradually returned to work, the person without GRTW with the most similar propensity score was selected from the larger pool of potential controls. Resampling was realized without replacement. Sensitivity analyses were performed using a caliper of one quarter and one tenth of the standard deviation of the propensity score during resampling to increase the similarity of cases and controls by excluding the cases for whom it was especially difficult to find an adequate control. Additional sensitivity analysis tested if the effects on disability pension varied over the range of the propensity score. For this purpose, the propensity score was categorized based on quartiles, and the effects of GRTW were compared over the quartile-based groups.
Balance of cases and controls before and after matching were checked by bivariate statistics (t test, chi-square test). As an indicator of the bias before and after matching due to differences related to the observed sample characteristics, the standardized percentage bias was calculated. This is the difference of the sample means in cases and controls relative to the square root of the average of the sample variances in both groups (31).
Analyses of treatment effects in propensity-scorematched samples can use the same statistical methods that are also used in experimental studies (24,25). Differences in the survival distribution in working life for both groups were analyzed with proportional hazards models. The hazard rate ratio (HR) was determined to estimate the relative risk reduction. Moreover, the absolute risk reduction and the number needed to treat (NNT) to avoid one additional disability pensioner were calculated. In addition, interaction terms were included in the proportional hazard model to examine if baseline characteristics, eg, rehabilitation following a prompt by the health insurance agency, moderated the treatment effect. Interactions were first tested for age, sex and diagnostic groups. Age was categorized for this purpose (18-50 versus 51-60 years). Additionally, interactions were tested for indicators of severity of work disability (prompt for rehabilitation by health insurance agency and sickness absence duration) as stronger effects were assumed for more severely restricted persons. The comparative analyses of the effects on average income from regular employment and average time of receiving welfare benefits due to sickness absence and unemployment were done with t tests. Additionally, Mann-Whitney U tests were used as a nonparametric alternative.
Statistical differences were regarded as significant if the two-sided P-value of a test was <0.05. All analyses were performed with STATA statistical software, version 12 (StataCorp LP, College Station, TX, USA). Propensity score matching was realized by using the procedure psmatch2.

Sample
The primary sample included 11 581 persons, of whom, 1875 (16.2%) gradually returned to work at the end of their rehabilitation program. Characteristics of persons with and without GRTW are shown in table 1. There were considerable covariate imbalances. Persons with GRTW were for instance younger and more frequently female compared with work-disabled persons who did not gradually return to work. Moreover, persons with GRTW were more severely restricted in working life as they were more frequently prompted by their health insurance agency to request a rehabilitation program, eg, due to long-term sick leave or severe chronicity, and they were more likely to have ≥3 months of cumulative sick leave within the 12 months before starting the rehabilitation program.
The median of the propensity scores of both groups clearly differed (GRTW versus non-GRTW: 0.28 versus 0.08). However, there was substantial overlap between the distributions of propensity scores in the two groups so that for every case one similar control was identified. Baseline differences between cases and controls were reduced to a minimum (mean bias before and after matching: 25.1% versus 1.4%). The one-to-one matched analytic sample included 3750 persons (with GRTW: N=1875; matched controls: N=1875). Characteristics of the matched controls are presented in the third column of table 1. Persons with GRTW and matched controls were balanced regarding all baseline scores, ie, there were no significant differences in any of the observed baseline variables.

Disability pension
The risk of a disability pension was decreased from 8.6% among patients without GRTW to 5.4% among patients with GRTW. This corresponds to a relative risk reduction of about 40% [HR 0.62, 95% confidence interval (95% CI) 0.49-0.80; figure 1]. The absolute risk reduction was 3.2%. The NNT was 31 persons, ie, 31 persons had to start a GRTW to avoid one additional disability pensioner.
There were no significant interactions with age, sex and diagnostic group. However, findings indicated that there was no effect on a diminished disability pension risk among patients with cardiovascular diseases. The effect of GRTW on a decreased risk of a disability pension was approximately 2-times stronger among patients who started their rehabilitation program following a prompt by their health insurance agency compared to patients who were not (prompted by health insurance agency: HR 0.34, 95% CI 0.18-0.62; not prompted by health insurance agency: HR Effects of graded return-to-work 0.72, 95% CI 0.55-0.95; interaction: P=0.027). The NNT were 13 and 47, respectively. A similar finding was seen when comparing persons with sickness absence duration prior to the rehabilitation program <3 months (HR 0.81, 95% CI 0.51-1.30) versus ≥3 months (HR 0.57, 95% CI 0.42-0.76). However, the interaction term was not significant in this case (P=0.207).

Both groups had comparable earnings in 2005 and 2006
prior to the start of their rehabilitation program. During follow-up, ie, from 2007 until 2009, the average annual income level among persons with GRTW was €3700-4700 higher than among those without GRTW (table 2 and figure 2). In total, the accumulated income from regular employment from 2007 to 2009 was €12 920 (95% CI €10 054-15 786) higher among GRTW patients.

Sickness absence and unemployment welfare benefits
Patients with GRTW received less welfare benefits due to sickness absence and unemployment up to the end of 2009 than patients without GRTW (table 2). The accumulated time of receiving sickness benefits was reduced by 52 days (95% CI 40-64 days), short-term unemployment benefits by 58 days (95% CI 49-67 days), and long-term unemployment benefits by 15 days (95% CI 10-20 days).

Sensitivity analyses
Sensitivity analyses that used calipers of 0.1 and 0.25 of the standard deviation of the propensity score for matching resulted in samples of 3738 and 3734 persons. Findings on the effects on disability pension were identical (HR 0.62, 95% CI 0.49-0.80). The comparative analyses of annual income yielded differences of €13 004 (95% CI €10 135-15 873) and €12 889 (95% CI €10 020-15 757) in favor of GRTW patients. The effects on disability pension varied to some extent over the range of the propensity score. Effects were strongest below the first quartile and above the third quartile. However, effects did not differ significantly.  Figure 1. Cumulative probability of a disability pension in patients with and without graded return-to-work (GRTW).

Discussion
Rehabilitation service research examines how services are routinely implemented and what is achieved by their usual application within a national health system. While clinical rehabilitation research focuses on the efficacy of an intervention in a more or less optimal setting with high treatment credibility and carefully selected patients, service research is interested in effectiveness, ie, the effects under routine conditions. This is important as findings of service research and clinical rehabilitation research might differ (30). Service research is, however, challenged by the fact that randomized controlled trials, which could provide the best evidence, are hardly feasible to perform in the case of services that have already been implemented. Moreover, comprehensive data collection in usual care is difficult to achieve (30).
In this study, administrative data and propensity score matching were used to analyses the effects of GRTW, a strategy frequently used in German rehabilitation care. The application of propensity score matching reduced bias when comparing work-disabled patients who started gradually returning to work and work-disabled patients who did not. The use of administrative data allowed us to consider a large sample and observe a follow-up period of up to three years. The findings indicate a moderate relative risk reduction of permanent work disability  by about 40%. Moreover, the findings clearly show that GRTW is associated with a higher average income level and a reduction in time dependent on welfare benefits due to sickness absence and unemployment. The absolute risk reduction of permanent work disability, however, was small, and the NNT was high with 31 persons needed to avoid one additional disability pensioner. However, additional analyses showed that the NNT decreased to only 13 persons among patients who started their rehabilitation program following a prompt by their health insurance agency due to long-term sick leave and severe chronicity. This -and the similar finding related to the sickness absence duration -indicate that the effect on a diminished disability pension risk is especially strong among patients with chronic handicaps. The potential effect of GRTW seems to be less if a RTW would be also possible without this additional therapeutic measure. Rehabilitation physicians need to consider that recommending GRTW for less restricted patients may be of no additional benefit even if patients and employers wish to use the opportunity of GRTW.
The arena of work disability as described by Loisel (16) involves many stakeholders (eg, clinicians, workplace actors, insurance agencies, family and friends). Preventing permanent work disability and enabling a RTW therefore needs a strategy which takes account of as many of these actors as possible in order to develop a comprehensive and integrated strategy. This might explain, for example, that -despite the clear evidence that multimodal clinical interventions reduce pain and disability in patients with musculoskeletal disorders (32) -the evidence of the effects on work participation are conflicting. While Schaafsma and colleagues (12) reported small effects on work participation among workers with subacute and chronic back pain as com-pared with usual care or exercise treatment, Kamper and colleagues (32) failed to identify an effect on work outcomes when they compared multidisciplinary programs to usual care. In contrast, the review by van Vilsteren and colleagues (17) on workplace interventions demonstrated a clear benefit on work outcomes. Most of the interventions in the latter review involved clinical and workplace interventions, as proposed by Loisel and his Sherbrooke model (13). We, therefore, see our findings in line with the review by van Vilsteren and colleagues (17). Workplace involvement, especially by therapeutic work resumption and GRTW, seems to be a major component of a successful work rehabilitation strategy (22).

Strengths and limitations
When interpreting our findings, the following limitations must be considered. First, though propensity score matching is a powerful tool to reduce bias, a propensity-scorematched analysis is still based on observational data only (30). It is not a randomized controlled trial. Consequently, the results are potentially biased by unobserved differences between cases and controls. Second, the risk of bias is increased as the amount of data that can be used for matching to reduce bias is clearly limited when using administrative data. Self-reported data could additionally support the estimation of the propensity scores. In the case of GRTW, factors like job strain, job satisfaction, fear avoidance beliefs, subjective RTW prognosis, self-rated work ability, RTW motivation and support from supervisors and colleagues are probably important predictors for considering a GRTW. For the additional use of self-reported data, large cohort studies are needed as described by Saltychev and colleagues in their papers on the propensity-score-matched analysis of the effects of the Finnish vocationally oriented medical rehabilitation (28,29,33). Additionally, linking of administrative and questionnaire data has to be realized to use questionnaire data for such analysis.
The limitations of the study are balanced by the following strengths. First, administrative data allows fairly complete, reliable and valid assessment of data. This was especially the case in this study for several indicators of work participation, which is the primary outcome of RTW strategies and vocational rehabilitation services. Second, using administrative data allows the inclusion of large samples. Third, using administrative data enables a long follow-up period to be observed without sample attrition. Fourth, propensity-score-matched analyses allowed us to determine figures such as the NNT, which are usually derived from randomized controlled trials and needed to appropriately communicate the benefit of an intervention.

Concluding remarks
In conclusion, this study demonstrates that the application of GRTW in usual rehabilitation practice supported return to work and sustainable work participation among patients who were still unable to perform full job duties at the end of their rehabilitation program. However, the results also indicate that the possibility of GRTW should be particularly considered for patients who started their rehabilitation program following a prompt by their health insurance agency due to long-term sick leave and severe chronicity. The additional effect on avoiding disability pensions among less-disabled patients does not seem to be clinically meaningful.