Screening manual and office workers for risk of long-term sickness absence: cut-off points for the Work Ability Index

The serious consequences of long-term sickness absence (LTSA) for workers, employers and society justify screening for risk of LTSA. The Work Ability Index (WAI) may be used as a screening tool because it accurately predicts the LTSA risk and usefully discriminates high- from low-risk workers. However, WAI cut-off points differ between manual and office workers. Screening manual and office workers for risk long-term sickness cut-off points for the Work Objectives The aim of this study was to investigate the Work Ability Index (WAI) as a tool to screen for risk of different durations of long-term sickness absence (LTSA) among manual and office workers. Methods The prospective study comprised a cohort of 3049 (1710 manual and 1339 office) workers participat ing in occupational health surveys between 2010 – 2012. The survey date was set as baseline and incident LTSA episodes of different duration (>14, >28, >42, >60, and >90 days) were retrieved from an occupational health register in the year following the survey. Baseline WAI scores were associated with LTSA episodes occurring (no/yes) during one-year follow-up by logistic regression analysis in a random sample (N=1000) of the cohort. Predictions of LTSA risk were then validated among the workers not included in the random sample. Results The odds of LTSA episodes at follow-up decreased with increasing baseline WAI scores (ie, better work ability). The WAI accurately predicted the risk of future LTSA episodes >28, >42, >60 days, but over-predicted the risk of LTSA episodes >14 and >90 days. The WAI discriminated between workers at high and low risk of LTSA episodes of all durations. Office workers had higher WAI scores than manual workers. Consequently, false-negative rates were higher among office workers and false-positive rates were higher among manual workers at each WAI cut-off point. Conclusion The WAI could be used to screen both manual and office workers for risk of LTSA episodes lasting >28, >42, >60 days. WAI cut-off points depend on the objectives of screening and may differ for manual and office workers.

Sickness absence, particularly long-term sickness absence (LTSA) is a substantial societal and economic problem. The costs of sickness benefits average 1% of the gross domestic product of OECD countries (1) and are highest in Norway and The Netherlands where LTSA accounts for most of these costs (2). When a worker is absent from work for a longer period of time, employers have to assign the worker's tasks to other staff or replace the absent worker. LTSA disconnects sick-listed workers from the workplace, which may ultimately lead to social marginalization and reduced income (3). The probability of resuming work decreases with increasing sickness absence duration (4). Therefore, it is important to identify workers at risk of LTSA and refer them to preventive programs helping them to stay at work.
The ability to stay at work and manage work demands has been conceptualized as work ability, that is the balance between a worker's resources and the demands of work (5). The Work Ability Index (WAI) is widely used to measure work ability. Several studies have associated poor WAI scores with an increased risk of disability pension (6)(7)(8)(9). As LTSA precedes disability pension, it is conceivable that poor WAI scores also predict LTSA. Among sick-listed workers, poor WAI scores Schouten et al were found to be associated with a longer duration of LTSA (10,11). Only few studies related the WAI scores of non-sick-listed workers to their risk of future sickness absence. Kujala et al (12) investigated the relationship between baseline work ability and sickness absence (>9 days) during a one-year follow-up of Finnish workers from the Northern Finland Birth Cohort 1966 study. Poor-to-moderate WAI scores were associated with a higher risk of sickness absence as compared to workers with excellent WAI scores. Alavinia et al (13) associated the WAI scores of Dutch construction workers participating in a health survey in 2005 with short (<2 weeks), medium (2-12 weeks) and long (>12 weeks) duration sickness absence episodes occurring until the end of 2006. Lower WAI scores were found to be associated with higher risks of LTSA.
These prospective studies, however, do not tell us whether the WAI can be used for case-finding, which is the identification of non-sick-listed workers with an increased risk of LTSA. The serious consequences of LTSA for workers, employers, and society justify case-finding by screening for risk of LTSA. Frequently debated disadvantages of screening are prolonged morbidity when the prognosis is unaltered, over-diagnosis and over-treatment of questionable conditions, false reassurance for individuals with false-negative results, and anxiety for those with false-positive results. However, false reassurance and anxiety will be less of a problem for workers at risk of LTSA than individuals at risk of serious disease. Furthermore, interventions may alter the prognosis and consequences of LTSA. Taimela et al (14) showed that preventive consultations reduced sickness absence, although such consultations were cost-effective only among high-risk workers (15). The cost-effectivity of preventive consultations may increase when only high-risk workers are referred, which accentuates the need to screen for risk of LTSA. Kant et al (16) investigated the effect of preventive consultations on LTSA among high-risk workers who were identified with the Balansmeter®, an instrument developed and used among office workers to screen for risk of psychosocial sickness absence (17). They found that 9.1% of the intervention group (N=99) had LTSA episodes >28 days as compared to 18.3% of the control group (N=131).
In addition to a tool for identifying office workers at risk of psychosocial LTSA, we need an instrument to screen for risk of all types of LTSA in all kinds of occupations. Lindberg et al (18) investigated the predictive value of the WAI for LTSA episodes ≥28 days in a population-based sample (N=2252) from three Swedish municipalities. The authors found that workers with poor-to-moderate WAI scores had a 1.6 (women) to 2.1 (men) times higher risk of LTSA as compared to workers with good to excellent WAI scores. The present study calibrated the WAI for LTSA risk predictions and investigated its ability to discriminate between high-and low-risk workers.

Methods
The study population was recruited from a steel mill employing 10 935 workers, of whom 3674 were invited for occupational health surveys in the period 2010-2012. A total of 3049 (83%) workers participated in the health survey and completed a questionnaire including the WAI. The Medical Ethics Committee of the University Medical Center Groningen (M12.116654) granted ethical clearance for the study.

Work Ability Index (WAI)
The WAI measures work ability with seven dimensions: (i) current work ability compared with lifetime best (range 0-10), (ii) work ability in relation to the demands of work (range 2-10), (iii) current number of diagnosed diseases (range 1-7), (iv) impaired work performance due to illness (range 1-6), (v) sickness absence in the past 12 months (range 1-5), (vi) estimated work ability in the forthcoming two years (range 1-7), and (vii) mental resources (range 1-4). The dimension scores are summed to a WAI score ranging from 7-49, with higher scores reflecting better work ability. We used a short version of the WAI listing 15 medical conditions (19), which is nowadays commonly used (20). The psychometric properties of the WAI showed to be satisfactory for use in occupational health research and practice (21,22).

Outcome variable
Sickness absence episodes in the year following the health survey were retrieved from an occupational health register. Dutch sickness absence policies require medical certification of sickness absence by an occupational physician within 42 days of calling in sick. Hence, we investigated the WAI as tool for predicting LTSA episodes lasting >42 days. There is no international consensus on the definition of LTSA and episodes >42 days may be arbitrary for other countries. To evaluate the WAI as prognostic tool in a broader international context, we also presented results for the prediction of LTSA episodes lasting >14, >28, >60, and >90 consecutive days.

Statistical analysis
Statistical analyses were done in SPSS Statistics for Windows, version 21.0 (IBM Corp, Armonk, NY, USA) and in R (Project for Statistical Computing) by using the regression modeling strategies (RMS) package (23). Baseline WAI scores were included as continuous independent variables in logistic regression models for each duration of LTSA episodes, occurring (no=0, yes=1) during follow-up as outcome variable.

Split-sample validation of the WAI
A random sample (N=1000) was drawn from the health survey participants and used to estimate the linear predictor LP = b 0 + b 1 ×WAI, in which b 0 is the intercept and b 1 the logistic regression coefficient of the WAI for each duration of LTSA episodes (24). The logistic regression's Nagelkerke's pseudo R 2 was presented as measure for the overall predictive ability of the WAI.
The WAI was calibrated for predictions of the risk for each duration of LTSA episodes among the participants not included in the random sample. Calibration was investigated by calibration graphs, plotting mean predicted LTSA risks against the observed LTSA frequencies.
Calibration was considered adequate if tests for calibration intercept and slope were non-significant, ie, P≥0.05 (24). Discrimination between workers at high and low risk of different duration LTSA episodes was examined by Receiver Operating Characteristic (ROC) analysis. The area under the ROC curve (AUC) is a measure for the discriminative ability of the WAI. If we randomly select one worker from the LTSA group and one worker from the non-LTSA group, then the AUC indicated the probability that the WAI correctly identifies the worker from the LTSA group. AUC=0.50 represents no discrimination above chance and AUC≥0.75 is generally considered to reflect adequate discrimination (24). The Youden index was calculated as sensitivity + specificity -1 to determine the cut-off point for equally important sensitivity and specificity (25). Sensitivity represents true-positive rates and 1-sensitivity false-negative rates (figure 1). Alternatively, specificity represents true-negative rates and 1-specificity false-positive rates.

Results
A total of 3049 workers participated in the health surveys. They were 46.1 (SD 11.2) years of age and worked 36.7 (SD 6.0) hours/week at the steel mill, most of them (69%) for >10 years. Office workers were older, more often female, and worked more hours/week than manual workers (table 1). WAI scores were lower and LTSA episodes more frequent among manual compared to office workers.

Development of the WAI as prognostic tool
In the random sample (N=1000, 57% manual workers), 19 workers (2%) left employment: 9 workers resigned, 4 workers were dismissed, and 6 workers retired during follow-up. Hence, 981 workers were included in analysis, 166 (17%) of whom had LTSA at follow-up (table  2). Baseline WAI scores were negatively associated with LTSA episodes of all durations, indicating that higher WAI scores (ie, better work ability) were associated with lower odds of LTSA. Nagelkerke's pseudo R 2 increased from 0.295 for LTSA episodes >14 days to 0.422 for LTSA episodes >90 days, representing substantial predictive ability of the WAI.

Validation of the WAI as prognostic tool
In the validation sample (N=2049, 55% manual workers); 43 workers (2%) left employment: 20 resigned, 15 were dismissed and 8 retired. Of the 2006 remaining workers, 298 (15%) had LTSA (table 3). The LP accurately predicted the LTSA risk as reflected by non-significant calibration intercepts and slopes for LTSA episodes >28, >42 and >60 days. Calibration tests showed that the calibration intercept for LTSA episodes >14 days was significantly lower than 0, indicating systematic overprediction of the LTSA risk. The calibration slopes for LTSA >14 days and LTSA >90 days were significantly <1, indicating that over-prediction increased with estimated risks.
Discrimination was adequate for LTSA episodes of all durations and improved from AUC 0.78 for risk of LTSA episodes >14 days to AUC 0.86 for LTSA episodes >90 days. In other words, the WAI correctly identified 78% of workers with LTSA episodes >14 days and 86% of workers with LTSA episodes >90 days. Youden indices decreased from WAI 42 for identifying work-    Work ability cut-off points for risk of long-term sickness absence ers at risk of LTSA episodes >14 days to WAI 36 for identifying workers at risk of LTSA episodes >90 days.

WAI cut-off points
Manual workers reported lower WAI scores than office workers (table 1). Therefore, sensitivities and specificities at different WAI cut-off points were analyzed separately for manual and office workers (table 4). Sensitivities (ie, true-positive rates) at all WAI cut-off points were higher for manual than office workers. Alternatively, specificities (ie, true-negative rates) were lower for manual than office workers. At WAI 34, the positive predictive value was 31% among manual workers and 37% among office workers (table 4). Given the population incidence of LTSA episodes >42 days of 7% and 5% in manual and office workers, respectively, the risk was 4.4 times higher among manual workers with WAI ≤34 scores and 7.4 times higher among office workers with WAI ≤34 scores. The positive predictive values decreased with increasing WAI scores to 11% among manual workers and 9% among office workers corresponding with a 1.6 and 1.8 times higher risk of LTSA episodes >42 days among workers with WAI ≤43 scores as compared to the population incidence.

Discussion
The WAI accurately predicted the risk of LTSA episodes >28, >42 and >60 days and adequately discriminated between workers at high and low risk of LTSA episodes of all durations. The predictive and discriminative abilities of the WAI increased with LTSA duration, indicating that the WAI better identified workers at risk of longer duration LTSA. Cut-off points when sensitivity and specificity were equally important decreased from WAI 42 for LTSA episodes >14 days to WAI 36 for LTSA episodes >90 days. Apparently, lower WAI cut-off scores are needed to identify workers at risk of longer duration LTSA. In general, however, sensitivity and specificity are not equally important and cut-off points have to be attuned to the objectives of screening. The objective of screening for LTSA could be to identify workers at high risk of LTSA episodes for preventive consultations (14)(15)(16) or workplace health promotion programs (26,27).

WAI cut-offs for manual and office workers
The current study showed that the WAI accurately predicted the risk of LTSA and discriminated between workers at high and low risk of different duration LTSA episodes. Bethge et al (28) reported that the WAI predicted LTSA episodes >42 days and argued that rehabilitation services should be provided to workers with WAI<38 scores. We also found that WAI 38 was the best balanced cut-off score for LTSA episodes >42 days. At this cut-off point, sensitivity was 0.73 and 0.59 among manual and office workers, respectively. In other words, 73% of the manual workers and 59% of the office workers with LTSA at follow-up had baseline WAI ≤38 scores. Thus, 27% and 41% of the manual and office workers with LTSA, respectively, had baseline WAI >38 scores. On average, office workers reported higher WAI scores than manual workers, which explains why a higher proportion of office workers with LTSA had baseline WAI scores above the cut-off point. As a result, office workers had a higher probability of being missed as a LTSA case.
In occupational healthcare, missing cases of LTSA will not be a great problem. Unnecessary utilization of services by workers falsely identified as being at risk

Schouten et al
of LTSA will be more problematic, especially when the burden or costs of interventions are high and/or resources limited. If WAI 38 was chosen as a cut-off point to identify workers at risk of LTSA episodes >42 days, specificity was 0.76 and 0.88 among manual and office workers, respectively. Consequently, 24% and 12% of the manual and office workers, respectively, were falsely identified as LTSA cases. In other words: they had baseline WAI ≤38 scores, but did not develop LTSA during follow-up. False-positive rates were higher among manual workers because they reported lower WAI scores than office workers. Occupational healthcare providers may want to choose WAI cut-off points <38 to increase specificity and reduce false-positive rates, particularly among manual workers. In that regard, it is interesting to note that preventive consultations cost-effectively reduced sickness absence only among workers at high risk of sickness absence, not among those at moderate or low risk (16).
A recent meta-analysis showed that the overall effect of workplace health promotion programs is small (26). High specificity cut-off points can be used to target preventive consultations and programs at the high-risk workers who need them most, which may increase cost-effectivity at the expense of missing LTSA cases.

Methodological considerations
The prospective design of the study, the different data sources (occupational health survey for WAI scores and register for sickness absence), and the use of registered instead of self-reported LTSA are strengths of the study.
In addition, the validation for different duration LTSA episodes enables cross-national evaluation of the WAI as a tool to identify workers at risk of LTSA. The large sample size provided sufficient statistical power for splitsample validation of the WAI, so that regression coefficients were estimated for subjects other than those used for validating risk predictions. It should be reminded, however, that the study population was a male-dominated sample of workers employed at a steel mill. Van den Berg et al (29) reported an average WAI score of 40.4 for a heterogeneous sample of 10 542 workers (42.8% women) from 49 Dutch companies in commercial (41%), and non-commercial (37%) services, industry (18%) and construction (4%). We found higher WAI scores in our study population, which may indicate a healthy volunteer effect. Healthy workers are more likely to participate in health surveys than workers with health complaints (30). Such healthy volunteer bias may have underestimated associations between WAI scores and LTSA.

Concluding remarks
The WAI accurately predicted the risk of LTSA episodes >28, >42 and >60 days, but over-predicted the risk of LTSA episodes >14 and >90 days. Discrimination between workers at high and low risk of LTSA was adequate for LTSA episodes of all durations. The choice of WAI cut-off points depends on the objectives of screening rather than the Youden index. When defining cut-off points, occupational healthcare providers have to take into account that office workers report higher average WAI scores than manual workers. The current findings indicate that the WAI could be used as a surveillance tool, although further validation in other settings is needed before the WAI can be recommended to screen for risk of LTSA in occupational health care.