Short communication

Scand J Work Environ Health 2014;40(4):428-431    pdf | Issue date:

Work ability as prognostic risk marker of disability pension: single-item work ability score versus multi-item work ability index

by Roelen CAM, van Rhenen W, Groothoff JW, van der Klink JJL, Twisk JWR, Heymans MW

Objectives Work ability predicts future disability pension (DP). A single-item work ability score (WAS) is emerging as a measure for work ability. This study compared single-item WAS with the multi-item work ability index (WAI) in its ability to identify workers at risk of DP.

Methods This prospective cohort study comprised 11 537 male construction workers, who completed the WAI at baseline and reported DP after a mean 2.3 years of follow-up. WAS and WAI were calibrated for DP risk predictions with the Hosmer-Lemeshow (H-L) test and their ability to discriminate between high- and low-risk construction workers was investigated with the area under the receiver operating characteristic curve (AUC).

Results At follow-up, 336 (3%) construction workers reported DP. Both WAS [odds ratio (OR) 0.72, 95% confidence interval (95% CI) 0.66–0.78] and WAI (OR 0.57, 95% CI 0.52–0.63) scores were associated with DP at follow-up. The WAS showed miscalibration (H-L model χ²=10.60; df=3; P=0.01) and poorly discriminated between high- and low-risk construction workers (AUC 0.67, 95% CI 0.64–0.70). In contrast, calibration (H-L model χ²=8.20; df=8; P=0.41) and discrimination (AUC 0.78, 95% CI 0.75–0.80) were both adequate for the WAI.

Conclusion Although associated with the risk of future DP, the single-item WAS poorly identified male construction workers at risk of DP. We recommend using the multi-item WAI to screen for risk of DP in occupational health practice.

This article refers to the following texts of the Journal: 2009;35(1):1-5  2010;36(5):404-412
The following articles refer to this text: 2015;41(1):36-42; 2016;42(6):490-499

Work disability has become a major occupational health problem in developed economies. On average 6% of the working population in OECD countries received disability pension (DP) benefits in 2009 (1). DP recipient rates are highest in Sweden (11%), Norway (10%), Finland (8%), and The Netherlands (8%). OECD countries spend on average 1.9% of their gross domestic product on DP benefits (2). As a result, governments attach importance to sustained work ability throughout working life. Work ability is developing into a more versatile concept and its definition may differ across settings, for example occupational health, insurance medicine, or rehabilitation (3). In occupational health, work ability is primarily determined by the balance between an individual’s work demands and resources.

In the 1980s, Finnish occupational clinicians developed the work ability index (WAI) to measure work ability. Several studies have demonstrated that low or declining WAI levels increase the risk of DP (46). The WAI is a long instrument and rather difficult to complete, especially when workers do not have a good understanding of it. A user-friendly single-item work ability score (WAS), asking for current work ability in relation to lifetime best, is increasingly being used to assess work ability (7). The similarity in results between WAS and WAI fosters the use of the WAS for large-scale population surveys (8). The present study compares the single-item WAS with the multi-item WAI in its ability to identify workers at risk of DP and discriminate between workers at high and low risk of DP.


ArboNed is a nationwide occupational health service in The Netherlands. ArboNed recruits workers from contracted construction companies every two years for health checks assessing work ability. All 18 093 construction workers who participated in health checks in the period 2005–2007 were included in this study. After a mean 2.3 (standard deviation [SD] 0.1) years, 11 537 (64%) of these workers participated in another health check, assessing DP with the question “At the moment, are you receiving disability pension benefits?” (no/yes). The Medical Ethics Committee of the University Medical Center Groningen approved the study.

Work ability

The single-item WAS asked construction workers “Assume that your work ability at its best has a value of 10 points. How many points would you give your current work ability?” Workers could respond on a 10-point scale, ranging from 0 (completely unable to work) to 10 (work ability at its best).

The WAI is a self-administered instrument that asks for current work ability (as the WAS does), work ability in relation to physical and mental job demands, and work ability in the forthcoming two years. The WAI assesses diseases by a list of medical diagnoses. For this study, we used a short version with 15 medical diagnoses (9). Furthermore, the WAI asks for impaired work performance due to illness and sickness absence over the last 12 months. Finally, mental resources are addressed with the items: “Have you been able to enjoy your regular daily activities?”, “Have you been active and alert?”, and “Have you felt yourself to be full of hope about the future?” All WAI items are weighted and summed to a composite score ranging from 7=poor work ability to 49=excellent work ability (10).

Statistical analysis

Statistical analyses were performed in SPSS for Windows, version 20 (IBM Corp, Armonk, NY, USA). WAS and WAI scores were standardized as percentage of their maximum score and included as continuous independent variables in separate logistic regression models with DP

(no=0, yes=1) at follow-up as outcome variable. Odds ratios (OR) and related 95% confidence intervals (95% CI) are presented per 10-point increase in standardized WAS and WAI scores, adjusted for age, working hours/ week, and number of years employed in the construction industry.

The predictive performance of both logistic regression models was quantified in terms of calibration and discrimination. Calibration refers to the agreement between predicted and observed DP risks, and was investigated by the Hosmer-Lemeshow (H-L) test. Lower H-L model χ2 indicates better calibration and H-L P≥0.05 reflects adequate calibration (11). Discrimination refers to the ability to distinguish between workers at high and low risk of DP, and was examined by receiver operating characteristic (ROC) analysis. An area under the ROC curve (AUC) ≥0.75 represents adequate discrimination (11, 12).


Men participating in both health checks were aged 45.4 (SD 9.5) years and those who did not participate in the health check at follow-up were aged 43.1 (SD 11.1 years; t-test P<0.01) years. Participants had been working 39.7 (SD 7.3) hours/week for on average 24.4 (SD 12.1) years in the construction industry, while men who did not participate at follow-up worked 39.4 (SD 7.9; t-test P=0.67) hours/week for on average 23.8 (SD 11.7; t-test P=0.29) years.

Six construction workers had missing data on the WAS and 2007 (17%) workers had missing data on the WAI. The work ability scores did not differ between men who did (WAS 7.8, SD 1.3; WAI 40.1, SD 4.9) and did not (WAS 7.9, SD 1.3, t-test P=0.44; WAI 41.3, SD 5.1, t-test P=0.17) participate at follow-up.

At baseline, 81% were blue-collar workers versus 80% at follow-up. WAS scores did not differ between blue(mean 7.8, SD 1.4) and white- (mean 8.1, SD 1.3; t-test P=0.63) collar workers, whereas WAI scores were lower among blue- (mean 39.8, SD=5.0) than white- (mean 41.3, SD=4.4; t-test P<0.01) collar workers.

A total of 336 (3%) construction workers [287 (4%) blue- and 49 (3%) white-collar workers] reported DP at follow-up. Both standardized WAS (OR 0.72, 95% CI 0.66–0.78) and WAI (OR 0.57, 95% CI 0.52–0.63) scores were associated with DP. Associations did not differ significantly between blue- and white-collar workers [data not shown].

The WAS showed miscalibration (H-L model χ2=10.60; df=3; P=0.01) indicating that it did not accurately predict the risk of DP. Figure 1 shows that the WAS poorly discriminated (AUC 0.67, 95% CI 0.64–0.70) between workers at high and low risk of DP. This was probably due to the fact that 94% of workers scored WAS ≥7. Specificities were high, but sensitivities low for WAS <7 (table 1).

Figure 1

The figure shows the ability of the single-item work ability score [(WAS) grey line with area under the curve (AUC) 0.67] and the multi-item work ability index [(WAI) black line with aUc 0.78] to discriminate construction workers at high risk of disability pension from those at low risk; the diagonal indicates no discrimination above chance.

Table 1

Sensitivities and specificities at different work ability cut-off scores.


For the multi-item WAI, calibration (H-L model χ2=8.20; df=8; P=0.41) and discrimination (AUC 0.78, 95% CI 0.75–0.80) were both adequate. WAI scores of 37–40 were the most optimal cut-offs (table 1).


The multi-item WAI, but not the single-item WAS showed adequate calibration and discrimination to identify construction workers at increased risk of DP. It should be acknowledged that our results only apply to male construction workers and may be different for women and workers in other economic sectors. Ahlstrom et al (7) reported that the WAS is a good alternative to the WAI for assessing the status and progress of work ability among sick-listed female human service workers. In line with our present results, the authors discussed that the WAI better predicted future health outcomes. El Fassi et al (8) demonstrated that the WAS collects information on work ability as validly as the WAI. Despite being a valid measure for work ability, our results show that the ability of the WAS to identify construction workers at increased risk of DP is poor.

Practical implications

The WAS discriminates to some extent between male construction workers at high and low risk of DP. When we want to predict a multi-factorial endpoint such as DP, an AUC of 0.67 might not be that bad. However, the multi-item WAI better discriminates between high- and low-risk workers. Critics argue that the WAI is a long and complicated instrument to complete, which might explain the 17% of missing responses on the WAI in our study. The WAS is more user-friendly and easier to interpret. Furthermore, the WAS can be implemented at lower cost in large-scale surveys. We could consider the WAS as a primary screening instrument and then distribute the WAI only to the workers with low WAS scores. The present study showed low sensitivities for WAS cut-off scores <8, but for WAS ≥9 sensitivity was acceptable. In subgroup analysis, we found that the discriminative ability of the WAI was lower among workers with WAS <9 scores than in the total population of male construction workers [data not shown]. This may be due to the reduced spread of work ability scores among the selected workers or the so-called “incorporation bias” as the WAS is part of the WAI. Based on this finding, we advise against using the WAS as a primary screening instrument and recommend the short version of the WAI listing 15 medical diagnoses to screen construction workers for risk of DP.

In a random sample of the German workforce, Bethge et al (13) found that WAI ≤37 identified workers in need of rehabilitation services. We recommend WAI cut-off scores of 37–40 to identify construction workers at increased risk of DP. At a cut-off point <36, about half of DP cases would be missed, while at cut-off points >40 about half of the workers are falsely identified as being at increased risk of DP. Which cut-off point between 37 and 40 should be chosen depends on the burden and costs of interventions to prevent DP.



(2010). Paris: OECD Publishing. Organization for Economic Cooperation and Development. Sickness, disability and work: breaking the barriers.


(2011). Paris: OECD Publishing. Organization for Economic Cooperation and Development. Society at a Glance 2011: OECD Social Indicators.


Ilmarinen, J. (2009). Work ability –a comprehensive concept for occupational health research and prevention. Scand J Work Environ Health, 35, 1-5, .


Liira, J, Matikainen, E, Leino-Arjas, P, Malmivaara, A, Mutanen, P, Rytkönen, H, & Juntunen, J. (2000). Work ability of middle-aged Finnish construction workers –a follow-up study in 1991–1995. Int J Ind Ergon, 25, 477-81, .


Alavinia, SM, de Boer, AGEM, van Duivenbooden, JC, Frings-Dresen, MH, & Burdorf, A. (2009). Determinants of work ability and its predictive value for disability. Occup Med, 59, 32-7, .


Bethge, M, Gutenbrunner, C, & Neuderth, S. (2013). Work Ability Index predicts application for disability pension after work-related medical rehabilitation fro chronic back pain. Arch Phys Med Rehabil, 94, 2262-8, .


Ahlstrom, L, Grimby-Ekman, A, Hagberg, M, & Dellve, L. (2010). The work ability index and single-item question: associations with sick leave, symptoms, and health –a prospective study of women on long-term sick leave. Scand J Work Environ Health, 36, 404-12, .


El, Fassi M, Bocquet, V, Majeri, N, Lair, ML, Couffignal, S, & Mairiaux, P. (2013). Work ability assessment in a worker population: comparison and determinants of Work Ability Index and Work Ability score. BMC Public Health, 13, 305, .


Tuomi, K, Ilmarinen, J, Jahkola, A, Katajarinne, L, & Tulkki, A. (1998). Helsinki: Finnish Institute of Occupational Health. Work ability index, second revised version.


Steyerberg, EW. (2009). Springer: New York. Clinical prediction models, .


Fad, J, Upadhye, S, & Worster, A. (2006). Understanding receiver operating characteristic (ROC) curves. CJEM, 8, 19-20, .


Bethge, M, Radoschewski, FM, & Gutenbrunner, C. (2012). The work ability index as a screening tool to identify the need for rehabilitation: longitudinal findings from the second German sociomedical panel of employees. J Rehabil Med, 44, 980-7, .