Analyzing sickness absence with statistical models for survival data

with statistical Objectives Sickness absence is the outcome in many epidemiologic studies and is often based on summary measures such as the number of sickness absences per year. In this study the use of modern statistical methods was examined by making better use of the available information. Since sickness absence data deal with events occurring over time, the use of statistical models for survival data has been reviewed, and the use of frailty models has been proposed for the analysis of such data. Methods Three methods for analyzing data on sickness absences were compared using a simulation study involving the following: (i) Poisson regression using a single outcome variable (number of sickness absences), (ii) analysis of time to first event using the Cox proportional hazards model, and (iii) frailty models, which are random effects proportional hazards models. Data from a study of the relation between the psychosocial work environment and sickness absence were used to illustrate the results. Results Standard methods were found to underestimate true effect sizes by approximately one-tenth [method i] and one-third [method ii] and to have lower statistical power than frailty models. Conclusions An uncritical use of standard methods may underestimate the effect of work environment exposures or leave predictors of sickness absence undiscovered.

Several recent epidemiologic studies of sickness absence report the results of Poisson regression models based on the number of sickness absences or the total number of sickness absence days for each person in a given follow-up period (1,2). While the number of workdays lost due to sickness absence is of great interest to employers and others, this contribution is limited to a discussion of the analysis of sickness absences. The problem of how to measure sickness absence has been discussed in the literature (3).
The statistical methods applied in epidemiologic studies of sickness absences are almost completely limited to studies that focus on the frequency of sickness absences using Poisson regression or the time to the first sickness absence using the Cox proportional hazards model.
Poisson regression analysis describes effects of covariates on sickness absence rates through relative effect estimates referred to as rate ratios. Overdispersion is often encountered in sickness absence data, with some people having many absences and many having none, and it is therefore often included in the Poisson regression model (4)(5)(6). Unequal follow-up time (eg, when people quit their job during the follow-up period) is also a typical situation, and the actual follow-up time should then be included as an offset. The actual time under risk can be computed when data from company registers on sickness absences are used. Cox analysis of time to the first event is a standard epidemiologic approach, but, as a consequence, information is discarded because people are censored after their first sickness absence.
Frequencies and the time to events are different expressions of the same pattern, and, if no one has more than one period of sickness absence, the two methods give the same result. This is the case in a standard survival analysis in which no person can die more than once. In studies of sickness absence, however, it is common that people have more than one sickness absence. This situation yields data with multiple events per person. Those with several sickness absences will Analyzing sickness absence data then influence the analysis more than those with few sickness absences.
It is likely that the risk changes when people return to work after a sickness absence, and the standard Poisson regression approach does not take this possibility into account. Furthermore, if this is the case, then the Cox analysis of time to first event focuses on the onset of the first sickness absence and may not be immediately comparable to the analysis of sickness absences in general.
This paper examines how modern survival analysis can be used when the first and last day of all absences (sickness absences and other absences) are recorded (eg, when data come from company registers on sickness absences).

Methods
The Intervention Project on Absence and Well-being (IPAW) is a controlled intervention study with 5 years of follow-up (7). The following three types of workplaces were included: workplaces assigned for later intervention, control workplaces with many sickness absences, and control workplaces with few sickness absences. Intervention effects were not included in our study.
Data from one of three organizations-a large pharmaceutical company-taking part in IPAW were considered in this study. A total of 731 persons participated, and for 725 of these persons sickness absence data were available. Data on exposures in their work environment came from questionnaires, ascertained between January and April 1997. Sickness absence data for the period April 1997 to April 1998 were obtained from workplace registers of sickness absence. The register data for each person consisted of information about the first and last day of every absence period and the type of absence (sickness absence or other absence). An example is shown in figure 1, where data from a person with two sickness absences and an absence due to something other than sickness are shown.
Note that an entire year, consisting of 365 days, is shown in figure 1. In other situations weekends and holidays could be excluded from the total risk time. The most important thing in connection with this approach is that the same procedure is used for all persons. The person shown in figure 1 was employed and was, therefore, at risk of being sick listed for 85+(166-89)+(310-180)+(365-330)=327 days, and two events occurred. With the use of the standard Poisson regression approach, this person would have two events as the outcome, and the total time, 327 days, at risk could be included as the offset variable. With time to first event as the outcome, this person would be at risk for 85 days and have one event.
The effect of four psychosocial work environment factors (skill discretion, decision authority, predictability, and meaning of work) on sickness absence has been studied in the examples. These factors have been described in detail elsewhere (7).
The following three methods for the analysis of sickness absences were compared: (i) the "standard" Poisson regression approach using a single outcome variable (number of sickness absences) for each person, (ii) Cox analysis of time to first event, later events being ignored, and (iii) frailty models allowing dependence between multiple events. Of these three methods, the first and second are simple and easy to implement with standard software, but they are inefficient in the sense that not all of the available information is used The third method is more efficient but requires assumptions describing the heterogeneity in the population.
The results of a simulation study of the precision, bias, type I error, and statistical power of these methods under different conditions have also been presented.

Poisson regression analysis
Poisson regression analysis assumes that the number of sickness absences, N, in a given follow-up period, say, a year, is Poisson-distributed and describes effects of covariates on the logarithm of the mean, µ=E(N). The analysis yields relative effect estimates that are sometimes referred to as rate ratios. Two problems are often encountered for sickness absence data, overdispersion and unequal follow-up time.
For Poisson distributed data the variance is equal to the mean, but for sickness absence data the variance is frequently substantially larger than the mean. A simple way to model this situation is to allow the variance to have a multiplicative overdispersion factor, V(N)=ρµ. In such a case, the parameter estimates are not affected, but   the confidence intervals are. When a scale parameter, ρ, is included, a quasi-likelihood function (8-9) is used, but since most of the asymptotic theory for log likelihoods also applies to quasi-likelihoods, computing standard errors is justified. This approach is often used in studies of sickness absence (4)(5)(6). Ignoring overdispersion leads to a model with poor fit to the data, where the standard errors will be unrealistically small. Therefore, evaluating significance based on these models can yield incorrect results.
The Poisson distribution is only reasonable if all persons are followed for the same period of time, and this approach is not always the case for sickness absence data (eg, when people quit their job or are on leave for a period of time during the follow-up period). This possibility can be included in the Poisson regression model if the logarithm of the actual follow-up time is used as an offset (a covariate with a constant regression coefficient equal to 1). Unequal follow-up time is often encountered in survival analysis, and it is referred to as censoring. Indeed, in most studies of events occurring over time, it is crucial that actual time at risk be taken into account. When data come from company registers of sickness absence, it will often be possible to compute the actual time at risk.

Cox analysis of time to first event
Events occurring over time can be described by the number of events occurring in a specified period (eg, 1 year) or by waiting times between events. A model for the occurrence of events can be specified using the intensity or hazard function λ(t) conditionally on the event history up to time t. This function specifies that the risk of an event in the interval from t to t + h is λ(t) · h if the person is at risk just before time t. The Cox proportional hazards model assumes that the intensity function for person i with covariate vector X i is given by where λ 0 (t) is a nonnegative function specifying the hazard for persons with all covariates equal to 0, called the baseline hazard, and Y i (t) is an indicator equal to 1 if person i is at risk just before time t and 0 otherwise. The ratio of hazards for two persons at risk is constant over time, and, for this reason, the model is called the proportional hazards model. For given X i , the mean of the number, N i , of sickness absences in a 1-year followup period is A Poisson regression model for the number, N i , of sickness absences will thus estimate the same parameters if all risk indicators Y i (t) are equal to one (10). This assumption implies that people are at risk again immediately after an event, and therefore it does not apply to sickness absence data because people will not return to work immediately after reporting sick.

Frailty models
Models given by equations 1 and 2 are often unrealistic because the covariate vectors cannot explain the variation in the rate of events between persons, a phenomenon referred to as overdispersion (compare with the preceding section on Poisson regression analysis). More general models, called frailty models, are based on hazard functions including random person effects, u i , called frailties, arising from some distribution. For practical uses of such a model, the standard approach (11) is to assume that u i follows a gamma distribution with mean E(u)=1 and unspecified variance V(u)=φ. Models based on the hazards (3) allow correlation between multiple events. Furthermore, the random person effect can be used to quantify the heterogeneity in the population through the variance (φ). The number, N i , of sickness absences in a 1-year follow-up period has the same mean structure as equation The frailties are characteristics of the persons and can be viewed as, for example, susceptibility. As they are unobserved variables, the EM (expectation-maximalization) algorithm (12) can be used to estimate the parameters (13). Using gamma-distributed frailties in a model such as equation 3, a maximum likelihood solution implemented in computer program R (14) is available (15). Frailty models quantify the person variation and yields an estimate of the heterogeneity, V(u)=φ. Confidence intervals for φ can also be computed (15).

Results
In the year following the exposure measurement, the 725 persons, for whom sickness absence data were available, had a total of 2357 sickness absences, and 1160 other absences (modeled as time not at risk using censoring). Furthermore, observations were censored at the end of the follow-up period (April 1998). The mean number of sickness absences was 3.2 (median 2), 158 persons had no sickness absences, and one person had 60 absences.
The three methods, Poisson regression, Cox analysis of time to first event, and frailty models, were compared using the IPAW data. The Poisson regression models included a scale parameter to account for overdispersion and the logarithm of the actual time at risk as an offset rom zero (results not shown). The heterogeneity between persons was thus substantial, but it is likely that the heterogeneity could be even larger in studies of people from many different organizations, for example, in general population studies.

Simulation study
A simulation study was used to compare the three methods (Poisson regression, Cox analysis of time to first event, and frailty models). The Poisson regression models included a scale parameter to account for overdispersion and the logarithm of the actual time at risk as an offset variable.
Data sets with 600 persons were simulated using exponentially distributed waiting times, where the log of the intensity depended on a single standard normally distributed predictor, x, and a gamma distributed frailty, u, with E(u)=1 and V(u)=φ: Time to other absence did not depend on the value of the predictor and the frailty. The same was true for the duration of sickness absence. Time until the occurrence of other absence was simulated to have a mean that was six times greater than the average duration of sickness absence. Parameter estimates β and the rejection rates for tests of the null hypothesis, H : β=0, were studied under different conditions. Data sets were simulated using the values exp(β)= 1.00, 0.97, 0.94, 0.91 for the true effect size (exp(β)=1.00 corresponding to the null hypothesis), and the values φ=0, 0.5, 1.0 for the heterogeneity (the variance in the distribution of the frailty u). The φ=0.5 was chosen based on the value found in the IPAW data, and the value φ=1.0 corresponded to situations with more heterogeneity (eg, general population studies).
The mean and the empirical standard deviation of the estimates β were computed for each combination of true effect size and heterogeneity. The corresponding effect estimates and 95% confidence intervals are shown in table 2.
All three methods provided unbiased effect estimates under the null hypothesis. Poisson regression underestimated the effects slightly for large values of the heterogeneity. The Cox analysis of time to first event provided correct effect estimates when φ=0, as would be expected, but underestimated the effects as the heterogeneity increased. The frailty model was the model used to generate the data and, of course, estimated the effects correctly. Poisson regression analysis without use of an offset variable was also considered, and it consistently yielded effect estimates that were too small. This problem increased as the heterogeneity increased (results not shown).
variable. Estimated rate ratios (RR) adjusted for the effect of gender, age, and intervention assignment are shown in table 1. Hazard ratios (HR) adjusted for the effect of gender, age, intervention assignment from the Cox analysis of time to first event, and frailty models are also shown in table 1.
In the Cox analysis of time to first event, no work environment factors were significantly associated with sickness absence.
According to the frailty models, all of the factors were significantly associated with sickness absence. An increase of one standard deviation for skill discretion was associated with a 9% lower risk of sickness absence (95% CI 1-16), whereas a corresponding increase in decision authority was associated with a 15% lower risk of sickness absence (95% CI 8-21). For predictability and meaning of work, the effects equaled those found using Poisson regression. On the average, Poisson regression thus underestimated the effects by (3=9+2=15+0+0)=4=12%.
For all four of the fitted frailty models, the estimated heterogeneity differed significantly from 0. Estimates of the heterogeneity φ and the corresponding 95% confidence intervals were computed. The estimated heterogeneity in all of the analyses was 0.45, and the confidence intervals clearly indicated that the heterogeneity differed Table 1. Impact of psychosocial work environment factors on sickness absences in a 1-year follow-up period according to Poisson regression, Cox analysis of time to first event, and frailty models-data from workplaces taking part in the Intervention Project on Absence and Well-being (N=725). (RR = rate ratio, 95% CI = 95% confidence interval, HR = hazard ratio)

Christensen et al
For φ=0.5, the value closest to that observed in the IPAW data example, average effect estimates (3%, 6%, and 9%, respectively) were reduced by one-third (to 2%, 4%, and 6%, respectively) in the Cox analysis of time to first event and by one-tenth (to 3%, 5%, and 8%, respectively) in the Poisson regression model.
The information presented in table 2 describes the estimation precision and is based solely on the effect estimates computed from the 1000 simulated data sets. Table 3 and table 4, on the other hand, describe properties of statistical tests. Table 3 shows the empirical type I error rates for the three methods (Poisson regression, Cox analysis of time to first event, and frailty models) for different values of the heterogeneity φ. Wald tests using the ratio of effect estimates and asymptotic standard errors were used.
For the Poisson regression model, the type I error was too small when φ>0. The type I error was close to the nominal 5% for the two other methods. A Poisson regression analysis without the overdispersion parameter was also considered, but the type I error was much too large when φ>0 (results not shown). Table 4 shows the empirical rejection rates for a 5% level Wald test for the three methods for different values of the true effect size and the heterogeneity.
The Cox analysis of time to first event had less statistical power than the two other methods, and the frailty model was the most powerful. For all three methods, the power decreased as the heterogeneity increased.

Discussion
Poisson regression is often used in epidemiologic studies of sickness absence. The importance of including actual time at risk as an offset variable and introducing a scale parameter to account for overdispersion in Poisson regression is well-known. Inclusion of a scale parameter addresses the problem of overdispersion, and the problem of unequal follow-up time can be addressed using an offset variable when the actual follow-up time is known.
Poisson regression implicitly assumes that people are at risk of sickness absence again immediately after a sickness absence, and, furthermore, it does not distinguish between the events occurring for different persons and multiple events occurring for the same person. These problems can be addressed using the frailty models discussed in this paper.
Models based on intensities such as those in equations 1 and 3 are natural descriptions of events occurring over time. The Cox proportional hazards model is a well-known model for estimating differences in risk when (possibly censored) survival times are analyzed. Table 3. Empirical type I error rates a for a 5%-level Wald test for the Poisson regression, the Cox analysis of time to first event, and the frailty models depending on heterogeneity φ (the variance in the distribution of the frailty).  Table 4. Empirical rejection rates a for a 5%-level Wald test for the Poisson regression, the Cox analysis of time to first event, and the frailty models depending on true effect size and heterogeneity φ (the variance in the distribution of the frailty).   When the Poisson regression approach is used, multiple events for the same person are assumed to be independent. In the frailty model, the multiple events are assumed to be conditionally independent, given the person effect (the frailty). Frailty models estimate relative risks from repeated events while taking the heterogeneity between persons into account.
In this paper we have shown that the Poisson regression analysis underestimated effects, and this problem increased as the heterogeneity (φ) increased, but was also seen for φ=0. This finding stresses the point that Poisson regression and time-to-event analyses only yield identical results if persons are at risk again immediately after an event has occurred, an assumption that is unrealistic for sickness absence data. An analysis of the time to first event also underestimated effects when φ>0. In these cases the Cox proportional hazards model is not correct because the effect of the frailty (u) is not included. These findings can be regarded as bias caused by an omitted covariate (15), and, when the effect of u is ignored the proportional hazards assumption is violated. The Poisson regression model uses all of the available information, but has two potential problems. It ignores the fact that the multiple events for each person can be correlated, and it does not distinguish between risk time before the first sickness absence and risk time upon return to work after a sickness absence. The Poisson regression model has two potential problems. It ignores the fact that the multiple events for each person can be correlated, and it does not distinguish between risk time before the first sickness absence and risk time upon return to work after a sickness absence. The first problem is addressed in frailty models, as the introduced random person effect has the same value for all the events for a person, and they are thus correlated.
The second problem is also very important, as it is likely that the risk of sickness absence is higher when a person returns to work after a sickness absence. For both the Poisson regression model and the frailty models, previous sickness absence can be included as a timedependent covariate. In reference again to figure 1, the contribution of this person to the analysis is four sets of time at risk and two events, and, while covariates like gender and (baseline) age will not change, the number of previous sickness absences could be included as a time-dependent covariate, taking the values 0, 1, 1, and 2 in the four risk time sets, respectively. With the Poisson regression model, it is, therefore, possible to determine whether the number (or the duration) of previous sickness absences affects the risk of sickness absence, but this model would be based on the assumption that the effect of previous absences is the same for all persons. Including previous sickness absence as a time-dependent covariate is also possible in frailty models; thus both a shared effect of previous sickness absence (common to every person in the study) and an individual effect model (modeled through the frailty) can be added. This approach, with both previous event history and individual frailty has been used in psychiatric epidemiology (16).
When the data from the IPAW study were analyzed, the Poisson regression model yielded effect estimates of the same magnitude as those of the frailty models, but the Cox analysis of time to first event was unable to disclose significant associations between work environment factors and sickness absence, the loss of information thus being illustrated. The simulation study yielded similar findings. The Cox analysis of time to first event was, by far, the least powerful approach. The Poisson regression model was almost as powerful as the frailty models. For all three methods the power decreased as the heterogeneity increased.
The frailty model estimates differences in risk after the heterogeneity between persons has been accounted for. Furthermore, it quantifies the heterogeneity in the population. The gamma distribution is chosen for the frailties out of mathematical convenience. Other frailty distributions are available (11), but the computationally intensive EM algorithm (12) is required in order to perform full maximum likelihood estimation.
Marginal models are an alternative to frailty models. Such models estimate population-averaged risk associated with each exposure. The risk of a random sample of exposed persons is compared with that of a random sample of unexposed persons. These models only specify the marginal means, whereas frailty models are models for the entire process. Thus marginal models can yield a simple interpretation of covariate effects on the means. However, for a simple marginal model, in general, the effect of covariates on the intensity will not be simple. On the other hand, if the frailty model is correct, the marginal model will yield biased estimates. It is well-known that, for logistic regression models with random effects, the marginal model yields attenuated effects (17)(18), and a similar result applies for gamma frailty models (19). Both types of model are capable of accounting for intraindividual correlation caused by unobserved heterogeneity. We have chosen to concentrate on frailty models, mainly because they are a rather straightforward extension of both the Poisson model and the Cox model for time until the first event. In addition, frailty models rely on less restrictive assumptions concerning the censoring pattern (10), although it should be noted that they rely critically on the assumption that times not at risk do not depend on the frailty.
The frailty models have many potential applications in work environment research. Examples of relevant fields of research are studies of accidents, critical incidents, production errors, violence, harassment, and Christensen et al psychological hassles at work. The methods could also be employed in studies of end points having the character of periodic attacks such as migraine, tension-type headache, low-back pain, cervicobrachial disorders, and allergic reactions, or in studies of variations in physiological parameters. A condition for using these methods is that the information on the events is collected in a way that the time between the events can be registered reliably.
The example presented in this paper used detailed information to calculate the time at risk. As long as information about the events is available, frailty models can be applied, and the findings are likely to be relatively robust, as relative comparisons between the risk of sickness absence is studied. However, if some persons are at risk for considerably shorter periods of time, ignoring this situation can be a source of bias. One example would be when women on maternity leave were considered to be at risk of sickness absence.
Standard methods were found to underestimate true effect sizes and to have lower statistical power than frailty models. Uncritical use of standard methods may thus underestimate the effect of work environment exposures on sickness absence or leave predictors of sickness absence undiscovered. Frailty models can be fitted using the freely available statistical software package R (14), and SAS macros for gamma frailty models using the EM algorithm are also available (http://www.biostat.mcw. edu/software/SoftMenu.html) (20, p 416).
This contribution has discussed the analysis of sickness absences. Other aspects of sickness absence, like the number of workdays lost due to sickness absence or the total cost of sickness absence, could be considered as the relevant outcome measure. The choice of outcome measure is crucial in the study design, and the choice of the statistical model should be made subsequently. Analyzing sickness absences does not yield information about the duration of the sickness absences or about the number of sickness absence days. Frailty models can also be used to address these aspects by regarding people who are sick to be at "risk" of returning and analyzing the (possibly censored) time until return.