Self-reported or register-based? A comparison of sickness absence data among 8110 public and private employees in Denmark

A survey on the Work Environment and Health in Denmark obtained higher response rates from public employees, women, and older employees.Self-reported sickness absence (SA) correlated highly with register-based SA data from the Danish SA register. In general, responders with few SA days under-reported their SA and responders with many SA days over-reported their SA. Self-reported or register-based? A comparison of sickness absence data among 8110 public and private Denmark. Objectives The study aim was to examine (i) non-response bias between responders and non-responders, and (ii) whether the association between self-reported sickness absence (SA) and register-based SA differed by gender, age, sector, or physically demanding work. Methods The responses of 8110 participants to a question on self-reported SA in past 12 months in the Work Environment and Health in Denmark Survey (2014) was linked to 12 months of SA data from the Danish Register of Work Absence. We used logistic regression for the non-response analysis and Poisson regression to examine associations. Results Responders had on average 0.5 days less SA per year than non-responders. Public employees had a higher response rate than private employees (approximately five percentage points), women had a higher rate than men (approximately nine percentage points), and older employees a higher rate than younger employees (approximately nine percentage points in ten years). Self-reported SA correlated highly with register-based SA (Spearman’s rank correlation=0.76). In general, responders with few SA days (<10) under-reported their SA while responders with many SA days (>30) over-reported their SA. Women under-reported significantly more than men (average difference one day); older employees under-reported significantly more than younger employees (difference between age groups 18–29 and 60–64 was 1.7 days). Differences between sectors or levels of physically demanding work were non-significant. Conclusions Self-reported SA data may be influenced by non-response bias, and different accuracy in different demographic groups. When available, the use of register-based SA data is recommended.

Sickness absence (SA) data and research are used to guide occupational healthcare professionals and to inform policy-makers (1)(2)(3)(4). SA data can be selfreported or register-based. Self-reported SA data are generally obtained from occupational health surveys and are often measured with a single question regarding SA days in the past year (5)(6)(7)(8). Register-based SA data may be obtained from company registers, insurance data, occupational health service registers, or governmental registers (9)(10)(11).
Self-reported SA data are relatively easy to obtain (11). However, the data may be influenced by biases, eg, non-response bias and different reporting biases in different demographic groups. Non-response bias occurs when there is a systematic difference between responders and non-responders (12). For example, Martikainen and colleagues (13) estimated a 20-30% higher SA rate among non-responders compared to responders. Recall bias occurs when responders do not accurately recall past experiences. Severens and colleagues recommended a maximum recall period of 2 months after investigating different recall periods ranging from 2 weeks to 12 months (14). Still, most studies use a self-reported SA recall period of 3-12 months. Different biases in 2 Scand J Work Environ Health -online first Self-reported or register-based sickness absence different demographic groups may lead to differential misclassification. It may occur when, eg, women underreport their SA more than men.
The use of register-based SA data in research depends on the accessibility and the quality of data. Company registers are in some cases accessible (9) and, when the data is used for salary calculations, the quality of the data is considered to be relatively high (10,11). National SA registers are only available in a few countries (10), and fewer yet have access to both short-term and long-term SA. Denmark maintains a SA register with high coverage and access for record linkage, which enables register-based follow-up of large populations.
Several studies have examined the validity of selfreported SA data among public-sector employees using different types of register-based SA data as reference (9)(10)(11). A Swedish (11) and a Danish study (10) found that responders under-reported their SA, a UK study (9) found under-reporting among women but not among men. The Swedish and UK-based studies both concluded good agreement between self-reported and registerbased data (9,11), while the Danish study only concluded good agreement when the total annual length of SA was ≤1 week (10). Accuracy decreased with a higher number of SA days in all three studies, the UK study showed less accuracy among lower grade employees than higher grade employees, and the Swedish and UK studies indicated less accuracy among women compared to men (9,11). A recent meta-analysis on the reliability, validity and accuracy of self-reported SA (15) suggested that self-reported SA might serve as a valid measure in some correlational research designs, and added the cautionary note that the tendency to under-report SA could result in flawed policy decisions.
Although several original studies have compared self-reported with register-based SA, little is known about possible non-response bias in these studies. Moreover, solid knowledge whether the association between self-reported and register-based SA differs by gender, age, sector (public/private) and by physically demanding work (no/low/high) is lacking. New insights from a large study among public and private employees will help to better measure, interpret and apply SA data in research, policy and practice.
The present study investigates SA data from a large population of public and private employees in Denmark. Self-reported data from the Work Environment and Health in Denmark (WEHD) survey was linked to SA data from the Danish Register of Work Absence. The aims of the study were to (i) examine non-response bias regarding SA among responders and non-responders, and (ii) investigate the association between self-reported and register-based SA and whether the association differed by gender, age, sector and physically demanding work.

Design and study population
This study linked self-reported SA data from the WEHD survey (16) with register-based SA data from the Danish Register of Work Absence (17).
The Danish Register of Work Absence does not include all employees in Denmark. Of the N=35 023 in the random survey sample 'WEHD-2014', N=19 685 employees (56%) could be linked to the Danish Register of Work Absence. A total of N=14 171 employees (72%) could be traced full 12 months back in the register, and of those N=8308 (59%) employees had responded to the WEHD survey. The SA question was answered by N=8110 employees (see figure 1), N=5239 (65%) public employees (N=1194 state, N=4045 region and municipality) and N=2871 (35%) private employees.

Data sources
The WEHD survey. The Danish National Research Centre for the Working Environment has been conducting the WEHD survey since 2012 and will continue biannually until 2020 (16). In 2014, Statistics Denmark drew a random sample of N=35 023 employees, aged 18-64 years, employed for ≥35 hours per month, with an income of ≥3000 DKK (approximately €400) per month in the past 3 months. The employees in the sample received an invitation letter to participate in the WEHD survey and complete a web-based questionnaire. Non-responders received a reminder with a paper version of the questionnaire. From 19 March to 15 August 2014, N=16 622 employees responded (response rate 47%).
The questionnaire included questions regarding the work environment and health. In the present study, we  Thorsen et al used the SA question ["In total, how many work days with sickness absence have you had in the past year? (number of days)"]. We also used information about physically demanding work ("How physically demanding do you normally perceive your current work to be?", range 0-10).
The Danish Register of Work Absence. The Danish Register of Work Absence is located at Statistics Denmark (17) and includes absence data from all public companies/ institutions and a large sample of private companies (N=2600). The sample includes all private companies with >250 employees and a representative sample from companies with 10-250 employees. Data from companies with <10 employees is not collected. Start-and end-dates of absence periods, ie, "own sickness", "child sickness", "occupational injury" and "maternity and adoption leave" are recorded (17). In the present study, we counted SA days due to "own sickness" and "occupational injury" during the past 12 months, assuming a five-day work week from Monday to Friday. No information was available on work schedules, ie, vacations or work during weekends.

Sociodemographic variables
Data on age and gender was obtained from the Danish Personal Identification Register (CPR-register); data on sector (private/state/regions & municipality) was obtained from the Danish Register of Work Absence.

Statistical analysis
The non-response analysis included WEHD responders (N=8308) and WEHD non-responders (N=5863). Descriptive analyses revealed a non-linear relationship between register-based SA and probability of response. We fitted a logistic regression model using the outcome response/non-response, and used linear splines for SA to model the non-linearity (18). The association between response/non-response and SA was described by three odds ratios (OR), that is, SA was included in the model as a binary variable "SA days (any versus none)" (value 0 if no SA, otherwise value 1), a linear spline with SA from 1-10 days, and a linear spline with SA >10 days. Age, gender, and sector were included as covariates. We performed three extra analyses to test if multiplicative interaction terms were significant for SA and gender, SA and age, or SA and sector. All interactions were non-significant; no interaction term was included in the final analysis.
The association between self-reported data and register-based data was examined among N=8110 WEHD responders. First, we calculated the correlation between self-reported and register-based SA. Then we performed descriptive analyses that showed a non-linear relation-ship between self-report and register-based SA. In particular, the self-report of employees who reported >100 SA days differed from register-based SA. A Poisson regression analysis with linear splines (18) showed a significant association between self-report and registerbased SA, when self-reported sickness absence was ≤100 days. The association was no longer present when self-reported SA was >100 days. Therefore, the small group of employees reporting >100 SA days (N=94) was excluded from the Poisson regression analysis. The sample was further reduced by N=16 employees due to missing responses to the "physically demanding work" question, ie, the final analyses were conducted among N=8000 employees. Among employees with self-report SA ≤100 days, the association with register-based SA could be modelled in a Poisson regression analysis with self-report SA operationalized as: "self-reported days (any versus none)" (value 0 if answer was none, otherwise value 1) and "Log2(self-report 1-100)" (value 0 if self-report was none and the binary logarithm if selfreport was from 1-100 days). We included gender, age, sector, and physically demanding work in the model to examine if any of those factors added further significant explanatory value to the model. We compensated for over-dispersion in the model using a scale parameter.

Ethics
The Danish Data Protection Agency approved the WEHD survey, journal number 2012-54-0017. According to Danish law, questionnaire-based and registerbased studies do not need approval by committees of ethics, nor do they need informed consent (19,20).

Results
Non-response analysis of the WEHD survey (responders N=8308, non-responders N=5863) Responders of the WEHD survey (N=8308) had on average 7.84 register-based SA days and non-responders (N=5863) 8.37 register-based SA days, respectively. The difference of 0.5 SA days per year (6% difference) was statistically significant (Wilcoxon rank sum test; P=0.04).
The probability of response, ie, response rate, depended significantly on gender, age, sector, and number of register-based sickness absence days. Table 1 shows mutually adjusted odds (OR adj ) for response. Employees with one SA day had the highest probability for response, ie, adjusted OR and 95% confidence interval (CI) of "register-based SA (any versus none)" was 1.12 (95% CI 1.01-1.23), and OR adj of responders with 1-10 register-based SA days was 0.96 (95% 4 Scand J Work Environ Health -online first Self-reported or register-based sickness absence CI 0.95-0.97) per one day increase. Responders with >10 register-based SA days had no change in probability of response for increase in register-based SA [OR adj 1.00 (95% CI 1.00-1.00)]. The probability of response significantly increased with age (OR adj 1.04 per year), men were significantly less likely to respond than women (OR adj 0.70), and private sector employees were significantly less likely to respond than region and municipality employees (OR adj 0.84). We further tested whether the association between SA and non-response differed by gender, age or sector; no significant interactions were found. Figure 2 illustrates the results of table 1. The figure shows the difference in response rate between those with 1 register-based SA day compared to those with 10 was approximately eight percentage points. The difference between public sector employee and private sector employee was approximately five percentage points, the difference between women and men was approximately eight percentage points, and the average increase in response rate per ten years increase in age was approximately nine percentage points.

Association between self-reported and register-based SA
The Spearman's rank correlation between register-based and the self-reported SA was 0.76 (0.78 among region and municipality employees, 0.74 among state employees, and 0.71 among private employees).
Descriptive statistics for the associations between selfreported and register-based SA (N=8110). Table 2 shows the descriptive data for the associations between selfreported and register-based SA. Based on self-report, 28% of the employees (N=2242) reported no SA days in the past 12 months (table 2). According to the register, 30% had no SA days in the past year (data not shown).
The difference between self-report and registerbased SA increased the more SA days the employee had. For example, employees with 0 self-report days had on average 1.2 register-based days, and the difference between self-report and register-based days was at maximum 2 days for 90% of the responders. In contrast, employees with >100 self-report days had on average 146 days less register-based days than self-reported days, and the difference between register-based and self-report was >10 days for 97% of the responders (see table 2). The relative difference between register-based and self-reported SA (ie, register-based minus self-report divided by self-report) was on average 5.8% for those who reported 11-100 days of SA, ie, the relative underreporting was 5.8%.
When data was restricted to employees with 0-100 self-reported days (N=8016), we found an average of 1.2 excess days in register-based SA. Women underreported more than men, region and municipality employees under-reported more than state and private sector employees, and employees with physically demanding work under-reported more than employees with no physically demanding work. Excess days in register-based sickness absence increased with age, from  0.1 days among the youngest employees (aged 18-29 years) to 1.8 days among the oldest employees (aged 60-64 years), ie, older employees under-reported more than younger employees.
Adjusted rate ratios for the association between self-report and register-based SA (N=8000) The adjusted rate ratios (RR adj ) for the associations between self-report days and register-based days are shown in table 3 and illustrated in figure 3. The RR adj for "self-reported SA (any versus none)" was 1.45 (95% CI 1.20-1.75) and "LOG2(self-reported SA 1-100)" was 1.80 (1.76-1.84), which can be interpreted as responders with any self-report days had more register-based days than those with no self-report days, and each time the selfreported days doubled, the register-based days increased by 80%. The register-based days was lower for men than women given the same self-reported days (men versus women RR adj 0.89) and higher for older than younger employees given the same self-reported days (RR adj 1.01 per year increase in age). No statistically significant associations were found for sector and physically demanding work. Figure 3 illustrates these results for employees with no physically demanding work in the region and municipality sector. The figure shows that employees with few self-reported days under-reported their SA and employees with many self-reported days over-reported their SA days. The degree of over-report/under-report depended on age and gender, eg, for 30-year-old women, the change from under-to over-report occurred at approximately 20 selfreported days.

Discussion
This comprehensive study compared self-reported and register-based SA with a focus on non-response bias and whether the associations between self-reported and register-based SA differed by gender, age, sector, and physically demanding work. Responders were more likely to be public employees, older, women, and had on average less register-based SA than non-responders. We found a high correlation between self-reported and register-based SA.
The association between self-reported and register-based SA showed significant biases depending on gender, age, and the total number of self-reported days.

Non-response analysis
Responders had on average 0.5 days less register-based SA days than non-responders, ie, we found a difference of 6%. This is small compared to Martikainen et al (13)  register-based work days with SA while other studies counted register-based calendar days. Both in our study and the others, the self-reported question asked for work days, but it is possible that employees with many SA days gave an answer that reflected calendar days and not work days. The change from under-to over-report could perhaps also be related to the fact that employees tend to forget SA periods of a few days and better remember long-term SA periods, which may also be related to the underlying cause of SA.

Strengths and limitations
The strengths of this study are the large sample size, including both public and private employees, and the access to a detailed, national SA register. A limitation of this study concerns the lack of registered information on work schedules. In the present study, we assumed a five-day work week from Monday to Friday, which is not true for all employees. Another limitation concerns missing data for some variables, self-reported and register-based, resulting in the exclusion of employees from some analysis. Although we still retained a large sample size, the sample is not a random sample of all Danish employees.

Concluding remarks
When using self-reported SA, the results may be influenced by non-response bias and different biases in different demographic groups. In general, employees with few (1-3 days) SA days had a higher response rate than employees with many (≥10) SA days. On average, women under-reported their SA more than men (difference approximately 1 day), and older employees under-reported more than younger employees (difference approximately 1.5 days), but if the employee had only few SA days the accuracy was relatively high for all employees (90% of responders with 0 self-reported days had at maximum 2 register-based SA days). The overall correlation between self-reported and register-based SA was relatively high (Spearman's rank correlation =0.76). From our study, we conclude that self-reported SA may be used when register-based SA is not available, but caution is recommended.