Since the late 1970s, job-exposure matrices (JEM) have been increasingly used to obtain exposure estimates in occupational epidemiological studies. A JEM is a cross-tabulation of job titles or occupational codes and occupational exposures, preferably for a specific time window (1–4). JEM can be used in large epidemiological studies where methods based on individual interview data, observation, or technical measurements would be very costly. Other important advantages are that JEM can be used to estimate both current and past exposures and minimize the risk of information bias compared to individual-based self-report methods (2, 5, 6).
The validity of occupational exposure estimates assigned to individuals by means of JEM depends on the quality of information about exposures in specific jobs in different time periods, as well as on correct job titles or occupational codes (7). The latter aspect of JEM validity is particularly important when occupational codes are retrieved from national registers, without occupational research as the primary objective. While the validity of exposures assigned by JEM has been examined in a number of publications (8–13), the validity of the job titles and occupational codes per se has seldom been examined (7, 14). Incorrect occupational codes in registers may be the result of erroneous reporting from the primary sources (eg, tax agents, companies) and – if classification systems have changed over time – errors in translation from one classification system to another. Therefore, the validity of registered occupational codes may vary between industries and occupations and across time periods.
The Danish Occupational Cohort with eXposure data (DOC*X) is a nationwide cohort for occupational research containing occupational histories in terms of year-by-year codes according to the Danish version of the International Standard Classification of Occupations (DISCO) on an individual level from 1970 through 2015 with ongoing updates. DOC*X is an open research resource that provides opportunities to perform register-based epidemiological studies of occupational exposures by use of JEM (15). The validity of the DISCO codes in the nationwide registers, which form the foundation of DOC*X, has not been investigated.
The overall aim of this study was to evaluate the validity of DISCO codes in DOC*X. Specific aims were to evaluate (i) the agreement between JEM-based exposure estimates according to self-reported job titles converted to DISCO codes and according to register-based DISCO codes in DOC*X; and (ii) the agreement between these two sets of DISCO codes per se.
Methods
Danish Occupational Cohort with eXposure data (DOC*X)
DOC*X is a nationwide database including 6.4 million residents in Denmark from the age of 16, who have been gainfully employed at a private or public workplace in Denmark from 1970 through 2015 (15–17). The database has been compiled and is updated at a secured platform at Statistics Denmark. The backbone of the database is the information on occupation and industry, which includes calendar specific DISCO-88 codes for each individual based on the 1970 Census (16) and the Employment Classification Module (1976–2015) (17). The Employment Classification Module has used three classifications: (i) a scheme developed by Statistics Denmark based on ISCO-68 (1976–1990), (ii) DISCO-88 (1991–2009), and (ii) DISCO-08 (2010 onwards) (15). In DOC*X, the different coding versions have been harmonized to DISCO-88 codes in a code-by-code manner as described previously (15). The codes vary in detail from 1- to 4-digit levels, of which the last-mentioned is the most detailed. The annual DISCO-88 code for each individual is defined by the job with the highest income during each calendar year. We extracted annual DISCO-88 codes by use of the personal identifier (18).
Population used for validation
From 1976–1994, we used occupational data from the Copenhagen City Heart Study (CCHS). In total, 19 698 men and women from the center of Copenhagen were randomly drawn from the Copenhagen Population Register. The sample was age-stratified within 5-year age groups from 35–70 years of age. All participants completed a self-administrated questionnaire in 1976–1978, including a freeform question about current job title (N=14 223). Follow-up studies with information on job title were completed in 1981–83 (≥500 20–25-year-olds) and in 1991–94 (≥3000 20–49 year-olds) (19, 20). The proportions that responded were 73.6% at baseline and 70.2% and 61.2% at follow-up. In the beginning of 2016, the job title text strings from the stored questionnaires were digitalized and assigned DISCO-88 codes by three librarians, who worked independently. The codes were cross-checked and a supervising occupational health specialist resolved discrepancies.
For 2004, we used data from the ASUSI cohort of 14 266 men and women, who completed a questionnaire in a population-based study of working environment and sickness absence (ASUSI is a Danish acronym for working environment, sickness absence, premature exit from the labor market, social inheritance, and intervention) (21). Two trained sociologists digitalized the job title text strings from the questionnaires assigned DISCO-88 codes. Only persons who had been in employment for ≥80% of the time during the previous year or had been employed for 6 out of the 12 weeks preceding 1 July 2004 were included.
Assessment of occupational exposure intensities
We assessed five types of exposure using four JEM:
Wood dust estimates were assessed using a wood dust JEM based on expert ratings and 12 704 measurements collected in 1978–2007 in wood related industries in six European countries (22, 23). We dichotomized the exposures as non-exposed and exposed because wood dust exposure was rare in the study population.
Lifting and standing/walking estimates were assessed using the Lower Body JEM (24) . Five Danish occupational health physicians with a minimum of 10 years of experience rated the exposures. We categorized the lifting exposures as described previously (25–28) (0=non-exposed, 1=medium exposed (>0–<1000 kg/day), and 2=highly exposed (≥1000 kg/day)) and divided the exposure estimates for standing/walking into three groups [(0=non-exposed (0 hours/day), 1=medium exposed (>0–5.9 hours/day), and 2=highly exposed (≥6.0 hours/day)] according to previously used categories (27, 28).
Work with the arms elevated >90° estimates were assessed using the Shoulder JEM, which is based on expert ratings by five Danish occupational health physicians with a minimum of 10 years of experience (29–32). The expert rated estimates of time spent working with the arms elevated >90° (hours/day) have been validated against technical measurements (13). We divided the exposure estimates according to previously used cut-off value for high exposure (0=non-exposed, 1=medium exposed (>0-0.4 hours/day), and 2=highly exposed (≥0.5 hours/day) (32, 33).
Noise was assessed using the Noise JEM (35, 36), which is based on personal dosimeter measures of occupational noise exposure in the periods 2001–03 and 2009–10 among 1140 workers (1343 measurements) within the ten industries with the highest reporting of noise induced hearing loss according to the Danish Working Environment Authority. The measurements represented 100 occupational titles according to the DISCO-88 system. Four experts rated the noise intensity levels for the remaining jobs using 35 benchmark groups. Their ratings were used to construct an expert score dependent on sex, age, and calendar time (34, 35). We used the categorical variable for noise exposure (0=<80 dB, 1=80-84 dB, 2=≥85dB), based on ISO-1999 thresholds (35, 36).
We assigned exposure estimates to individuals in the CCHS/ASUSI cohorts with DISCO-88 codes for which a JEM exposure estimate was available. The estimates were assigned by connecting the JEM with their calendar-year specific DISCO-88 codes based on self-report and their DISCO-88 codes in DOC*X for the specific calendar year.
Statistical methods
From both cohorts (CCHS and ASUSI) and each time period, we excluded persons, who stated that they were unemployed or had retired. For each exposure and time period, the final population included only individuals with both sets of DISCO-88 codes and only DISCO-88 codes with ≥10 self-reported observations (37). Furthermore, we only included observations where JEM-based exposure estimates were available for both sets of codes.
We computed kappa coefficients (κ) with 95% confidence intervals (CI) for exposures with two exposure categories (wood dust) and weighted κ with 95% CI for exposures with three exposure categories (all other exposures). Additionally, we in 3×3 tables computed sensitivity (the percentage of true exposure categorizations for the highest exposed individuals) and specificity (the percentage of true exposure categorizations for the non-exposed individuals) based on self-report as the gold standard. This means that the medium exposed groups not were included in the interpretation of sensitivity and specificity. We also assessed the sensitivity and agreement (weighted κ) between the DISCO-88 codes per se (specificity was not assessed because it would always be very high due to the low frequency of persons in any DISCO-88 group compared to the total number of persons in the study). Sensitivity was calculated as the percentage of true registrations within each DISCO-88 code digit level (1–4) taking the DISCO-88 codes based on self-report as the gold standard. In addition to the agreement at 1-, 2-, 3-, and 4-digit levels, we computed weighted κ coefficients by time period (1976–78; 1981–1983; 1991–1994; 2004) at DISCO-88 1-digit level (DISCO-88 major groups). We interpreted the κ coefficients as: <0=poor, 0.00–0.20=slight, 0.21–0.40=fair, 0.41–0.60=moderate, 0.61–0.80=substantial, and 0.81–1.00=almost perfect agreement (38). SAS software, version 9.4, (SAS Institute Inc, Cary, NC, USA) was used.
Results
Table 1 presents the number of DISCO-88 codes according to time period, including all digit levels of DISCO-88 (based on self-reported job titles), that met the inclusion criteria of minimum ten observations in our final study dataset. These codes represented 29–56% of the total number of codes, including all digit levels of the DISCO-88 system, with the lowest percentage in 1991–94 and the highest in 2004. The number of individuals in each time period is also shown; their distribution across DISCO-88 groups is presented in supplementary table S1, www.sjweh.fi/show_abstract.php?abstract_id=3857.
Table 1
Time period | N (≥10) a DISCO-88 codes | % (≥10) b DISCO-88 codes | Final population number of individuals |
---|---|---|---|
1976–78 | 215 | 44 | 7707 |
1981–83 | 180 | 37 | 7193 |
1991–94 | 142 | 29 | 2664 |
2004 | 271 | 56 | 11 782 |
As seen in table 2, our data showed substantial agreement between JEM-based exposure estimates according to the two sets of DISCO-88 codes based on self-reported job titles and registrations in DOC*X, except for noise in 1981–83. Across time, both the sensitivities and κ estimates were lowest for the time period 1981–83. Overall, the specificities were high showing substantial agreement for the non-exposed individuals.
Table 2
Exposure time period | Self-reported N non/medium/high | Registered N non/medium/high | Sensitivity a | Specificity b | Agreement weighted κ, (95% CI) |
---|---|---|---|---|---|
Wood dust c | |||||
1976–1978 d | 6448/ – /119 | 6446/ – /121 | 90.9 | 99.9 | 0.91 (0.88–0.95) f |
1981–1983 d | 4295/ – /37 | 4294/ – /38 | 63.2 | 99.7 | 0.64 (0.51–0.76) f |
1991–1994 d | 1712/ – /14 | 1710/ – /16 | 85.7 | 99.8 | 0.80 (0.64–0.96) f |
2004 e | 9465/ – /230 | 9479/ – /216 | 76.1 | 99.7 | 0.78 (0.74–0.82)f |
Lifting | |||||
1976–1978 d | 2854/2755/944 | 2655/2695/1203 | 60.3 | 89.6 | 0.71 (0.70–0.72) |
1981–1983 d | 2196/1638/465 | 2108/1785/406 | 47.3 | 88.9 | 0.64 (0.63–0.66) |
1991–1994 d | 904/585/198 | 817/665/205 | 76.6 | 94.6 | 0.78 (0.75–0.81) |
2004 e | 4358/2777/1783 | 4383/2826/1371 | 80.2 | 84.8 | 0.72 (0.70–0.73) |
Standing/walking | |||||
1976–1978 d | 2776/2032/1745 | 2619/2698/1236 | 78.8 | 89.5 | 0.68 (0.67–0.70) |
1981–1983 d | 2160/1242/897 | 2070/1140/1089 | 61.5 | 89.4 | 0.68 (0.66–0.70) |
1991–1994 d | 882/507/298 | 811/575/301 | 76.4 | 94.6 | 0.78 (0.75–0.80) |
2004 e | 4347/3224/1347 | 4379/3122/1417 | 68.2 | 84.8 | 0.69 (0.67–0.70) |
Arm elevation >90° | |||||
1976–1978 d | 2790/2646/1083 | 2645/2988/886 | 86.0 | 91.4 | 0.78 (0.77–0.80) |
1981–1983 d | 2235/1503/594 | 2090/1344/898 | 57.9 | 89.8 | 0.70 (0.68–0.72) |
1991–1994 d | 941/624/179 | 891/632/179 | 71.9 | 92.1 | 0.78 (0.75–0.80) |
2004 e | 4875/3379/1384 | 5146/3016/1476 | 73.8 | 82.2 | 0.69 (0.68–0.70) |
Noise | |||||
1976–1978 d | 4713/1516/387 | 4568/1663/385 | 75.1 | 94.0 | 0.75 (0.73–0.76) |
1981–1983 d | 3482/777/73 | 3251/954/127 | 29.9 | 93.8 | 0.56 (0.53–0.58) |
1991–1994 d | 1345/386/15 | 1319/400/27 | 73.3 | 94.3 | 0.78 (0.75–0.81) |
2004 e | 6504/2634/587 | 6703/2473/549 | 60.8 | 93.2 | 0.72 (0.70–0.73) |
Table 3 shows that the agreements between the two sets of DISCO-88 codes were substantial across 1-, 2-, 3-, and 4-digit levels. The highest κ estimates were seen for the 4-digit DISCO-88 group level with estimates between 0.73–0.81. The sensitivities varied between 51.5–73.2% and were highest for the 1-digit DISCO-88 level. As seen in table 4, the DISCO-88 code specific agreement at 1-digit level varied from fair to almost perfect across time periods (κ=0.34–0.91). Group 0 (armed forces) had almost perfect agreement, whereas group 1 with legislators, senior officials, and managers showed the lowest agreement; no time trends were evident. The sensitivities generally showed the same pattern as the κ-values.
Table 3
Self-reported a | Registered b | Final population c | Sensitivity d | Agreement | |
---|---|---|---|---|---|
|
|
|
|
||
N | % (N) | % (N) | κ, (95% CI) | ||
1976–1978 e | |||||
4-digit | 10 443 | 73.8 (7708) | 55.8 (5824) | 66.3 | 0.77 (0.76–0.79) |
3-digit | 10 933 | 74.3 (8124) | 65.7 (7182) | 61.4 | 0.71 (0.70–0.72) |
2-digit | 11 335 | 71.7 (8128) | 66.1 (7491) | 67.1 | 0.71 (0.70–0.73) |
1-digit | 11 688 | 72.1 (8430) | 65.9 (7707) | 70.6 | 0.73 (0.72–0.74) |
1981–1983 e | |||||
4-digit | 9319 | 55.8 (5204) | 42.6 (3973) | 59.4 | 0.74 (0.72–0.75) |
3-digit | 9744 | 69.8 (6804) | 65.2 (6352) | 54.4 | 0.69 (0.68–0.71) |
2-digit | 10 041 | 69.8 (7012) | 68.3 (6856) | 60.8 | 0.69 (0.67–0.70) |
1-digit | 10 311 | 69.8 (7199) | 69.8 (7193) | 65.5 | 0.69 (0.67–0.70) |
1991–1994 e | |||||
4-digit | 8186 | 28.4 (2322) | 17.6 (1443) | 71.4 | 0.81 (0.79–0.83) |
3-digit | 8552 | 28.6 (2447) | 23.9 (2042) | 65.5 | 0.72 (0.70–0.75) |
2-digit | 8753 | 28.8 (2522) | 28.0 (2451) | 69.4 | 0.72 (0.70–0.74) |
1-digit | 8992 | 29.7 (2668) | 29.6 (2664) | 73.2 | 0.73 (0.71–0.75) |
2004 f | |||||
4-digit | 13 858 | 77.9 (10 794) | 65.4 (9064) | 51.5 | 0.73 (0.72–0.74) |
3-digit | 13 892 | 78.4 (10 891) | 72.9 (10 134) | 56.6 | 0.72 (0.71–0.73) |
2-digit | 13 892 | 82.6 (11 469) | 75.9 (10 540) | 64.1 | 0.72 (0.71–0.73) |
1-digit | 14 266 | 84.5 (12 048) | 82.6 (11 782) | 65.8 | 0.71 (0.70–0.72) |
a Number of individuals with a DISCO-88 code based on self-reported job-titles within each DISCO-88 code digit level.
c For each exposure and time period, the final study sample includes observations with both sets of codes, and only DISCO-codes with at least 10 self-reported observations overall in the sample.
d The percentage of true registrations within each DISCO-88 code digit level based on self-reported job-title as the gold standard.
Table 4
DISCO Group a | 1976–1978 | 1981–1983 | 1991–1994 | 2004 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|||||||||
N b | Sensitivity c (%) | Agreement d κ, (95% CI) | N b | Sensitivity c (%) | Agreement d κ, (95% CI) | N b | Sensitivity c (%) | Agreement d κ, (95% CI) | N b | Sensitivity c (%) | Agreement d κ, (95% CI) | |
0 | 55 | 96.4 | 0.91 (0.86–0.97) | 48 | 97.9 | 0.87 (0.80–0.94) | 10 | 90.0 | 0.86 (0.70–1.00) | 83 | 90.4 | 0.71 (0.64–0.78) |
1 | 303 | 57.4 | 0.43 (0.39–0.48) | 340 | 40.0 | 0.38 (0.33–0.43) | 155 | 56.8 | 0.46 (0.40–0.53) | 994 | 33.3 | 0.41 (0.37–0.44) |
2 | 933 | 82.1 | 0.82 (0.80–0.84) | 995 | 78.7 | 0.73 (0.71–0.75) | 665 | 78.0 | 0.78 (0.75–0.80) | 2258 | 71.4 | 0.69 (0.68–0.71) |
3 | 1038 | 67.8 | 0.56 (0.53–0.59) | 1054 | 54.8 | 0.53 (0.50–0.56) | 557 | 68.4 | 0.62 (0.58–0.66) | 2487 | 67.0 | 0.52 (0.50–0.53) |
4 | 1549 | 81.7 | 0.70 (0.68–0.72) | 1481 | 75.3 | 0.72 (0.70–0.74) | 411 | 77.1 | 0.72 (0.69–0.76) | 1130 | 68.0 | 0.52 (0.50–0.55) |
5 | 1125 | 50.7 | 0.53 (0.50–0.56) | 985 | 56.4 | 0.56 (0.53–0.59) | 274 | 72.3 | 0.71 (0.67–0.76) | 1630 | 80.7 | 0.76 (0.74–0.77) |
6 | 20 | 75.0 | 0.77 (0.62–0.92) | 20 | 55.0 | 0.40 (0.23–0.56) | 11 | 54.5 | 0.54 (0.29–0.79) | 61 | 57.4 | 0.52 (0.42–0.63) |
7 | 1243 | 82.8 | 0.83 (0.81–0.84) | 903 | 75.6 | 0.79 (0.77–0.81) | 255 | 79.6 | 0.77 (0.73–0.81) | 1150 | 77.0 | 0.70 (0.68–0.72) |
8 | 489 | 68.7 | 0.60 (0.57–0.64) | 504 | 39.3 | 0.34 (0.30–0.38) | 98 | 64.3 | 0.57 (0.49–0.65) | 974 | 54.1 | 0.56 (0.54–0.59) |
9 | 952 | 55.8 | 0.57 (0.54–0.60) | 863 | 69.9 | 0.48 (0.45–0.50) | 228 | 72.4 | 0.63 (0.57–0.68) | 1015 | 52.3 | 0.52 (0.49–0.55) |
a 0=Armed forces; 1=Legislators, senior officials and managers; 2=Professionals; 3=Technicians and associate professionals; 4=Clerks; 5=Service workers and shop and market sales workers; 6=Skilled agricultural and fishery workers; 7=Craft and related trades workers; 8=Plant and machine operators and assemblers; 9=Elementary occupations.
c The proportion of true registrations within each major DISCO-88 group based on self-reported job-title as the gold standard.
Sensitivities for individual DISCO-88 codes, according to time period, are presented in supplementary table S1. The highest sensitivities across all time periods were found for dentists (2222; 96.2%); nursing associate professionals (3231; 95.0%); police officers (5162; 92.2%); medical doctors (2221; 91.5%); jewelry and precious-metal workers (7313; 91.3%); bakers, pastry-cooks and confectionery-makers (7412; 89.3%); and primary education teaching professionals (2331; 89.8%). Prison guards (5163) and travel attendants (5111) and travel stewards had 100% sensitivity in 2004, but not enough observations for the other time periods. In general, low sensitivities were found across all time periods for business services agents and trade brokers not elsewhere classified (3429; 1.7%); production clerks (4132; 6.2%); other teaching associate professionals (3340; 6.5%); advertising and public relations managers (1234; 7.4%); finance and sales associate professionals not elsewhere classified (3419; 10.1%); safety, health and quality inspectors (11.7%; 3152); receptionists and information clerks (4222; 12.3%); and buyers (3416; 13.5%).
Discussion
Job titles and occupational codes constitute a crucial basis for the use of JEM, but errors in job titles and assignment of occupational codes have received minimal scientific attention. The present study benefitted from exposure data from JEM concerning five airborne, mechanical, and physical exposures. Self-reported job titles for the CCHS/ASUSI cohorts were translated into DISCO-88 codes, which were connected with the JEM to provide exposure estimates, which were then compared to JEM-based exposure estimates according to DISCO-88 codes registered in DOC*X. High sensitivities and substantial agreement was found for the JEM-based exposure estimates and for the DISCO-88 codes per se, although the DISCO-88 code-specific agreement varied across digit levels and across time periods.
The number of individuals in the study population from 1991–94 was low since only about one third of the individuals with a self-reported job title had a DISCO-88 code in DOC*X. An explanation may be the higher mean age in the population by calendar time as the main part of the population was included in 1976 with an age of up to 70 years at that time. For example, if they retired from the workforce before 1991, they have no DISCO code registered in DOC*X database for the time-period 1991–94. The classification system used by Statistics Denmark changed in 1981 and 1993, which may be an explanation for lower agreement observed in the period 1981–83, and again in 1991–94. In 1981–83, the classification system was less detailed than the DISCO-88 system. This means that it was very difficult to translate specific job groups from that time-period to DISCO-88 codes. Therefore, discrepancies between DISCO-88 codes may be because of translation difficulties rather than exact differences between jobs. Because of the less detailed job groups in 1981–83, the solution was to translate job titles to less detailed DISCO-88 group levels. The system for code assignment also changed in 1991, when the DISCO-88 classification system was introduced by Statistics Denmark. The DISCO-88 was based on the ISCO-88. Before 1991, the occupational codes were assigned by trained coders at Statistics Denmark based on self-reported information and union membership, but from 1991 the system was automatized and based on tax records and other personal register information. This shift in code assignment led to a temporary reduction of data reporting, which probably also contributed to the low number of individuals in the final study population for 1991–94.
The variation across DISCO-88 codes probably reflected variations in the accuracy by which DISCO codes are reported to the central authorities. Reporting to Statistics Denmark from large public and private companies is undertaken by trained staff according to written guidelines, while small private companies with fewer resources may provide less accurate DISCO codes. It is only mandatory for Danish companies with ≥10 employees to report information on occupation, and therefore significant differences in accuracy may be expected.
The misclassification of JEM-based individual exposures assigned by using DISCO-88 codes in DOC*X seems less than might be expected based on comparison of the sensitivities for the DISCO-88 codes per se; overall, the sensitivities were higher when comparing JEM-based exposure estimates than when comparing the two sets of DISCO-88 codes (especially at the 3- and 4-digit levels). This is because DISCO-88 codes belonging to similar job groups in the JEM are assigned similar job-exposures (7, 14). For example, the noise JEM will assign the same low level of noise exposure to all types of office workers regardless of the specific DISCO-88 code. Lack of agreement between two sets of DISCO-88 codes will therefore not necessarily affect the agreement between JEM-based exposure estimates.
The variation in agreement between the two sets of individual DISCO-88 codes seems to depend on characteristics of the jobs covered by the code. In general, the codes with lowest sensitivities are broadly defined and not specified, eg, business services agents and trade brokers not elsewhere classified, other teaching associate professionals, and finance and sales associate professionals not elsewhere classified. The two last-mentioned groups will probably be classified as other kinds of office workers, which will reduce the effect of the misclassification on the assigned JEM-based exposure estimates (see above). Another possibility is to exclude DISCO codes with low sensitivities in epidemiological studies (at least in sensitivity analyses) as they may increase the risk of misclassification of exposures. Thus, the actual validity of the DISCO-codes per se may be significantly higher in cleaned data prepared for analysis.
Strengths and limitations
One strength of our study is that we have data from four different time periods during a 24-year long period where Statistics Denmark used different classification systems of occupations in their registers. Furthermore, we have access to self-reported job titles. It may be questioned if self-reported job titles converted to DISCO-88 codes can be taken as a gold standard, but self-reported information on the current job is generally considered to have high validity (14, 39).
One limitation of our study is that we have no self-reported job titles from the years after 2004, and therefore no validation has been performed on DOC*X registrations from 2005 onwards. This limitation particularly pertains to DISCO-88 codes after the time point when Statistics Denmark introduced the DISCO-08 system in 2010 (15). Another limitation is that the DISCO-88 codes, which were available for validation, only represented around half of the codes in the DISCO-88 system so that only frequent occupational titles were validated at the 4-digit level. If the agreements are lower for rare DISCO-88 codes, we may have overestimated the general validity of the DISCO-88 codes in DOC*X. On the other hand, the sensitivities did not seem to depend on the number of observations (all ≥10) per DISCO-code.
In our analyses of agreement between exposure levels, we used categorical variables with two or three categories. The JEM exposures for wood dust and noise only exist as categorical variables while the other JEM contain continuous measures, which we categorized to ensure comparability. It may be a limitation that we only validated the DISCO-88 codes based on categorical variables instead of using continuous scales. We chose to focus on the lowest and highest exposure categories to examine whether they were correctly categorized. To the extent that DISCO-88 codes in DOC*X are misclassified so that highly exposed are categorized as medium or non-exposed, the data would not be of a quality that allows future exposure–response analyses.
Validity of DISCO-88 codes in future DOC*X studies
This study concerned selected airborne, mechanical, and physical exposures, and it remains open whether the validity of DISCO-88 codes in DOC*X is similar for other exposures, eg, chemicals. The validity varied across 4-digit DISCO-88 codes and time periods, which should be considered when planning studies in DOC*X. DOC*X also covers industry codes from 1976 and onwards (15) and it can be relevant to use those industry codes together with the DISCO-88 codes to reduce the risk of misclassification of occupations.
Concluding remarks
The validity of the DISCO-88 codes in DOC*X was generally high. Substantial agreement was found for the JEM-based exposure estimates and group-based DISCO-88 codes per se, although the DISCO-88 code-specific agreement varied across digit levels and time periods.
Funding
The Danish Working Environment Research Fund funded this study (grant no.: 43-2014-03 / 20140016763). The funding source played no role in the (i) study design, (ii) the collection, analysis and interpretation of the data, (iii) the writing of the report, or (iv) the decision to submit the paper for publication.