Performance of odds ratios obtained with a job-exposure matrix and individual exposure assessment with special reference to misclassification errors.

OBJECTIVES
Individual assessment of exposure by experts and the use of a job-exposure matrix are the two main methods of evaluating past occupational exposures in community-based case-referent studies. The objective of this study was to compare the performance of the estimations of the odds ratio in the two methods. This paper focuses on job-exposure matrices whose entries consist of proportions of persons exposed.


METHODS
Simulations were used to compare the variances of the estimations of the odds ratios obtained with the two methods and to study the consequences with respect to bias and the precision of the odds ratios estimated for misclassifications of exposure produced by either the experts or the matrix.


RESULTS
When there was no misclassification, the results showed that the precision obtained with the job-exposure matrix was about three times less than that achieved by experts in a large range of practical situations. However, when potential errors of exposure assessment were taken into account, the simulations suggested that the test of the hypothesis OR = 1 against the alternative OR not equal to 1 when exposure was assessed with an unbiased job-exposure matrix had a statistical power close to that obtained when exposure was assessed by an expert with high sensibility and specificity.


CONCLUSIONS
The evaluation of exposure with an unbiased job-exposure matrix in studies of the association between exposure and disease had a statistical power close to that expected in practice with a good expert in the large range of practical situations which were investigated.

Ascertaining past occupational exposure of subjects is perhaps the main problem in community-based casereferent studies of occupational risk factors. Subjects may be interviewed about their past exposures, but their answers are prone to faulty recollection, misunderstanding of questions, or misinterpretation of responses. This issue has been discussed at several recent international meetings (1)(2)(3)(4). The method generally considered to be best consists of obtaining the most detailed job descriptions possible from study subjects. These job descriptions are then submitted to experienced industrial chemists or hygienists who assess exposure for each job period. This method is recognized as being time-consuming and expensive (3, but its main drawback is the shortage of well-trained experts in retrospective occupational exposures. Implementing this approach with in-adequate personnel may give rise to serious misclassification of exposure. Furthermore, the accuracy and the reliability of experts' assessments remain to be studied precisely (6). A different approach to assigning exposure, which may overcome some of these difficulties, is via a job-exposure matrix, which can be defined as the cross-classification of a list of jobs with a list of potential exposures that provides some information about the statistical distribution of a given exposure (columns) for each job (lines) (6,7). The drawback of this method is the bias in properly estimating the odds ratio. In a previous paper, we proposed a linear model which allows estimation without bias of the odds ratio from a quantitative job-exposure matrix whose entries consist of proportions of persons exposed (8). The purpose of this odds ratio using this method with the assessment of exposure by experts. The following two questions are addressed: first the quantification of the variances of the estimations of the odds ratio with the two methods; second, the consequences with respect to the bias and the precision of odds ratio estimations of misclassifications of exposure by experts or using the matrix.

Method
Two methods for estimating odds ratios Consider a case-referent study with n, cases and no referents. When exposure is evaluated by experts, each subjeft is classified as exposed or unexposed. We shall call OR,,,,,, the corresponding estimate of the odds ratio (OR): where P, and Po are the population proportions of exposed among tke cases and referents. The variance of the logarithm of ORE,,,,, is With a job-exposure matrix (JEM), exposure is evaluated at the job level. We consider job-exposure matrices which provide the proportions of persons exposed (or the probability of exposure) x. Each subject is attributed the x value of the job he or she is cassying out. If the matrix is exact, that is, if x is the actual probability of exposure, the odds ratio and the variance of its logarithm can then be estimated without bias by the maximum likelihood method (8) using the model where p = OR-1 and P(Dlx) is the probability of disease among subjects carrying out a j?b for which the proportion exposed is x. We note the OR,,, with this estimate.

Comparison of the two methods
The two methods of estimating odds ratios were compared in simulations. Each simulation included 500 successive samples, all of which consisted of n cases and n referents, with n = 50, 100, 250, 500. We simulated several types of populations with different exposure distributions. To simplify the presentation, we supposed that there were five categories of jobs corresponding to five different probabilities of exposure. Each exposure distribution was characterized by a job repartition within the population, and by the proportions of exposed subjects within each job (figure 1). The six exposure distributions (El to E6) presented in figure 1 covered a great variety of situations (8). The distributions E l and E2 were chosen because they are commonly encountered in the field of occupational epidemiology and are more particularly studied in this paper. In all six situations, population OR values were successively fixed at 1, 2, 3, 4, 5 and 10. Therefore, in all, 36 configurations were considered, 500 samples being simulated for each.
Finally, three cases were successively considered for studying the precision of the two methods. We first compared the "crude" performance of the two methods of estimation (ie, the performance in the absence of exposure misclassification). In this case, both methods provided an unbiased estimation of the odds ratio, and we thus compared their precision. We then added misclassifications to the experts' assessment and to the job-exposure matrix to reproduce situations encountered in practice more accurately. In these cases, we compared the bias and the precision of the two methods.
To compare the precision of the methods, we computed the ratio of their variances (in fact, since, for each simulation, the distribution of the odds ratio estimations appeared to be log-normal, we computed the variance of the logarithm of the estimates): The greater the VR, the greater the loss of precision with a job-exposure matrix. VR can also be interpreted as the ratio of the numbers of subjects needed to attain the same statistical precision with a job-exposure matrix as with an expert assessment. From the point of view of etiological research, it is useful to express the same results in terms of statistical power. Thus we also calculated the power with each method (for a two-sided test of the hypothesis OR = 1 against the alternative OR + 1 with a 5% level of alpha error).

Comparison in the absence of misclassification of exposure
The variance ratios are given in table 1 for exposure distributions E4 and E5 in particular. The VR was low for E4 and high for E5. These figures result from the proportion of subjects in the jobs where x = 0 or x = 1, this being high in E4 (80%) and low in E5 (10%). For these kinds of jobs, the information on exposure given by the job-exposure matrix was equivalent to that given by the experts. The more frequent these jobs were, the more the job-exposure matrix performed like an expert. For the other situations, the VR ranged from 1 to 4, and most of the time it was lower than 3. The power of the two methods is shown in table 2 for exposure distributions El and E2 and for several values of n. The results for OR = 10 are not shown, as the power was very close to 100% for both methods. With n 2 250, the loss of statistical power was very small when OR 2 2.5. On the other hand, for smaller samples (n = 50 or 100) or for smaller odds ratios (OR = 2), the loss of power may have been important.
When a job-exposure matrix is used, misclassifications of exposure occur when there are errors in x, the probability of exposed subjects within a job. In order to model these essors, we considered the job-exposure matrix as not providing the exact probability of exposure in a job but only an approximate value. For instance, if the matrix indicated 10% for a certain job (or a group of jobs), the true probabilities of exposure in these jobs were supposed to lie between 5 and 15%. Moreover, we supposed that within this interval the distribution of the true probabilities of exposure was uniform. In this paper, we have only presented the results for exposure distributions E l and E2. The intervals within which the true probabilities lie are shown in table 3, where the overlapping of some intervals can be observed. This modelization thus allowed two jobs with the same probability of exposure to be classified in different categories of the matrix.
Such misclassifications in the job-exposure matrix had two distinct consequences (table 4). On one hand,  the estimation of the odds ratio was biased towards 1, a result expected in the case of nondifferential errors of classification, albeit that the magnitude of the bias was rather small (around 10% in relative value). On the other hand, the variance decreases and the overall result was that the statistical power was almost unmodified.

Errors in assessment of individual exposure by the experts
In expert assessment, there are misclassifications of exposure when the sensitivity or the specificity of the evaluation of exposure differs from 100%. We supposed that misclassifications by experts were nondifferential, a common hypothesis in well-planned studies with blind-evaluated exposure. The consequences of such misclassifications are known to be an attenuation bias in the OR estimation and a decrease in statistical power (9, 10, 11). The decrease in power is all the greater when the sensitivity and the specificity of the experts' judgments are low. It may thus happen that the statistical power using experts' judgments equals or drops below that of a jobexposure matrix. We calculated (for exposure distribu-tion E l in figure 1) the values of the sensitivity and the specificity of the experts' judgment for which the power fell as low as that obtained with an exact job-exposure matrix (table 5). For these values, we also calculated the bias in the estimation of the odds ratio using experts' judgments. We ignored situations in which OR > 2 for i z = 500 and OR = 5 for n = 250, as the power of both the job-exposure matrix and the experts was almost 100% in these situations. (See table 2.) There were two situations, OR = 5 and n = 50 or 100, in which the power achieved with the experts remained greater than with an exact job-exposure matrix even for rather low values of sensitivity and specificity. In these cases, experts' assessments of exposure could be considered as always being better in practice than the assessment obtained with such a job-exposure matrix. However, it should be emphasized that in these situations the job-exposure matrix gave a satisfactory level of power (76 and 97%, as shown in table 2) and that the estimation given by the experts' evaluation was rather biased (range 1.7-3.3, as shown in table 5).
In all other situations, the specificity of the experts' judgment must be greater than 80% (and sometimes 85%) for the power to remain greater or equal to that of the exact job-exposure matrix. For these values of specificity, the sensitivity must be near 100%. Consider for instance n = 250 and OR = 3. If the specificity equals 80%, the sensitivity must equal loo%, and, if the specificity equals 85%, the sensitivity must equal 93%. If the sensitivity is restricted to more realistic values (say around 80%), the result of table 5 shows that the specificity must be greater than 90%.

Discussion
The problem discussed in this paper is part of the general debate on the retrospective assessment of occupational exposure (12). We have examined in detail the relative performances of the two main methods used in commu-   a Example: n = 100 and OR = 3. In this situation, the power with an exact JEM is 67% (see table 2), and with the experts it is 92%. The table shows the combinations of specificity and sensitivity for which the power with the experts falls to as low as 67%, for example, 85% and 92%, or 90% and 83%, etc. For each combination, the table also shows OR*, the corresponding biased expert estimation of the odds ratio. The power with the experts is always lower than that with an exact JEM if the specificity is less than 80% or the sensitivity is less than 58%; the bias of the odds ratio is then at least 2.6.
nity-based case-referent studies, evaluation by industrial hygiene experts and evaluation through the use of jobexposure matrices. Our results concern job-exposure matrices whose entries are probabilities of exposure, thought to be more informative than dichotomic (6), and which are now being developed and used to analyze the results of epidemiologic studies (13)(14)(15)(16). In using such matrices the most commonly used statistical method of analysis of the data is as follows. All of the subjects in a given job title are classified as exposed or unexposed, depending on whether their probability of exposure exceeds a given threshold (17,18). This procedure results in misclassifications and involves an estimation of the odds ratio biased towards 1 with an associated loss of statistical power (9,10,19). A method of unbiased statistical analysis with probabilistic job-exposure matrices has been proposed (6). Theoretical and practical results have already been published (8,20,21,22), but until now no quantitative comparison of the OR estimations obtained from experts and job-exposure matrices has been published.
Our results show that job-exposure matrices produce a loss in precision (and thus in statistical power) of about 3, compared with experts' assessments in a wide range of common situations. Therefore about three times more subjects are required with a job-exposure matrix to reach the same precision in the estimation of the odds ratio as that achieved with experts. This loss in precision is to be expected, as a job-exposure matrix provides information on exposure at the job level, which implies a loss of information compared with that of experts who provide exposure assessments at the individual level. Nevertheless, from a quantitative point of view, the decrease in precision is not as pronounced as had been believed by epidemiologists (5,19). This conclusion is reinforced when the statistical power of the odds ratio estimations is considered rather than the precision (variance). (See table 2.) In practice, both experts and job-exposure matrices are prone to errors of classification, whose consequences on their performance have been studied in this paper. In measuring the performance of job-exposure matrices, we considered typical misclassification errors. The teams of industrial hygienists and epidemiologists engaged in the development of a job-exposure matrix generally have good knowledge of the different exposures potentially present in a particular job. But they cannot identify a precise percentage of persons exposed, as they do not know the variety of situations encountered in all of the jobs with the same job title. Thus they are only able to give an approximate value of the percentage, or a percentage range. The results of table 4 show that, in the situations considered in this paper, such approximations may bias the odds ratio estimation and variance but have limited consequences on statistical power. Furthermore, the bias is small compared with that produced by experts because of misclassification errors. The results were similar for other simulations with intervals of imprecision other than those shown in table 3. In conclusion, the fact that it is often impossible to determine an exact proportion of exposed subjects within a job-title should not break the development of job-exposure matrices.
The results of table 5 show that only the most effective experts can provide exposure assessment with the statistical power to match an exact job-exposure matrix. The sensitivity and specificity of experts' judgments are not well known in practice, and further studies would be useful to determine them. Our results indicate clearly that the most important factor, from an epidemiologic point of view, is high specificity. Here again, the same kind of results were found with other exposure distributions. Thus we can conclude that the evaluation of exposure with a job-exposure matrix in studying the association between exposure and disease has a statistical power close to that which can be expected in practice with a good expert. At least two limitations of our work should he discussed. First, job-exposure matrices have not taken occupational history into account as effectively as experts. It is clear that further work is needed to propose statistical methods able to consider the entire job history in the estimation of odds ratios, especially with a job-exposure matrix. Second, our simulations covered only a part of the possible exposure distributions. We believe the six exposure distributions are representative of the most common situations encountered in population-based case-referent studies (especially E l and E2) (8). In particular, the exposure prevalence is not low in El to E6 (17 to 73%), as would be required for a population-based case-referent study to remain of manageable size. Nevertheless, we verified that the results are similar for exposure prevalence as low as 7% in E l and E2.
One of the goals of this paper was to add to the knowledge on the advantages of job-exposure matrices when properly used. Today, job-exposure matrices are not yet considered to be sufficiently reliable tools by epidemiologists, who are reluctant to use them in their studies. As a result, little energy is devoted to building new and more efficient matrices. We hope our work will play a part in breaking this vicious circle.