Imputation of individual cancer cases to occupational causes

Imputation of individual cancer cases to occupational causes. Objectives Many potential occupational causes of cancer have been documented. Imputation of an individual cancer to occupational or other causes is, however, difficult. A method based on the Bayes theorem is proposed for assessing causal relationships at the individual level. Methods Causality assessment, dealing with four types of persons defined by exposure and the occurrence of cancer, was linked with imputation, only dealing with persons who have cancer and were exposed. Imputation was then formulated using the Bayes theorem, relating epidemiologic information regarding causes, a patient’s exposure history, and the posterior odds that the cancer was caused by a suspected occupational exposure. Data needed to apply a Bayesian method were defined in terms of relative risks, proportion of people exposed in populations, and the frequency of a positive relevant characteristic for persons without cancer. A relevant characteristic was defined using a formal consensus between experts. The method was then illustrated with cases of mesothelioma and lung cancer in possible relation to asbestos. Results Experts defined the relevant characteristics as being qualification of occupational exposure, intensity of exposure, latency, disease characteristics, and presence of causal agent in the body. Application to mesothelioma and lung cancer cases illustrated the potential usefulness of the method. Conclusions The importance of occupational exposure in the formulation of imputation underscores the need for available and reliable data sources on occupational exposures. The proposed method could become a powerful tool for the expert assessment of causes of cancer cases, provided data become available in individual files and the literature.

Many potential occupational causes of cancer have been documented in the scientific literature (1)(2)(3). Understanding these causes requires good cancer epidemiology and an understanding of exposures and the susceptibility of those exposed (4). These occupational exposures have major consequences for people who develop cancer and on insurance bodies who compensate for lost revenues and other financial consequences of cancer. Compensation of a person with cancer cannot, however, occur before the disease can be imputed to a specific occupational exposure.
In many countries, recognition of an occupational cause is based on a set list of diseases and criteria regarding exposure (4,5). In France, for instance, impu-Vandentorren et al tation only takes into account the diagnostic category, characteristics of exposure, and delay between exposure and the occurrence of the disease (6). In the United States and many other countries, expert advice is requested by courts or insurance companies for judging imputability (7). Without specific methods to assess imputability, recognition of a causal relationship for an individual remains a subjective process, often yielding disagreement between experts (7).
One way to improve an expert judgment may be to use prediction rules yielding estimates of the probability that the disease is due to a specific cause. Such probabilistic methods are currently used for clinical diagnoses (8) and were proposed for the imputation of adverse reactions to specific drugs (9). In occupational health, probabilistic methods were developed for calculating the probability that a cancer might be attributed to occupational exposure to radiation (10), a context in which the concept most often used is that of the probability of causation (10,11). In this paper, we propose an extension of existing methods, based on Bayesian reasoning, the Bayesian imputability method, to compute the odds of causation at the individual level. Our method provides a more precise means of estimating probabilities, as it takes into consideration the probability of exposure. We formulated the imputability of cancer to a specific occupational exposure using the Bayes theorem and its link to the epidemiologic assessment of associations. We then had an expert group define, using formal consensus techniques, which information would be relevant for use in the method. Finally, we illustrated the method using cases of mesothelioma or lung cancer for which an occupational exposure was suspected.

Formulation of imputability
In the classical epidemiologic assessment of associations between an occupational exposure and a given cancer, the research question is whether the exposure can be the cause of the cancer. Occupational exposure can be an occupation (eg, being an insulator) or the exposure to a substance (eg, asbestos) through an occupation. It is summarized by a simple 2 × 2 table (lower right corner of figure 1) from which association measures such as relative risks and risk or odds ratios are estimated, although the concept of causality is much more complex (12). Imputability is a clinical question regarding an exposed cancer case for which the expert must decide whether the occupational exposure is the major cause of the cancer. Imputability is therefore characterized by the fact that all concerned persons have both the cancer and the occupational exposure (individuals a in figure 1).
As in any clinical question, a conclusion regarding imputability can be reached using one or more relevant characteristics regarding the case. By relevant characteristic, we mean any characteristic of the exposure, the disease, or the person that modifies the association between the occupational exposure and the occurrence of cancer. The value of a relevant characteristic is provided by its sensitivity (proportion of cancers due to occupational exposure with the relevant characteristic) and specificity (proportion of other cancers which do not have the relevant characteristic). The value of the conclusion depends, however, on predictive values (positive, the proportion of cancers with the relevant characteristic that are due to occupational exposure; negative,

Formulation of imputability using odds
Another way to express imputability is the Bayes theorem (13). This theorem compares the likelihood of reaching a good conclusion to the likelihood of making an error. For a relevant characteristic, this comparison is expressed by the posterior odds of imputability, calculated by dividing the positive predictive value by its complement to one.
Let O + be the fact for a case being due to a given occupational exposure and I + the fact that a case has a given relevant characteristic, for instance, the fact that the level of exposure was high (figure 1). Similarly, let Obe the fact for a case not being due to a given occupational exposure and Ithe fact that a case does not have a given relevant characteristic. The formulation of imputability can then be summarized by the 2 × 2 table crossing the presence or absence of the relevant characteristic and the true causal relationship between occupational exposure and cancer (higher left corner of figure 1). In that context, the Bayes theorem (posterior odds = likelihood ratio × prior odds) is reformulated as the relationship between the posterior odds of the imputability of the cancer to the occupational exposure, the prior odds of the imputability of cancer to that occupational exposure and the positive likelihood ratio of the relevant characteristic:  where Pr indicates a probability and the vertical bar | means "given that". Pr(O + ⏐I + ) is thus the probability for a case to be due to the occupational exposure given that the relevant characteristic I is present. If several relevant characteristics regarding the case, indexed i = 1,2,…, k, are used to build the reasoning, then Each parameter can be derived from epidemiologic data as follows (table 1): is the probability of all exposed cases due to the occupational exposure, or the attributable fraction of those exposed, divided by the probability of cases not due to the occupational exposure. The attributable fraction of those exposed (AFE) can be estimated as where O RR is the relative risk of cancer in people with, compared with those without, the occupational exposure. Pr(I i + ⏐O + )/Pr(I i + ⏐O -) is the positive likelihood ratio of characteristic i (ie, the relative frequency of the relevant characteristic i in cancer cases due or not due to the occupational exposure). In the absence of a gold standard test for imputability, these parameters cannot be directly estimated. However, under the hypothesis that the frequency of the relevant characteristic for a (equation 2) .
given occupation should not differ in cancer cases that are not due to the occupational exposure in question and in people without cancer, namely, Pr(I i where RR I + /I -is the relative risk of cancer for people with, compared with those without, the relevant characteristic within the group of exposed persons (appendix). Pr(I i + ⏐O + ) can be derived directly from the relative risks or attributable fractions derived from subgroup analyses (14).

Expert consensus
To define characteristics relevant to the application of a Bayesian method to occupational mesothelioma and lung cancer, we selected a group of 10 experts from the following disciplines: occupational health and medicine, respiratory disease, oncology, pharmacology, and epidemiology. Using a postal questionnaire, we asked all of the experts their opinion regarding the relevance of information already used in applying Bayesian methods in other areas, such as the imputability of side effects of drugs (15). Using a simplified Delphi method (16), we listed information judged by a group of experts as being relevant for the imputability of mesothelioma and lung cancer to occupational exposures. We then compiled the initial results into a second postal questionnaire asking the experts to comment upon all of the responses and modify their list of relevant characteristics, if necessary. The experts were not aware of who the other members of the group were.
We then convened a meeting of the experts to finalize the definition of relevant characteristics. Using a nominal group method (16), we first asked the experts to state how they would judge the relevance of a characteristic. Then, using a pairwise comparison method, each expert applied the definition of relevance to rank relevant characteristics issued from the Delphi method. Finally, the average ranking was used to define a list of relevant characteristics to be considered in the Bayesian method.

Illustration
To test the applicability of the method, we convened a 2-day workshop. We invited the initial group of experts and nine other experts with similar backgrounds. After a formal presentation and discussion of the method, we applied the method to a series of cases abstracted from files of the occupational medicine department of our university hospital. In this paper, we report three contrasted cases of mesothelioma and lung cancer potentially due to asbestos; when the epidemiologic data needed to apply the method were not available, we used hypothetical data extrapolated from published data that the experts considered reasonably applicable to these cases.

Expert consensus
Following the nominal group discussion, the experts reached a consensus on the need for a relevant characteristic (i) to be clearly defined, (ii) to have a rational scientific basis, (iii) to have an a priori high informative value, (iv) to correspond to available data, and (v) to be adaptable to many places, times, and populations. Using these judgment items, they selected six relevant characteristics (table 2), from an initial list of eight relevant characteristics resulting from the Delphi questionnaires. Two of the relevant characteristics (duration and intensity of exposure) were finally grouped, as the experts considered that quantitative estimates of exposure should ideally combine both relevant characteristics.

Illustration
Case 1 was a 68-year-old man who worked 33 years from the age of 20 years as a plumber. He was heavily exposed to asbestos (mean exposure >10 fibers per cubic centimeter) during all of this period. There was no interstitial disease in the pulmonary scan. A lung biopsy Table 2. Definition of five characteristics relevant to the imputation of cancer cases to occupational exposure, resulting from a twostage consensus process.

Information Definition
Qualification of exposure Notion of an exposure to a chemical agent, a family of chemical agents, products, or of a task, a profession, or circumstances of exposure (nominal or ordinal scale) Intensity or duration of exposure a Any estimate of the cumulative dose of exposure (quantitative scale) Presence of agent in body Qualification or quantification of the causal agent in the body Latency Delay between the beginning of exposure and the diagnosis Characteristics of disease or history b Clinical or laboratory characteristic specific for the cause or notion of another disease potentially due to the same cause a The intensity and duration of disease were initially considered separate relevant characteristics. b Information for which consensus was weak.
Imputation of cancer to occupation confirmed significant retention of asbestos bodies (>10 4 asbestos bodies per gram of dry lung tissue). Malignant mesothelioma was diagnosed 48 years after the beginning of his exposure to asbestos.
Case 2 was a 50-year-old man who worked in many different industrial settings in which exposure to asbestos was possible; his level of exposure to asbestos was assessed as being low (ie, <10 fibers per cubic centimeter and per year). This person was not diagnosed with asbestosis and had no asbestos bodies in his lung tissue. However, he developed a mesothelioma 9 years after the beginning of his exposure to asbestos.
Case 3 was a 55-year-old man who worked for >10 years as an insulator; his level of exposure to asbestos was assessed to be high (ie, >10 fibers per cubic centimeter and per year). This person was diagnosed with asbestosis, had a high content of asbestos bodies (ie, more than 1000 bodies per gram of dry lung tissue), and developed lung cancer >10 years after the beginning of his exposure to asbestos.
The data needed to apply the Bayesian imputability method to these three cases are provided in table 3. In France, the estimated probability of exposure to asbestos is 0.20 (17). Given a relative risk of 100 (18), the proportion of mesothelioma attributable to asbestos in patients exposed to asbestos is 99.0, or a prior odds of imputability of 99 to 1. Given a relative risk of 1.2 (18), the proportion of lung cancer attributable to asbestos in patients exposed to asbestos is 16.7% or a prior odds of imputability close to 1 to 5.
For case 1, the likelihood ratio related to available information was 4.465 for "being a plumber" [proba- Table 3. Application of the Bayesian imputability method to two mesothelioma cases and a lung cancer case for which imputation to asbestos was considered. (insulators compared with others exposed) others exposed) Resulting likelihood ratio 1.469 Intensity Proportion highly exposed 0.040 Proportion lightly exposed 0.160 Proportion highly exposed 0.040 among people without meso-among people without meso-among people without lung thelioma thelioma cancer Relative risk of mesothelioma 8.500 Relative risk of mesothelioma 0.118 Relative risk of lung cancer 1.339 (highly exposed compared with (lightly exposed compared with (highly exposed compared with others exposed) others exposed) others exposed) Resulting likelihood ratio 6 5.195 for "high content of asbestos bodies" [probability of exposure derived from reference 22 and relative risk from reference 23], 9.167 for "long latency" [probability of exposure derived from reference 21 and relative risk from reference 24], and 0.328 for "no asbestosis" [probability of exposure derived from reference 25 and relative risk from reference 24]. The total likelihood ratio is 455.767, and the posterior odds is 45 120.887 (45 120 795 to 896), or a posterior probability of 0.999. For case 2, the likelihood ratio related to available information was 0.137 for "low level of exposure" [probability of exposure derived from reference 21 and relative risk from reference 18], 0.172 for "low content of asbestos bodies" [probability of exposure derived from reference 22 and relative risk from reference 23], 0.011 for "short latency" [probability of exposure derived from reference 21 and relative risk from reference 24], 0.328 for "no asbestosis" [probability of exposure derived from reference 25 and relative risk from 24]. The total likelihood ratio is less than 0.0001 and the posterior odds is 0.0086 (3 to 350), or a posterior probability of 0.008.
For case 3, the likelihood ratio related to available information was 1.469 for "being an insulator" [probability of exposure derived from reference 26 and relative risk from reference 27], 1.321 for "high level of exposure" [probability of exposure derived from reference 21 and relative risk from reference 18], 1.329 for "high content of asbestos bodies" [probability of exposure derived from reference 22 and relative risk from reference 28], 1.346 for "long latency" [probability of exposure derived from reference 21 and relative risk from reference 18], and 2.885 for "asbestosis" [probability of exposure derived from reference 29 and relative risk from reference 29]. The total likelihood ratio is 10.018, and the posterior odds is 2.004 (2 to 1), or a posterior probability of 0.667.

Discussion
In this study, we have shown how Bayesian reasoning can be adapted to the important issue of imputing cancers to occupational exposures. The two-stage consensus work allowed us to define which relevant characteristics should be used in applying the method. Although the relevant characteristics defined by the experts is the information usually sought in traditional expertise on occupational diseases (6, 7), the Bayesian imputability method provides an explicit way to quantify the causal relationship. The illustration showed the applicability and potential usefulness of the method in strengthening conclusions about imputability, confirming a strong a priori feeling (case 1) or starting with a low prior odds (case 3) or excluding it, despite high prior odds (case 2). The last result can be counterintuitive, but it illustrates how probabilistic reasoning can modify the judgment of experts. Indeed, experts primarily use their prior knowledge of the issue, using epidemiologic data that are rarely specific enough to be applicable to a given person (30). Furthermore, these illustrations are only meant to document the applicability of the method; a definitive conclusion regarding the cases should not be reached without a critical appraisal of the studies used and an analysis testing the sensitivity of final estimates to the data used. For instance, if we had used a relative risk of 5.0 instead of 3.0 to assess the effect of asbestosis in case 3, the resulting likelihood ratio would have been 16.079, increasing the posterior odds to 3.216 or approximately 3 to 1.
The use of probabilistic approaches similar to that developed in this paper is well accepted in other fields. For instance, in the diagnosis and prognosis of cancer, many prediction rules are available and currently used at the bedside (8). The concept of the probability of causation, discussed for instance by Armstrong & Thériault (11), is a direct expression of the attributable fraction of the exposed (AFE). The probability of causation is computed from relative risks specific to groups having the same characteristics (age, smoking, and the like) as the patient for whom it is computed. Existing methods are similar to ours, whenever the probability of exposure is set at 1, which is not reasonable in many instances.
One issue we did not specifically address in developing and illustrating the method is that of multiple causes. This issue has been considered in assessments of the probability of causation by radiation. In the absence of knowledge about interaction (multiplicative or additive), it is assumed that the risk at a given age is the sum of the risks from each exposure (31). Developments allow for an extension of this approach to any situation with multiple causes, for instance, for apportioning risk to asbestos and smoking exposures when interaction testing for the additivity or multiplicativity of risks is taken into account (32,33). To compensate for the hypothesis of independent exposures, our method can be extended in two ways. First, a posterior odds of imputability can be derived for each of the suspected causes; for instance, a posterior odds of imputability of the cancer to smoking can be derived from the attributable fraction of smokers and from likelihood ratios of characteristics of the exposure to smoking. Second, the fact that the patient smokes can be considered an informative characteristic when the odds for an occupational exposure are calculated. Although the resulting posterior odds are not only odds of imputation, but also odds of developing the cancer, this extension may help when rules are defined for the sharing of compensation when several causes have high posterior odds. This would be a dramatic change to current systems of compensation in which one has to use a dichotomized distinction between due or not due to a single cause for which the claim is made.
Many of the data needed to estimate prior odds are available from the epidemiologic literature. Indeed, a demonstration of a causal relationship, at the collective level, starts with the estimation of the relative risks needed to compute the prior odds (12). Data on the frequency of exposure is scarcer; in France, for instance, such data are routinely collected by social security agencies, but they are seldom published. Register data can be an alternative for use in estimating the impact of occupational exposures on cancer in a population (34,35). Occupation and industry classifications in general population studies could also be used to study risk by job or to infer exposure to specific agents through job-exposure matrices (36). Some exposure data, furthermore, can be extracted from case series in representative casecontrol studies (37,38). Data regarding the frequency of relevant characteristics for a person without cancer are less accessible. For most occupational exposures, indeed, data are either missing or issued from weak studies; the use of such data in applying our method, or any other probabilistic method, could give a false impression of precision and objectivity. Therefore, it would be important when the method is applied with weak data to provide some measures of uncertainty. At least, instead of using only point estimates, one could use upper or lower confidence limits, as suggested for instance by Armstrong et al (39).
As any measurement tool, our method needs external validation. Information defined as relevant by our expert group may be considered differently in other contexts; for instance, the presence of the agent in the body is not relevant for the imputation of cancer to nontraceable agents. The way the method was developed, however, including the use of consensus techniques to define the content, can be considered an appropriate first approach to content validation (40). Obviously, an assessment of criterion validity is impossible, due to the absence of a gold standard test. Nevertheless, construct validity should be assessed formally by applying the method to a set of observations, extracted from national compensation systems; this assessment should include both cases for whom the occupational cause has been accepted and cases for whom it has been rejected. Furthermore, the intra-and interobserver reliability must be formally tested, as well as the agreement with expert judgment. The latter study, however, would be more a study of acceptability than of reliability, as a basic assumption of any prediction rule is that expert judgment allows for too much variation (40). Acceptability of the method will strongly depend on the familiarity of potential users with epidemiologic concepts. Most clinicians, occupational health physicians, and researchers who deal with cancer are familiar with the epidemiologic literature and related concepts of relative risk and causality. The fact that causality reasoning is not limited to estimating a relative risk is also known (12). Other important concepts, such as effect modification and likelihood ratios, are more difficult to grasp. Actually, the literature is poorer with respect to estimates reflecting the role of these effect modifiers, as they imply subgroup analyses and are confronted with problems of statistical power and accuracy of measurement. For example, the assessment of "intensity or duration of exposure" implies a well-documented dose-response relationship, available only for a few exposures such as asbestos (25) or radiation (29). We have shown that these are the more informative characteristics. This lack of data adds to the current difficulty of using epidemiologic information to analyze individual cases and underscores the need to adapt information systems to secure the availability of relevant data (41). The societal acceptability of our method will depend on the level of acceptability of using probabilities in circumstances of uncertainty. A possible obstacle to the use of probabilistic models might be the difficulty of law courts in accepting uncertainty. In such contexts, the expected response to the question of imputability is a yes or no answer, and the trust in expert opinion is higher than that in explicit rules. Furthermore, the generalization of our method and its extensions to multiple causes cannot be considered without the key issue of how to deal with shared responsibility when a patient receives compensation.