Effect of incomplete exposure assessment on epidemiologic dose-response analyses.

Effect of assessment on epidemiologic analyses. OBJECTIVES - When potentially hazardous agents have multiple environmental sources, failure to in clude all exposure sources can constitute a type of measurement error. In addition, the effects of ex posure from one source can also be confounded by exposure to other sources of the same agent. In this study clarification of these concepts is sought, and the direction and magnitude of the resulting bias in epidemiologic measures of association are examined. METHODS - The bias in dose-response functions when the exposure data omit some sources of the agent was estimated with linear and log-linear models to compute risk differences and risk ratios un der different assumptions about the magnitude and correlation of exposures from measured and un measured sources. RESULTS - With unmeasured exposure of constant magnitude, there is no bias when a measure of association of the appropriate form (difference measures for additive dose-response processes, ratios for multiplicative ones) is selected. When the magnitude of unmeasured exposure varies, the result is nondifferential measurement error that can bias observed dose-response relations upward or down ward, depending on the pattern of measurement error and the measure of association. CONCLUSIONS- Failure to measure all sources of exposure to an agent and account for them in the analysis can bias the results of epidemiologic studies. When it is not feasible to measure all exposure sources, the magnitude of bias can be predicted by estimating the distribution of omitted exposures from external data or substudies. Sensitivity analyses are particularly useful for estimating the direc tion and magnitude of potential bias from incomplete exposure assessment.

workplaces and residential en vironments, and within each of the se areas the re are many discrete sources of ex pos ure, incl uding building wiring, hou sehold appliances, power lines, video display terminals, and e lectrical machinery , of which non e is cle arly predominant o ver the others. Yet among the dozens of ep idemiologi c stud ies conducted on electromagnetic fields, few have simultaneou sly considered appliances and other residential exposures, and none have co nsidered both occ upational and nonoccupational ex pos ures (2,3).
Epidemio log ic studies whose goal is to qu an tify dose-re sponse re lations bet ween spec ific age nts and health outcomes require ac curate mea surem ents of ex pos ure. When the ag ent itself is of interes t, without regard to the source of expo sure, epide miologi c studies sho uld clearly inco rpor ate all so urces o f exposure to the agent in order to yield accu rate results. Constru ct ing dose-response function s using so me but not all sources constitutes measurement error.
Epidemiologic studies may also seek to isol ate the effects of a parti cul ar sou rce of exposure to an age nt. Regu lation and indi vidual de cision mak ing frequ ently foc us on the benefits of cha ngi ng a sing le so urce of expos ure; occ upational health standards, in particular, consider only workplace expo sur es. If expo-sure from the source of concern is measured accurately, omission of other sources does not, in principle, constitute measurement error. Nevertheless, exposures from sources other than the one of primary interest are a potential cause of the same health effects . In such situations, exposures from such secondary sources can act as confounders of the effect of the source of primary concern. The potential importance of the omitted sources of exposure clearly increases when their magnitude is large, relative to exposures from the primary source.
The preceding goals of measuring the overall effect of expo sure and measuring the effect of a single source are often blended in practice. Thi s kind of mixture is particularly common in occupational studies, in which workers' experience is obser ved to develop dose-response data for the agent in general, and not only for workplace exposures.
Several type s of measurement error can be involved when the assessment of exposure is incomplete. The simplest type is systematic omission of an exposure component of constant magnitude, such that the true level of exposure is uniformly understated for every person or group in a study. Such an omission might occur, for example, if exposures to cosmic radiation were omitted from estimates of exposure to ionizing radiation from other sources.
The other type of measurement error which has been considered in this study is the systematic omission of a component of exposure whose magnitude varies among persons or groups with different measured exposures and which therefore results in an underestimation of true exposure by a nonconstant amount. This type of error might occur, for example , if, in addition to occupational exposure to formaldehyde, workers also receive a variable contribution from residential formaldehyde expo sure. The magnitude of this type of mea surement error can be positively or negati vely correlated with true exposure, or independent of it.
In this paper , we examine the consequences of failure to account for multiple sources of expo sure to a single agent. Using simple dose-response models and hypothetical examples from studies of exposure to electromagnetic fields , we have first considered the problem of empirically estimating an unknown doseresponse function when the available expo sure data omit some sources and then examined the potential for multiple expo sure sources to act as confounders when the independent effect of a single source is of interest. Althou gh the principles are simple, these issues have not to our knowledge been directly addressed in the epidemiologic literature.

Risk models
Two simple mathematical models are frequently used to describe the expo sure-disease relationship in a Scand J Work Environ Health 1994, vol 20, no 3 popul ation expo sed to a specific, disea se-causin g agent. In linear form , population risk is modeled as

Measures of association
The effect of agent s on disease occurrence is often assessed by estimating the parameters of linear or log-linear risk model s from population data . The risk difference (RD ) and risk ratio (RR) are common mea sures of association in epidemiologic studies. Under the linear do se-response model , RD = f3X and RR = I + f3X1ro for the comparison of a population with exposure X to an unexposed referent population. The same comparison under a log-linear dose-response model yields RR = e flx and RD =erl!(e flX -1).

Comp onents of expo sure
When the measurement of exposure is incomplete, a person 's true expo sure, x, can be viewed as x + x , the sum of mea sured and unmeasured comp'nents of exposure, respectively. Sinc e there may be multiple sources of unmeasured expo sure s, the avera ge true exposure of a population with measured exposure X can be written as X + X , where X = L . P X n: F, with P. as the pre~alen~e of unm~a sured~xp'os~re source i, and X . as its average magnitud e. in In the slightly modified form XI + L j P! iL, Pi' where Xn' is the intens ity of each of n secondary exposure s6urces and P. is its prevalence, this expression can also describ~the situation in which multiple sources of expo sure exist, but one is considered to be of primary interest. In such cases, variations in the pre valence and magnitude of the secondary sources in relation to the level of the primary exposure are of concern, with confounding being pre sent when P . or X"j vary across levels of the primary source, XI' The preceding expressions treat expo sures as continuous variables, and this treatment allows them to be conveniently summarized by population mean s. However, we have focused subsequent discussion on examples using discr ete , ordinal expo sure data; they are more typical of epidemiologic analy ses and make the illu strations clearer .

Effect of unmeasured exposure
In practice, measurements of exposure are often used to estim ate dose, and these exposures themselves may be measured with error, as when some sources of the agent are not taken into account. In these situations it is of interest to know how severely observed risk coefficient s or related effect measures like the risk difference and risk ratio are biased relati ve 201 Scand J Work Enviro n Health 1994, vo l 20, no 3 to the values that would be obtai ned with complete, accurate exposure data.
To examine the impact of incomplete exposure measurement on observed dose -respo nse rela tions, we used the linear and log-linear risk models to calculate measures of associa tion based on varying assumptions about the magnitude and relationship of measured and unmeasured exposures . The following two types of systematic measuremen t error resulting from the omission of some sources of an agent were considered: (i) underestimation of exposure by a constant amount and (ii) underestimation by an amount which varies with the measured level of exposure and which may be positively or negatively correlated with it or independent of it. We have assumed, in addition, that the agent causes disease, that there is no random error in measuring expos ure nor any bias from sources other than incomplete treatment of exposure, and that exposures from all sources act quantitatively and qualitatively in the same way.

Constant exposure component omitted
The consequences of measuring only one among several sources of expos ure to an agent are easily predicted when the quantity of unmeasured exposure is constant across the study population and simply adds an increment to the average dose . If the true doseresponse function is assumed to be linear, it can be shown algebraically that the observe d risk difference is equal to pX, the true risk difference. Underestimatio n of exposure by a constant amount therefore has no effect on risk differences observed for measured exposures . However, risk ratio s are consistently underestimated in this situation . (See the appendix .) The results are reversed for a log-linear doserespo nse relation in that observed risk ratios equal Table 1. Example of the eff ect of con stan t 1·mG unmeasured ex posure on a dose-resp onse rela t ion observed in a hypoth etica l stu dy of leukemia among w orkers exposed to magn etic 202 the true risk ratio e/l x, whereas risk difference measures overstate the excess risk at any level of measured exposu re. (See the appe ndix.) Since the coefficient prepresents the change in risk or relative risk per unit of exposure, the same principles apply whether exposure is considered on a cont inuous or categorical scale. Now consider a hypothetical study of leukemia among electrical workers (table I). Workers with measured occupational exposures to magnetic fields of I, 2, 4, 8, or 16 mG, as shown in the first column of table I, have cumu lative exposures ranging from 2 to 29 mG-years, on the assumption of 40 h of exposure per week in the measured field for 10 years. If all of the workers were also exposed over the same interv al to average background fields of I mG during the 128 h they spent away from work each week, the occupational exposure data would underestimate their total exposure by an equal amount (about 7 mGyears) for workers in each of the five expos ure groups .
Comparison of the risk ratios and risk differences in table I shows that uniform underestimation of exposure under the log-linear risk model causes the risk difference for each exposure category to be ove rstated and steepens the dose-response curve it describes (table I). The risk ratios and the slope they describe are measure d correctly, however. The observed risk coefficient estimated from the data in cont inuous, rather than categorized, form is also unbiased (table I).

Nonconstant exposure component omitted
Failure to measure exposures from all sources has more complex effec ts when the intensity or prevalence of unmeasured exposures is not constant but varies relative to the measured exposure . The direction and magnitude of the resulting bias then depend on the magnitude of the unmeasured exposure and its relationship to the measured component (appendix) . In general, unmeasured exposure s whose magnitude is positively correlated with measured exposure tend to amplify dose-response relationships, while negatively correlated unmeasured exposures diminish them.
As an illustration of this type of bias, consider again the hypothetical electrical worker s in the previous example. The intensity of measured and unmeasured exposures remains as before, but this time the duration and intensity of occupational expos ure are inver sely correlated, and the workers are followed for 20 years (tab le 2). With this pattern of employment, workers with the most intense occupational exposures tend to remain on the jo b for the shorte st time and consequently have the larges t component of unmeasured cumulative exposure. Risk differences in this situation are consistently overstated at every level of expos ure (table 2). However , this particu lar pattern of unmeasured exposure re sult s in negli gibl e overesti matio n of risk ratios in the three intermediate exp osure groups, with a more sub stantial overestimate in the high-exposure cat egory (table 2).

Confo unding by othe r expos ure sou rces
If an age nt causes disease and there are several co rrel ated sources of that age nt, then each source satisfies the definition of a con founder for the effects of the others. Thus the effects of ex pos ure fro m one source considered in isolation may be con founded by ex pos ures recei ved fro m other so urces (w hic h may be measured or unmeasured ). Wh ere the focus of a study is the effect of exposure fro m a specifie d source, exposures to the sa me agent are co nceptua lly ide ntica l to co nfo unding exp osures to oth er disease-c au sing age nts.
A hypothet ical study of chi ldhoo d cance r in relation to exposure to magnetic fields from electric blanket s during gestatio n provides a useful illu stration of this problem (table 3). Suppose that high ex pos ures fro m elec tric blan kets are co rrelated with high ambient magnetic field ex po sure s fro m oth er resi de ntial so urces becau se geographic co nd itions that encourage ex tended use of electric blanket s through mu ch of the year also lead to high use of electric heat , lighting, and the like. In the example, es tima ted ex posures from blankets and oth er sources (4) are dichotomized into high and lo w catego ries , the prevalen ce of high ambie nt ex pos ure being ass umed to be 80% among those with high blanket exposure and 20 % among those with low blanket exposure .
Con sid ering only ex posure from electric blan ket s -ignor ing ex pos ures from other reside ntial sources -is equivalent to evaluat ing the crude risk rati o for high ve rsus low blanket exposure, whic h has a valu e of 2.7 under the log-linear risk model (tabl e 3). However, the risk ratio fo r the sa me compariso n stratified by the level of exposure to ambient magnet ic fie lds is 2 .0 in ea ch stra tum of ambie nt exposure (ta ble 3) . Sin ce the stratum-specific ratios are iden tical, a risk ratio for high ver su s low blank et exposure adju sted for other residential exposures woul d also equal 2.0 . Thu s, like ot her confounder s (5), exposu res from different sources produce differe nces bet ween the crude and adjusted es timates . Th e risk ratio for blank et ex pos ure adj usted for other residential ex po sure s is the mor e valid esti mate of the independent effect of magnetic fields from electric blank et s. Eva luatio n of the health co nsequences of modifying ex pos ures from electric blanke ts would be di storted by failure to con sider ambient magneti c fie lds , j us t as it would by con fo undi ng by any other factor.

Discussion
We have show n that fai lure to acc ount fully for exposure to the age nt of interes t ca n cha llenge the Scand J Work Environ Health 1994, vol 20. no 3 Table 2. Example of the effect of 1.0-mG unmeasured exposure with vary ing duration of measured exposure on a doseresponse relation observed in a hypothetical study of leukem ia among workers exposed to magnetic fields (log-linear risk model, estimation of risk as in  qu ant itat ive interpre tat ion of ex posure-disease assoc iatio ns obta ined fro m ep idemi ologic data. When the tru e ex posure is understated by a co nstant am ou nt, bia s ca n be avoided by specifying the co rrect analytical model -ratio measures for loglinear dose-respon se rel ati on ships a nd differen ce measures for linear dose-res ponse re lationships. Unfortunatel y, the appro priate mod el may not alw ays be apparent. Th e biological ac tion of toxic agents is rar ely well en ough understood to dictate the correct model uni qu ely (6), and measurem ent err or mak es it more d iffic ult to dist ingu ish empirically bet ween se veral alterna tives .
When the magn itude or pre valence of the unme asured exposu re co mponents varies de pe ndi ng on the magnitude of me asured exposure, the effect on the ob served res ults is not as eas ily predicted, and select ing an appro priate anal ytic mod el will not neces sarily remov e the bias due to measur em ent error. This obse rva tio n adds to the evi de nce from recent papers (7-9) tha t it is not un iversally tru e that non-Scand J Work Environ Health 1994, vol 20, no 3 differential error in measuring exposure attenuates exposure-disease associations (10).
The tendency for purely random nondifferential exposure measurement error to attenuate observed exposure-disease associations is well known (11). However, our analysis of situ ations in which the magnitude of nondifferential measurement error is associated with mea sured expo sure indicates that the slope of dose-response relations can be biased upward or downward, even though the measurement error is independent of individual disease status and always understates the true level of exp osure. Although standard texts note that coefficients in regression models can be biased in a similar way by correlated errors in mea suring predictor variables (12) , the consequences for epidemiologic research do not appear to be widely appreciated.
We have also shown that exposures received from multiple sources of a single agent may act as confounders of an association that has been chosen as the one of primary interest. Exposures received at different times would act analogously. For example, given a focus on exposure within parti cular "windows" of age (13 ), the effect may be confounded by earlier or later expo sures that are correlated with the one of interest (14) .
Our analysis suggests some strategies for eliminating or reducing the potential bias from incomplete assessment and analysis of exposure. When multiple exposure sources are known or suspected, the preferred solution is ob viously to identify and measure all of the sources. Some biological markers, blood lead , for example, provide a measure of exposure from all source s combined, although they also make it impossible to separate their effe cts for assigning risk to specific sources of exposure, or benefits to their elimination.
When data describing expo sure from all sources are obtained, the y must also be analyzed with methods that are appropriate to the scientific goals. If the principal goal is to characterize a dose-response relation for a particular agent-disease combination, it can be met the most directly by summing measured exposures from all sources to estimate total exposure. In studies that seek to isolate the independent effect of a particular exposure source, control for potential confounding by expo sures from other sourc es is necessary.
When direct information cannot be obtained for all exposure sources, external data or a smaller substudy may help to characterize the relative contributions of the measured and unmeasured sources and also the relationship between them. If the expo sure sources are uncorrelated, then , ideally, knowledge of the underlying dose-response relation would allow selection of an analytic model not likely to be biased by the unmeasured source. At a minimum, estimates of the magnitude of exposures from various sources and 204 their relationship s can be used to estimate the bias from studying the measured source alone. A sen sitivity analysis that considers a range of possibilities would place bounds on the uncertainty in measuring the dose-response relation in the population. The prin ciple s developed in this study provide the basis for more systematic speculation about potential bias from incompl etely evaluated exposure.