Statistical evaluation of the results of measurements of occupational exposure to air contaminants

Statistical evaluation of the results of measurements of occupational exposure to air contaminants. A suggested method of dealing with short-period samples taken during one whole shift when noncompliance with occupational health standards is being determined. ULFVARSON, U. Statistical ,ev·aluation of the results of measurements of occupation al exposure to air contaminants: A suggested method of dealing with short-period samples taken during one whole shift when noncompliance with occupational health standards is being determined. Scand. j. work environ. & health 3 (1977) 109-115. The problem of the representative sampling of short-period samples of air contaminants in the breathing zone of one employee during one shift is discussed. Different types of error and different sources of variation in sampling and the analysis of such sam ples ·are dealt with. It is suggested that noncompliance with the time-weighted aver age limit can be established by a combined test procedure: (a) when a direct com parison between the observ,ed mean and the standard is made and the observed mean is above the standard, noncompliance is established; (b) if the observed mean (x) is below or equal to the standard, noncompliance is also established if the observed standard deviation is too large when compar,ed to !~ - x, where ,u. is a value above the standard. The consequences of the suggested combined test in establishing non compliance are discussed. The criteria are considered to be simple to use and comprehensible, which is a demand ,since the results of observations of air contam inant concentrations in the workroom are used by employees and laymen. The choice of the criteria imply certain tolerance limits of the population of observations during a shift. These tolerance limits are considered to be in reasonable accord with excursion factors above the time-weighted average limits. ,or

1 of the air contaminant in the breathing zone of exposed employees. The information obtained is usually compared to an occupational health standard, but it may also be used in epidemiologic studies for the determination of the correlation between exposure and illness.
The measurements are subject to errors of observation, which are to some extent due to analytical errors, Le., the sampling procedure (handling of sampling equipment) and the analysis. When the sampling period or periods do not cover the period of investigation, an important sampling error is added. The objective of this paper is to describe the errors of observation and to suggest a statistical test that establishes noncompliance with occupational health standards.

DEFINITION OF THRESHOLD LIMIT VALUE
The concentrations of air contaminants undoubtedly vary according to a most complicated pattern. Regulations f.or limiting concentrations of air oontaminants, occupational health standards, have to cope with this complexity and also with the biological effects of different pollutants, especially in regard to the rapidity wi,th which various substances act, and therefo're the concepts of the time-weighted average limit (TWAL) and the ceiHng limit have been developed for spedfic COQ1centrations in the breatJhing zone of an exposed person.
In Swedish and American practice the TWAL refers to the time-weighted average concentration during an 8-h wOl"k shift. Excursions above the TWAL are permiUed according to a "rule of thumb." This rule states that excursions are allowed up to a eertain multiple o.f the TWAL. The lower the multiple, or excursion factor, the larger the standard. The excursion factor for standards above 100 (ppm or mg/m 3 ) is 1.25. For standards between 10 and 100 it is 1.5. Standards between 1 and 10 are given the factor 2 and for substances with standards below 1 the factor is 3 (1, 6). The excursion factors are also important when short-term exposure limiJts are being determined.
'Dhe ceiling limit is the time-weighted average limit during a much shorter period than 8 h. Generally the ceiling limit is defined f.or 1'5 min, but 5 min is also common.
It should be observed that neither in the deEinitions of the 'standards of the America,n Conference of Governmental Industrial Hygienists nor in the definittons derived from them in Sweden are there any directives regarding the number of samples that should be taken or how errors in the measurements should be treated. Practical advice regarding the interpreta-tion of measurements in the absence of strictly statistical evaluation is given. In the Swedish directives a statistical treatment is recommended as one possibility for considering errors of observation.

REPRESENTATIVITY OF THE SAMPLING
It is obvi'Ous that the representativity of the sampling depends on ,the choice of the employee to be studied and the period of investigation.
The sampling period is defined as the period during Which sampling is actually performed, Le., the period from the time of starting the sampling equipment to the time of swikhing it off.
The period of investigation is defined as the period wnder study, during which one or several samples may be taken. It is often the purpose of the investigation to find an average concentration value over the whole period of investigation. The dis-trIbution of periods of sampling over the period of investigation gives rise to three main types of samples.
According to the terminology of Leidel and Busch (4), a full-period sample is taken during the whole period of investigation, generally 8 h. When several consecutive samples are taken that cover the whole period of investigation, the term to be used is full-period, consecutive samples. A shortperiod sample is taken during a small part of the period of the investigation (e.g., 5-60 min). A number of short-period samples is distributed over the period of investigation, usually randomly. This type of sample i,s sometimes referred to as a grab sample.
According to Coenen (2), the length of the sampling period has little influence on the standard deviation since air contaminant concentrations usually show a strong trend, or auto correlation. From this point of view it can be assumed that the length of the sampling period in short-period sampling is, within reasonabl'e limits, optional. Other demands on the length of the sampling perIod ca,n therefoDe be satisfied. The most important is the need to sample a detectable amount of the air contaminant in question.
Different considerations have to be made when the sampling period covers the whole period ,of investigation, that i,s usually the time for which the standard is defined, and when the period of SaJlll-pIing covers only par,t of the period of mvestiga tion.
In the first case, the sample is of course representative for the period of inv·estigation and the individual employee carrying the sampling equipment. In such a case only analytical errors, which are generally small when compared to sampling error, r·emain to be considered. The choice of the period of investigation and the employee to be studied is a matter of personal judgement. If enough observations of air CQIltaminant concentrations are accumulated for many emploY'ees, shifts, seasons, etc., stabstical treatment may be used for the determination of variations and differences. Such a large-scale measuring program is very ra're, 'however, and is beyond the scope of this paper.
The costs of sampling and analyses set a practical !limit for the sampling strategies that can be applied. The populations of observations treated in this paper refer to short-period samples taken in the breathing zone Df one employee, during one shift only. Samples from the breathing 2ione of more than one employee or taken during more than one day can be treated as one population of samples only if there is no reason to believe that there are differences between the exposures of the individual employees or between days.

DEFINITIONS OF VARIOUS TYPES OF ERRORS
The spread of the results of repeated measurements is caused by occasional errors. The larger the spread, the more inaccurate the precision. Systematical errors may depend upon erroneous calibration of an instrument or, for example, erroneous readings. The larger the systematical errors, the less the accuracy. Large errors or "gross err-ors" may be due to accidents of some kind during the sampling and analysis. When exposure measurements are statistically treated, it is generally assumed that systematical errors can be kept small and gross errors very low in frequency in comparison to other errors.
The occasional enors are divided into analytical errors and sampling errors. The analytical ones are due to the handling and nalysis of the sample. Important parts of the variations in air contaminant concentrations depend on the production routine and the movements of the air within the premises. A further source of variation in air contaminant concentrations, which is superimposed on the vari·ations mentioned, is the variation of the air movements in the locality due to the season. The employee under study moves within this "landscape of concentration" during the day. Short-period samples 'taken in his bre'athing zone at different times reflect all these variatiDns in air contaminant c-oncentrations.
The concentration observed in short-periDd sampling obviously depends upon when the sample is taken, where it is t~en, and the duration of the sampling. The concentration measured during a short period of sampling differs from the mean concentration during a longer period. l1his difference is one important cause of the total sampling error. It shDuld be observed that this type of sampling error (type 1) is not an error in that which is actually being measured, I.e., in the concentration during the period of sampling at the place of sampling. It is an error compared to the mean concentration during a longer period of time, e.g., the 8-h mean and to the average in the whole locality.
The total sampling error is also due to variations within the breathing zone (type 2). The breathing zone has a complicated structure with concentration gradien ts of the substance under study which are continuouslydhanging under the influence of air movements, dilution by expired air, and sometimes also due to the release of the substance from clothes. The air movements are partly caused by the temperature diofference between the body of the human 'being and its environment. Lewis et al. (5) state that air movements caused by temperature differences close to the body of man may be of the order of magnitude of 0.5 mls.
In spite of the importance of a more exact definition of breathing zone, no further specifications of the place of sampling are generaHy given. The breathing zone error is partly systematic, and it is recommended that the uncertainty be eliminated by standardization. This type of error is not considered in the following treatment.
STATISTICAL TREATMENT OF OBSER-VATIONS FROM SHORT-PERIOD SAM-PLING FOR ONE EMPLOYEE AND ONE WORK SHIFT One of the most important demands for the statistical treatment of observations of workroom concentrations of air contaminants is simplicity and comprehensibility because the calculations often have to be made by personnel without knowledge of statistics and the results are used by laymen. A conventional way of establishing noncompliance with a standard is a test 'of the hypothesis: "True mean is equal to the standard or less." This hypothesis is rejectedata suitabIe level of significancE', e.g., 95 Ofo one-sided. The practical procedure is to subtract a certain amount from the observed mean before comparison with the standard. This procedure tends to create suspicions of the measurements among the users of the informaiion. Therefore a direct comparison of the observed mean with the standard is to be preferred, corresponding to the testing of the hypothesis on the 50 Ofo level of significance. A large spread of real variations above the standard is to be avoided (cf. af,orementionedexcursion factors). One method of doing this is to supplement the preceding test with a further rejection rule that guarantees that, if the true mean exceeds a certain value (,a) abo"'e the standard, the probability of rejection (1-fJ) will be sufficiently high.
When values are chosen for the parameters fl and (J, the excursion factors should also be considered. One possibility is the comparison of the upper limit of a certain proportion of the population distribution (tolerance limit) with the excursion factors.

TEST PROCEDURE
The combined test procedure may be described as follows. If x > TWAL, the hypothesis of compliance is rejected, i.e., non-112 compliance is est?blished. Unless x~,atlis/Y--;;' the hypothesis "True mean ;;::: p" is accepted and noncompliance is also established (fl = a fixed chosen value, above the TWAL; s = observed standard deviation; n = number ,of ·short-period samples; tli = Student's t fior 'level of significance (J, one-sided).
Noncompliance is established if either x> TWAL or s> (/A, -x)'y--:n Itp. Alternatively the procedure may be described in the following manner: Oompliance with the standard is established if x TWAL, unless the observed standard deviation is too large in comparison to ,a-x.
From the construction of the combined test, one can establish the following results concerning the behavior of the test. Case 1. True mean < TWAL.
In case 1 the probability of establishing compliance is at least 50°/0 unless the true mean is close to the TWAL and the true standard deviation a is large. Case 2. True mean> TWAL but < ,a.
In case 2 the probability of establishing noncompliance will always be at least 50°/ 0, and it will increase with the value of the true mean.
Case 3. True mean> It. In case 3 the probability of establishing noncompliance will always be at least (1 -(J), and it will increase with the value of the true mean.
(The behavior of a test is conveniently described by the so-called power function, i.e., the probability or rejection as a function of the parameters involved. In the present paper the power function would be the probability of establishing noncompliance as a function of the true mean, the true standard deviation and the number of samples. Unfortunately this power function cannot be computed exactly or even approximately without extensive numerical integrations, and therefore only the qualitative descdption of its behavior in the three aforementioned cases is presented.)

CHOICE OF VALUES OF TEST PARAMETERS
As was already mentioned, there i,s reason to consider the ·excursion factors when the most suitable choice of ft and fJ is being discussed. The excursion factors vary between 3 and 1.25, ,3 for TWAL < 1 and 1.25 for TWAL ;;:: 100 (ppm or mg/m a ). T'olera-nce limits cover a fbced 'Portion of the population distribution with a specified confidence. There should be some reasonable relation between the upper tolerance limit (Tu) and the excursion factor. If T u = x + Ks and x = TWAL and s= (ft -xrvn/tfl , the following relationship can be derived: Now f3 is taken to be 0.05, T.JTWAL to be 2.0 (as a compromise between different demands on the highest permisstble excursions at different TWALs), the portion of the population distribution below the tolerance limit to be 0.90, and the confidence with which this is true to be 0.90. The corresponding value of p./TWAL will then be 1.25, I.e., the value of p. will be 25 % above the TWAL. The conclusion of this discussion is that, with ft = 1.25· TWAL and fJ = 0.05, a reasonable upper tolerance limit of the populatio-n distribution is guaranteed (provided the population distribution is normal).

PROCEDURE TO ESTABLISH COMPLIANCE
With fJ = 0.05 and ft = 1.25· TWAL, a maximum permitted value of s/(Y--;; TWAL) can be calculated for a certain number of observations and the value of the quan-tity x/TWAL. In table 1 such values are given.
Example: 'Drichloroethylene was sampled and analyzed in the breathing zone of one employee during one shift. Each sampling period lasted 15 min. In one sedes of observations the following results were obtained in parts per million: 2, 5, 7, 15, 20, 125. The TWAL standard is 30 ppm; x/TWAL = 0.97; s/<V~TWAL) = 0.65. According to criterion 2 noncompliance is established. This result seems reasonable since there is one very high observation in the series. I'll a second series the following concentrations were observed: 2, 5, 7, 15, 20, 40, 45 ppm, and x/TWAL = 0.64, and s/(Y-;; TWAL) = 0.24. In this situation compliance is established according to both criteria. It is interesting to note that in the last series no observation had an excursion factor above 1.5.

DISCUSSION
As has been pointed out by Leidel (3) and Leidel and Busch (4), there is justification for using the log normal distribution to e~plain the skewness towards high values of concentration when air contaminants are being dealt with. On the other hand it is more practical to use the normal distribution, since the calculations are simpler. Furthermore a disadvantage of the log transf'orma:tion is that observations below the detection limit cannot be handled properly. They cannot be set equal to zero.
Earlier I suggested (7) using the normal distribution under the assumption that the null hypothesis is t,ested one-sidedly on the 95 % level of confidence and the power of this test is speci,fied. The spread of observations permitted will therefore be limited. The error made when the normal distribution is used instead of a more adequate model of distr]bution is thereby limited as well.
The combined test procedure suggested will have the same effect. When the mean is directly compared with the TWAL, the use of the normal distribution is actually more conservative than the use of the log normal distribution since the geometric mean of skewed populations of air ,contaminants is lower than the arithmetic mean.
A consequence of the application of the combined test procedure when noncompliance with a standard is being judged is that the number of required samples is directly related to the spread of the observations. If the standard deviation is too large, 'more samples may ,contribute to a smaller standard deviation in comparison to ,U -x since the influence of the analytical error is somewhat decreased. The real variation in concentrations of course prevails.
As already poin ted out, the choice of II and fJ should be governed to some extent by the correspondence between the permitted excursion factors above the TWAL and the tolerance limit of the population distribution implied in this choice. The upper tolerance limit, T,/TWAL = (xl TWAL) + K s/TWAL, for this discussion was arbitrarily chosen so that 90 0 10 of the population of observations would be below the tolerance limit with 90 Ofo confidence. On the assumption of a normal distribution, T,/TWAL = 2.0 when compliance is just on the borderline of being established with five observations, fJ = 1.25 . TWAL, fJ = 0.05 and the observed mean = TWAL. When ,the observed mean is close to the TWAL, the upper tolerance limit of the population distribution is found to be in reasonable accord with the excursion factors for the majority of substances in the list of standards (1, 6). When the observed mean is !lower, a population with the "per'mitted" standard deviation according to table 1 will show an increasing skewness. This result is obvious if a symmetrical range (difference between highest and lowest observation) is considered. If the 114 lowest observation, and only that one, is below the detection limit, then the range WITWAL will be close to 2x/TWAL provided the observaUons are symmetrically distributed on both sides of the mean. Estimates of the standard deviation based on the range confirms that skewness becomes obvious when x/TWAL decreases below 0.75 when the number of observations i,s 5-10. When the distribution is skewed, the prediction oftoler2nce limits is, in any case, uncertain.
As already pointed out, the rules for determining compliance with the standard according to the Swedish National Board of Occupational Safety and Health (6) and the American Conference of Governmental Industrial Hygienists {I) have not been worked out with a statistical concept or with any model of distribution of observations in mind. It is therefore not possible to make unambiguous comparisons between the consequences of a statistical approach and ,the consequences of applying the rules and directions in question.
A correspondence with the relation between the excursion factors and the size of the standard is not possible to arrange.
The use of the suggested combined test procedure regarding noncompliance with the TWAL may to some extent replace the use of excursion factors. Such factors still may be important in the determination of short-term exposure limits however.

SUMMARY OF THE PROCEDURE
1. Take at least 4, better between 5 and 10, short-period samples {5-60 min depending on the detection limit of the substance, the air flow through the sampling equipment and the need of checking ceiling limits or excursions above the TWAL). The sampling periods should be distributed either randomly during the period of investigation (one shift) ,or regularly if the period between sampling does not coincide with regular production cycles. 2. Calculate the mean of the observations (x) and compare it directly with the TWAL. If the mean is above the standal'd, noncompliance is established. 3. If the mean is below the standard, calculate the standard deviation. If the quotient s/e:v---;; TWAL) is higher than the value permitted a,ccording to table 1, noncompliance is established.