Who needs selection bias?

Present day low participation rates in certain parts of research is concerning. It is not unusual that half (or even more) of those invited to participate in a research project decline the invitation. Many think this will always lead to selection bias regardless of the type of study in question. That is fortunately not the case. Selection bias is a reason for concern in studies that aim to obtain a representative sample concerning risk factors or outcomes. Those who decline participation will often have a higher risk profile (lower social status, smokers etc) and more chronic disorders. Selection bias of the most serious type is related to the specific hypothesis when the hypothesis is known to participants. Being asked to participate in a study on oral contraceptives (OC) and breast cancer, for example, may be more appealing to women with breast cancer who have used OC. Selection is therefore often of concern in case–control studies. Non-response and selection bias is of less concern in follow-up studies where the outcome is not known at the time of recruitment.

The reason for limited interest in “representativeness” is that scientific inference addresses a potential cause–effect relation in general, not in a specific population. Whether an established cause–effect relation is present in a specific population with a given effect estimate is a matter of effect measure modification and the distribution of component causes in that population at that given time period.

Non-response in a follow-up study is expected to affect the structure of the population and influence the confounder distribution, often by producing less confounding because those with health problems and extreme lifestyle factors are more likely to decline participation. Since these factors may act as confounders, selection may change effect sizes without bias but just because effect sizes are population-specific. One should, however, be aware of a selection bias that would not be present with complete case ascertainment as illustrated in figure 1.

Without selection (S), C 3 would not cause bias, but S (conditioned upon) will link C 3 to E, and establish the backdoor E-C 3 -D that can be closed by adjusting for C 3 . In a like manner, C 1 and C 2 have confounding potential because conditioning on S will produce the backdoor paths C 1 -C 3 -D and C 1 -C 2 -D. Adjustments for C 2 and C 3 will be needed. This will happen even in situations where S is not directly linked to D (because D has not occurred at the time of recruitment). Direct selection bias relating S to both E and D is only expected if D can be predicted by study participants (eg, by using the family history of disease occurrence).

Selection in a cohort will not in itself lead to selection bias but will often produce a different confounder structure and therefore call for a different strategy when deciding on which confounders to include and how to treat them in the analysis.

Present day low participation rates in certain parts of research is concerning. It is not unusual that half (or even more) of those invited to participate in a research project decline the invitation. Many think this will always lead to selection bias regardless of the type of study in question. That is fortunately not the case. Selection bias is a reason for concern in studies that aim to obtain a representative sample concerning risk factors or outcomes. Those who decline participation will often have a higher risk profile (lower social status, smokers etc) and more chronic disorders. Selection bias of the most serious type is related to the specific hypothesis when the hypothesis is known to participants. Being asked to participate in a study on oral contraceptives (OC) and breast cancer, for example, may be more appealing to women with breast cancer who have used OC. Selection is therefore often of concern in case-control studies. Non-response and selection bias is of less concern in follow-up studies where the outcome is not known at the time of recruitment.
The reason for limited interest in "representativeness" is that scientific inference addresses a potential cause-effect relation in general, not in a specific population. Whether an established cause-effect relation is present in a specific population with a given effect estimate is a matter of effect measure modification and the distribution of component causes in that population at that given time period.
Non-response in a follow-up study is expected to affect the structure of the population and influence the confounder distribution, often by producing less confounding because those with health problems and extreme lifestyle factors are more likely to decline participation. Since these factors may act as confounders, selection may change effect sizes without bias but just because effect sizes are population-specific. One should, however, be aware of a selection bias that would not be present with complete case ascertainment as illustrated in figure 1.
Without selection (S), C 3 would not cause bias, but S (conditioned upon) will link C 3 to E, and establish the backdoor E-C 3 -D that can be closed by adjusting for C 3. In a like manner, C 1 and C 2 have confounding potential because conditioning on S will produce the backdoor paths C 1 -C 3 -D and C 1 -C 2 -D. Adjustments for C 2 and C 3 will be needed. This will happen even in situations where S is not directly linked to D (because D has not occurred at the time of recruitment). Direct selection bias relating S to both E and D is only expected if D can be predicted by study participants (eg, by using the family history of disease occurrence).
Selection in a cohort will not in itself lead to selection bias but will often produce a different confounder structure and therefore call for a different strategy when deciding on which confounders to include and how to treat them in the analysis.