Endotoxin exposure in asthmatic children and matched healthy controls


Recall that the goal of the course is to not only improve critical appraisal skills but also to think about research questions, designs, and the necessary compromises that are often required in research. Hopefully this adds another dimension to typical journal clubs where either an article in uncritically endorsed enthusiastically or trashed unmercifully, although this is sometimes quite merited

For reasons I am somewhat unclear about, this week’s selected article was the 2005 publication Endotoxin exposure in asthmatic children and matched healthy controls: results of IPEADAM study. Although not described as such, this was apparently a nested case-control study from a 1999 crosssectional study by the same authors, of over 1500 completed questionnaires. Cases were children between the ages of 4 and 17 who had a study diagnosis of probable asthma. These asthmatic children were matched for sex, age and sib-ship size with children living in asthma free households and the following main conclusions reported.

A critical analysis should consider bias and statistical issues


It is helpful to consider biases in observational studies according to the following grouping
- selection
- misinformation
- confounding

Selection bias

Selection bias is a distortion in the measure of association due to a sample selection such the measured association is not representative of what exists in the target population.
If the selection of cases and controls is not done independently of the exposure, then when you compare exposure distributions any observed differences are a combination of a true effect and an artifact of the way the data was collected.
This figure may help explain. Where would you put the reinforcements?

After identification of the research question, the next step is to distinguish among the various populations, including the
- target population (all children in the UK)
- study population (all those with a chance of being in the study, ie the 1500 children who completed the original questionnaire)
- sample population (the 200 children in Figure 1)
- analyzed sample population (the 90 matched pairs of children)

In their previous work, the authors report an asthma prevalence of approximately 20%, or about 300 children based on their study population of 1500. The following questions need answering.
- Does the study population (?1500) reflect the target population?
- Do the 200 sample children reflect the total study population ?
- Do the 90 matched pairs, the analytical sample population reflect the overall sample population? What about the missing cases?

Many reasons to be concerned about possible selection bias in this study.
Let’s begin by examining their Figure 1 which is reproduced below.

How many asthma cases (and controls) did they identify in this Figure?

The figure shows 105 cases and 93 controls, a total of 198 children. Yet the authors claim this figure shows the age distribution of 200 children. While the difference is small, it does raise further concerns about the reliability of their data analysis when there is a discordance in a simple counting of individuals.

Why do they only use 90 cases? Presumably to meet their matching criteria, age (+/- 1 year). Below is the best matching I could create from their data.

The question then becomes why match?
Most people think matching is to control for confounding but this is only partly true (and indeed may be more than offset by potential selection bias) and the best reason is to improve precision in situations of sparse data.
Moreover, in this case a better alternative would have been not to match but to control for age in regression analysis as this would have allowed an estimation of the effect of age on the detection of asthma, while using the whole study sample of 198 children, including all 105 cases.

Performing logistic regression glm(stat~ages, data = data_long_uncount, family = "binomial") shows that each year of additional age is associated with a odds ration decrease in asthma diagnosis = OR = 0.92, 95%CI 0.89 - 0.96


The authors don’t report the raw data for the endotoxin exposure but assuming the following distribution among cases and controls gives an OR that approximates their results

## --Observed data-- 
##          Outcome: Exposed+ 
##        Comparing: Cases vs. Controls 
##          Cases Controls
## Exposed+    40       50
## Exposed-    27       63
##                             2.5% 97.5%
## Observed Relative Risk: 1.3  1.0   1.8
##    Observed Odds Ratio: 1.9  1.0   3.4
## ---
## Misclassification Bias Corrected Relative Risk: 1.3
##    Misclassification Bias Corrected Odds Ratio: 1.9

As the authors report from their previous work that their asthma detection questionnaire has only a 70% sensitivity and 91% specificity misclassification is present.
The effect of misclassification can be quantified using quantitative bias analysis (QBA) via the episensr package and misclassification function. This seems an improvement over purely qualitative heuristics, such as “non-differential misclassification biases toward the null”.

misclassification(matrix(c(40, 50, 27, 63),
                          dimnames = list(c("Exposed+", "Exposed-"), c("Cases", "Controls")),
                         nrow = 2, byrow = TRUE),
                  type = "outcome",
                  bias_parms = c(.70,.70,.91,.91))
## --Observed data-- 
##          Outcome: Exposed+ 
##        Comparing: Cases vs. Controls 
##          Cases Controls
## Exposed+    40       50
## Exposed-    27       63
##                             2.5% 97.5%
## Observed Relative Risk: 1.3  1.0   1.8
##    Observed Odds Ratio: 1.9  1.0   3.4
## ---
## Misclassification Bias Corrected Relative Risk: 1.4
##    Misclassification Bias Corrected Odds Ratio: 3.6

Conceptually, this misclassification of cases and controls suggest any association between indoor air pollutants and asthma will likely be underestimated and this analysis permits an estimate of the order of magnitude.


How much residual or unmeasured confounding would be required to wipe out the observed effect? This can be determined using the EValue package as described in this paper.

##          point lower upper
## RR         1.4   1.1   1.8
## E-values   2.1   1.3    NA

This suggests that a moderate confounder with a risk of 2.1 fold to exposure and a 2.1 fold increase in the outcome would be required to eliminate the observed risk. Of course if the true OR is greater due to misclassification bias the size of unmeasured confounder needed to eradicate the observed association would be even larger.
In conclusion, there are multiple potential biases (selection, misclassification and confounding) associated with this study. Unmeasured confounding is unlikely to explain the observed association and misclassification suggest the association may be underestimated, but the inability to evaluate the magnitude and direction of the selection bias limits definitive determination of the overall direction of the bias assessment.


The authors they performed a logistic regression where presumably probable asthma is the dependent variable. In this case understanding their table 2 is difficult.

From Table 2 the authors make causal statements not only about endotoxins and asthma but also about other variables, including for example, dampness and single parent home.
Making assertions based on regression coefficients from a multivariable analysis is subject to Table 2 Fallacy. These covariate effect estimates may also be confounded even though the effect estimate for the main exposure is not confounded and their proper interpretation far from obvious.

The authors state “There was no difference in the levels of exposure to house dust major allergen Der p 1, expressed in quartiles of exposure between asthmatic and non-asthmatic matched controls (Table 3)”

Looking at the confidence intervals we see they are very wide and, for example, don’t exclude that for quartile 2 there is a 180% increase or 64% decrease in asthma prevalence compared to quartile 1. The authors have conflated an absence of evidence with evidence of absence. Aslo see this previous post.
The authors make the same error in Table 4

when they conclude there are no interactions. The more reasonable conclusion being even if one ignores the potential biases, they lack sufficient power to draw conclusions about interactions.


A critical appraisal of this publication has shown several interesting avenues for discussion, perhaps explaining why it was suggested for review. Based on the critical appraisal it would seem there is little to support their original conclusions.

Professor of Medicine & Epidemiology

I am a tenured (full) professor with a joint appointment in the Departments of Medicine and Epidemiology and Biostatistics where I work as a clinical cardiologist and do research in cardiovascular epidemiology.