1 Brief introduction

1.1 Comparison within cluster

A small cross-over trial of asthma medications in children aged 7-14 years. Outcome is the peak expiratory flow.

Research question:

Is there evidence that mean peak flow is different between treatment and control?

1.1.1 Data

Variable name	Description
sub	Subject identifier
pef0	Peak flow measurement under control
pef1	Peak flow measurement under treatment
seq	Order in which treatments were received (1 = treatment followed by control; 2 = control followed by treatment)

library(haven)
pef <- read_dta("./data/pef.dta")
pef

# A tibble: 10 × 4
     sub  pef0  pef1 seq                
   <dbl> <dbl> <dbl> <dbl+lbl>          
 1     1   330   335 1 [Treatment first]
 2     2   270   310 2 [Control first]  
 3     3   350   360 2 [Control first]  
 4     4   240   290 1 [Treatment first]
 5     5   320   330 1 [Treatment first]
 6     6   360   350 2 [Control first]  
 7     7   340   380 1 [Treatment first]
 8     8   280   335 2 [Control first]  
 9     9   365   370 1 [Treatment first]
10    10   320   340 2 [Control first]

1.1.2 Ignoring the paired structure

When we are not consider the paired outcome, and using independence two-sample t-test, we will obtain

library(tidyr); library(dplyr)
library(ggplot2)
pef.long <- pivot_longer(pef, 
                         cols = c("pef0", "pef1"), 
                         names_to = "trt", 
                         values_to = "pef") 

pef.long$trt <- ifelse(pef.long$trt=="pef0","control","treatment")

# two ways to implement two samples independent t-test in R
## t-test function
t.test(pef ~ trt, data=pef.long,var.equal = TRUE)


    Two Sample t-test

data:  pef by trt
t = -1.4387, df = 18, p-value = 0.1674
alternative hypothesis: true difference in means between group control and group treatment is not equal to 0
95 percent confidence interval:
 -55.35666  10.35666
sample estimates:
  mean in group control mean in group treatment 
                  317.5                   340.0

## using linear regression
summary(lm(pef~trt,data=pef.long))


Call:
lm(formula = pef ~ trt, data = pef.long)

Residuals:
   Min     1Q Median     3Q    Max 
-77.50 -15.00   2.50  24.38  47.50 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)    317.50      11.06  28.711   <2e-16 ***
trttreatment    22.50      15.64   1.439    0.167    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 34.97 on 18 degrees of freedom
Multiple R-squared:  0.1031,    Adjusted R-squared:  0.05331 
F-statistic:  2.07 on 1 and 18 DF,  p-value: 0.1674

confint(lm(pef~trt,data=pef.long), level=0.95)

                 2.5 %    97.5 %
(Intercept)  294.26684 340.73316
trttreatment -10.35666  55.35666

The estimated treatment effect is 340-317.5 = 22.5 (95% CI = [-10.36, 55.36])
Standard error is 15.64

1.1.3 Consider the paired structure

t.test(pef$pef1,pef$pef0,paired=T)


    Paired t-test

data:  pef$pef1 and pef$pef0
t = 3.2134, df = 9, p-value = 0.0106
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
  6.660412 38.339588
sample estimates:
mean difference 
           22.5

sd(pef$pef1-pef$pef0)/sqrt(10) # standard error of estimate

[1] 7.001984

Same estimated treatment error is 22.5, but more precise with lower standard error 7.0019838

Conclusion

When using independent two-sample t-test, the test treat observation from the same subject as independent observations. It not valid, because in this data there is the correlation between observation from the same subject (paired data).

1.1.4 When some pairs have missing data ?

Suppose some paired have one ‘half’ missing data: are these cases still informative?

In this section, we just stop at introduce the linear mixed effects model, when using linear mixed effects model for the paired data without missing data, it equal using paired t-test

\[ Y_{ij} = \beta_0 + \beta_1 X_{ij} + S_i + e_{ij} \]

Where:

\(\beta_0\): the population expected response in control group
\(\beta_1\): treatment different
\(S_i\): systematic differents between individual (assumed ~ N(0,\(\sigma_s^2\)))
\(e_{ij}\): within individual error (assumed ~ N(0,\(\sigma_e^2\)))

The output of the model includes: fixed effect (\(\beta_0,\beta_1\)), random effect (\(\sigma_s^2,\sigma_e^2\))

library(lme4)                  # mixed effects models

## fit lme model for peak flow data

fit.lmer1 <- lmer(pef ~ trt + (1|sub), data = pef.long)
summary(fit.lmer1)

Linear mixed model fit by REML ['lmerMod']
Formula: pef ~ trt + (1 | sub)
   Data: pef.long

REML criterion at convergence: 174.5

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.33176 -0.45244  0.00889  0.58564  1.22464 

Random effects:
 Groups   Name        Variance Std.Dev.
 sub      (Intercept) 977.8    31.27   
 Residual             245.1    15.66   
Number of obs: 20, groups:  sub, 10

Fixed effects:
             Estimate Std. Error t value
(Intercept)   317.500     11.059  28.711
trttreatment   22.500      7.002   3.213

Correlation of Fixed Effects:
            (Intr)
trttreatmnt -0.317

confint(fit.lmer1, level=0.95)

                  2.5 %    97.5 %
.sig01        18.086710  51.50392
.sigma        10.131426  24.80399
(Intercept)  295.142588 339.85741
trttreatment   8.124216  36.87578

Conclusion

The estimated treatment effect (mean difference, treatment vs control) and its standard error are:

Estimated treatment effect = 22.5
SE = 7.002

These are exactly the same numbers as those obtained for the paired t-test. (N.B. You may see a difference of sign, simply because the t-test and mixed model estimation treat the two groups in the opposite order when defining the comparison.) So in the case of fully observed paired data, the mixed model (estimated by the “REML” method) is equivalent to the paired t-test.

Using gtsummary and broom package for summary table of model output

library(gtsummary)
model <- lmer(pef ~ trt + (1|sub), data = pef.long)  

model |>
  tbl_regression(
    intercept = TRUE,
    tidy_fun = broom.mixed::tidy,
    estimate_fun = purrr::partial(style_ratio, digits = 3)
  ) |>
  as_gt() |>
  gt::summary_rows(
    groups = NULL,
    columns = everything(),
    fns = list(
      "Variance (subject)"  = ~ sprintf("%.3f", as.data.frame(VarCorr(model))$vcov[1]),
      "Variance (residual)" = ~ sprintf("%.3f", as.data.frame(VarCorr(model))$vcov[2])
    )
  )

	Characteristic	Beta	95% CI
	(Intercept)	317.5	295.8, 339.2
	trt
	control	—	—
	treatment	22.50	8.776, 36.22
	sub.sd__(Intercept)	31.27
	Residual.sd__Observation	15.66
Variance (subject)	977.778	977.778	977.778
Variance (residual)	245.139	245.139	245.139
Abbreviation: CI = Confidence Interval

The output of this model:

\(\beta_0\): (Intercept)
\(\beta_1\): treatment in trt labels
\(\sigma_s\): sub.sd_(Intercept)
\(\sigma_e\): Residual.sd_Observation
\(\sigma_s^2\): Variance (subject)
\(\sigma_e^2\): Variance (residual)

Estimating the ICC (intra class correlation)

\(\sigma_s^2\) = 977.78: represents the between-subject variation, i.e. the variability in “underlying” peak flow between subjects.
\(\sigma_e^2\) = 245.14: represents the within-subject variation, i.e. the variability among peak flow measurements on the same person (under the same treatment)

The (estimated) intraclass correlation coefficient is:

\[ICC = \frac{977.78}{977.78+245.14} = 0.7995\]

Explanation of why accounting for correlation in comparisons within a cluster increases the precision of estimation

Linear mixed model and paired t-test

\[ Y_{ij} = \beta_0 + \beta_1 X_{ij} + S_i + e_{ij} \] For an individual

\[ \begin{align} Y_{i1} - Y_{i0} &= (\beta_0 + \beta_1 X_{i1} + S_i + e_{i1}) - (\beta_0 + \beta_1 X_{i0} + S_i + e_{i0}) \\ &= \beta_0 + \beta_1 X_{i1} + S_i + e_{i1} - \beta_0 - \beta_1 X_{i0} - S_i - e_{i0} \\ &= \beta_1 + e_{i1} - e_{i0} \end{align} \]

Averaged over all individuals:

\(\widehat{\beta} = \frac{\sum\nolimits_{N}^{i = 1} (Y_{i1} - Y_{i0})}{N} = \overline{Y_1} - \overline{Y_0}\)
\(Var(\widehat{\beta}) = \frac{2\sigma_e^2}{N}\)

Two samples t-test

\(\widehat{\beta} = \frac{\sum\nolimits_{i = 1}^{N} (Y_{i1} - Y_{i0})}{N} = \overline{Y_1} - \overline{Y_0}\)
\(Var(\widehat{\beta}) = Var(\overline{Y_1}) + Var(\overline{Y_0})\)

Variance in each group is the average of the total variation of observations about treatment group mean: \(Var(\overline{Y_1}) = \frac{(\sigma_s^2 + \sigma_e^2)}{N}\).

So \(Var(\widehat{\beta}) = 2 \times \frac{(\sigma_s^2 + \sigma_e^2)}{N}\)

=> The variance will be higher and the precision of estimation will be lower

1.2 Comparison between clusters

Cluster randomize trials: All members of each cluster receive the same treatment

Anti-malarial trial (Lec 1): Cluster = village, treatment given to all individuals in a village
Comparison of treatment involves between cluster comparisons
Analysis estimates difference between (population) mean of treatment and control groups

Marginal models

Consider the treatment group only in a cluster RCT;

i refers to a cluster, \(Y_{ij}\) is the outcome for a cluster i with \(j^{th}\) measurement of outcome

\[ \begin{align} E[Y_{ij}] = \overline{Y_i} = \mu \quad\quad Var(Y_{ij}) = \sigma_T^2 \quad\quad Corr(Y_{ij},Y_{ik}) = \rho \end{align} \]

This is called a marginal model or population average model which specifies only the mean, variance, and covariance

Variance of cluster mean for a cluster of size \(n_i\):

\(Var(\overline{Y_i}) = \frac{\sigma^2_T (1+(n_i - 1)\rho)}{n_i}\)

Each cluster mean estimates \(\mu\)

\[ \begin{align} \widehat{\mu} = \frac{\sum\nolimits_{i = 1}^{N} w_i \overline{Y_i}}{\sum\nolimits_{i = 1}^{N} w_i} \quad where \ w_i = \frac{n_i}{1+(n_i-1)\rho} \end{align} \]

The generalized estimating equations (GEE) can be written as

\[ \sum\nolimits_{i = 1}^{N} w_i(\overline{Y_i} - \widehat{\mu}) = 0 \]

Conclusion

GEE estimate population mean of each treatment arm, this model take correlation structure into account but treats it as a nuisance
Variance of sample mean INCREASES with (positively) correlated observations (LESS precise), see \(Var(\overline{Y_i}) = \frac{\sigma^2_T (1+(n_i - 1)\rho)}{n_i}\), \(\rho = Corr(Y_{ij},Y_{ik})\)
Can be used to estimate population average odds ratios, rate ratios, etc.

1.2.1 Example data

The Melbourne Water Quality Study was undertaken to assess whether filtering the household water supply for viruses, bacteria and protozoa was associated with a reduced burden of gastroenteritis. The study was a cluster-randomised double blind clinical trial comparing real versus sham water filters. The primary outcome was gastroenteritis during a 68 week follow-up period.

We will use a subset of the data consisting of 2437 individuals in 533 families. Because we have so far only explored methods for continuous data, we will examine a secondary outcome measure – total water consumption.

water <- read_dta("./data/water.dta")

water$logwater <- log(water$watertot)

water |> 
  mutate(filter = factor(filter,labels = c("Control","Filter"))) |> 
  filter(house <= 20) |> 
  ggplot(aes(x = as.factor(house), y = logwater))+
  geom_point() +
  facet_wrap(~filter)+
  labs(x = "House", y  = "Log of total water consumption")+
  theme_bw()

The plot shows that the log of each house’s water consumption was randomly assigned to the treatment and control groups. In this plot, we see the first 20 houses of the dataset.

Use a GEE approach to estimate the effect of the filter on water consumption

Note that the corstr argument will be explained more detail in Section 2.4

library(geepack)

fit.geeglm2 <- geeglm(logwater ~ filter,
                      data = water, 
                      id = house, 
                      corstr = "exchangeable")
summary(fit.geeglm2)


Call:
geeglm(formula = logwater ~ filter, data = water, id = house, 
    corstr = "exchangeable")

 Coefficients:
            Estimate Std.err     Wald Pr(>|W|)    
(Intercept)  0.99718 0.02821 1249.198   <2e-16 ***
filter       0.06083 0.03781    2.588    0.108    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation structure = exchangeable 
Estimated Scale Parameters:

            Estimate Std.err
(Intercept)   0.3878 0.01943
  Link = identity 

Estimated Correlation Parameters:
      Estimate Std.err
alpha   0.3236 0.02639
Number of clusters:   533  Maximum cluster size: 9

confint.default(fit.geeglm2)

               2.5 % 97.5 %
(Intercept)  0.94189 1.0525
filter      -0.01328 0.1349

The output of the model (on the logscale of water consumption):

The estimated treatment effect 0.061 (-0.013,0.135)
The standard error 0.038

We need to exponentiate the estimation, using gtsummary package to provide the summary table

library(gtsummary)

fit.geeglm2 |>
  tbl_regression(
    intercept = TRUE,
    exponentiate = TRUE,
    tidy_fun = broom.mixed::tidy,
    estimate_fun = purrr::partial(style_ratio, digits = 2)
  )

Characteristic	exp(Beta)	95% CI	p-value
(Intercept)	2.71	2.56, 2.86	<0.001
Randomised group (filter/control)	1.06	0.99, 1.14	0.11
Abbreviation: CI = Confidence Interval

=> That means the difference in the real filter group is 6% greater than in the sham filter group (with 95% CI from a 1% decrease to a 14% increase). The p-value is p=0.11. So there is very minimal evidence of an effect of the real filter on changing mean water consumption.

Summary

Correlation structure of your data can impact on your standard error estimates

When the comparison is within cluster/individual like in a cross-over trial, accounting for clustering improve precision (smaller standard errors)
When the comparison is between clusters like in a cluster RCT, modelling clustering reduces precision

Linear mixed models (conditional models) produce estimates taking individual variation into account

GEE models (marginal models) produced population average estimates, but need many clusters