In this post I am going to describe how to compare EC50s from two or more GLMM curves. To begin, run the code in my recent post

#### Automated three factor dose-response with binomial GLMMs and ECx’s“

In the post above we calculated the 95% CI around the ECx using an approximation by multiplying the standard error by 1.96 and adding or subtracting it from the mean ECx. But this makes it difficult to compare the ECx’s except to say that the 95% CI do or not overlap. In this example, we bootstrap the data which creates a distributions around the ECx. From there we can visually present the distributions or compare the distributions statistically.

Let’s first check out our old EC50 of ‘factor a’. The EC50 = 46.12 (95% CI 36.85 – 55.39)

> ec.df1

ecx1 lower1 upper1

(Intercept) 46.11674 36.84638 55.3871

Now let’s bootstrap the EC50. Remember in the previous code I had “ecx1 <- (eta1 – betas1[1])/betas1[2] “, which interpolates the 50% response onto the curve? I will run this 200 times, but resampling the data each time. This will give 200 EC50 values that we can make a distribution out of. I will use ‘lme4’ again, ignoring the initial convergence issue. The 200 EC50s will be stored in ‘bb1$t’. We can also order then and get the bottom 2.5% and 97.5% i.e the 95% CI

#Bootstrapped confidence intervals on ECx values

library(lme4)

md1 =glmer(cbind(suc,(tot-suc))~raw.x*factor+(1|obs) ,data1,family=’binomial’ )

sims = 200 #number of simulations.

lowerCI = 0.025*sims

median.CI = 0.5*sims

upperCI = 0.975*sims

# ecx <- (logit(0.5) – coef(md1)[1])/coef(md1)[2]

predFun2<-function(.)((eta1 – fixef(.)[1])/fixef(.)[2]) #define the function

bb1<-bootMer(md1,FUN=predFun2,nsim=200, parallel = “multicore”, .progress = ‘txt’ ) #get bootstrapping, 200 sims

bb_se1<-apply(bb1$t,2,function(X) X[order(X)]) #apply function

median<-bb_se1[median.CI,] #find the 50%

low<-bb_se1[lowerCI,] #find the bottom 2.5%

up<-bb_se1[upperCI,] #find the top 2.5%ec.df1.boot = data.frame(median, low, up)

> ec.df1.boot

median low up

(Intercept) 45.22589 36.04048 55.48538

You can see they are similar to the approximated confidence intervals above but slightly different. Generally when random effects are involved, bootstrapped confidence intervals are the way to go. Now let’s see the distribution using a density plot

plot(density(bb1$t))

So here is the EC50 distribution for ‘factor a’ that we want to compare factors b and c against. We will need to again relevel the model for the other factors and repeat the code.

#Bootstrapped confidence intervals for factor b and c

library(lme4)

data1$factor <- relevel(data1$factor, ref = “b”) #set reference levels for GLMs

md2 <- glmer(cbind(suc,(tot-suc))~raw.x*factor+(1|obs) ,data1,family=’binomial’ )

bb2<-bootMer(md2,FUN=predFun2,nsim=200, parallel = “multicore”, .progress = ‘txt’ ) #get bootstrapping, 200 sims

data1$factor <- relevel(data1$factor, ref = “c”) #set reference levels for GLMs

md3 <- glmer(cbind(suc,(tot-suc))~raw.x*factor+(1|obs) ,data1,family=’binomial’ )

bb3<-bootMer(md3,FUN=predFun2,nsim=200, parallel = “multicore”, .progress = ‘txt’ ) #get bootstrapping, 200 sims

all.dist = data.frame(‘a’ = as.vector(bb1$t), ‘b’ = as.vector(bb2$t), ‘c’ = as.vector(bb3$t))

all.dist.long <- gather(all.dist, factor, ecx, a:c, factor_key=TRUE)#Plotting

library(ggplot2)

library(tidybayes)

p1 = ggplot(all.dist.long, aes(x=ecx))+geom_density(aes(group=factor, color =factor , fill=factor), alpha=0.3)+

stat_pointintervalh(aes(y = 0.00, x = ecx, group=factor),.width = c(.66, .95))+#+facet_wrap(~contrast+time, nrow = 3, ncol = 2)+

theme_light()

p1 = p1+scale_fill_manual( values = c(“steelblue4”, “orange”, ‘red’))+

scale_color_manual( values = c(“steelblue4″,”grey”, “steelblue1″,”steelblue4”, “grey”,”grey”, “grey”,”grey”))+theme(legend.position=”none”)#nice

p1 = p1 + scale_y_continuous(name =”Density”)

p1 = p1 + coord_cartesian(xlim = c(0.0, 50))

p1 = p1 + coord_cartesian(ylim = c(0.0, 0.1))

p1

Starting to look pretty. The blue is ‘factor ‘a’, yellow ‘factor b’, and red ‘factor c’. By looking at the 95% CI on the bottom, the red looks statistically different than the factor a EC50.

Now let’s compare the distributions against ‘factor a’ by subtracting the two distributions.

#Differences posterior

df4.s = data.frame(atob= all.dist$a – all.dist$b, atoc = all.dist$a – all.dist$c)

df4.s.long <- gather(df4.s, factor, diff, atob:atoc, factor_key=TRUE)

plot(density(df4.s$atob))

p2 = ggplot(df4.s.long, aes(x=diff))+geom_density(aes(group=factor, color =factor , fill=factor), alpha=0.3)+

stat_pointintervalh(aes(y = 0.00, x = diff, group=factor),.width = c(.66, .95))+#+facet_wrap(~contrast+time, nrow = 3, ncol = 2)+

theme_light()

p2 = p2+scale_fill_manual( values = c(“steelblue4”, “orange”, ‘red’))+

scale_color_manual( values = c(“steelblue4″,”grey”, “steelblue1″,”steelblue4”, “grey”,”grey”, “grey”,”grey”))+theme(legend.position=”none”)#nice

p2 = p2 + scale_y_continuous(name =”Density”)

p2 = p2 +geom_vline(xintercept = 0, color = “red”, lty = 2)+ theme_light()

p2 = p2 + coord_cartesian(xlim = c(0.0, 50))

p2 = p2 + coord_cartesian(ylim = c(0.0, 0.1))

# p2 = p2 + scale_x_continuous(name =”Standardized effect size”)

p2 = p2+facet_wrap(~factor)

p2

I’ve added a line through zero, which represents ‘no difference’ i.e if both distribution had the same ECx, the black dot for the median ECx would be on zero. Values less than zero mean that factor b or c are greater than factor a, and vice versa for positive numbers. So the interpretation is that if the 95% interval see by the thin black line overlaps with the red line, the two distributions are not different. Here ‘a’ and ‘b’ are similar, and ‘factor a’ and ‘factor c’ are different.

You can of course get the values for these 95% CI.

atob.diff = sort(df4.s$atob)

nrow(df4.s)

atob.diff[median.CI]

median.ab<-atob.diff[median.CI] #find the 50%

low.ab<-atob.diff[lowerCI] #find the bottom 2.5%

up.ab<-atob.diff[upperCI] #find the top 2.5%

ab.diff = data.frame(median.ab, low.ab, up.ab)

ab.diff> ab.diff

median.ab low.ab up.ab

3.940983 -8.379134 16.5668

You may also be interested in making a probability type statement. This probably should be reserved to Bayesian analysis (and yes you can do this exact analysis above using Bayesian model parameter outputs). But in a frequentist sense, we could say, ‘*if this experiment was repeated 100 times, the EC50 of factor b would be greater than the EC50 of factor a, X many times*‘.

length(which (all.dist$a < all.dist$b))/sims*100

length(which (all.dist$a < all.dist$b))/sims*100

[1] 28length(which (all.dist$a < all.dist$c))/sims*100

length(which (all.dist$a < all.dist$c))/sims*100

[1]0.5

*‘if this experiment was repeated 100 times, the EC50 of factor b would be greater than the EC50 of ‘factor a’ 28 of the times’.*

*‘if this experiment was repeated 100 times, the EC50 of factor c would be greater than the EC50 of ‘factor a’ less than 1 time’.*