Unstandardizing coefficients from a GLMM
This post was last updated on 2020-06-26.
Fitting generalized linear mixed models (GLMM) can be tricky. For example, having explanatory variables with very different magnitudes can cause problems with model convergence. Standardizing the explanatory variables by subtracting the mean and dividing by the standard deviation prior to model fitting can often fix this issue. However, the interpretation of standardized coefficients is about the change in mean response for a 1 standard deviation increase in a variable; this is not particularly intuitive. Most often the coefficients need to be converted back to their unstandardized values for interpretation.
It turns out the math for converting the standardized slope estimates to unstandardized ones is fairly straightforward. Coefficients for each variable need to be divided by the standard deviation of that variable (this is only true for slopes, not intercepts). A nice write-up of the math involved is shown in the answer to this Cross Validated question.
The first time I went though the process of unstandardizing many explanatory variables the process felt pretty clunky. Since then I’ve managed to tidy things up quite a bit to make it more automated. I’ll outline that process in this post.
Table of Contents
R packages
Model fitting will be done via lme4. Data manipulation tools from dplyr will be useful for getting results tidied up. I’ll also use helper functions from purrr to loop through variables and broom.mixed for the tidy extraction of fixed-effects coefficients from mixed models.
library(lme4) # v. 1.1-23
suppressPackageStartupMessages( library(dplyr) ) # v. 1.0.0
library(purrr) # 0.3.4
library(broom.mixed) # 0.2.6
The dataset
The dataset I’ll use is named cbpp
, and comes with lme4. The response variable in this dataset is counted proportions, so the data will be analyzed via a binomial generalized linear mixed model.
glimpse(cbpp)
# Rows: 56
# Columns: 4
# $ herd <fct> 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, ...
# $ incidence <dbl> 2, 3, 4, 0, 3, 1, 1, 8, 2, 0, 2, 2, 0, 2, 0, 5, 0, 0, 1, ...
# $ size <dbl> 14, 12, 9, 5, 22, 18, 21, 22, 16, 16, 20, 10, 10, 9, 6, 1...
# $ period <fct> 1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, ...
This dataset has no continuous explanatory variables in it, so I’ll add some to demonstrate standardizing/unstandardizing. I create three new variables with very different ranges.
set.seed(16)
cbpp = mutate(cbpp,
y1 = rnorm(56, 500, 100),
y2 = runif(56, 0, 1),
y3 = runif(56, 10000, 20000) )
glimpse(cbpp)
# Rows: 56
# Columns: 7
# $ herd <fct> 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, ...
# $ incidence <dbl> 2, 3, 4, 0, 3, 1, 1, 8, 2, 0, 2, 2, 0, 2, 0, 5, 0, 0, 1, ...
# $ size <dbl> 14, 12, 9, 5, 22, 18, 21, 22, 16, 16, 20, 10, 10, 9, 6, 1...
# $ period <fct> 1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, ...
# $ y1 <dbl> 547.6413, 487.4620, 609.6216, 355.5771, 614.7829, 453.158...
# $ y2 <dbl> 0.87758754, 0.33596115, 0.11495285, 0.26466003, 0.9994220...
# $ y3 <dbl> 14481.07, 11367.88, 14405.16, 18497.73, 17955.66, 10068.8...
Analysis
Unstandardized variables
I will first fit the binomial GLMM model without standardizing the explanatory variables to demonstrate the resulting warning messages. The first message tells me that I should consider rescaling the variables since they are on very different scales.
fit1 = glmer( cbind(incidence, size - incidence) ~ y1 + y2 + y3 + (1|herd),
data = cbpp, family = binomial)
# Warning: Some predictor variables are on very different scales: consider
# rescaling
# Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
# Model failed to converge with max|grad| = 0.947357 (tol = 0.002, component 1)
# Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
# - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
# - Rescale variables?
Standardizing the variables
To get the model to converge without warnings I’ll need to standardize the three explanatory variables. You can manually subtract the mean and divide by the standard deviation, but the scale()
function conveniently does this all at once.
I do the work inside mutate_at()
so I can standardize all three variables at once. I add “s” as the suffix by assigning a name within the .funs
list. Adding the suffix allows me to keep the original variables in the dataset. I use as.numeric()
to convert the matrix that the scale()
function returns into a vector.
cbpp = mutate_at(cbpp,
.vars = vars( y1:y3 ),
.funs = list(s = ~as.numeric( scale(.) ) ) )
glimpse(cbpp)
# Rows: 56
# Columns: 10
# $ herd <fct> 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, ...
# $ incidence <dbl> 2, 3, 4, 0, 3, 1, 1, 8, 2, 0, 2, 2, 0, 2, 0, 5, 0, 0, 1, ...
# $ size <dbl> 14, 12, 9, 5, 22, 18, 21, 22, 16, 16, 20, 10, 10, 9, 6, 1...
# $ period <fct> 1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, ...
# $ y1 <dbl> 547.6413, 487.4620, 609.6216, 355.5771, 614.7829, 453.158...
# $ y2 <dbl> 0.87758754, 0.33596115, 0.11495285, 0.26466003, 0.9994220...
# $ y3 <dbl> 14481.07, 11367.88, 14405.16, 18497.73, 17955.66, 10068.8...
# $ y1_s <dbl> 0.34007250, -0.32531152, 1.02536895, -1.78352139, 1.08243...
# $ y2_s <dbl> 1.4243017, -0.4105502, -1.1592536, -0.6520950, 1.8370368,...
# $ y3_s <dbl> -0.2865770, -1.3443309, -0.3123665, 1.0781427, 0.8939670,...
Standardized model
The model with standardized variables converges without any problems.
fit2 = glmer( cbind(incidence, size - incidence) ~ y1_s + y2_s + y3_s + (1|herd),
data = cbpp, family = binomial)
Unstandardizing slope coefficients
Extract coefficients and profile confidence intervals
If I want to use this model for inference I’ll need to unstandardize the coefficients and confidence interval limits before reporting them so they are more easily interpretable.
The first step in the process is to get the standardized estimates and confidence intervals from the model fit2
. I use tidy()
from package broom for this, which returns a data.frame of coefficients, statistical tests, and confidence intervals. The help page is at ?tidy.merMod
if you want to explore some of the options.
Using tidy()
, I extract the fixed effects along with profile likelihood confidence intervals.
coef_st = tidy(fit2,
effects = "fixed",
conf.int = TRUE,
conf.method = "profile")
# Computing profile confidence intervals ...
coef_st
# # A tibble: 4 x 8
# effect term estimate std.error statistic p.value conf.low conf.high
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 fixed (Intercept) -2.096 0.2161 -9.699 3.037e-22 -2.566 -1.654
# 2 fixed y1_s 0.2640 0.1396 1.892 5.851e- 2 -0.009310 0.5473
# 3 fixed y2_s 0.1032 0.1295 0.7968 4.256e- 1 -0.1515 0.3636
# 4 fixed y3_s 0.1570 0.1229 1.277 2.016e- 1 -0.08629 0.4033
Calculate standard deviations for each variable
Next I’ll calculate the standard deviations for each variable. If I do this right I’ll be able to put the standard deviations into a data.frame that I can join to coef_st
. That way dividing the estimated slopes and confidence interval limits by the standard deviation will be straightforward.
I will calculate the standard deviations per variable with map()
from purrr, as it is a convenient way to loop through columns. I pull out the variables I want to calculate standard deviations for via select()
. An alternative approach would have been to take the variables from columns and put them in rows (i.e., put the data in long format), and then summarize by groups.
The output from map()
returns a list, which can be stacked into a long format data.frame via utils::stack()
. This results in a two column data.frame, with a column for the standard deviation (called values
) and a column with the variable names (called ind
).
cbpp %>%
select(y1:y3) %>%
map(sd) %>%
stack()
# values ind
# 1 90.4430192 y1
# 2 0.2951881 y2
# 3 2943.2098667 y3
The variables in my model output end with _s
, so I need to add that suffix to the variable names. I also decided to rename values
to sd
so the column name is more useful.
sd_all = cbpp %>%
select(y1:y3) %>%
map(sd) %>%
stack() %>%
rename(sd = values) %>%
mutate(ind = paste(ind, "s", sep = "_") )
sd_all
# sd ind
# 1 90.4430192 y1_s
# 2 0.2951881 y2_s
# 3 2943.2098667 y3_s
Joining the standard deviations to the coefficients table
Once the names of the variables match between the datasets I can join the “standard deviations” data.frame to the “coefficients” data.frame. I’m not unstandardizing the intercept at this point, so I use inner_join()
to keep only rows that have a match in both data.frames. Notice that the columns I’m joining by have different names in the two data.frames.
coef_st %>%
inner_join(., sd_all, by = c("term" = "ind") )
# # A tibble: 3 x 9
# effect term estimate std.error statistic p.value conf.low conf.high sd
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 fixed y1_s 0.2640 0.1396 1.892 0.05851 -0.009310 0.5473 9.044e+1
# 2 fixed y2_s 0.1032 0.1295 0.7968 0.4256 -0.1515 0.3636 2.952e-1
# 3 fixed y3_s 0.1570 0.1229 1.277 0.2016 -0.08629 0.4033 2.943e+3
Once everything in one data.frame I can divide estimate
, conf.low
and conf.high
by sd
via mutate_at()
. I will round the results, as well, although I am ignoring the vast differences in the variable range when I do this rounding.
coef_st %>%
inner_join(., sd_all, by = c("term" = "ind") ) %>%
mutate_at( .vars = vars(estimate, conf.low, conf.high),
.funs = list(~round( ./sd, 4) ) )
# # A tibble: 3 x 9
# effect term estimate std.error statistic p.value conf.low conf.high sd
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 fixed y1_s 0.00290 0.1396 1.892 0.05851 -0.0001 0.0061 90.44
# 2 fixed y2_s 0.3495 0.1295 0.7968 0.4256 -0.5131 1.232 0.2952
# 3 fixed y3_s 0.0001 0.1229 1.277 0.2016 0 0.0001 2943.
I’ll get rid of the extra variables via select()
, so I end up with only the unstandardized coefficients and confidence interval limits along with the variable names.
coef_unst = coef_st %>%
inner_join(., sd_all, by = c("term" = "ind") ) %>%
mutate_at( .vars = vars(estimate, conf.low, conf.high),
.funs = list(~round( ./sd, 4) ) ) %>%
select(-effect, -(std.error:p.value), -sd)
coef_unst
# # A tibble: 3 x 4
# term estimate conf.low conf.high
# <chr> <dbl> <dbl> <dbl>
# 1 y1_s 0.00290 -0.0001 0.0061
# 2 y2_s 0.3495 -0.5131 1.232
# 3 y3_s 0.0001 0 0.0001
Estimates from the unstandardized model
Note that the estimated coefficients are the same from the model where I manually unstandardize the coefficients (above) and the model fit using unstandardized variables.
round( fixef(fit1)[2:4], 4)
# y1 y2 y3
# 0.0029 0.3495 0.0001
Given that the estimates are the same, couldn’t we simply go back and fit the unstandardized model and ignore the warning message? That would be nice, but unfortunately the convergence issues can cause problems when trying to calculate profile likelihood confidence intervals so this simpler approach doesn’t always work.
In this case there are a bunch of warnings (not shown), and the profile likelihood confidence interval limits aren’t successfully calculated for some of the coefficients.
tidy(fit1,
effects = "fixed",
conf.int = TRUE,
conf.method = "profile")
# Computing profile confidence intervals ...
# # A tibble: 4 x 8
# effect term estimate std.error statistic p.value conf.low conf.high
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 fixed (Interc~ -4.583e+0 1.095 -4.186 2.845e-5 NA NA
# 2 fixed y1 2.919e-3 0.001520 1.920 5.486e-2 NA NA
# 3 fixed y2 3.495e-1 0.4371 0.7997 4.239e-1 NA NA
# 4 fixed y3 5.335e-5 0.00004049 1.317 1.877e-1 -2.932e-5 1.370e-4
Further (important!) work
These results are all on the scale of log-odds, and I would exponentiate the unstandardized coefficients to the odds scale when reporting.
One thing I didn’t discuss in this post that is important to consider is the appropriate and practically interesting unit increase for each variable. Clearly the effect of a 1 unit increase in the variable may not be of interest for at least y2
(range between 0 and 1) and y3
(range between 10000 and 20000). In the first case, 1 unit encompasses the entire range of the variable and in the second case 1 unit appears to be much smaller than the scale of the measurement.
The code to calculate the change in odds for a practically interesting increase in each explanatory variable would be similar to what I’ve done above. I would create a data.frame with the unit increase of interest for each variable in it, join this to the “coefficients” dataset, and multiply all estimates and CI by those values. The multiplication step can occur before or after unstandardizing but must happen before doing exponentiation/inverse-linking. I’d report the unit increase used for each variable in any tables of results so the reader can see that the reported estimate is a change in estimated odds/mean for the given practically important increase in the variable.
Just the code, please
Here’s the code without all the discussion. Copy and paste the code below or you can download an R script of uncommented code from here.
library(lme4) # v. 1.1-23
suppressPackageStartupMessages( library(dplyr) ) # v. 1.0.0
library(purrr) # 0.3.4
library(broom.mixed) # 0.2.6
glimpse(cbpp)
set.seed(16)
cbpp = mutate(cbpp,
y1 = rnorm(56, 500, 100),
y2 = runif(56, 0, 1),
y3 = runif(56, 10000, 20000) )
glimpse(cbpp)
fit1 = glmer( cbind(incidence, size - incidence) ~ y1 + y2 + y3 + (1|herd),
data = cbpp, family = binomial)
cbpp = mutate_at(cbpp,
.vars = vars( y1:y3 ),
.funs = list(s = ~as.numeric( scale(.) ) ) )
glimpse(cbpp)
fit2 = glmer( cbind(incidence, size - incidence) ~ y1_s + y2_s + y3_s + (1|herd),
data = cbpp, family = binomial)
coef_st = tidy(fit2,
effects = "fixed",
conf.int = TRUE,
conf.method = "profile")
coef_st
cbpp %>%
select(y1:y3) %>%
map(sd) %>%
stack()
sd_all = cbpp %>%
select(y1:y3) %>%
map(sd) %>%
stack() %>%
rename(sd = values) %>%
mutate(ind = paste(ind, "s", sep = "_") )
sd_all
coef_st %>%
inner_join(., sd_all, by = c("term" = "ind") )
coef_st %>%
inner_join(., sd_all, by = c("term" = "ind") ) %>%
mutate_at( .vars = vars(estimate, conf.low, conf.high),
.funs = list(~round( ./sd, 4) ) )
coef_unst = coef_st %>%
inner_join(., sd_all, by = c("term" = "ind") ) %>%
mutate_at( .vars = vars(estimate, conf.low, conf.high),
.funs = list(~round( ./sd, 4) ) ) %>%
select(-effect, -(std.error:p.value), -sd)
coef_unst
round( fixef(fit1)[2:4], 4)
tidy(fit1,
effects = "fixed",
conf.int = TRUE,
conf.method = "profile")