library(tidyverse)
library(ggfortify)
library(lme4)
library(saemix)
library(lattice)
theme_set(theme_bw())
Nonlinear Mixed Effects Models
Preliminary
Functions from R
-base and stats (preloaded) are required plus packages from the tidyverse for data representation and manipulation. We also need the lme4 and saemix package for fitting (nonlinear) mixed-model. lattice is used for graphical representation of quantities such as random and fixed effects in the mixed models.
1 Introduction: the theophylline
data set
The theophyllineData
file reports data from a study of the kinetics of the anti-asthmatic drug theophylline. Twelve subjects were given oral doses of theophylline then serum concentrations were measured at 11 time points over the next 25 hours.
<- read_csv("../../data/theophyllineData.csv") %>% mutate(id = factor(id))
theophylline %>% rmarkdown::paged_table() theophylline
Here, time
is the time since drug administration when the sample was drawn (h), concentration
is the measured theophylline concentration (mg/L) and weight
is the weight of the subject (kg).
The 12 subjects received 320 mg of theophylline at time 0.
Let us plot the data, i.e. the concentration versus time:
Show the code
<- theophylline %>%
theo_plot ggplot() + aes(x = time, y = concentration) + geom_point(color="#993399", size=2) +
xlab("time (h)") + ylab("concentration (mg/l)")
+ geom_line(color="#993399", aes(group = id)) theo_plot
We can also plot the 12 individual concentration profiles on 12 separated plots,
Show the code
+ geom_line() + facet_wrap( ~ id) theo_plot
The pattern is similar for the 12 individuals: the concentration first increases during the absorption phase and then decreases during the elimination phase. Nevertheless, we clearly see some differences between these profiles which are not only due to the residual errors. In particular, we see that the patients absorb and eliminate the drug more or less rapidly.
A population approach and the use of mixed effects models will allow us to take into account this inter individual variability.
2 Fitting nonlinear models to the data
2.1 Fitting a nonlinear model to a single subject
Let us consider the first subject of this study (id=1)
<- theophylline %>%
subject1 filter(id == 1) %>% select("time","concentration")
<- subject1 %>%
subject1_plot ggplot() + aes(x = time, y = concentration) + geom_point( color="#993399", size=3) +
xlab("time (h)") + ylab("concentration (mg/l)") + ylim(c(0,11))
+ geom_line(color="#993399") subject1_plot
We may want to fit a nonlinear model to this data of the form
y_j = f(t_j ,\psi) + \varepsilon_j \quad , \quad 1\leq j \leq n
where (y_j, 1\leq j \leq n) are the n measurements for this subject, f is the model (e.g. from pharmacokinetics), \psi is the vector of parameters for this subject and (\varepsilon_j, 1\leq j \leq n) are residual errors.
A one compartment model with first order absorption and linear elimination to this data writes
f(t_j ,\psi) = \frac{ D \, k_a}{V(k_a-k_e)}\, \left( e^{- k_e \, t} - e^{- k_a \, t} \right)
where \psi=(k_a,V,k_e) are the PK parameters of the model and D is the amount of drug given to the patient (here, D=320mg).
Let us compute the least squares estimate of \psi defined as
\hat\psi = \arg\min_{\psi} \sum_{j=1}^{n} (y_{j} - f(t_{j} ,\psi))^2
We first need to implement the pharmacokinetics (PK) model:
<- function(psi, t){
f1 <- 320; ka <- psi[1]; V <- psi[2]; ke <- psi[3]
D <- D*ka/V/(ka-ke)*(exp(-ke*t)-exp(-ka*t))
f
f }
We can then use the nls
function for fitting this (nonlinear) model to the data
<- nls(concentration ~ f1(psi, time), start = list(psi=c(ka=1, V=40, ke=0.1)), data=subject1)
model_1 coef(model_1)
psi1 psi2 psi3
1.77741125 29.39415966 0.05395458
and plot the predicted concentration f(t,\hat\psi)
Show the code
<- data.frame(time = seq(0, 40, by=0.2))
dplot $pred_1 <- predict(model_1, newdata = dplot)
dplot+ geom_line(data = dplot, aes(x = time, y = pred_1), colour = "#339900", linewidth=1) subject1_plot
2.2 Fitting a unique nonlinear model to several subjects
Instead of fitting this model to a single patient, we may want to fit the same model to all the patients:
y_{ij} = f(t_{ij} ,\psi) + \varepsilon_{ij} \quad , \quad 1\leq i \leq N \ , \ 1\leq j \leq n_i
where (y_{ij}, 1\leq j \leq n_i) are the n_i measurements for subject i. Here, \psi is the vector of parameters shared by the N subjects.
In this model, the least squares estimate of \psi is defined as
\hat\psi = \arg\min_{\psi} \sum_{i=1}^N \sum_{j=1}^{n_i} (y_{ij} - f(t_{ij} ,\psi))^2
Let use the nls
function with the pooled data from the 12 subjects.
<- nls(concentration ~ f1(psi, time), start = list(psi=c(ka=1, V=40, ke=0.1)), data=theophylline)
model_all coef(model_all)
psi1 psi2 psi3
1.57975379 33.42183648 0.07931028
Show the code
$pred_all <- predict(model_all, newdata = dplot)
dplot+ geom_line(data = dplot, aes(x = time, y = pred_all), colour="#339900", size=1) theo_plot
These estimated parameters are typical parameters and this profile is a typical profile for this sample of patients: by definition, they do not take into account the variability between the patients and therefore do not provide good individual predictions.
Show the code
+
theo_plot geom_line(data = dplot, aes(x = time, y=pred_all), colour="#339900", linewidth=1) +
facet_wrap(~ id)
2.3 Fitting several nonlinear models to several subjects
We can instead fit the same PK model with different parameters to each subject, exactly a we did above with the first patient:
y_{ij} = f(t_{ij} ,\psi_i) + \varepsilon_{ij} \quad , \quad 1\leq i \leq N \ , \ 1\leq j \leq n_i
where \psi_i is the vector of parameters for patient i.
In this model, the least squares estimate of \psi_i is defined as
\hat\psi_i = \arg\min_{\psi} \sum_{j=1}^{n_i} \Big(y_{ij} - f(t_{ij} ,\psi)\Big)^2 \quad , \quad 1\leq i \leq N
Each individual predicted concentration f(t,\hat\psi_i) seems to predict quite well the observed concentrations for the 12 subjects:
Show the code
<- split(theophylline, theophylline$id) %>%
res map(~{
<- nls(concentration ~ f1(psi, time),
model_i start = list(psi=c(ka=1, V=40,k=0.08)),
data = .x)
list(psi = coef(model_i),
y_hat = predict(model_i, newdata = dplot),
id = unique(.x$id))
})<- map_df(res, "psi") %>%
psi_hat setNames(c("ka","V","ke")) %>%
add_column(id = factor(map_dbl(res, "id")))
<-
theo_pred map_df(res, "y_hat") %>%
pivot_longer(everything(), names_to = "id", values_to = "concentration") %>%
add_column(time = rep(dplot$time, each = length(res)))
+ geom_line(data = theo_pred, aes(x=time,y=concentration), colour="#339900", size=0.75) + facet_wrap(~id) theo_plot
For instance for individual 9,
<- nls(concentration ~ f1(psi, time), start = list(psi=c(ka=1, V=40,k=0.08)),
model_9 data = filter(theophylline, id == 9))
model_9
Nonlinear regression model
model: concentration ~ f1(psi, time)
data: filter(theophylline, id == 9)
psi1 psi2 psi3
8.86564 38.94820 0.08663
residual sum-of-squares: 2.489
Number of iterations to convergence: 13
Achieved convergence tolerance: 3.749e-06
3 Nonlinear mixed effects (NLME) model
3.1 A first basic model
Until now, the individual parameters (\psi_i) were considered as fixed effects: we didn’t make any assumption about there possible values.
In a population approach, the N subjects are assumed to be randomly sampled from a same population of individuals. Then, each individual parameter \psi_i is treated as a random variable.
We will start assuming that the \psi_i’s are independent and normally distributed:
\psi_i \sim^{\mathrm{iid}} \mathcal{N}(\psi_{\rm pop} , \Omega)
where \psi_{\rm pop} is a p-vector of population parameters and \Omega a p\times p variance-covariance matrix.
Remark. This normality assumption allows us to decompose each individual parameter \psi_i into a fixed effect \psi_{\rm pop} and a random effect \eta_i:
\psi_i = \psi_{\rm pop} + \eta_i
where \eta_i \sim^{\mathrm{iid}} \mathcal{N}(0 , \Omega).
We will also start assuming that the residual errors (\varepsilon_{ij}) are independent and normally distributed: \varepsilon_{ij} \sim^{\mathrm{iid}} \mathcal{N}(0 , a^2).
In summary, we can equivalently represent a (nonlinear) mixed effects model
i) using equations:
\begin{aligned} y_{ij} & = f(t_{ij} ,\psi_i) + \varepsilon_{ij} \\ \psi_i & = \psi_{\rm pop} + \eta_i \end{aligned} where \varepsilon_{ij} \sim^{\mathrm{iid}} \mathcal{N}(0 , a^2) and \eta_i \sim^{\mathrm{iid}} \mathcal{N}(0 , \Omega),
ii) or using probability distributions:
\begin{aligned} y_{ij} &\sim \mathcal{N}(f(t_{ij} ,\psi_i) \ , \ a^2) \\ \psi_i & \sim \mathcal{N}(\psi_{\rm pop} , \Omega) \end{aligned}
The model is the joint probability distribution of (y,\psi), where y=(y_{ij}, 1\leq i \leq N, 1 \leq j \leq n_i) is the complete set of observations and \psi = (\psi_i, 1\leq i \leq N) the N vectors of individual parameters,
\begin{aligned} \mathbb{P}(y,\psi;\theta) &= <\prod_{i=1}^N \mathbb{P}(y_i,\psi_i;\theta) \\ &= \prod_{i=1}^N \mathbb{P}(y_i|\psi_i;\theta)\mathbb{P}(\psi_i;\theta) \end{aligned}
3.2 Tasks, methods and algorithms
A detailed presentation of all the existing methods and algorithms for nonlinear mixed effect models goes far beyond the scope of this course. We will restrict ourselves to the methods and algorithms implemented in the saemix package.
3.2.1 Estimation of the population parameters
The parameters of the model are \theta=(\psi_{\rm pop}, \Omega, a^2). Maximum likelihood estimation of \theta consists of maximizing with respect to \theta the observed likelihood function defined by
\begin{aligned} \ell(\theta,y) &\triangleq \mathbb{P}(y ; \theta) \\ &= \int \mathbb{P}(y,\psi ;\theta) \, d \psi \\ &= \prod_{i=1}^N\int \mathbb{P}(y_i|\psi_i ;\theta)\mathbb{P}(\psi_i ;\theta) \, d \psi_i . \end{aligned}
If f is a nonlinear function of \psi_i, then y_i is not a Gaussian vector and the likelihood function \ell(\theta,y) cannot be computed in a closed form.
Several algorithms exists for maximum likelihood estimation in nonlinear mixed effects models. In particular, the stochastic approximation EM algorithm (SAEM) is an iterative algorithm that converges to a maximum of the likelihood function under general conditions.
3.2.2 Estimation of the individual parameters
Once \theta has been estimated, the conditional distribution \mathbb{P}(\psi_i | y_i ; \hat{\theta}) can be used for each individual i for estimating the vector of individual parameters \psi_i.
The mode of this conditional distribution is defined as
\begin{aligned} \hat{\psi}_i &= \arg\max_{\psi_i}\mathbb{P}(\psi_i | y_i ; \hat{\theta}) \\ &= \arg\min_{\psi_i} \left\{ -2\log(\mathbb{P}(\psi_i | y_i ; \hat{\theta})) \right\}\\ &= \arg\min_{\psi_i} \left\{-2\log\mathbb{P}( y_i | \psi_i ; \hat{\theta}) -2 \log\mathbb{P}(\psi_i ; \hat{\theta}) \right\}\\ &= \arg\min_{\psi_i} \left\{ \frac{1}{\hat{a}^2}\|y_i - f(t_i,\psi_i)\|^2 + (\psi_i-\hat\psi_{\rm pop})^\top\Omega^{-1}(\psi_i-\hat\psi_{\rm pop}) \right\} \end{aligned}
This estimate is called the maximum a posteriori (MAP) estimate, or the empirical Bayes estimate (EBE) of \psi_i.
Remark. Since f is a nonlinear function of \psi_i, there is no analytical expression of \hat\psi_i. A Newton-type algorithm should then be used to carry out this minimization problem.
We can then use the conditional mode for computing predictions, taking the philosophy that the most likely values of the individual parameters are the most suited for computing the most likely predictions:
\widehat{f(t_{ij} , \psi_i)} = f(t_{ij} , \hat\psi_i).
3.2.3 Estimation of the likelihood function
Performing likelihood ratio tests and computing information criteria for a given model requires computation of the log-likelihood \log\ell(\hat{\theta};y) \triangleq \log(\mathbb{P}(y;\hat{\theta})).
The log-likelihood cannot be computed in closed-form for nonlinear mixed effects models. In the case of continuous data, approximation of the model by a Gaussian linear one allows us to approximate the log-likelihood.
Indeed, we can linearize the model for the observations (y_{ij},\, 1\leq j \leq n_i) of individual i around the vector of predicted individual parameters \hat\psi_i.
Let \partial_{\psi}{f(t , \psi)} be the row vector of derivatives of f(t , \psi) with respect to \psi. Then,
\begin{aligned} y_{ij} &\simeq f(t_{ij} , \hat\psi_i) + \nabla_{\psi}{f(t_{ij} , \hat\psi_i)^\top} \, (\psi_i - \hat\psi_i) + \varepsilon_{ij} \\ &\simeq f(t_{ij} , \hat\psi_i) + \nabla_{\psi}{f(t_{ij} , \hat\psi_i)^\top} \, (\hat\psi_{\rm pop} - \hat\psi_i) + \nabla_{\psi}{f(t_{ij} , \hat\psi_i)^\top} \, \eta_i + \varepsilon_{ij} . \end{aligned}
Following this, we can approximate the marginal distribution of the vector y_i by a normal one:
y_{i} \approx \mathcal{N}\left(\mu_i\, , \, \Gamma_i \right),
where
\begin{aligned} \mu_i & = f(t_{i} , \hat\psi_i) + \nabla_{\psi}{f(t_{i} , \hat\psi_i)^\top}(\hat\psi_{\rm pop} - \hat\psi_i) \\ \Gamma_i &= \nabla_{\psi}{f(t_{i} , \hat\psi_i)^\top} \, \hat\Omega \, \nabla_{\psi}{f(t_{i} , \hat\psi_i)}^{\top} + \hat{a}^2\, I_{n_i} \end{aligned}
The log-likelihood function is then approximated by
\begin{aligned} \log\ell(\hat{\theta};y) & = \sum_{i=1}^N \log\ell(\hat{\theta};y_i)\\ & \simeq \sum_{i=1}^N \left\{ -\frac{n_i}{2}\log(2 \pi) -\frac{1}{2}\log(|\Gamma_i|) -\frac{1}{2} (y_i - \mu_i)^\top \Gamma_i^{-1} (y_i - \mu_i) \right\} \end{aligned}
3.2.4 Estimation of the Fisher information matrix
Using the linearized model, the variance of the maximum likelihood estimate (MLE) \hat{\theta}, and thus confidence intervals, can be derived from the observed Fisher information matrix (FIM), itself derived from the observed likelihood:
\begin{aligned} I({\hat\theta}) & \triangleq \ - \frac{\partial^2}{\partial \theta \partial \theta^\top} \log\ell(\hat{\theta};y) \\ & \simeq \frac{1}{2}\sum_{i=1}^N \frac{\partial^2}{\partial \theta \partial \theta^\top}\left\{ \log(|\Gamma_i|) + (y_i - \mu_i)^\top \Gamma_i^{-1} (y_i - \mu_i) \right\} \end{aligned}
The variance-covariance matrix of \hat\theta can then be estimated by the inverse of the observed FIM. Standard errors (s.e.) for each component of \hat\theta are the standard deviations, i.e., the square root of the diagonal elements of the variance-covariance matrix.
4 Fitting a NLME model to the theophylline data
Let us see how to use the saemix package for fitting our model to the theophylline data.
We first need to create a SaemixData
object, defining which column of the data file should be used and their role. In our example, concentration
is the response variable y, time
is the explanatory variable (or predictor) t and id
is the grouping variable.
<- saemixData(name.data = theophylline,
saemix_data name.group = "id",
name.predictors = "time",
name.response = "concentration")
The structural model is the one compartment model with first order absorption and linear elimination previously used,
<- function(psi,id,x) {
model1_nlme <- 320
D <- x[,1]
t <- psi[id,1]
ka <- psi[id,2]
V <- psi[id,3]
ke <- D*ka/(V*(ka-ke))*(exp(-ke*t)-exp(-ka*t))
fpred
fpred }
The model is defined in a saemixModel object. The structural model and some initial values for the vector of population parameters \psi_{\rm pop} are required
<- saemixModel(model = model1_nlme,
saemix_model psi0 = c(ka=1,V=20,ke=0.5))
Several options for selecting and running the algorithms can be defined, including the estimation of the individual parameters (map=TRUE
), the estimation of the Fisher information matrix and the log-likelihood by linearization (fim=TRUE
), or the estimation of the log-likelihood by importance sampling (ll.is=TRUE
, ignored for this course).
seed
is a integer used for the random number generator: running the algorithms several times with the same seed ensures that the results will be the same.
<- list(map=TRUE, fim=TRUE, ll.is=FALSE, displayProgress=FALSE, save=FALSE, seed=632545)
saemix_options <- saemix(saemix_model, saemix_data, saemix_options) saemix_fit1
A summary of the results of the estimation algorithm can be displayed
@results saemix_fit1
Fixed effects
Parameter Estimate SE CV(%)
ka 1.8247 0.34144 18.71
V 32.5408 1.59864 4.91
ke 0.0884 0.00629 7.11
a.1 0.7474 0.05658 7.57
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 1.21e+00 5.50e-01 45.5
omega2.V 2.27e+01 1.12e+01 49.4
omega2.ke 2.02e-04 1.72e-04 85.4
Statistical criteria
Likelihood computed by linearisation
-2LL= 345.101
AIC= 359.101
BIC= 362.4954
The individual parameters estimates are also available
<- psi(saemix_fit1)
psi psi
ka V ke
1 1.6618502 28.23167 0.06372162
2 2.0150874 32.84552 0.09448832
3 2.2772015 33.42197 0.08637440
4 1.1978851 31.37165 0.08668642
5 1.5244794 27.50193 0.08575651
6 1.1193182 39.55880 0.09839426
7 0.7350038 34.06061 0.09255150
8 1.3723612 35.37493 0.09168638
9 4.3235277 36.38783 0.09534774
10 0.7027648 25.58049 0.07625125
11 3.2181003 36.72171 0.09811233
12 0.9502211 26.08340 0.09081927
These individual parameter estimates can be used for computing and plotting individual predictions
<- saemix.predict(saemix_fit1)
saemix_fit saemix.plot.fits(saemix_fit1)
Several diagnostic fit plots can be displayed, including the plot of the observations versus individual predictions
saemix.plot.obsvspred(saemix_fit1,level=1)
and the plot of the residuals versus time, and versus individual predictions,
saemix.plot.scatterresiduals(saemix_fit1, level=1)
4.1 Some extensions of the model
4.1.1 The residual error model
In the model y_{ij} = f(t_{ij} ,\psi_i) + \varepsilon_{ij}, the residual errors (\varepsilon_{ij}) are assumed to be Gaussian random variables with mean 0. Different models can be used for the variance of the (\varepsilon_{ij}) in a nonlinear mixed effects model. The following are some of them (more details about residual error models here).
Constant error model:
The residual errors (\varepsilon_{ij}) are independent and identically distributed:
\varepsilon_{ij} \sim^{\mathrm{iid}} \mathcal{N}(0 \ , \ a^2)
The variance of y_{ij} is therefore constant over time:
y_{ij} = f(t_{ij} ,\psi_i) + a \varepsilon_{ij}
where \varepsilon_{ij} \sim^{\mathrm{iid}} \mathcal{N}(0, 1).
The error model can be defined as an argument of saemixModel
(default is the constant error model)
<- saemixModel(model=model1_nlme, psi0=c(ka=1,V=20,ke=0.5), error.model="constant")
saemix_model <- saemix(saemix_model, saemix_data, saemix_options) fit.constant
@results fit.constant
Fixed effects
Parameter Estimate SE CV(%)
ka 1.8247 0.34144 18.71
V 32.5408 1.59864 4.91
ke 0.0884 0.00629 7.11
a.1 0.7474 0.05658 7.57
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 1.21e+00 5.50e-01 45.5
omega2.V 2.27e+01 1.12e+01 49.4
omega2.ke 2.02e-04 1.72e-04 85.4
Statistical criteria
Likelihood computed by linearisation
-2LL= 345.101
AIC= 359.101
BIC= 362.4954
Proportional error model:
Proportional error models assume that the standard deviation of \varepsilon_{ij} is proportional to the predicted response: \varepsilon_{ij} = \ b\, f(t_{ij} ,\psi_i) \, \varepsilon_{ij} where \varepsilon_{ij} \sim^{\mathrm{iid}} \mathcal{N}(0, 1). Then,
y_{ij} = f(t_{ij} ,\psi_i) + b f(t_{ij} ,\psi_i) \, \varepsilon_{ij}
<- saemixModel(model=model1_nlme, psi0=c(ka=1,V=20,ke=0.5), error.model="proportional")
saemix_model <- saemix(saemix_model, saemix_data, saemix_options) fit.proportional
@results fit.proportional
Fixed effects
Parameter Estimate SE CV(%)
ka 1.7430 0.32323 18.55
V 32.9585 1.59411 4.84
ke 0.0871 0.00471 5.41
b.1 0.1608 0.01225 7.62
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 1.07e+00 4.94e-01 46.3
omega2.V 2.12e+01 1.16e+01 55.0
omega2.ke 1.81e-04 1.02e-04 56.3
Statistical criteria
Likelihood computed by linearisation
-2LL= 339.6003
AIC= 353.6003
BIC= 356.9947
Combined error model:
A combined error model additively combines a constant and a proportional error model: \varepsilon_{ij} = (a +\ b\, f(t_{ij} ,\psi_i)) \, \varepsilon_{ij} where \varepsilon_{ij} \sim^{\mathrm{iid}} \mathcal{N}(0, 1). Then,
y_{ij} = f(t_{ij} ,\psi_i) + (a + b f(t_{ij} ,\psi_i)) \, \varepsilon_{ij}
<- saemixModel(model=model1_nlme, psi0=c(ka=1,V=20,ke=0.5), error.model="combined")
saemix_model <- saemix(saemix_model, saemix_data, saemix_options) fit.combined
@results fit.combined
Fixed effects
Parameter Estimate SE CV(%)
ka 1.7912 0.34267 19.13
V 32.5252 1.65719 5.10
ke 0.0886 0.00578 6.53
a.1 0.5947 0.13123 22.07
b.1 0.0769 0.02423 31.49
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 1.22e+00 5.54e-01 45.3
omega2.V 2.44e+01 1.20e+01 49.3
omega2.ke 1.61e-04 1.46e-04 90.7
Statistical criteria
Likelihood computed by linearisation
-2LL= 341.426
AIC= 357.426
BIC= 361.3053
Exponential error model:
If y is known to take non negative values, a log transformation can be used. We can then write the model with one of two equivalent representations: \begin{aligned} \log(y_{ij}) & = \log(f(t_{ij} ,\psi_i)) + a \varepsilon_{ij} \\ y_{ij} & = f(t_{ij} ,\psi_i) \ e^{a \varepsilon_{ij}} \end{aligned}
<- saemixModel(model=model1_nlme, psi0=c(ka=1,V=20,ke=0.5), error.model="exponential")
saemix_model <- saemix(saemix_model, saemix_data, saemix_options) fit.exponential
@results fit.exponential
Fixed effects
Parameter Estimate SE CV(%)
ka 1.4848 0.27580 18.58
V 32.3074 1.60279 4.96
ke 0.0887 0.00493 5.56
a.1 0.1766 0.01341 7.60
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 7.73e-01 3.58e-01 46.3
omega2.V 1.95e+01 1.14e+01 58.6
omega2.ke 1.85e-04 1.09e-04 59.1
Statistical criteria
Likelihood computed by linearisation
-2LL= 364.281
AIC= 378.281
BIC= 381.6754
4.1.2 Transformation of the individual parameters
Clearly, not all distributions are Gaussian. To begin with, the normal distribution has support \mathbb{R}, unlike many parameters that take values in precise intervals. For instance, some variables take only positive values (e.g., volumes and transfer rate constants) and others are restricted to bounded intervals.
Furthermore, the Gaussian distribution is symmetric, which is not a property shared by all distributions. One way to extend the use of Gaussian distributions is to consider that some transformation of the parameters in which we are interested is Gaussian.
i.e., assume the existence of a monotonic function h such that h(\psi_i) is normally distributed. For a sake of simplicity, we will consider here a scalar parameter \psi_i. We then assume that
h(\psi_i) \sim \mathcal{N}(h(\psi_{\rm pop}) \ , \ \omega^2).
Or, equivalently,
h(\psi_i) = h(\psi_{\rm pop}) + \eta_i
where \eta_i \sim \mathcal{N}(0,\omega^2).
Let us now see some examples of transformed normal pdfs.
Log-normal distribution:
A log-normal distributions ensures nonnegative values and is widely used for describing distributions of physiological parameters.
If \psi_i is log-normally distributed, the 3 following representations are equivalent:
\begin{aligned} \log(\psi_i) & \sim \mathcal{N}( \log(\psi_{\rm pop}) \ , \omega^2) \\ \log(\psi_i) &= \log(\psi_{\rm pop}) + \eta_i \\ \psi_i &= \psi_{\rm pop} e^{\eta_i} \end{aligned}
Logit-normal distribution:
The logit function is defined on (0,1) and take its value in \mathbb{R}: For any x in (0,1),
\mathrm{logit}(x) = \log \left(\frac{x}{1-x}\right) \Longleftrightarrow x = \frac{1}{1+e^{-\mathrm{logit}(x)}}
An individual parameter \psi_i with a logit-normal distribution takes its values in (0,1). The logit of \psi is normally distributed, i.e.,
\mathrm{logit}(\psi_i) = \log \left(\frac{\psi_i}{1-\psi_i}\right) \ \sim \ \ \mathcal{N}( \mathrm{logit}(\psi_{\rm pop}) \ , \ \omega^2),
Probit-normal distribution:
The probit function is the inverse cumulative distribution function (quantile function) \psi^{-1} associated with the standard normal distribution \mathcal{N}(0,1). For any x in (0,1), \mathrm{probit}(x) = \psi^{-1}(x) \ \Longleftrightarrow \mathbb{P}\left(\mathcal{N}(0,1) \leq \mathrm{probit}(x)\right) = x
An individual parameter \psi_i with a probit-normal distribution takes its values in (0,1). The probit of \psi_i is normally distributed:
\mathrm{probit}(\psi_i) = \psi^{-1}(\psi_i) \ \sim \ \mathcal{N}( \mathrm{probit}(\psi_{\rm pop}) \ , \ \omega^2) .
The distribution of each individual parameter can be defined using the argument transform.par
(0=normal, 1=log-normal, 2=probit, 3=logit). Default are normal distributions, i.e. a vector of 0.
If we want to use, for example, a normal distribution for V and log-normal distributions for k_a and k_e, then, transform.par
should be the vector c(1,0,1):
<-saemixModel(model = model1_nlme,
saemix_modelpsi0 = c(ka=1,V=20,ke=0.5),
transform.par = c(1,0,1))
<-saemix(saemix_model, saemix_data, saemix_options) saemix_fit2
@results saemix_fit2
Fixed effects
Parameter Estimate SE CV(%)
ka 1.5778 0.30230 19.16
V 32.4602 1.68263 5.18
ke 0.0881 0.00589 6.69
a.1 0.7324 0.05556 7.59
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 0.391 0.1740 44.5
omega2.V 26.041 12.4119 47.7
omega2.ke 0.019 0.0196 102.6
Statistical criteria
Likelihood computed by linearisation
-2LL= 338.2846
AIC= 352.2846
BIC= 355.6789
Remark. Here, \omega^2_{ka} and \omega^2_{ke} are the variances of \log(k_{a_i}) and \log(k_{e_i}) while \omega^2_{V} is the variance of V_i.
4.1.3 Models with covariates
Let c_i = (c_{i1},c_{i2}, \ldots , c_{iL}) be a vector of individual covariates, i.e. a vector of individual parameters available with the data. We may want to explain part of the variability of the non observed individual parameters (\psi_i) using these covariates.
We will only consider linear models of the covariates. More precisely, assuming that h(\psi_i) is normally distributed, we will decompose h(\psi_i) into fixed and random effects:
h(\psi_i) = h(\psi_{\rm pop}) + \sum_{\ell=1}^L c_{i\ell}\beta_{\ell} + \eta_i
Remark. \psi_{\rm pop} is the typical
value of \psi_i if the covariates c_{i1}, …, c_{iL} are zero for a typical
individual of the population.
Let us consider a model where the volume V_i is normally distributed and is a linear function of the weight w_i:
V_i = \beta_0 + \beta \, w_i + \eta_{V,i}
Assuming that the weight of a typical individual of the population is w_{\rm pop}, the predicted volume for this individual is not \beta_0 but \beta_0 + \beta w_{\rm pop}.
If we use instead the centered weight w_i-w_{\rm pop}, we can now write the model as
V_i = V_{\rm pop} + \beta \, (w_i-w_{\rm pop}) + \eta_{V,i} \ .
Indeed, the predicted volume for a typical individual is now V_{\rm pop}.
Assume that we decide to use 70kg as typical weight in the theophylline study. The saemixData object now needs to include w_i-70:
$w70 <- theophylline$weight - 70
theophylline<- saemixData(name.data = theophylline,
saemix_data name.group = c("id"),
name.predictors = c("time"),
name.response = c("concentration"),
name.covariates = c("w70"))
Here, only the volume V is function of the weight. The covariate model is therefore encoded as vector (0,1,0).
<- saemixModel(model = model1_nlme,
saemix_model psi0 = c(ka=1,V=20,ke=0.5),
transform.par = c(1,0,1),
covariate.model = c(0,1,0))
<- saemix(saemix_model, saemix_data, saemix_options) saemix_fit3
@results saemix_fit3
Fixed effects
Parameter Estimate SE CV(%) p-value
ka 1.5958 0.313 19.59 -
V 32.7181 1.414 4.32 -
beta_w70(V) 0.3345 0.141 42.25 0.00898
ke 0.0871 0.006 6.89 -
a.1 0.7270 0.055 7.57 -
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 0.4102 0.1820 44.4
omega2.V 16.1399 8.4093 52.1
omega2.ke 0.0227 0.0202 88.9
Statistical criteria
Likelihood computed by linearisation
-2LL= 333.2821
AIC= 349.2821
BIC= 353.1613
Here, \hat\beta_{w70} = 0.33 means that an increase of the weight of 1kg leads to an increase of the predicted volume of 0.33l.
The p-value of the test H_0: \ \beta_{w70}=0 versus H_1: \ \beta_{w70}\neq 0 is 0.01 We can then reject the null and conclude that the predicted volume significantly increases with the weight.
Imagine that we now use a log-normal distribution for the volume V_i. It is now the log-volume which is a linear function of the transformed weight.
We can assume, for instance, that the log-volume is a linear function of the centered log-weight:
\log(V_i) = \log(V_{\rm pop}) + \beta \, \log(w_i/w_{\rm pop}) + \eta_{V,i} \ .
Or, equivalently,
V_i = V_{\rm pop} \left(\frac{w_i}{w_{\rm pop}}\right)^{\beta} e^{\eta_{V,i}} \ .
We see that, using this model, the predicted volume for a typical individual is V_{\rm pop}.
The saemixData object now needs to include \log(w_i/70) a covariate,
$lw70 <- log(theophylline$weight/70)
theophylline<- saemixData(name.data = theophylline,
saemix_data name.group = c("id"),
name.predictors = c("time"),
name.response = c("concentration"),
name.covariates = c("lw70"))
The covariate model is again encoded as the (row) vector (0,1,0) but the transformation is now encoded as 1 for the three parameters
<- saemixModel(model = model1_nlme,
saemix_model psi0 = c(ka=1,V=20,ke=0.5),
transform.par = c(1,1,1),
covariate.model = c(0,1,0))
<- saemix(saemix_model, saemix_data, saemix_options) saemix_fit4
@results saemix_fit4
Fixed effects
Parameter Estimate SE CV(%) p-value
ka 1.5906 0.31021 19.50 -
V 32.5590 1.41765 4.35 -
beta_lw70(V) 0.7689 0.30123 39.18 0.00535
ke 0.0872 0.00601 6.89 -
a.1 0.7274 0.05502 7.56 -
Variance of random effects
Parameter Estimate SE CV(%)
omega2.ka 0.4061 0.18030 44.4
omega2.V 0.0153 0.00795 52.1
omega2.ke 0.0225 0.02007 89.2
Statistical criteria
Likelihood computed by linearisation
-2LL= 333.0968
AIC= 349.0968
BIC= 352.976
4.1.4 Correlations between random effects
Until now, the random effects were assumed to be uncorrelated, i.e. the variance-covariance matrix \Omega was a diagonal matrix (default covariance model for saemix).
Correlations between random effects can be introduced with the input argument covariance.model
, a square matrix of size equal to the number of parameters in the model, giving the variance-covariance structure of the model: 1s correspond to estimated variances (in the diagonal) or covariances (off-diagonal elements). The structure of the matrix \Omega should be block.
Consider, for instance a model where k_a is fixed in the population, i.e. \omega_{k_a}=0 (and thus k_{a_i}={k_a}_{\text{pop}} for all i), and where \log(V) and \log(k_e) are correlated, i.e \eta_V and \eta_{k_e}) are correlated:
<-saemixModel(model = model1_nlme,
saemix_modelpsi0 = c(ka=1,V=20,ke=0.5),
transform.par = c(1,1,1),
covariate.model = t(c(0,1,0)),
covariance.model = matrix(c(0,0,0,0,1,1,0,1,1),nrow=3))
<- saemix(saemix_model, saemix_data, saemix_options) saemix_fit5
@results saemix_fit5
Fixed effects
Parameter Estimate SE CV(%) p-value
ka 1.5403 0.13275 8.62 -
V 33.1993 1.72352 5.19 -
beta_lw70(V) 0.3234 0.30586 94.59 0.145
ke 0.0826 0.00982 11.89 -
a.1 1.0787 0.07779 7.21 -
Variance of random effects
Parameter Estimate SE CV(%)
omega2.V 0.0165 0.00962 58.5
omega2.ke 0.0911 0.05680 62.3
Correlation matrix of random effects
omega2.V omega2.ke
omega2.V 1.000 -0.254
omega2.ke -0.254 1.000
Statistical criteria
Likelihood computed by linearisation
-2LL= 392.5382
AIC= 408.5382
BIC= 412.4175