# Introduction to metapack

### Table of Contents

The metapack package contains two main functions interfacing C++ using Rcpp (Eddelbuettel and Balamuta 2017) and RcppArmadillo (Eddelbuettel and Sanderson 2014): bayes.parobs and bayes.nmr. bayes.parobs fits multivariate meta-regression models with a partially observed sample covariance matrix (Yao et al. 2015), and bayes.nmr fits the network meta-regression models. We refer the readers to the Appendix for the mathematical formulations of the models.

To use either of our functions, bayes.parobs and bayes.nmr, the data set must be preprocessed into the following format, with the only exception that bayes.nmr has ZCovariate instead of WCovariate to be consistent with the mathematical notation in the original paper.

Outcome SD XCovariate WCovariate Trial (k) Treat (t) Npt
$$\bar{y}_{\cdot 13}$$ $$S_{13}$$ $$\boldsymbol{x}_{13}^\prime$$ $$\boldsymbol{w}_{13}^\prime$$ 1 3 1000
$$\bar{y}_{\cdot 10}$$ $$S_{10}$$ $$\boldsymbol{x}_{10}^\prime$$ $$\boldsymbol{w}_{20}^\prime$$ 1 0 1000
$$\bar{y}_{\cdot 20}$$ $$S_{20}$$ $$\boldsymbol{x}_{20}^\prime$$ $$\boldsymbol{w}_{20}^\prime$$ 2 0 1000
$$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$

The functions requires the data to be formatted in a way that every row corresponds to a unique pair of $$(t,k)$$ which stands for $$t$$th treatment and $$k$$th trial. Therefore, $$\bar{y}_{\cdot tk}$$ and $$S_{tk}$$ are $$J$$-dimensional row vectors for bayes.parobs, and scalars for bayes.nmr.

## Fitting multivariate meta-regression with bayes.parobs

The variables included in the Table are required for bayes.parobs. The key argument in bayes.parobs is fmodel which specifies the model for the error covariance matrix. The objects enclosed in parentheses at the end of every bullet point are the hyperparameters associated with each model.

• fmodel=1 - $$\Sigma_{tk} = \mathrm{diag}(\sigma_{tk,11}^2,\ldots,\sigma_{tk,JJ}^2)$$ where $$\sigma_{tk,jj}^2 \sim \mathcal{IG}(a_0,b_0)$$ and $$\mathcal{IG}(a,b)$$ is the inverse-gamma distribution. This specification is useful if the user does not care about the correlation recovery. (c0, dj0, a0, b0, Omega0)
• fmodel=2 - $$\Sigma_{tk}=\Sigma$$ for every combination of $$(t,k)$$ and $$\Sigma^{-1}\sim\mathcal{W}_{s_0}(\Sigma_0)$$. This specification assumes that the user has prior knowledge that the correlation structure does not change across the arms included. (c0, dj0, s0, Omega0, Sigma0)
• fmodel=3 - $$\Sigma_{tk}=\Sigma_t$$ and $$\Sigma_t^{-1}\sim \mathcal{W}_{s_0}(\Sigma_0)$$. This is a relaxed version of fmodel=2, allowing the correlation structure to differ across trials but forcing it to stay identical within a trial. (c0, dj0, s0, Omega0, Sigma0)
• fmodel=4 - $$\Sigma_{tk}=\boldsymbol{\delta}_{tk}\boldsymbol{\rho}\boldsymbol{\delta}_{tk}$$ where $$\boldsymbol{\delta}_{tk}=\mathrm{diag}(\Sigma_{tk,11}^{1/2},\ldots,\Sigma_{tk,JJ}^{1/2})$$, and $$\boldsymbol{\rho}$$ is the correlation matrix. This specification allows the variances to vary across arms but requires that the correlations be the same. This is due to the lack of correlation information in the data, which would in turn lead to the nonidentifiability of the correlations if they were allowed to vary. However, this still is an ambitious model which permits maximal degrees of freedom in terms of variance and correlation estimation. (c0, dj0, a0, b0, Omega0)
• fmodel=5 - The fifth model is hierarchical and thus may require more data than the others: $$(\Sigma_{tk}^{-1}\mid \Sigma)\sim \mathcal{W}_{\nu_0}((\nu_0-J-1)^{-1}\Sigma^{-1})$$ and $$\Sigma \sim \mathcal{W}_{d_0}(\Sigma_0)$$. $$\Sigma_{tk}$$ encodes the within-treatment-arm variation while $$\Sigma$$ captures the between-treatment-arm variation. The hierarchical structure allows the “borrowing of strength” across treatment arms. (c0, dj0, d0, nu0, Sigma0, Omega0)

The remaining optional arguments are

• mcmc - a list for MCMC specification. ndiscard is the number of burn-in iterations. nskip configures the thinning of the MCMC. For instance, if nskip=5, bayes.parobs will save the posterior every 5 iterations. nkeep is the size of the posterior sample. The total number of iterations will be ndiscard + nskip * nkeep.
• group - a vector containing binary variables for $$u_{tk}$$. If not provided, bayes.parobs will assume that there is no grouping and set $$u_{tk}=0$$ for all $$(t,k)$$.
• prior - a list for hyperparameters. Despite $$\boldsymbol{\theta}$$ in every model, each fmodel, along with the group argument, requires a different set of hyperparameters. Refer to the itemized model specifications.
• init - a list of initial values for the parameters to be sampled: theta, gamR, Omega, and Rho. The initial value for Rho will be effective only if fmodel=4.
• control - a list of tuning parameters for the Metropolis-Hastings algorithm. Rho, R, and delta are sampled through either localized Metropolis algorithm or delayed rejection robust adaptive Metropolis algorithm. *_stepsize with the asterisk replaced with one of the names above specifies the stepsize for determining the sample evaluation points in the localized Metropolis algorithm. sample_Rho can be set to FALSE to suppress the sampling of Rho for fmodel=4. When sample_Rho is FALSE, $$\boldsymbol{\rho}$$ will be fixed using the value given by the init argument, which defaults to $$\boldsymbol{\rho} = 0.5\boldsymbol{I}+0.5\boldsymbol{1}\boldsymbol{1}^\prime$$ where $$\boldsymbol{1}$$ is the vector of ones.
• scale_x - a Boolean indicating whether XCovariate should be scaled/standardized. The effect of setting this to TRUE is not limited to merely standardizing XCovariate. The following generic functions will scale the posterior sample of theta back to its original unit: plot, fitted, summary, and print. That is, $$\theta_j \gets \theta_j/\mathrm{sd}(X_j^*)$$ where $$X_j^*$$ indicates the $$j$$th column of $$\boldsymbol{X}^*$$.
• Treat_order - a vector of unique treatments to be used for renumbering the Treat vector. The first element will be assigned treatment zero, potentially indicating placebo. If not provided, the numbering will default to an alphabetical/numerical order.
• Trial_order - a vector of unique trials. The first element will be assigned zero. If not provided, the numbering will default to an alphabetical/numerical order.
• group_order - a vector of unique group labels. The first element will be assigned zero. If not provided, the numbering will default to an alphabetical/numerical order
• verbose - a Boolean indicating whether to print the progress bar during the MCMC sampling.

#### Values

bayes.parobs returns

• Outcome - the aggregate response used in the function call.
• SD - the standard deviation used in the function call.
• Npt - the number of participants for (t,k) used in the function call.
• XCovariate - the aggregate design matrix for fixed effects used in the function call. Depending on scale_x, this may differ from the matrix provided at function call.
• WCovariate - the aggregate design matrix for random effects.
• Treat - the renumbered treatment indicators. Depending on Treat_order, it may differ from the vector provided at function call.
• Trial - the renumbered trial indicators. Depending on Trial_order, it may differ from the vector provided at function call.
• group - the renumbered grouping indicators in the function call. Depending on group_order, it may differ from the vector provided at function call. If group was missing at function call, bayes.parobs will assign NULL for group.
• TrtLabels - the vector of treatment labels corresponding to the renumbered Treat. This is equivalent to Treat_order if it was given at function call.
• TrialLabels - the vector of trial labels corresponding to the renumbered Trial. This is equivalent to Trial_order if it was given at function call.
• GroupLabels - the vector of group labels corresponding to the renumbered group. This is equivalent to group_order if it was given at function call. If group was missing at function call, bayes.parobs will assign NULL for GroupLabels.
• K - the total number of trials.
• T - the total number of treatments.
• fmodel - the model number as described here.
• scale_x - a Boolean indicating whether XCovariate has been scaled/standardized.
• prior - the list of hyperparameters used in the function call.
• control - the list of tuning parameters used for MCMC in the function call.
• mcmctime - the elapsed time for the MCMC algorithm in the function call. This does not include all the other preprocessing and post-processing outside of MCMC.
• mcmc - the list of MCMC specification used in the function call.
• mcmc.draws - the list containing the MCMC draws. The posterior sample will be accessible here.

#### Example

The following boilerplate code demonstrates how bayes.parobs can be used:

Rho_init <- diag(1, nrow = J)
Rho_init[upper.tri(Rho_init)] <- Rho_init[lower.tri(Rho_init)] <- 0.5
bayes.parobs(Outcome, SD, XCovariate, WCovariate, Trial, Treat, Npt,
fmodel = 4, prior = list(c0 = 1e6),
mcmc = list(ndiscard = 1, nskip = 1, nkeep = 1),
control = list(Rho_stepsize=0.05, R_stepsize=0.05),
group = Group,
scale_x = TRUE, verbose = TRUE)

fmodel can be a different number from 1 to 5.

## Fitting network meta-regression with bayes.nmr

Unlike bayes.parobs, the aggregate design matrix for random effects, ZCovariate, is optional. If missing, bayes.nmr will assign a vector of ones for ZCovariate, i.e., $$\boldsymbol{Z}_k = \boldsymbol{1}$$ for all $$k$$. This reduces the model to have $$\mathrm{Var}(\bar{y}_{tk}) = \sigma_{tk}^2/n_{tk} + \exp(2\phi)$$ for every $$(t,k)$$.

The remaining optional arguments are

• mcmc - a list for MCMC specification. ndiscard is the number of burn-in iterations. nskip configures the thinning of the MCMC. For instance, if nskip=5, bayes.nmr will save the posterior every 5 iterations. nkeep is the size of the posterior sample. The total number of iterations will be ndiscard + nskip * nkeep.
• prior - a list of hyperparameters. The hyperparameters include df, c01, c02, a4, b4, a5, and b5. df indicates the degrees of freedom whose default value is 20. The hyperparameters a* and b* will take effect only if sample_df=TRUE. See control.
• control - a list of tuning parameters for the Metropolis-Hastings algorithm. lambda, phi, and Rho are sampled through the localized Metropolis algorithm. *_stepsize with the asterisk replaced with one of the names above specifies the stepsize for determining the sample evaluation points in the localized Metropolis algorithm. sample_Rho can be set to FALSE to suppress the sampling of Rho. When sample_Rho is FALSE, $$\boldsymbol{\rho}$$ will instantiated using the value given by the init argument, which defaults to $$\boldsymbol{\rho} = 0.5\boldsymbol{I}+0.5\boldsymbol{1}\boldsymbol{1}^\prime$$ where $$\boldsymbol{1}$$ is the vector of ones. When sample_df is TRUE, $$\nu$$ (df) will be sampled.
• scale_x - a Boolean whether XCovariate should be scaled/standardized. The effect of setting this to TRUE is not limited to merely standardizing XCovariate. The following generic functions will scale the posterior sample of theta back to its original unit: plot, fitted, summary, and print. That is, $$\theta_j \gets \theta_j/\mathrm{sd}(X_j^*)$$ where $$X_j^*$$ indicates the $$j$$th column of $$\boldsymbol{X}^*$$.
• init - a list of initial values for the parameters to be sampled: theta, phi, sig2, and Rho.
• Treat_order - a vector of unique treatments to be used for renumbering the Treat vector. The first element will be assigned treatment zero, potentially indicating placebo. If not provided, the numbering will default to an alphabetical/numerical order.
• Trial_order - a vector of unique trials. The first element will be assigned trial zero. If not provided, the numbering will default to an alphabetical/numerical order.
• verbose - a Boolean indicating whether to print the progress bar during the MCMC sampling.

#### Values

bayes.nmr returns

• Outcome - the aggregate response used in the function call.
• SD - the standard deviation used in the function call.
• Npt - the number of participants for (t,k) used in the function call.
• XCovariate - the aggregate design matrix for fixed effects used in the function call. Depending on scale_x, this may differ from the matrix provided at function call.
• ZCovariate - the aggregate design matrix for random effects. bayes.nmr will assign rep(1, length(Outcome)) if it was not provided at function call.
• Trial - the renumbered trial indicators. Depending on Trial_order, it may differ from the vector provided at function call.
• Treat - the renumbered treatment indicators. Depending on Treat_order, it may differ from the vector provided at function call.
• TrtLabels - the vector of treatment labels corresponding to the renumbered Treat. This is equivalent to Treat_order if it was given at function call.
• TrialLabels - the vector of trial labels corresponding to the renumbered Trial. This is equivalent to Trial_order if it was given at function call.
• K - the total number of trials.
• nT - the total number of treatments.
• scale_x - a Boolean indicating whether XCovariate has been scaled/standardized.
• prior - the list of hyperparameters used in the function call.
• control - the list of tuning parameters used for MCMC in the function call.
• mcmctime - the elapsed time for the MCMC algorithm in the function call. This does not include all the other preprocessing and post-processing outside of MCMC.
• mcmc - the list of MCMC specification used in the function call.
• mcmc.draws - the list containing the MCMC draws. The posterior sample will be accessible here.

#### Example

The following boilerplate code demonstrates how bayes.nmr can be used:

fit <- bayes.nmr(Outcome, SD, XCovariate, ZCovariate, Trial, Treat, Npt,
prior = list(c01 = 1.0e05, c02 = 4, df = 3),
mcmc = list(ndiscard = 1, nskip = 1, nkeep = 1),
Treat_order = c("PBO", "S", "A", "L", "R", "P",
"E", "SE", "AE", "LE", "PE"),
scale_x = TRUE, verbose = TRUE)

## Appendix

#### Model descriptions for bayes.parobs

When there are multiple endpoints, the correlations thereof are oftentimes unreported. In a meta-regression setting, the correlations are something we want to know. In the case of subject-level meta-analyses (where individual participant/patient data are available), the correlations may only be one function call away. However, for study-level meta-analyses, the correlations must be estimated, which is not an easy task. bayes.parobs provides five different models to estimate and recover the missing correlations.

Since the aforementioned setting regards the correlations as missing, the reported data are $$(\overline{\boldsymbol{y}}_{\cdot tk}, \boldsymbol{s}_{tk}) \in \mathbf{R}^J \times \mathbf{R}^J$$ for the $$t$$th treatment arm and $$k$$th trial. Every trial out of $$K$$ trials includes $$T$$ treatment arms. The sample size of the $$t$$th treatment arm and $$k$$th trial is denoted by $$n_{tk}$$. If we write $$\boldsymbol{x}_{tkj}\in \mathbf{R}^{p_j}$$ to be the treatment-within-trial level regressor corresponding to the $$j$$th response, reflecting the fixed effects of the $$t$$th treatment arm, and $$\boldsymbol{w}_{tkj}\in \mathbf{R}^{q_j}$$ to be the same for the random effects, the model becomes

$\begin{equation}\label{eq:reduced-model} \overline{\boldsymbol{y}}_{\cdot tk} = \boldsymbol{X}_{tk}\boldsymbol{\beta} + \boldsymbol{W}_{tk}\boldsymbol{\gamma}_k + \overline{\boldsymbol{\epsilon}}_{\cdot tk}, \end{equation}$

and $$(n_{tk}-1)S_{tk} \sim \mathcal{W}_{n_{tk}-1}(\Sigma_{tk})$$ where $$\boldsymbol{X}_{tk} = \mathrm{blockdiag}(\boldsymbol{x}_{tk1}^\prime,\boldsymbol{x}_{tk2}^\prime,\ldots,\boldsymbol{x}_{tkJ}^\prime)$$, $$\boldsymbol{\beta} = (\boldsymbol{\beta}_1^\prime, \cdots, \boldsymbol{\beta}_J^\prime)^\prime$$, $$\boldsymbol{W}_{tk}=\mathrm{blockdiag}(\boldsymbol{w}_{tk1}^\prime, \ldots,\boldsymbol{w}_{tkJ}^\prime)$$, $$\boldsymbol{\gamma}_k = (\boldsymbol{\gamma}_{k1}^\prime,\boldsymbol{\gamma}_{k2}^\prime, \ldots,\boldsymbol{\gamma}_{kJ}^\prime)^\prime$$, $$\overline{\boldsymbol{\epsilon}}_{\cdot tk} \sim \mathcal{N}(\boldsymbol{0}, \Sigma_{tk}/n_{tk})$$, $$S_{tk}$$ is the full-rank covariance matrix whose diagonal entries are the observed $$\boldsymbol{s}_{tk}$$, and $$\mathcal{W}_\nu(\Sigma)$$ is the Wishart distribution with $$\nu$$ degrees of freedom and a $$J\times J$$ scale matrix $$\Sigma$$ whose density function is

$f(X\mid \nu,\Sigma) = \dfrac{1}{2^{J\nu}|\Sigma|^{\nu/2}\Gamma_J(\nu/2)}|X|^{(\nu-J-1)/2}\exp\left(-\dfrac{1}{2}\mathrm{tr}(\Sigma^{-1}X) \right)I(X\in \mathcal{S}_{++}^J),$

where $$\Gamma_p$$ is the multivariate gamma function defined by

$\Gamma_p(z) = \pi^{p(p-1)/4}\prod_{j=1}^p \Gamma[z+(1-j)/2],$

and $$\mathcal{S}_{++}^J$$ is the space of $$J\times J$$ symmetric positive definite matrices. The statistical independence of $$\overline{\boldsymbol{\epsilon}}_{\cdot tk}$$ and $$S_{tk}$$ follows naturally from the Basu’s theorem.

The patients can sometimes be grouped by a factor that will generate disparate random effects. Although an arbitrary number of groups can exist in theory, metapack restricts the number of groups to two for practicality. Denoting the binary group indicates by $$u_{tk}$$ yields

$\overline{y}_{\cdot tkj} = \boldsymbol{x}_{tkj}^\prime\boldsymbol{\beta} + (1-u_{tk})\boldsymbol{w}_{tkj}^\prime \boldsymbol{\gamma}_{kj}^0 + u_{tk}\boldsymbol{w}_{tkj}^\prime \boldsymbol{\gamma}_{kj}^1 + \overline{\epsilon}_{\cdot tkj}.$

The random effects are modeled as $$\boldsymbol{\gamma}_{kj}^l \overset{\text{ind}}{\sim}\mathcal{N}(\boldsymbol{\gamma}_j^{l*},\Omega_j^l)$$ and $$(\Omega_j^l)^{-1} \sim \mathcal{W}_{d_{0j}}(\Omega_{0j})$$. Stacking the vectors, $$\boldsymbol{\gamma}_k^l = ((\boldsymbol{\gamma}_{k1}^l)^\prime, \ldots, (\boldsymbol{\gamma}_{kJ}^l)^\prime)^\prime \sim \mathcal{N}(\boldsymbol{\gamma}^{l*},\Omega^l)$$ where $$\boldsymbol{\gamma}^{l*} = ((\boldsymbol{\gamma}_{1}^{l*})^\prime,\ldots,(\boldsymbol{\gamma}_{J}^{l*})^\prime)^\prime$$, $$\Omega_j = \mathrm{blockdiag}(\Omega_j^0,\Omega_j^l)$$, and $$\Omega = \mathrm{blockdiag}(\Omega_1,\ldots,\Omega_J)$$ for $$l \in \{0,1\}$$. Adopting the non-centered reparametrization (Bernardo et al. 2003), define $$\boldsymbol{\gamma}_{k,o}^l = \boldsymbol{\gamma}_k^l - \boldsymbol{\gamma}^{l*}$$. Denoting $$\boldsymbol{W}_{tk}^* = [(1-u_{tk})\boldsymbol{W}_{tk}, u_{tk}\boldsymbol{W}_{tk}]$$, $$\boldsymbol{X}_{tk}^* = [\boldsymbol{X}_{tk},\boldsymbol{W}_{tk}^*]$$, and $$\boldsymbol{\theta} = (\boldsymbol{\beta}^\prime, {\boldsymbol{\gamma}^{0*}}^\prime, {\boldsymbol{\gamma}^{1*}}^\prime)^\prime$$, the model is written as follows:

$\overline{\boldsymbol{y}}_{\cdot tk} = \boldsymbol{X}_{tk}^*\boldsymbol{\theta}+\boldsymbol{W}_{tk}^*\boldsymbol{\gamma}_{k,o} + \overline{\boldsymbol{\epsilon}}_{\cdot tk},$ where $$\boldsymbol{\gamma}_{k,o} = ((\boldsymbol{\gamma}_{k,o}^0)^\prime, (\boldsymbol{\gamma}_{k,o}^1)^\prime)^\prime$$. If there is no grouping in the patients, setting $$u_{tk}=0$$ for all $$(t,k)$$ reduces the model back to $$\eqref{eq:reduced-model}$$.

The conditional distribution of $$(R_{tk} \mid V_{tk}, \Sigma_{tk})$$ where $$R_{tk} = V_{tk}^{-\frac{1}{2}}S_{tk}V_{tk}^{-\frac{1}{2}}$$ and $$V_{tk} = \mathrm{diag}(S_{tk11},\ldots,S_{tkJJ})$$ becomes

$f(R_{tk}\mid V_{tk},\Sigma_{tk}) \propto |R_{tk}|^{(n_{tk}-J-2)/2}\exp\left\{-\dfrac{(n_{tk}-1)}{2}\mathrm{tr}\left(V_{tk}^{\frac{1}{2}}\Sigma_{tk}^{-1}V_{tk}^{\frac{1}{2}}R_{tk} \right) \right\}.$

#### Model descriptions for bayes.nmr

Network meta-analysis is an extension of meta-analysis where more than two treatments are compared. Unlike the traditional meta-analyses that restrict the number of treatments to be equal across trials, network meta-analysis allows varying numbers of treatments. This achieves a unique benefit that two treatments that have not been compared head-to-head can be assessed as a pair.

Start by denoting the comprehensive list of treatments in all $$K$$ trials by $$\mathcal{T}=\{1,\ldots,T\}$$. It is rarely the case that all $$T$$ treatments are included in the data but we drop the subscripts $$t_k$$ and replace it with $$t$$ for notational simplicity. Now, consider the model

$\begin{equation}\label{eq:nmr-basic} \bar{y}_{\cdot tk} = \boldsymbol{x}_{tk}^\prime\boldsymbol{\beta} + \tau_{tk}\gamma_{tk} + \bar{\epsilon}_{\cdot tk}, \quad \bar{\epsilon}_{\cdot tk} \sim \mathcal{N}(0,\sigma_{tk}^2/n_{tk}), \end{equation}$ where $$\bar{y}_{\cdot tk}$$ is the univariate aggregate response of the $$k$$th trial for which treatment $$t$$ was assigned, $$\boldsymbol{x}_{tk}$$ is the aggregate covariate vector for the fixed effects, and $$\gamma_{tk}$$ is the random effects term. The observed standard deviation, $$s_{tk}^2$$ is modeled by

$\dfrac{(n_{tk}-1)s_{tk}^2}{\sigma_{tk}^2} \sim \chi_{n_{tk}-1}^2.$

$$\tau_{tk}$$ in Equation $$\eqref{eq:nmr-basic}$$ encapsulates the variance of the random effect for the $$t$$th treatment in the $$k$$th trial, which is modeled as a deterministic function of a related covariate. That is,

$\log \tau_{tk} = \boldsymbol{z}_{tk}^\prime\boldsymbol{\phi},$

where $$\boldsymbol{z}_{tk}$$ is the $$q$$-dimensional aggregate covariate vector and $$\boldsymbol{\phi}$$ is the corresponding coefficient vector.

For the $$k$$th trial, we define a selection/projection matrix $$E_k = (e_{t_{1k}},e_{t_{2k}},\ldots, e_{t_{T_k k}})$$, where $$e_{t_{lk}} = (0,\ldots,1,\ldots,0)^\prime$$, $$l=1,\ldots,T_k$$, with $$t_{lk}$$th element set to 1 and 0 otherwise, and $$T_k$$ is the number of treatments included in the $$k$$th trial. Let the scaled random effects $$\boldsymbol{\gamma}_k = (\gamma_{1k},\ldots,\gamma_{Tk})^\prime$$. Then, $$\boldsymbol{\gamma}_{k,o}=E_k^\prime\boldsymbol{\gamma}_k$$ is the vector of $$T_k$$-dimensional scaled random effects for the $$k$$th trial. The scaled random effects $$\boldsymbol{\gamma}_k \sim t_T(\boldsymbol{\gamma},\boldsymbol{\rho},\nu)$$ where $$t_T(\boldsymbol{\mu},\Sigma,\nu)$$ denotes a multivariate $$t$$ distribution with $$\nu$$ degrees of freedom, a location parameter vector $$\boldsymbol{\mu}$$, and a scale matrix $$\Sigma$$.

The non-centered reparametrization (Bernardo et al. 2003) gives $$\boldsymbol{\gamma}_{k,o} = E_k^\prime(\boldsymbol{\gamma}_k - \boldsymbol{\gamma})$$. Then, with $$\bar{\boldsymbol{y}}_k = (\bar{y}_{kt_{k1}},\ldots,\bar{y}_{kt_{kT_k}})^\prime$$, $$\boldsymbol{X}_k = (\boldsymbol{x}_{kt_{k1}},\ldots, \boldsymbol{x}_{kt_{kT_k}})$$, and $$\boldsymbol{Z}_k(\boldsymbol{\phi}) = \mathrm{diag}(\exp(\boldsymbol{z}_{kt_{k1}}^\prime \boldsymbol{\phi}),\ldots, \exp(\boldsymbol{z}_{kt_{kT_k}}^\prime \boldsymbol{\phi}))$$, the model is recast as

$\bar{\boldsymbol{y}}_k = \boldsymbol{X}_k^* \boldsymbol{\theta} + \boldsymbol{Z}_k(\boldsymbol{\phi}) \boldsymbol{\gamma}_{k,o} + \bar{\boldsymbol{\epsilon}}_k,$ where $$\boldsymbol{X}_k^* = (\boldsymbol{X}_k, E_k^\prime)$$, $$\boldsymbol{\theta} = (\boldsymbol{\beta}^\prime, \boldsymbol{\gamma}^\prime)^\prime$$, and $$\bar{\boldsymbol{\epsilon}}_k \sim \mathcal{N}_{T_k}(\boldsymbol{0},\Sigma_k)$$, $$\Sigma_k = \mathrm{diag}(\sigma_{kt_{k1}}^2/n_{kt_{k1}}, \ldots, \sigma_{kt_{kT_k}}^2/n_{kt_{kT_k}})$$. This allows the random effects $$\boldsymbol{\gamma}_{k,o} \sim t_{T_k}(\boldsymbol{0},E_k^\prime \boldsymbol{\rho}E_k, \nu)$$ to be centered at zero.

Since the multivariate $$t$$ random effects are not analytically marginalizable, we represent it as a scale mixture of normals as follows:

$(\boldsymbol{\gamma}_{k,o}\mid \lambda_k) \overset{\text{ind}}{\sim} \mathcal{N}_{T_k}\left(\boldsymbol{0}, \lambda_k^{-1}(E_k^\prime\boldsymbol{\rho}E_k) \right), \quad \lambda_k \overset{\text{iid}}{\sim}\mathcal{G}a\left(\dfrac{\nu}{2},\dfrac{\nu}{2} \right),$ where $$\mathcal{G}a(a,b)$$ indicates the gamma distribution with mean $$a/b$$ and variance $$a/b^2$$.