The **metapack** package contains two main functions interfacing C++ using **Rcpp** (Eddelbuettel and Balamuta 2017) and **RcppArmadillo** (Eddelbuettel and Sanderson 2014): `bayes.parobs`

and `bayes.nmr`

. `bayes.parobs`

fits multivariate meta-regression models with a partially observed sample covariance matrix (Yao et al. 2015), and `bayes.nmr`

fits the network meta-regression models. We refer the readers to the Appendix for the mathematical formulations of the models.

To use either of our functions, `bayes.parobs`

and `bayes.nmr`

, the data set must be preprocessed into the following format, with the only exception that `bayes.nmr`

has `ZCovariate`

instead of `WCovariate`

to be consistent with the mathematical notation in the original paper.

Outcome | SD | XCovariate | WCovariate | Trial (`k` ) |
Treat (`t` ) |
Npt |
---|---|---|---|---|---|---|

\(\bar{y}_{\cdot 13}\) | \(S_{13}\) | \(\boldsymbol{x}_{13}^\prime\) | \(\boldsymbol{w}_{13}^\prime\) | 1 | 3 | 1000 |

\(\bar{y}_{\cdot 10}\) | \(S_{10}\) | \(\boldsymbol{x}_{10}^\prime\) | \(\boldsymbol{w}_{20}^\prime\) | 1 | 0 | 1000 |

\(\bar{y}_{\cdot 20}\) | \(S_{20}\) | \(\boldsymbol{x}_{20}^\prime\) | \(\boldsymbol{w}_{20}^\prime\) | 2 | 0 | 1000 |

\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |

The functions requires the data to be formatted in a way that every row corresponds to a unique pair of \((t,k)\) which stands for \(t\)th treatment and \(k\)th trial. Therefore, \(\bar{y}_{\cdot tk}\) and \(S_{tk}\) are \(J\)-dimensional row vectors for `bayes.parobs`

, and scalars for `bayes.nmr`

.

`bayes.parobs`

The variables included in the Table are required for `bayes.parobs`

. The key argument in `bayes.parobs`

is `fmodel`

which specifies the model for the error covariance matrix. The objects enclosed in parentheses at the end of every bullet point are the hyperparameters associated with each model.

`fmodel=1`

- \(\Sigma_{tk} = \mathrm{diag}(\sigma_{tk,11}^2,\ldots,\sigma_{tk,JJ}^2)\) where \(\sigma_{tk,jj}^2 \sim \mathcal{IG}(a_0,b_0)\) and \(\mathcal{IG}(a,b)\) is the inverse-gamma distribution. This specification is useful if the user does not care about the correlation recovery. (`c0`

,`dj0`

,`a0`

,`b0`

,`Omega0`

)`fmodel=2`

- \(\Sigma_{tk}=\Sigma\) for every combination of \((t,k)\) and \(\Sigma^{-1}\sim\mathcal{W}_{s_0}(\Sigma_0)\). This specification assumes that the user has prior knowledge that the correlation structure does not change across the arms included. (`c0`

,`dj0`

,`s0`

,`Omega0`

,`Sigma0`

)`fmodel=3`

- \(\Sigma_{tk}=\Sigma_t\) and \(\Sigma_t^{-1}\sim \mathcal{W}_{s_0}(\Sigma_0)\). This is a relaxed version of`fmodel=2`

, allowing the correlation structure to differ across trials but forcing it to stay identical within a trial. (`c0`

,`dj0`

,`s0`

,`Omega0`

,`Sigma0`

)`fmodel=4`

- \(\Sigma_{tk}=\boldsymbol{\delta}_{tk}\boldsymbol{\rho}\boldsymbol{\delta}_{tk}\) where \(\boldsymbol{\delta}_{tk}=\mathrm{diag}(\Sigma_{tk,11}^{1/2},\ldots,\Sigma_{tk,JJ}^{1/2})\), and \(\boldsymbol{\rho}\) is the correlation matrix. This specification allows the variances to vary across arms but requires that the correlations be the same. This is due to the lack of correlation information in the data, which would in turn lead to the nonidentifiability of the correlations if they were allowed to vary. However, this still is an ambitious model which permits maximal degrees of freedom in terms of variance and correlation estimation. (`c0`

,`dj0`

,`a0`

,`b0`

,`Omega0`

)`fmodel=5`

- The fifth model is hierarchical and thus may require more data than the others: \((\Sigma_{tk}^{-1}\mid \Sigma)\sim \mathcal{W}_{\nu_0}((\nu_0-J-1)^{-1}\Sigma^{-1})\) and \(\Sigma \sim \mathcal{W}_{d_0}(\Sigma_0)\). \(\Sigma_{tk}\) encodes the within-treatment-arm variation while \(\Sigma\) captures the between-treatment-arm variation. The hierarchical structure allows the “borrowing of strength” across treatment arms. (`c0`

,`dj0`

,`d0`

,`nu0`

,`Sigma0`

,`Omega0`

)

The remaining optional arguments are

`mcmc`

- a list for MCMC specification.`ndiscard`

is the number of burn-in iterations.`nskip`

configures the thinning of the MCMC. For instance, if`nskip=5`

,`bayes.parobs`

will save the posterior every 5 iterations.`nkeep`

is the size of the posterior sample. The total number of iterations will be`ndiscard + nskip * nkeep`

.`group`

- a vector containing binary variables for \(u_{tk}\). If not provided,`bayes.parobs`

will assume that there is no grouping and set \(u_{tk}=0\) for all \((t,k)\).`prior`

- a list for hyperparameters. Despite \(\boldsymbol{\theta}\) in every model, each`fmodel`

, along with the`group`

argument, requires a different set of hyperparameters. Refer to the itemized model specifications.`init`

- a list of initial values for the parameters to be sampled:`theta`

,`gamR`

,`Omega`

, and`Rho`

. The initial value for`Rho`

will be effective only if`fmodel=4`

.`control`

- a list of tuning parameters for the Metropolis-Hastings algorithm.`Rho`

,`R`

, and`delta`

are sampled through either localized Metropolis algorithm or delayed rejection robust adaptive Metropolis algorithm.`*_stepsize`

with the asterisk replaced with one of the names above specifies the stepsize for determining the sample evaluation points in the localized Metropolis algorithm.`sample_Rho`

can be set to`FALSE`

to suppress the sampling of`Rho`

for`fmodel=4`

. When`sample_Rho`

is`FALSE`

, \(\boldsymbol{\rho}\) will be fixed using the value given by the`init`

argument, which defaults to \(\boldsymbol{\rho} = 0.5\boldsymbol{I}+0.5\boldsymbol{1}\boldsymbol{1}^\prime\) where \(\boldsymbol{1}\) is the vector of ones.`scale_x`

- a Boolean indicating whether`XCovariate`

should be scaled/standardized. The effect of setting this to`TRUE`

is not limited to merely standardizing`XCovariate`

. The following generic functions will scale the posterior sample of`theta`

back to its original unit:`plot`

,`fitted`

,`summary`

, and`print`

. That is, \(\theta_j \gets \theta_j/\mathrm{sd}(X_j^*)\) where \(X_j^*\) indicates the \(j\)th column of \(\boldsymbol{X}^*\).`Treat_order`

- a vector of unique treatments to be used for renumbering the`Treat`

vector. The first element will be assigned treatment zero, potentially indicating placebo. If not provided, the numbering will default to an alphabetical/numerical order.`Trial_order`

- a vector of unique trials. The first element will be assigned zero. If not provided, the numbering will default to an alphabetical/numerical order.`group_order`

- a vector of unique group labels. The first element will be assigned zero. If not provided, the numbering will default to an alphabetical/numerical order`verbose`

- a Boolean indicating whether to print the progress bar during the MCMC sampling.

`bayes.parobs`

returns

`Outcome`

- the aggregate response used in the function call.`SD`

- the standard deviation used in the function call.`Npt`

- the number of participants for`(t,k)`

used in the function call.`XCovariate`

- the aggregate design matrix for fixed effects used in the function call. Depending on`scale_x`

, this may differ from the matrix provided at function call.`WCovariate`

- the aggregate design matrix for random effects.`Treat`

- the*renumbered*treatment indicators. Depending on`Treat_order`

, it may differ from the vector provided at function call.`Trial`

- the*renumbered*trial indicators. Depending on`Trial_order`

, it may differ from the vector provided at function call.`group`

- the*renumbered*grouping indicators in the function call. Depending on`group_order`

, it may differ from the vector provided at function call. If`group`

was missing at function call,`bayes.parobs`

will assign`NULL`

for`group`

.`TrtLabels`

- the vector of treatment labels corresponding to the renumbered`Treat`

. This is equivalent to`Treat_order`

if it was given at function call.`TrialLabels`

- the vector of trial labels corresponding to the renumbered`Trial`

. This is equivalent to`Trial_order`

if it was given at function call.`GroupLabels`

- the vector of group labels corresponding to the renumbered`group`

. This is equivalent to`group_order`

if it was given at function call. If`group`

was missing at function call,`bayes.parobs`

will assign`NULL`

for`GroupLabels`

.`K`

- the total number of trials.`T`

- the total number of treatments.`fmodel`

- the model number as described here.`scale_x`

- a Boolean indicating whether`XCovariate`

has been scaled/standardized.`prior`

- the list of hyperparameters used in the function call.`control`

- the list of tuning parameters used for MCMC in the function call.`mcmctime`

- the elapsed time for the MCMC algorithm in the function call. This does not include all the other preprocessing and post-processing outside of MCMC.`mcmc`

- the list of MCMC specification used in the function call.`mcmc.draws`

- the list containing the MCMC draws. The posterior sample will be accessible here.

The following boilerplate code demonstrates how `bayes.parobs`

can be used:

```
Rho_init <- diag(1, nrow = J)
Rho_init[upper.tri(Rho_init)] <- Rho_init[lower.tri(Rho_init)] <- 0.5
bayes.parobs(Outcome, SD, XCovariate, WCovariate, Trial, Treat, Npt,
fmodel = 4, prior = list(c0 = 1e6),
mcmc = list(ndiscard = 1, nskip = 1, nkeep = 1),
control = list(Rho_stepsize=0.05, R_stepsize=0.05),
group = Group,
scale_x = TRUE, verbose = TRUE)
```

`fmodel`

can be a different number from 1 to 5.

`bayes.nmr`

Unlike `bayes.parobs`

, the aggregate design matrix for random effects, `ZCovariate`

, is optional. If missing, `bayes.nmr`

will assign a vector of ones for `ZCovariate`

, i.e., \(\boldsymbol{Z}_k = \boldsymbol{1}\) for all \(k\). This reduces the model to have \(\mathrm{Var}(\bar{y}_{tk}) = \sigma_{tk}^2/n_{tk} + \exp(2\phi)\) for every \((t,k)\).

The remaining optional arguments are

`mcmc`

- a list for MCMC specification.`ndiscard`

is the number of burn-in iterations.`nskip`

configures the thinning of the MCMC. For instance, if`nskip=5`

,`bayes.nmr`

will save the posterior every 5 iterations.`nkeep`

is the size of the posterior sample. The total number of iterations will be`ndiscard + nskip * nkeep`

.`prior`

- a list of hyperparameters. The hyperparameters include`df`

,`c01`

,`c02`

,`a4`

,`b4`

,`a5`

, and`b5`

.`df`

indicates the degrees of freedom whose default value is 20. The hyperparameters`a*`

and`b*`

will take effect only if`sample_df=TRUE`

. See`control`

.`control`

- a list of tuning parameters for the Metropolis-Hastings algorithm.`lambda`

,`phi`

, and`Rho`

are sampled through the localized Metropolis algorithm.`*_stepsize`

with the asterisk replaced with one of the names above specifies the stepsize for determining the sample evaluation points in the localized Metropolis algorithm.`sample_Rho`

can be set to`FALSE`

to suppress the sampling of`Rho`

. When`sample_Rho`

is`FALSE`

, \(\boldsymbol{\rho}\) will instantiated using the value given by the`init`

argument, which defaults to \(\boldsymbol{\rho} = 0.5\boldsymbol{I}+0.5\boldsymbol{1}\boldsymbol{1}^\prime\) where \(\boldsymbol{1}\) is the vector of ones. When`sample_df`

is`TRUE`

, \(\nu\) (`df`

) will be sampled.`scale_x`

- a Boolean whether`XCovariate`

should be scaled/standardized. The effect of setting this to`TRUE`

is not limited to merely standardizing`XCovariate`

. The following generic functions will scale the posterior sample of`theta`

back to its original unit:`plot`

,`fitted`

,`summary`

, and`print`

. That is, \(\theta_j \gets \theta_j/\mathrm{sd}(X_j^*)\) where \(X_j^*\) indicates the \(j\)th column of \(\boldsymbol{X}^*\).`init`

- a list of initial values for the parameters to be sampled:`theta`

,`phi`

,`sig2`

, and`Rho`

.`Treat_order`

- a vector of unique treatments to be used for renumbering the`Treat`

vector. The first element will be assigned treatment zero, potentially indicating placebo. If not provided, the numbering will default to an alphabetical/numerical order.`Trial_order`

- a vector of unique trials. The first element will be assigned trial zero. If not provided, the numbering will default to an alphabetical/numerical order.`verbose`

- a Boolean indicating whether to print the progress bar during the MCMC sampling.

`bayes.nmr`

returns

`Outcome`

- the aggregate response used in the function call.`SD`

- the standard deviation used in the function call.`Npt`

- the number of participants for`(t,k)`

used in the function call.`XCovariate`

- the aggregate design matrix for fixed effects used in the function call. Depending on`scale_x`

, this may differ from the matrix provided at function call.`ZCovariate`

- the aggregate design matrix for random effects.`bayes.nmr`

will assign`rep(1, length(Outcome))`

if it was not provided at function call.`Trial`

- the*renumbered*trial indicators. Depending on`Trial_order`

, it may differ from the vector provided at function call.`Treat`

- the*renumbered*treatment indicators. Depending on`Treat_order`

, it may differ from the vector provided at function call.`TrtLabels`

- the vector of treatment labels corresponding to the renumbered`Treat`

. This is equivalent to`Treat_order`

if it was given at function call.`TrialLabels`

- the vector of trial labels corresponding to the renumbered`Trial`

. This is equivalent to`Trial_order`

if it was given at function call.`K`

- the total number of trials.`nT`

- the total number of treatments.`scale_x`

- a Boolean indicating whether`XCovariate`

has been scaled/standardized.`prior`

- the list of hyperparameters used in the function call.`control`

- the list of tuning parameters used for MCMC in the function call.`mcmctime`

- the elapsed time for the MCMC algorithm in the function call. This does not include all the other preprocessing and post-processing outside of MCMC.`mcmc`

- the list of MCMC specification used in the function call.`mcmc.draws`

- the list containing the MCMC draws. The posterior sample will be accessible here.

The following boilerplate code demonstrates how `bayes.nmr`

can be used:

`bayes.parobs`

When there are multiple endpoints, the correlations thereof are oftentimes unreported. In a meta-regression setting, the correlations are something we want to know. In the case of subject-level meta-analyses (where individual participant/patient data are available), the correlations may only be one function call away. However, for study-level meta-analyses, the correlations must be estimated, which is not an easy task. `bayes.parobs`

provides five different models to estimate and recover the missing correlations.

Since the aforementioned setting regards the correlations as missing, the reported data are \((\overline{\boldsymbol{y}}_{\cdot tk}, \boldsymbol{s}_{tk}) \in \mathbf{R}^J \times \mathbf{R}^J\) for the \(t\)th treatment arm and \(k\)th trial. Every trial out of \(K\) trials includes \(T\) treatment arms. The sample size of the \(t\)th treatment arm and \(k\)th trial is denoted by \(n_{tk}\). If we write \(\boldsymbol{x}_{tkj}\in \mathbf{R}^{p_j}\) to be the treatment-within-trial level regressor corresponding to the \(j\)th response, reflecting the fixed effects of the \(t\)th treatment arm, and \(\boldsymbol{w}_{tkj}\in \mathbf{R}^{q_j}\) to be the same for the random effects, the model becomes

\[\begin{equation}\label{eq:reduced-model} \overline{\boldsymbol{y}}_{\cdot tk} = \boldsymbol{X}_{tk}\boldsymbol{\beta} + \boldsymbol{W}_{tk}\boldsymbol{\gamma}_k + \overline{\boldsymbol{\epsilon}}_{\cdot tk}, \end{equation}\]

and \((n_{tk}-1)S_{tk} \sim \mathcal{W}_{n_{tk}-1}(\Sigma_{tk})\) where \(\boldsymbol{X}_{tk} = \mathrm{blockdiag}(\boldsymbol{x}_{tk1}^\prime,\boldsymbol{x}_{tk2}^\prime,\ldots,\boldsymbol{x}_{tkJ}^\prime)\), \(\boldsymbol{\beta} = (\boldsymbol{\beta}_1^\prime, \cdots, \boldsymbol{\beta}_J^\prime)^\prime\), \(\boldsymbol{W}_{tk}=\mathrm{blockdiag}(\boldsymbol{w}_{tk1}^\prime, \ldots,\boldsymbol{w}_{tkJ}^\prime)\), \(\boldsymbol{\gamma}_k = (\boldsymbol{\gamma}_{k1}^\prime,\boldsymbol{\gamma}_{k2}^\prime, \ldots,\boldsymbol{\gamma}_{kJ}^\prime)^\prime\), \(\overline{\boldsymbol{\epsilon}}_{\cdot tk} \sim \mathcal{N}(\boldsymbol{0}, \Sigma_{tk}/n_{tk})\), \(S_{tk}\) is the full-rank covariance matrix whose diagonal entries are the observed \(\boldsymbol{s}_{tk}\), and \(\mathcal{W}_\nu(\Sigma)\) is the Wishart distribution with \(\nu\) degrees of freedom and a \(J\times J\) scale matrix \(\Sigma\) whose density function is

\[ f(X\mid \nu,\Sigma) = \dfrac{1}{2^{J\nu}|\Sigma|^{\nu/2}\Gamma_J(\nu/2)}|X|^{(\nu-J-1)/2}\exp\left(-\dfrac{1}{2}\mathrm{tr}(\Sigma^{-1}X) \right)I(X\in \mathcal{S}_{++}^J), \]

where \(\Gamma_p\) is the multivariate gamma function defined by

\[ \Gamma_p(z) = \pi^{p(p-1)/4}\prod_{j=1}^p \Gamma[z+(1-j)/2], \]

and \(\mathcal{S}_{++}^J\) is the space of \(J\times J\) symmetric positive definite matrices. The statistical independence of \(\overline{\boldsymbol{\epsilon}}_{\cdot tk}\) and \(S_{tk}\) follows naturally from the Basu’s theorem.

The patients can sometimes be grouped by a factor that will generate disparate random effects. Although an arbitrary number of groups can exist in theory, **metapack** restricts the number of groups to two for practicality. Denoting the binary group indicates by \(u_{tk}\) yields

\[ \overline{y}_{\cdot tkj} = \boldsymbol{x}_{tkj}^\prime\boldsymbol{\beta} + (1-u_{tk})\boldsymbol{w}_{tkj}^\prime \boldsymbol{\gamma}_{kj}^0 + u_{tk}\boldsymbol{w}_{tkj}^\prime \boldsymbol{\gamma}_{kj}^1 + \overline{\epsilon}_{\cdot tkj}. \]

The random effects are modeled as \(\boldsymbol{\gamma}_{kj}^l \overset{\text{ind}}{\sim}\mathcal{N}(\boldsymbol{\gamma}_j^{l*},\Omega_j^l)\) and \((\Omega_j^l)^{-1} \sim \mathcal{W}_{d_{0j}}(\Omega_{0j})\). Stacking the vectors, \(\boldsymbol{\gamma}_k^l = ((\boldsymbol{\gamma}_{k1}^l)^\prime, \ldots, (\boldsymbol{\gamma}_{kJ}^l)^\prime)^\prime \sim \mathcal{N}(\boldsymbol{\gamma}^{l*},\Omega^l)\) where \(\boldsymbol{\gamma}^{l*} = ((\boldsymbol{\gamma}_{1}^{l*})^\prime,\ldots,(\boldsymbol{\gamma}_{J}^{l*})^\prime)^\prime\), \(\Omega_j = \mathrm{blockdiag}(\Omega_j^0,\Omega_j^l)\), and \(\Omega = \mathrm{blockdiag}(\Omega_1,\ldots,\Omega_J)\) for \(l \in \{0,1\}\). Adopting the non-centered reparametrization (Bernardo et al. 2003), define \(\boldsymbol{\gamma}_{k,o}^l = \boldsymbol{\gamma}_k^l - \boldsymbol{\gamma}^{l*}\). Denoting \(\boldsymbol{W}_{tk}^* = [(1-u_{tk})\boldsymbol{W}_{tk}, u_{tk}\boldsymbol{W}_{tk}]\), \(\boldsymbol{X}_{tk}^* = [\boldsymbol{X}_{tk},\boldsymbol{W}_{tk}^*]\), and \(\boldsymbol{\theta} = (\boldsymbol{\beta}^\prime, {\boldsymbol{\gamma}^{0*}}^\prime, {\boldsymbol{\gamma}^{1*}}^\prime)^\prime\), the model is written as follows:

\[ \overline{\boldsymbol{y}}_{\cdot tk} = \boldsymbol{X}_{tk}^*\boldsymbol{\theta}+\boldsymbol{W}_{tk}^*\boldsymbol{\gamma}_{k,o} + \overline{\boldsymbol{\epsilon}}_{\cdot tk}, \] where \(\boldsymbol{\gamma}_{k,o} = ((\boldsymbol{\gamma}_{k,o}^0)^\prime, (\boldsymbol{\gamma}_{k,o}^1)^\prime)^\prime\). If there is no grouping in the patients, setting \(u_{tk}=0\) for all \((t,k)\) reduces the model back to \(\eqref{eq:reduced-model}\).

The conditional distribution of \((R_{tk} \mid V_{tk}, \Sigma_{tk})\) where \(R_{tk} = V_{tk}^{-\frac{1}{2}}S_{tk}V_{tk}^{-\frac{1}{2}}\) and \(V_{tk} = \mathrm{diag}(S_{tk11},\ldots,S_{tkJJ})\) becomes

\[ f(R_{tk}\mid V_{tk},\Sigma_{tk}) \propto |R_{tk}|^{(n_{tk}-J-2)/2}\exp\left\{-\dfrac{(n_{tk}-1)}{2}\mathrm{tr}\left(V_{tk}^{\frac{1}{2}}\Sigma_{tk}^{-1}V_{tk}^{\frac{1}{2}}R_{tk} \right) \right\}. \]

`bayes.nmr`

Network meta-analysis is an extension of meta-analysis where more than two treatments are compared. Unlike the traditional meta-analyses that restrict the number of treatments to be equal across trials, network meta-analysis allows varying numbers of treatments. This achieves a unique benefit that two treatments that have not been compared head-to-head can be assessed as a pair.

Start by denoting the comprehensive list of treatments in all \(K\) trials by \(\mathcal{T}=\{1,\ldots,T\}\). It is rarely the case that all \(T\) treatments are included in the data but we drop the subscripts \(t_k\) and replace it with \(t\) for notational simplicity. Now, consider the model

\[\begin{equation}\label{eq:nmr-basic} \bar{y}_{\cdot tk} = \boldsymbol{x}_{tk}^\prime\boldsymbol{\beta} + \tau_{tk}\gamma_{tk} + \bar{\epsilon}_{\cdot tk}, \quad \bar{\epsilon}_{\cdot tk} \sim \mathcal{N}(0,\sigma_{tk}^2/n_{tk}), \end{equation}\] where \(\bar{y}_{\cdot tk}\) is the univariate aggregate response of the \(k\)th trial for which treatment \(t\) was assigned, \(\boldsymbol{x}_{tk}\) is the aggregate covariate vector for the fixed effects, and \(\gamma_{tk}\) is the random effects term. The observed standard deviation, \(s_{tk}^2\) is modeled by

\[ \dfrac{(n_{tk}-1)s_{tk}^2}{\sigma_{tk}^2} \sim \chi_{n_{tk}-1}^2. \]

\(\tau_{tk}\) in Equation \(\eqref{eq:nmr-basic}\) encapsulates the variance of the random effect for the \(t\)th treatment in the \(k\)th trial, which is modeled as a deterministic function of a related covariate. That is,

\[ \log \tau_{tk} = \boldsymbol{z}_{tk}^\prime\boldsymbol{\phi}, \]

where \(\boldsymbol{z}_{tk}\) is the \(q\)-dimensional aggregate covariate vector and \(\boldsymbol{\phi}\) is the corresponding coefficient vector.

For the \(k\)th trial, we define a selection/projection matrix \(E_k = (e_{t_{1k}},e_{t_{2k}},\ldots, e_{t_{T_k k}})\), where \(e_{t_{lk}} = (0,\ldots,1,\ldots,0)^\prime\), \(l=1,\ldots,T_k\), with \(t_{lk}\)th element set to 1 and 0 otherwise, and \(T_k\) is the number of treatments included in the \(k\)th trial. Let the scaled random effects \(\boldsymbol{\gamma}_k = (\gamma_{1k},\ldots,\gamma_{Tk})^\prime\). Then, \(\boldsymbol{\gamma}_{k,o}=E_k^\prime\boldsymbol{\gamma}_k\) is the vector of \(T_k\)-dimensional scaled random effects for the \(k\)th trial. The scaled random effects \(\boldsymbol{\gamma}_k \sim t_T(\boldsymbol{\gamma},\boldsymbol{\rho},\nu)\) where \(t_T(\boldsymbol{\mu},\Sigma,\nu)\) denotes a multivariate \(t\) distribution with \(\nu\) degrees of freedom, a location parameter vector \(\boldsymbol{\mu}\), and a scale matrix \(\Sigma\).

The non-centered reparametrization (Bernardo et al. 2003) gives \(\boldsymbol{\gamma}_{k,o} = E_k^\prime(\boldsymbol{\gamma}_k - \boldsymbol{\gamma})\). Then, with \(\bar{\boldsymbol{y}}_k = (\bar{y}_{kt_{k1}},\ldots,\bar{y}_{kt_{kT_k}})^\prime\), \(\boldsymbol{X}_k = (\boldsymbol{x}_{kt_{k1}},\ldots, \boldsymbol{x}_{kt_{kT_k}})\), and \(\boldsymbol{Z}_k(\boldsymbol{\phi}) = \mathrm{diag}(\exp(\boldsymbol{z}_{kt_{k1}}^\prime \boldsymbol{\phi}),\ldots, \exp(\boldsymbol{z}_{kt_{kT_k}}^\prime \boldsymbol{\phi}))\), the model is recast as

\[ \bar{\boldsymbol{y}}_k = \boldsymbol{X}_k^* \boldsymbol{\theta} + \boldsymbol{Z}_k(\boldsymbol{\phi}) \boldsymbol{\gamma}_{k,o} + \bar{\boldsymbol{\epsilon}}_k, \] where \(\boldsymbol{X}_k^* = (\boldsymbol{X}_k, E_k^\prime)\), \(\boldsymbol{\theta} = (\boldsymbol{\beta}^\prime, \boldsymbol{\gamma}^\prime)^\prime\), and \(\bar{\boldsymbol{\epsilon}}_k \sim \mathcal{N}_{T_k}(\boldsymbol{0},\Sigma_k)\), \(\Sigma_k = \mathrm{diag}(\sigma_{kt_{k1}}^2/n_{kt_{k1}}, \ldots, \sigma_{kt_{kT_k}}^2/n_{kt_{kT_k}})\). This allows the random effects \(\boldsymbol{\gamma}_{k,o} \sim t_{T_k}(\boldsymbol{0},E_k^\prime \boldsymbol{\rho}E_k, \nu)\) to be centered at zero.

Since the multivariate \(t\) random effects are not analytically marginalizable, we represent it as a scale mixture of normals as follows:

\[ (\boldsymbol{\gamma}_{k,o}\mid \lambda_k) \overset{\text{ind}}{\sim} \mathcal{N}_{T_k}\left(\boldsymbol{0}, \lambda_k^{-1}(E_k^\prime\boldsymbol{\rho}E_k) \right), \quad \lambda_k \overset{\text{iid}}{\sim}\mathcal{G}a\left(\dfrac{\nu}{2},\dfrac{\nu}{2} \right), \] where \(\mathcal{G}a(a,b)\) indicates the gamma distribution with mean \(a/b\) and variance \(a/b^2\).

Bernardo, JM, MJ Bayarri, JO Berger, AP Dawid, D Heckerman, AFM Smith, and M West. 2003. “Non-Centered Parameterisations for Hierarchical Models and Data Augmentation.” In *Bayesian Statistics 7: Proceedings of the Seventh Valencia International Meeting*. Vol. 307. Oxford University Press, USA.

Eddelbuettel, Dirk, and James Joseph Balamuta. 2017. “Extending R with C++: A Brief Introduction to Rcpp.” *PeerJ Preprints* 5 (August): e3188v1. https://doi.org/10.7287/peerj.preprints.3188v1.

Eddelbuettel, Dirk, and Conrad Sanderson. 2014. “RcppArmadillo: Accelerating R with High-Performance C++ Linear Algebra.” *Computational Statistics and Data Analysis* 71: 1054–63. http://dx.doi.org/10.1016/j.csda.2013.02.005.

Yao, Hui, Sungduk Kim, Ming-Hui Chen, Joseph G Ibrahim, Arvind K Shah, and Jianxin Lin. 2015. “Bayesian Inference for Multivariate Meta-Regression with a Partially Observed Within-Study Sample Covariance Matrix.” *Journal of the American Statistical Association* 110 (510): 528–44.