For the matrix variate normal distribution, maximum likelihood estimates exist for \(N > max(p/q,q/p)+1\) and are unique for \(N > max(p,q)\). The number necessary for the matrix variate t has not been worked out but this is a lower bound. This implements an ECME algorithm to estimate the mean, covariance, and degrees of freedom parameters. An AR(1), compound symmetry, or independence restriction can be proposed for either or both variance matrices. However, if they are inappropriate for the data, they may fail with a warning.

MLmatrixt(
  data,
  row.mean = FALSE,
  col.mean = FALSE,
  row.variance = "none",
  col.variance = "none",
  df = 10,
  fixed = TRUE,
  tol = .Machine$double.eps^0.5,
  max.iter = 5000,
  U,
  V,
  ...
)

Arguments

data

Either a list of matrices or a 3-D array with matrices in dimensions 1 and 2, indexed by dimension 3.

row.mean

By default, FALSE. If TRUE, will fit a common mean within each row. If both this and col.mean are TRUE, there will be a common mean for the entire matrix.

col.mean

By default, FALSE. If TRUE, will fit a common mean within each row. If both this and row.mean are TRUE, there will be a common mean for the entire matrix.

row.variance

Imposes a variance structure on the rows. Either 'none', 'AR(1)', 'CS' for 'compound symmetry', 'Correlation' for a correlation matrix, or 'Independence' for independent and identical variance across the rows. Only positive correlations are allowed for AR(1) and CS and these restrictions may not be guaranteed to converge. Note that while maximum likelihood estimators are available (and used) for the unconstrained variance matrices, optim is used for any constraints so it may be considerably slower.

col.variance

Imposes a variance structure on the columns. Either 'none', 'AR(1)', 'CS', 'Correlation', or 'Independence'. Only positive correlations are allowed for AR(1) and CS.

df

Starting value for the degrees of freedom. If fixed = TRUE, then this is required and not updated. By default, set to 10.

fixed

Whether df is estimated or fixed. By default, TRUE.

tol

Convergence criterion. Measured against square deviation between iterations of the two variance-covariance matrices.

max.iter

Maximum possible iterations of the algorithm.

U

(optional) Can provide a starting point for the U matrix. By default, an identity matrix.

V

(optional) Can provide a starting point for the V matrix. By default, an identity matrix.

...

(optional) additional arguments can be passed to optim if using restrictions on the variance.

Value

Returns a list with the following elements:

mean

the mean matrix

U

the between-row covariance matrix

V

the between-column covariance matrix

var

the scalar variance parameter (the first entry of the covariances are restricted to unity)

nu

the degrees of freedom parameter

iter

the number of iterations

tol

the squared difference between iterations of the variance matrices at the time of stopping

logLik

log likelihood of result.

convergence

a convergence flag, TRUE if converged.

call

The (matched) function call.

References

Thompson, G Z.  R Maitra, W Q Meeker, A Bastawros (2019),
"Classification with the matrix-variate-t distribution", arXiv
e-prints arXiv:1907.09565 <https://arxiv.org/abs/1907.09565>

Dickey, James M. 1967. “Matricvariate Generalizations of the
Multivariate t Distribution and the Inverted Multivariate t
Distribution.” Ann. Math. Statist. 38 (2): 51118.
\doi{10.1214/aoms/1177698967}

Liu, Chuanhai, and Donald B. Rubin. 1994. “The ECME Algorithm:
A Simple Extension of EM and ECM with Faster Monotone Convergence.”
Biometrika 81 (4): 63348.
      \doi{10.2307/2337067}

Meng, Xiao-Li, and Donald B. Rubin. 1993. “Maximum Likelihood Estimation via the ECM Algorithm: A General Framework.” Biometrika 80 (2): 267–78. doi:10.1093/biomet/80.2.267

Rubin, D.B. 1983. “Encyclopedia of Statistical Sciences.” In, 4th ed.,
  2725. John Wiley.

Examples

set.seed(20180202)
# drawing from a distribution with specified mean and covariance
A <- rmatrixt(
  n = 100, mean = matrix(c(100, 0, -100, 0, 25, -1000), nrow = 2),
  L = matrix(c(2, 1, 0, .1), nrow = 2), list = TRUE, df = 5
)
# fitting maximum likelihood estimates
results <- MLmatrixt(A, tol = 1e-5, df = 5)
print(results)
#> $mean
#>             [,1]          [,2]        [,3]
#> [1,] 99.86681274 -100.08579303    24.96235
#> [2,] -0.05923668   -0.05035992 -1000.01499
#> 
#> $U
#>           [,1]      [,2]
#> [1,] 1.0000000 0.5053587
#> [2,] 0.5053587 0.2585040
#> 
#> $V
#>             [,1]       [,2]        [,3]
#> [1,]  1.00000000  0.1299735 -0.06949657
#> [2,]  0.12997346  1.1392257 -0.07063750
#> [3,] -0.06949657 -0.0706375  0.87998071
#> 
#> $var
#> [1] 3.666694
#> 
#> $nu
#> [1] 5
#> 
#> $iter
#> [1] 30
#> 
#> $tol
#> [1] 8.100817e-06
#> 
#> $logLik
#> [1] 3.492195
#> 
#> $convergence
#> [1] TRUE
#> 
#> $call
#> MLmatrixt(data = A, df = 5, tol = 1e-05)
#>