Maximum likelihood estimation for matrix variate t distributions

For the matrix variate normal distribution, maximum likelihood estimates exist for \(N > max(p/q,q/p)+1\) and are unique for \(N > max(p,q)\). The number necessary for the matrix variate t has not been worked out but this is a lower bound. This implements an ECME algorithm to estimate the mean, covariance, and degrees of freedom parameters. An AR(1), compound symmetry, or independence restriction can be proposed for either or both variance matrices. However, if they are inappropriate for the data, they may fail with a warning.

MLmatrixt(
  data,
  row.mean = FALSE,
  col.mean = FALSE,
  row.variance = "none",
  col.variance = "none",
  df = 10,
  fixed = TRUE,
  tol = .Machine$double.eps^0.5,
  max.iter = 5000,
  U,
  V,
  ...
)

Arguments

data: Either a list of matrices or a 3-D array with matrices in dimensions 1 and 2, indexed by dimension 3.
row.mean: By default, FALSE. If TRUE, will fit a common mean within each row. If both this and col.mean are TRUE, there will be a common mean for the entire matrix.
col.mean: By default, FALSE. If TRUE, will fit a common mean within each row. If both this and row.mean are TRUE, there will be a common mean for the entire matrix.
row.variance: Imposes a variance structure on the rows. Either 'none', 'AR(1)', 'CS' for 'compound symmetry', 'Correlation' for a correlation matrix, or 'Independence' for independent and identical variance across the rows. Only positive correlations are allowed for AR(1) and CS and these restrictions may not be guaranteed to converge. Note that while maximum likelihood estimators are available (and used) for the unconstrained variance matrices, optim is used for any constraints so it may be considerably slower.
col.variance: Imposes a variance structure on the columns. Either 'none', 'AR(1)', 'CS', 'Correlation', or 'Independence'. Only positive correlations are allowed for AR(1) and CS.
df: Starting value for the degrees of freedom. If fixed = TRUE, then this is required and not updated. By default, set to 10.
fixed: Whether df is estimated or fixed. By default, TRUE.
tol: Convergence criterion. Measured against square deviation between iterations of the two variance-covariance matrices.
max.iter: Maximum possible iterations of the algorithm.
U: (optional) Can provide a starting point for the U matrix. By default, an identity matrix.
V: (optional) Can provide a starting point for the V matrix. By default, an identity matrix.
...: (optional) additional arguments can be passed to optim if using restrictions on the variance.

Value

Returns a list with the following elements:

mean: the mean matrix
U: the between-row covariance matrix
V: the between-column covariance matrix
var: the scalar variance parameter (the first entry of the covariances are restricted to unity)
nu: the degrees of freedom parameter
iter: the number of iterations
tol: the squared difference between iterations of the variance matrices at the time of stopping
logLik: log likelihood of result.
convergence: a convergence flag, TRUE if converged.
call: The (matched) function call.

References

Thompson, G Z.  R Maitra, W Q Meeker, A Bastawros (2019),
"Classification with the matrix-variate-t distribution", arXiv
e-prints arXiv:1907.09565 <https://arxiv.org/abs/1907.09565>

Dickey, James M. 1967. “Matricvariate Generalizations of the
Multivariate t Distribution and the Inverted Multivariate t
Distribution.” Ann. Math. Statist. 38 (2): 511–18.
\doi{10.1214/aoms/1177698967}

Liu, Chuanhai, and Donald B. Rubin. 1994. “The ECME Algorithm:
A Simple Extension of EM and ECM with Faster Monotone Convergence.”
Biometrika 81 (4): 633–48.
      \doi{10.2307/2337067}

Meng, Xiao-Li, and Donald B. Rubin. 1993. “Maximum Likelihood Estimation via the ECM Algorithm: A General Framework.” Biometrika 80 (2): 267–78. doi:10.1093/biomet/80.2.267

Rubin, D.B. 1983. “Encyclopedia of Statistical Sciences.” In, 4th ed.,
  272–5. John Wiley.

Examples

set.seed(20180202)
# drawing from a distribution with specified mean and covariance
A <- rmatrixt(
  n = 100, mean = matrix(c(100, 0, -100, 0, 25, -1000), nrow = 2),
  L = matrix(c(2, 1, 0, .1), nrow = 2), list = TRUE, df = 5
)
# fitting maximum likelihood estimates
results <- MLmatrixt(A, tol = 1e-5, df = 5)
print(results)
#> $mean
#>             [,1]          [,2]        [,3]
#> [1,] 99.86681274 -100.08579303    24.96235
#> [2,] -0.05923668   -0.05035992 -1000.01499
#> 
#> $U
#>           [,1]      [,2]
#> [1,] 1.0000000 0.5053587
#> [2,] 0.5053587 0.2585040
#> 
#> $V
#>             [,1]       [,2]        [,3]
#> [1,]  1.00000000  0.1299735 -0.06949657
#> [2,]  0.12997346  1.1392257 -0.07063750
#> [3,] -0.06949657 -0.0706375  0.87998071
#> 
#> $var
#> [1] 3.666694
#> 
#> $nu
#> [1] 5
#> 
#> $iter
#> [1] 30
#> 
#> $tol
#> [1] 8.100817e-06
#> 
#> $logLik
#> [1] 3.492195
#> 
#> $convergence
#> [1] TRUE
#> 
#> $call
#> MLmatrixt(data = A, df = 5, tol = 1e-05)
#>

Maximum likelihood estimation for matrix variate t distributions

Arguments

Value

References

See also

Examples