vignettes/matrixnormal.Rmd
matrixnormal.Rmd
Suppose there is a two-dimensional grid of observations: each observation is randomly distributed somehow, perhaps each spot on the grid has its own mean, thereβs a covariance relation across the rows that doesnβt vary across the columns and a covariance relation across the columns that doesnβt vary across the rows. One case where this might come up is a multivariate time series: at each point in time we are dealing with a multivariate observation (perhaps normally distributed or not) and then there is a covariance structure across time to account for. We may be interested in drawing from such distributions, calculating densities, or estimating parameters based on observations. This package presents some functions for doing so in the case of matrix variate normal distributions. For more details about matrix variate distributions in general and matrix variate normal distributions in particular, see Gupta and Nagar (1999), whose results I rely on for much of this presentation.
A random matrix with a matrix variate normal distribution is parameterized by a mean matrix and covariance matrices (determining covariance among rows) and (determining covariance among columns) and has the following probability density function (Wikipedia (2018)): Here is a useful fact for drawing observations from a matrix variate normal distribution: suppose is distributed as a matrix variate normal distribution with mean matrix and covariance matrices and . Then for a mean matrix and linear transformations and of appropriate dimensions, if then is matrix variate normally distributed with parameters .
Matrix variate random variables can then be generated by specifying and and or by specifying and . If the covariance matrices are provided, then Cholesky decomposition is performed to create the and matrices. If and matrices are provided, they are used to create and matrices which are then decomposed. If these are not specified, the variances are presumed to be identity matrices by default and the means are presumed to be zero. If or are not full rank linear transformations, degenerate matrix normal distributions can be produced.
set.seed(20180203)
x <- matrix(rnorm(6),nrow=3)
x
#> [,1] [,2]
#> [1,] 0.04234611 1.7072872
#> [2,] -1.05186238 0.2182058
#> [3,] 0.36199161 0.6759820
set.seed(20180203)
y <- rmatrixnorm(n = 1, mean=matrix(rep(0,6),nrow=3))
y
#> [,1] [,2]
#> [1,] 0.04234611 1.7072872
#> [2,] -1.05186238 0.2182058
#> [3,] 0.36199161 0.6759820
U <- 5 * diag(3) + 1
V <- matrix(c(2,0,0,.1),nrow=2)
mu = matrix(1:6,nrow=3)
set.seed(20180203)
z <- rmatrixnorm(n = 1, mean=mu,U=U,V=V)
z
#> [,1] [,2]
#> [1,] 1.146691 5.322459
#> [2,] -1.568345 5.387067
#> [3,] 3.734947 6.755212
mu + t(chol(U)) %*% y %*% chol(V)
#> [,1] [,2]
#> [1,] 1.146691 5.322459
#> [2,] -1.568345 5.387067
#> [3,] 3.734947 6.755212
When
,
by default this returns a matrix. For
,
by default, it will return a three-dimensional array indexed by the last
coordinate (array convention similar to rWishart
).
set.seed(20180203)
x <- rmatrixnorm(n = 1, mean=matrix(rep(0,6),nrow=3))
x
#> [,1] [,2]
#> [1,] 0.04234611 1.7072872
#> [2,] -1.05186238 0.2182058
#> [3,] 0.36199161 0.6759820
set.seed(20180203)
y <- rmatrixnorm(n = 100, mean=matrix(rep(0,6),nrow=3),list = TRUE)
y[[1]]
#> [,1] [,2]
#> [1,] 0.04234611 1.7072872
#> [2,] -1.05186238 0.2182058
#> [3,] 0.36199161 0.6759820
set.seed(20180203)
z <- rmatrixnorm(n = 100, mean=matrix(rep(0,6),nrow=3),array = TRUE)
z[ , , 1]
#> [,1] [,2]
#> [1,] 0.04234611 1.7072872
#> [2,] -1.05186238 0.2182058
#> [3,] 0.36199161 0.6759820
Densities are computed in much the same way as for other
distributions. The defaults for the mean parameters are 0 and identity
for the variance matrices. The log
parameter sets whether
to return the density on the log scale - by default this is
FALSE
.
Maximum likelihood estimation is possible for the matrix variate normal distribution (see Dutilleul (1999) or Glanz and Carvalho (2013) for details). The pair of variance matrices are only identifiable up to a constant, so this function sets the first element of to . There is a closed-form solution for the mean matrix and the and have an iterative solution.
set.seed(20180202)
A = rmatrixnorm(n=100,mean=matrix(c(100,0,-100,0,25,-1000),nrow=2),
L=matrix(c(2,1,0,.1),nrow=2),list=TRUE)
results=MLmatrixnorm(A)
print(results)
#> $mean
#> [,1] [,2] [,3]
#> [1,] 99.7692446 -100.2587675 25.0921
#> [2,] -0.1010964 -0.1537989 -999.9448
#>
#> $U
#> [,1] [,2]
#> [1,] 1.0000000 0.5011833
#> [2,] 0.5011833 0.2542832
#>
#> $V
#> [,1] [,2] [,3]
#> [1,] 1.000000000 0.08886027 0.003307182
#> [2,] 0.088860275 0.99216701 -0.048960852
#> [3,] 0.003307182 -0.04896085 0.808693731
#>
#> $var
#> [1] 3.984766
#>
#> $iter
#> [1] 5
#>
#> $tol
#> [1] 9.830401e-11
#>
#> $logLik
#> [1] -415.1583 -376.4587 -376.4574 -376.4574 -376.4574
#>
#> $convergence
#> [1] TRUE
#>
#> $call
#> MLmatrixnorm(data = A)
There are two restrictions possible for the mean matrices:
row.mean = TRUE
will force a common mean within a row and
col.mean = TRUE
will force a common mean within a column.
Setting both will ensure a constant mean for the entire system.
Restrictions on
and
are possible with row.variance
and
col.variance
commands.
results.fixrows = MLmatrixnorm(A, row.mean = TRUE, max.iter = 5)
#> Warning in MLmatrixnorm(A, row.mean = TRUE, max.iter = 5): Failed to converge
print(results.fixrows)
#> $mean
#> [,1] [,2] [,3]
#> [1,] 8.20086 8.20086 8.20086
#> [2,] -333.39989 -333.39989 -333.39989
#>
#> $U
#> [,1] [,2]
#> [1,] 1.0000000 0.4732438
#> [2,] 0.4732438 0.2752951
#>
#> $V
#> [,1] [,2] [,3]
#> [1,] 1.000000 1.287832 -2.287827
#> [2,] 1.287832 1.717146 -3.004973
#> [3,] -2.287827 -3.004973 5.292803
#>
#> $var
#> [1] 441310.5
#>
#> $iter
#> [1] 5
#>
#> $tol
#> [1] 2.815079e+12
#>
#> $logLik
#> [1] -3125.019 -3093.986 -3062.230 -3030.199 -2998.804
#>
#> $convergence
#> [1] FALSE
#>
#> $call
#> MLmatrixnorm(data = A, row.mean = TRUE, max.iter = 5)
# this failure is expected with misspecification! The number of iterations is also
# fixed to be low so the vignette compiles quickly.
Currently the options for variance restrictions are to impose an
AR(1) structure by providing the AR(1)
option, a compound
symmetry structure by providing the CS
option, to impose a
correlation matrix structure by specifying correlation
or
corr
, or to impose an identical and independent structure
by specifying Independent
or I
. This works by
using uniroot
to find the appropriate
which sets the derivative of the log-likelihood to zero for the
AR
and CS
options - it is not fast but if this
is the true structure it will be better in some sense than an
unstructured variance matrix. The
parameter should be
and is forced to be non-negative. If the data behaves incompatibly with
those restrictions, the function will provide a warning and exit with
the current model fit.
# tolerance and maximum iterations are limited here to make the vignette compile faster
B = rmatrixnorm(n = 50, mean = matrix(1:15,nrow = 5),
U = 3*stats::toeplitz(.6^(1:5)))
MLmatrixnorm(B, row.variance = "AR(1)", tol = 1e-5)
#> $mean
#> [,1] [,2] [,3]
#> [1,] 0.9879945 5.976905 10.88718
#> [2,] 1.8732469 6.852212 12.18793
#> [3,] 2.9224880 7.939933 13.30729
#> [4,] 4.2194033 9.036333 14.06759
#> [5,] 5.2340302 9.857787 14.63149
#>
#> $U
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.0000000 0.5839991 0.3410550 0.1991758 0.1163185
#> [2,] 0.5839991 1.0000000 0.5839991 0.3410550 0.1991758
#> [3,] 0.3410550 0.5839991 1.0000000 0.5839991 0.3410550
#> [4,] 0.1991758 0.3410550 0.5839991 1.0000000 0.5839991
#> [5,] 0.1163185 0.1991758 0.3410550 0.5839991 1.0000000
#>
#> $V
#> [,1] [,2] [,3]
#> [1,] 1.0000000000 0.0001430499 0.005716386
#> [2,] 0.0001430499 0.8494982020 0.035590626
#> [3,] 0.0057163858 0.0355906255 0.979331742
#>
#> $var
#> [1] 1.853556
#>
#> $iter
#> [1] 12
#>
#> $tol
#> [1] 8.864026e-06
#>
#> $logLik
#> [1] -1156.249 -1149.718 -1148.332 -1147.872 -1147.712 -1147.651 -1147.627
#> [8] -1147.617 -1147.614 -1147.612 -1147.611 -1147.611
#>
#> $convergence
#> [1] TRUE
#>
#> $call
#> MLmatrixnorm(data = B, row.variance = "AR(1)", tol = 1e-05)
MLmatrixnorm(B, tol = 1e-5)
#> $mean
#> [,1] [,2] [,3]
#> [1,] 0.9879945 5.976905 10.88718
#> [2,] 1.8732469 6.852212 12.18793
#> [3,] 2.9224880 7.939933 13.30729
#> [4,] 4.2194033 9.036333 14.06759
#> [5,] 5.2340302 9.857787 14.63149
#>
#> $U
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.0000000 0.5088212 0.3116836 0.2418073 0.1278731
#> [2,] 0.5088212 0.9878158 0.6383707 0.3910373 0.3114872
#> [3,] 0.3116836 0.6383707 1.1270834 0.6297656 0.4495454
#> [4,] 0.2418073 0.3910373 0.6297656 1.0046405 0.6505338
#> [5,] 0.1278731 0.3114872 0.4495454 0.6505338 1.0516576
#>
#> $V
#> [,1] [,2] [,3]
#> [1,] 1.000000000 0.005092835 0.009893861
#> [2,] 0.005092835 0.836800036 0.034013446
#> [3,] 0.009893861 0.034013446 0.972453799
#>
#> $var
#> [1] 1.8111
#>
#> $iter
#> [1] 4
#>
#> $tol
#> [1] 2.457002e-06
#>
#> $logLik
#> [1] -1145.534 -1143.775 -1143.767 -1143.766
#>
#> $convergence
#> [1] TRUE
#>
#> $call
#> MLmatrixnorm(data = B, tol = 1e-05)
In order to fit parameters for an matrix variate normal distribution, at least observations are needed. Maximum likelihood estimates are unique if there are more than observations.
The function expects data in a format produced by the
rmatrixnorm
function. The results are only the parameter
estimates and do not include measures of uncertainty.
This vignette was compiled with rmarkdown
.
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] MixMatrix_0.2.8
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.37 desc_1.4.3 R6_2.5.1 fastmap_1.2.0
#> [5] xfun_0.47 glue_1.7.0 cachem_1.1.0 knitr_1.48
#> [9] htmltools_0.5.8.1 rmarkdown_2.28 lifecycle_1.0.4 cli_3.6.3
#> [13] pkgdown_2.1.1 sass_0.4.9 textshaping_0.4.0 jquerylib_0.1.4
#> [17] systemfonts_1.1.0 compiler_4.4.1 tools_4.4.1 ragg_1.3.3
#> [21] evaluate_1.0.0 bslib_0.8.0 Rcpp_1.0.13 yaml_2.3.10
#> [25] jsonlite_1.8.9 rlang_1.1.4 fs_1.6.4
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
set.seed(20180203)
x <- matrix(rnorm(6),nrow=3)
x
set.seed(20180203)
y <- rmatrixnorm(n = 1, mean=matrix(rep(0,6),nrow=3))
y
U <- 5 * diag(3) + 1
V <- matrix(c(2,0,0,.1),nrow=2)
mu = matrix(1:6,nrow=3)
set.seed(20180203)
z <- rmatrixnorm(n = 1, mean=mu,U=U,V=V)
z
mu + t(chol(U)) %*% y %*% chol(V)
set.seed(20180203)
x <- rmatrixnorm(n = 1, mean=matrix(rep(0,6),nrow=3))
x
set.seed(20180203)
y <- rmatrixnorm(n = 100, mean=matrix(rep(0,6),nrow=3),list = TRUE)
y[[1]]
set.seed(20180203)
z <- rmatrixnorm(n = 100, mean=matrix(rep(0,6),nrow=3),array = TRUE)
z[ , , 1]
set.seed(20180202)
A = rmatrixnorm(n=1,mean=matrix(c(100,0,-100,0,25,-1000),nrow=2),
L=matrix(c(2,1,0,.1),nrow=2))
dmatrixnorm(A,mean=matrix(c(100,0,-100,0,25,-1000),nrow=2),
L=matrix(c(2,1,0,.1),nrow=2),log=TRUE )
set.seed(20180202)
A = rmatrixnorm(n=100,mean=matrix(c(100,0,-100,0,25,-1000),nrow=2),
L=matrix(c(2,1,0,.1),nrow=2),list=TRUE)
results=MLmatrixnorm(A)
print(results)
results.fixrows = MLmatrixnorm(A, row.mean = TRUE, max.iter = 5)
print(results.fixrows)
# this failure is expected with misspecification! The number of iterations is also
# fixed to be low so the vignette compiles quickly.
# tolerance and maximum iterations are limited here to make the vignette compile faster
B = rmatrixnorm(n = 50, mean = matrix(1:15,nrow = 5),
U = 3*stats::toeplitz(.6^(1:5)))
MLmatrixnorm(B, row.variance = "AR(1)", tol = 1e-5)
MLmatrixnorm(B, tol = 1e-5)
sessionInfo()