Finding blocks in covariance matrices

Edgar C. Merkle

Given a covariance matrix with some covariances fixed to 0, is it possible to permute the rows and columns so that the matrix is block diagonal? This question is important in Bayesian latent variable modeling and elsewhere, because it influences the prior distributions that we can use. To be specific, if our covariance matrix consists of unrestricted blocks, then the 0 entries don’t really matter. We can specify an inverse Wishart or LKJ or other prior on each block of the covariance matrix. On the other hand, if our covariance matrix does not consist of unrestricted blocks, then the prior distribution becomes complicated (e.g., Merkle et al. 2023). This is because the possible values of unrestricted covariances are influenced by the zero entries, along with the constraint that the entire covariance matrix is positive definite.

Example

To make this more concrete, let’s consider the correlation matrix below.

R
Python

origmat

import numpy as np
print(origmat)

[[1. 1. 0. 1. 0.]
 [1. 1. 0. 1. 0.]
 [0. 0. 1. 0. 1.]
 [1. 1. 0. 1. 0.]
 [0. 0. 1. 0. 1.]]

The entries of 0 are fixed to 0, and the entries of 1 are free parameters. For most of us, it is not clear whether or not this is a block diagonal matrix. But if we reorder the rows and columns so that the first, second, and fourth come first, followed by the third and fifth, we obtain:

R
Python

origmat[c(1,2,4,3,5), c(1,2,4,3,5)]

print(origmat[[0,1,3,2,4]][:, [0,1,3,2,4]])

[[1. 1. 1. 0. 0.]
 [1. 1. 1. 0. 0.]
 [1. 1. 1. 0. 0.]
 [0. 0. 0. 1. 1.]
 [0. 0. 0. 1. 1.]]

which is a block diagonal matrix. We now know we can use one prior distribution on the block in rows/columns one to three, and a separate prior distribution on the block in rows/columns four to five. In general, we want to automatically permute the rows and columns to block diagonal form where possible. This is because we can’t be sure that the matrix will be pre-specified in block diagonal form.

Automation

For a time, I wasn’t sure how to automate this. I could have coded something myself, but I was pretty sure that I would be re-inventing the wheel.

I found an answer in the graph theory concept of connectivity. In short, we treat each unrestricted entry of the covariance matrix as an undirected edge in a graph. We then search for the connected components of the graph. Each component is a block of the covariance matrix. We can then determine whether or not each block is unrestricted.

Functionality for doing this comes from the igraph package (Csárdi and Nepusz 2006; Csárdi et al. 2025). Below, we create a graph of our covariance matrix, where each free parameter is an undirected edge. The components() function then finds blocks of the covariance matrix; in network language, these are the maximal connected components of the graph.

R
Python

library("igraph")

g <- graph_from_adjacency_matrix(origmat, mode = "undirected")
blks <- components(g)
blks

$membership
          
1 1 2 1 2 

$csize
[1] 3 2

$no
[1] 2

import igraph as ig

g = ig.Graph.Adjacency(origmat)
blks = g.connected_components(mode = 'weak')
vars(blks)

{'_membership': [0, 0, 1, 0, 1], '_len': 2, '_graph': <igraph.Graph object at 0x7fe3237dd940>, '_modularity': None, '_modularity_dirty': True, '_modularity_params': {}}

The membership entry of the results shows that rows 1, 2, and 4 belong to one block, and rows 3 and 5 belong to the other block. After that, we see that the first block contains 3 variables and the second block contains 2 variables. We can reorder the covariance matrix using this membership information, via

R
Python

order(blks$membership)

[1] 1 2 4 3 5

print(np.argsort(blks.membership))

[0 1 3 2 4]

Larger Example

Let’s consider a larger example with an \(11 \times 11\) matrix. Again, entries of 0 are fixed to 0 and entries of 1 are free parameters.

R
Python

bigmat

                      
 1 0 0 0 0 0 0 0 0 0 0
 0 1 0 0 0 0 0 0 0 0 0
 0 0 1 0 0 0 0 0 0 0 0
 0 0 0 1 0 0 0 1 0 0 0
 0 0 0 0 1 0 1 0 1 0 0
 0 0 0 0 0 1 0 0 0 1 0
 0 0 0 0 1 0 1 0 0 0 1
 0 0 0 1 0 0 0 1 0 0 0
 0 0 0 0 1 0 0 0 1 0 1
 0 0 0 0 0 1 0 0 0 1 0
 0 0 0 0 0 0 1 0 1 0 1

print(bigmat)

[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 1. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 1.]
 [0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 1.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 1.]]

Here is the same matrix with just the 1s, so that it is easier to see the free entries:

                      
 1                    
   1                  
     1                
       1       1      
         1   1   1    
           1       1  
         1   1       1
       1       1      
         1       1   1
           1       1  
             1   1   1

The free entries look structured, but it is not clear whether this could be a block diagonal matrix. So we examine connectivity:

R
Python

g <- graph_from_adjacency_matrix(bigmat, mode = "undirected")
blks <- components(g)
blks

$membership
                      
1 2 3 4 5 6 5 4 5 6 5 

$csize
[1] 1 1 1 2 4 2

$no
[1] 6

g = ig.Graph.Adjacency(bigmat)
blks = g.connected_components(mode = 'weak')
vars(blks)

{'_membership': [0, 1, 2, 3, 4, 5, 4, 3, 4, 5, 4], '_len': 6, '_graph': <igraph.Graph object at 0x7fe3237df740>, '_modularity': None, '_modularity_dirty': True, '_modularity_params': {}}

These results indicate 6 blocks in the covariance matrix, some of which are of size 1. The blocks of size 1 indicate standalone variables, which could receive inverse gamma (or other) priors on variances.

To examine whether or not the blocks are fully unrestricted, we permute the rows and columns of our matrix:

R
Python

reorder <- order(blks$membership)
bigmat[reorder, reorder]

                      
 1 0 0 0 0 0 0 0 0 0 0
 0 1 0 0 0 0 0 0 0 0 0
 0 0 1 0 0 0 0 0 0 0 0
 0 0 0 1 1 0 0 0 0 0 0
 0 0 0 1 1 0 0 0 0 0 0
 0 0 0 0 0 1 1 1 0 0 0
 0 0 0 0 0 1 1 0 1 0 0
 0 0 0 0 0 1 0 1 1 0 0
 0 0 0 0 0 0 1 1 1 0 0
 0 0 0 0 0 0 0 0 0 1 1
 0 0 0 0 0 0 0 0 0 1 1

reorder = np.argsort(blks.membership)
print(bigmat[reorder][:, reorder])

[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 1. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 1. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 1. 1. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 1. 1. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1.]]

The output shows two \(2 \times 2\) blocks that are unrestricted, but there remains a \(4 \times 4\) block with some covariances fixed to 0:

R
Python

bigmat[reorder, reorder][6:9, 6:9]

print(bigmat[reorder][:, reorder][5:9][:, 5:9])

[[1. 1. 1. 0.]
 [1. 1. 0. 1.]
 [1. 0. 1. 1.]
 [0. 1. 1. 1.]]

We need to specially consider the prior distribution for this \(4 \times 4\) block due to the 0 restrictions. But the priors for the other seven rows and columns are “easy” because they are unrestricted.

Even though we were not able to permute this matrix to have fully unrestricted blocks, we have reduced the dimension of the problematic block from 11 to 4. We are likely to see improvements in the computational efficiency of MCMC methods due to this reduced dimension.

Summary

Connectivity is a well known concept in graph theory, and it is not really surprising that it can be applied to covariance matrices. This post shows exactly how the covariance matrix can be translated to a graph, which then opens the door to connectivity. Under this representation, we can easily see whether our matrix is block diagonal. This information can then be used to automatically determine appropriate prior distributions for the covariance matrix.

License

The code on this page is copyrighted by Edgar Merkle and licensed under the GPLv3 license:

https://www.gnu.org/licenses/gpl-3.0.en.html

The text and figures on this page are copyrighted by Edgar Merkle and licensed under the CC BY-NC 4.0 license:

https://creativecommons.org/licenses/by-nc/4.0/

sessionInfo()

R version 4.4.3 (2025-02-28)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Chicago
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] igraph_2.1.4

loaded via a namespace (and not attached):
 [1] digest_0.6.37       fastmap_1.2.0       xfun_0.51          
 [4] Matrix_1.7-2        lattice_0.22-6      magrittr_2.0.3     
 [7] reticulate_1.41.0.1 bspm_0.5.5          knitr_1.49         
[10] pkgconfig_2.0.3     htmltools_0.5.8.1   png_0.1-8          
[13] rmarkdown_2.29      lifecycle_1.0.4     cli_3.6.4          
[16] grid_4.4.3          compiler_4.4.3      tools_4.4.3        
[19] evaluate_1.0.3      Rcpp_1.0.14         yaml_2.3.10        
[22] rlang_1.1.5         jsonlite_1.9.1      htmlwidgets_1.6.4

import session_info
session_info.show()

-----
igraph              0.11.8
numpy               2.2.4
session_info        1.0.0
-----
Python 3.10.12 (main, Feb  4 2025, 14:57:36) [GCC 11.4.0]
Linux-5.15.0-126-generic-x86_64-with-glibc2.35
-----
Session information updated at 2025-03-20 17:19

References

Csárdi, Gábor, and Tamás Nepusz. 2006. “The igraph Software Package for Complex Network Research.” InterJournal Complex Systems: 1695. https://igraph.org.

Csárdi, Gábor, Tamás Nepusz, Vincent Traag, Szabolcs Horvát, Fabio Zanini, Daniel Noom, and Kirill Müller. 2025. igraph: Network Analysis and Visualization in R. https://doi.org/10.5281/zenodo.7682609.

Merkle, Edgar C., Oludare Ariyo, Sonja D. Winter, and Mauricio Garnier-Villarreal. 2023. “Opaque Prior Distributions in Bayesian Latent Variable Models.” Methodology 19: 228–55. https://doi.org/10.5964/meth.11167.

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{merkle2025,
  author = {Merkle, Edgar C.},
  title = {Finding Blocks in Covariance Matrices},
  date = {2025-03-20},
  url = {https://ecmerkle.github.io/cs/blockmat.html},
  langid = {en}
}

For attribution, please cite this work as:

Merkle, Edgar C. 2025. “Finding Blocks in Covariance Matrices.” March 20, 2025. https://ecmerkle.github.io/cs/blockmat.html.