matSpD local version

Matrix Spectral Decomposition (matSpD) – estimates the equivalent number of independent variables in a correlation (r) matrix.

I recommend limiting matSpD to correlation matrices containing between 2 and 200 rows!

Given the sign of the correlation is not important with regard to multiple test correction, matSpD now automatically analyses the absolute values of your correlation matrix.

If you have larger matrices use the matSpDlite.R script, which ONLY calculates Veff and VeffLi (i.e., it does not perform time-consuming varimax/promax rotations) thus allowing users to obtain Veff/VeffLi values for large numbers (1000s) of variables.

Below I provide a downloadable R script to perform matSpD analysis on your local machine.

Analogous to my SNPSpD approach, matSpD provides a measure of the equivalent number of independent variables in a correlation (r) matrix, by examining the ratio of observed eigenvalue variance (after spectral decomposiiton) to its theoretical maximum. Please refer to the SNPSpD publication for further information.

It is worth noting that the SNPSpD approach utilises the square root of a linkage disequilibrium (LD) correlation measure (r2) estimated from haplotype frequencies. However, correlations (r) based on genotype allele counts will suffice for the estimation of SNP independence (i.e., the correlation between two SNP variables, coded 0, 1 or 2 to represent the number of non-reference alleles at each SNP).

Therefore, in order to perform a SNPSpD analysis, you could run the matSpD script on a pre-calculated LD correlation (r) matrix for your SNPs (e.g., generated using the PLINK –r option).

The matSpD.R script takes one file as input (please use short simple file names [i.e., no spaces or special characters]).

The file should be a space or tab delimited [PC/UNIX (not Mac)] plain text (ascii/ansi) format text file (with no hidden characters) containing your correlation (r) matrix (e.g., file name = “corr.matrix”) without row and column names.

The following links provide an example of such an input file and its corresponding output:
corr.matrix – input file,
results – output file.

References:

Cheverud JM (2001) A simple correction for multiple comparisons in interval mapping genome scans. Heredity 87:52-8

Li J, Ji L (2005) Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity 95:221-227

Nyholt DR (2004) A simple correction for multiple testing for SNPs in linkage disequilibrium with each other. Am J Hum Genet 74(4):765-769.

R Development Core Team (2003) R: a language and environment for statistical computing. R Foundation for  Statistical Computing, Vienna, http://www.R-project.org (accessed March 1, 2004)

Download some slightly newer R scripts matSpDliteNewAbsMatrix.R and matSpDliteNewMatrix.R to perform matSpDlite analysis on your local machine.

Page last updated August 28, 2019.