Title: | Shrinkage Covariance Matrix Estimators |
---|---|
Description: | Provides nonparametric Steinian shrinkage estimators of the covariance matrix that are suitable in high dimensional settings, that is when the number of variables is larger than the sample size. |
Authors: | Anestis Touloumis [aut, cre] |
Maintainer: | Anestis Touloumis <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 2.1.0 |
Built: | 2024-11-07 03:56:45 UTC |
Source: | https://github.com/anestistouloumis/shrinkcovmat |
Provides nonparametric Stein-type shrinkage estimators of the covariance matrix that are suitable and statistically efficient when the number of variables is larger than the sample size. These estimators are non-singular and well-conditioned regardless of the dimensionality.
Each of the implemented shrinkage covariance matrix estimators is a convex linear combination of the sample covariance matrix and of a target matrix.
The function shrinkcovmat
implements three options for the
target matrix: (a) spherical sample covariance matrix, i.e. the diagonal
matrix with diagonal elements the average of the sample variances, (b)
diagonal sample covariance matrix, i.e. the diagonal matrix with diagonal
elements the corresponding sample variances, and (c) the identity matrix
(identity
). The optimal shrinkage intensity determines how much the
sample covariance matrix will be shrunk towards the selected target matrix.
Estimation of the corresponding optimal shrinkage intensities is discussed
in Touloumis (2015). The function targetselection
is
designed to ease the selection of the target matrix.
Anestis Touloumis
Maintainer: Anestis Touloumis <[email protected]>
Touloumis, A. (2015) Nonparametric Stein-type Shrinkage Covariance Matrix Estimators in High-Dimensional Settings. Computational Statistics & Data Analysis 83, 251–261.
Useful links:
Report bugs at https://github.com/AnestisTouloumis/ShrinkCovMat/issues
The dataset describes a colon cancer study (Alon et al., 1999) in which gene expression levels were measured on 40 normal tissues and on 22 tumor colon tissues. Note that a logarithmic (base 10) transformation has been applied to the gene expression levels.
colon
colon
A data frame in which the rows correspond to 2000 genes and the columns to 62 tissues. The first 40 columns belong to the normal tissue group while the last 22 columns to the tumor colon tissue group.
http://genomics-pubs.princeton.edu/oncology/affydata/ # nolint [Last Assessed: 2016-05-21]
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D. and Levine, A.J. (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America 96, 6745–6750.
data("colon") summary(colon)
data("colon") summary(colon)
Provides a nonparametric Stein-type shrinkage estimator of the covariance matrix that is a linear combination of the sample covariance matrix and of a target matrix.
shrinkcovmat(data, target = "spherical", centered = FALSE)
shrinkcovmat(data, target = "spherical", centered = FALSE)
data |
a numeric matrix containing the data. |
target |
a character indicating the target matrix. Options include 'spherical', 'identity' or 'diagonal'. |
centered |
a logical indicating if the mean vector is the zero vector. |
Options for the target matrix include the spherical
sample covariance
matrix (the diagonal matrix with diagonal elements the average of the sample
variances), the diagonal
sample covariance matrix (the diagonal matrix
with diagonal elements the corresponding sample variances), and (c) the
identity
matrix.
The rows of the data matrix data
correspond to variables/features and
the columns to subjects.
To select the target covariance matrix see targetselection
.
Returns an object of the class 'shrinkcovmathat' that has components:
Sigmahat |
The Stein-type shrinkage estimator of the covariance matrix. |
lambdahat |
The estimated optimal shrinkage intensity. |
Sigmasample |
The sample covariance matrix. |
Target |
The target covariance matrix. |
centered |
If the data are centered around their mean vector. |
Anestis Touloumis
Touloumis, A. (2015) nonparametric Stein-type Shrinkage Covariance Matrix Estimators in High-Dimensional Settings. Computational Statistics & Data Analysis 83, 251–261.
data(colon) normal_group <- colon[, 1:40] sigma_hat_normal_group <- shrinkcovmat(normal_group, target = "spherical") sigma_hat_normal_group
data(colon) normal_group <- colon[, 1:40] sigma_hat_normal_group <- shrinkcovmat(normal_group, target = "spherical") sigma_hat_normal_group
Implements the rule of thumb proposed by Touloumis (2015) for target matrix selection. If the estimated optimal shrinkage intensities of the three target matrices are of similar magnitude, then the average and the range of the sample variances should be inspected in order to adopt the most plausible target matrix.
targetselection(data, centered = FALSE)
targetselection(data, centered = FALSE)
data |
a numeric matrix containing the data. |
centered |
a logical indicating if the mean vector is the zero vector. |
The rows of the data matrix data
correspond to variables and the
columns to subjects.
Prints the estimated optimal shrinkage intensities, the range and average of the sample variances and returns an object of the class 'targetsel' that has components:
lambda_hat_spherical |
The estimated optimal shrinkage intensity for the spherical target matrix. |
lambda_hat_identity |
The estimated optimal shrinkage intensity for the identity target matrix. |
lambda_hat_diagonal |
The estimated optimal intensity for the diagonal target matrix. |
range |
The range of the sample variances. |
average |
The average of the sample variances. |
Anestis Touloumis
Touloumis, A. (2015) Nonparametric Stein-type Shrinkage Covariance Matrix Estimators in High-Dimensional Settings. Computational Statistics & Data Analysis 83, 251–261.
data(colon) normal_group <- colon[, 1:40] targetselection(normal_group) ## Similar intensities, the range of the sample variances is small and the ## average is not close to one. The spherical matrix seems to be the ## most suitable target matrix for the normal group. tumor_group <- colon[, 41:62] targetselection(tumor_group) ## Similar intensities, the range of the sample variances is small and the ## average is not close to one. The spherical matrix seems to be the ## most suitable target matrix for the colon group.
data(colon) normal_group <- colon[, 1:40] targetselection(normal_group) ## Similar intensities, the range of the sample variances is small and the ## average is not close to one. The spherical matrix seems to be the ## most suitable target matrix for the normal group. tumor_group <- colon[, 41:62] targetselection(tumor_group) ## Similar intensities, the range of the sample variances is small and the ## average is not close to one. The spherical matrix seems to be the ## most suitable target matrix for the colon group.