Package 'ShrinkCovMat'

Title: Shrinkage Covariance Matrix Estimators
Description: Provides nonparametric Steinian shrinkage estimators of the covariance matrix that are suitable in high dimensional settings, that is when the number of variables is larger than the sample size.
Authors: Anestis Touloumis [aut, cre]
Maintainer: Anestis Touloumis <[email protected]>
License: GPL-2 | GPL-3
Version: 2.1.0
Built: 2024-09-08 03:56:43 UTC
Source: https://github.com/anestistouloumis/shrinkcovmat

Help Index


Shrinkage Covariance Matrix Estimators

Description

Provides nonparametric Stein-type shrinkage estimators of the covariance matrix that are suitable and statistically efficient when the number of variables is larger than the sample size. These estimators are non-singular and well-conditioned regardless of the dimensionality.

Details

Each of the implemented shrinkage covariance matrix estimators is a convex linear combination of the sample covariance matrix and of a target matrix.

The function shrinkcovmat implements three options for the target matrix: (a) spherical sample covariance matrix, i.e. the diagonal matrix with diagonal elements the average of the sample variances, (b) diagonal sample covariance matrix, i.e. the diagonal matrix with diagonal elements the corresponding sample variances, and (c) the identity matrix (identity). The optimal shrinkage intensity determines how much the sample covariance matrix will be shrunk towards the selected target matrix.

Estimation of the corresponding optimal shrinkage intensities is discussed in Touloumis (2015). The function targetselection is designed to ease the selection of the target matrix.

Author(s)

Anestis Touloumis

Maintainer: Anestis Touloumis <[email protected]>

References

Touloumis, A. (2015) Nonparametric Stein-type Shrinkage Covariance Matrix Estimators in High-Dimensional Settings. Computational Statistics & Data Analysis 83, 251–261.

See Also

Useful links:


Colon Cancer Dataset

Description

The dataset describes a colon cancer study (Alon et al., 1999) in which gene expression levels were measured on 40 normal tissues and on 22 tumor colon tissues. Note that a logarithmic (base 10) transformation has been applied to the gene expression levels.

Usage

colon

Format

A data frame in which the rows correspond to 2000 genes and the columns to 62 tissues. The first 40 columns belong to the normal tissue group while the last 22 columns to the tumor colon tissue group.

Source

http://genomics-pubs.princeton.edu/oncology/affydata/ # nolint [Last Assessed: 2016-05-21]

References

Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D. and Levine, A.J. (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America 96, 6745–6750.

Examples

data("colon")
summary(colon)

Linear Shrinkage of the Sample Covariance

Description

Provides a nonparametric Stein-type shrinkage estimator of the covariance matrix that is a linear combination of the sample covariance matrix and of a target matrix.

Usage

shrinkcovmat(data, target = "spherical", centered = FALSE)

Arguments

data

a numeric matrix containing the data.

target

a character indicating the target matrix. Options include 'spherical', 'identity' or 'diagonal'.

centered

a logical indicating if the mean vector is the zero vector.

Details

Options for the target matrix include the spherical sample covariance matrix (the diagonal matrix with diagonal elements the average of the sample variances), the diagonal sample covariance matrix (the diagonal matrix with diagonal elements the corresponding sample variances), and (c) the identity matrix.

The rows of the data matrix data correspond to variables/features and the columns to subjects.

To select the target covariance matrix see targetselection.

Value

Returns an object of the class 'shrinkcovmathat' that has components:

Sigmahat

The Stein-type shrinkage estimator of the covariance matrix.

lambdahat

The estimated optimal shrinkage intensity.

Sigmasample

The sample covariance matrix.

Target

The target covariance matrix.

centered

If the data are centered around their mean vector.

Author(s)

Anestis Touloumis

References

Touloumis, A. (2015) nonparametric Stein-type Shrinkage Covariance Matrix Estimators in High-Dimensional Settings. Computational Statistics & Data Analysis 83, 251–261.

See Also

targetselection.

Examples

data(colon)
normal_group <- colon[, 1:40]
sigma_hat_normal_group <- shrinkcovmat(normal_group, target = "spherical")
sigma_hat_normal_group

Target Matrix Selection

Description

Implements the rule of thumb proposed by Touloumis (2015) for target matrix selection. If the estimated optimal shrinkage intensities of the three target matrices are of similar magnitude, then the average and the range of the sample variances should be inspected in order to adopt the most plausible target matrix.

Usage

targetselection(data, centered = FALSE)

Arguments

data

a numeric matrix containing the data.

centered

a logical indicating if the mean vector is the zero vector.

Details

The rows of the data matrix data correspond to variables and the columns to subjects.

Value

Prints the estimated optimal shrinkage intensities, the range and average of the sample variances and returns an object of the class 'targetsel' that has components:

lambda_hat_spherical

The estimated optimal shrinkage intensity for the spherical target matrix.

lambda_hat_identity

The estimated optimal shrinkage intensity for the identity target matrix.

lambda_hat_diagonal

The estimated optimal intensity for the diagonal target matrix.

range

The range of the sample variances.

average

The average of the sample variances.

Author(s)

Anestis Touloumis

References

Touloumis, A. (2015) Nonparametric Stein-type Shrinkage Covariance Matrix Estimators in High-Dimensional Settings. Computational Statistics & Data Analysis 83, 251–261.

Examples

data(colon)
normal_group <- colon[, 1:40]
targetselection(normal_group)
## Similar intensities, the range of the sample variances is small and the
## average is not close to one. The spherical matrix seems to be the
## most suitable target matrix for the normal group.

tumor_group <- colon[, 41:62]
targetselection(tumor_group)
## Similar intensities, the range of the sample variances is small and the
## average is not close to one. The spherical matrix seems to be the
## most suitable target matrix for the colon group.