Title: | Nonparametric Estimation of Toeplitz Covariance Matrices |
Version: | 0.2 |
Description: | A nonparametric method to estimate Toeplitz covariance matrices from a sample of n independently and identically distributed p-dimensional vectors with mean zero. The data is preprocessed with the discrete cosine matrix and a variance stabilization transformation to obtain an approximate Gaussian regression setting for the log-spectral density function. Estimates of the spectral density function and the inverse of the covariance matrix are provided as well. Functions for simulating data and a protein data example are included. For details see (Klockmann, Krivobokova; 2023), <doi:10.48550/arXiv.2303.10018>. |
License: | GPL-2 |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.1 |
Imports: | dtt, MASS, nlme |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
Depends: | R (≥ 3.5.0) |
LazyData: | true |
LazyDataCompression: | xz |
NeedsCompilation: | no |
Packaged: | 2023-07-04 20:14:35 UTC; klockmann |
Author: | Karolina Klockmann [aut, cre], Tatyana Krivobokova [aut] |
Maintainer: | Karolina Klockmann <karolina.klockmann@gmx.de> |
Repository: | CRAN |
Date/Publication: | 2023-07-06 07:30:02 UTC |
Periodic Demmler-Reinsch Basis
Description
Calculates the periodic Demmler-Reinsch basisfor a given smoothness and a given vector of grid points. For details see (Schwarz, Krivobokova; 2016).
Usage
DR.basis(x, n, q)
Arguments
x |
|
n |
dimension of the basis |
q |
penalization order, |
Value
mxn
dimensional matrix with the n
DR basis functions evaluated at grid points x
Examples
DR.basis(seq(1,10)/10,5,2)
Data Examples
Description
example1, example2 and example3 generate i.i.d. vectors from a given distribution with different Toeplitz covariance matrices.
The covariance function \sigma
of the Toeplitz covariance matrix of
example1
: has a polynomial decay,\sigma(\tau)= sd^2(1+|\tau|)^{-gamma}
,example2
: follows anARMA(2,2)
model with coefficients(0.7,-0.4,-0.2,0.2)
and innovations variancesd^2
,example3
: yields a Lipschitz continuous spectral densityf
that is not differentiable, i.e.f(x)= sd^2({|\sin(x+0.5\pi)|^{gamma}+0.45})
Usage
example1(p, n, sd, gamma, family = "Gaussian")
example2(p, n, sd, family = "Gaussian")
example3(p, n, sd, gamma, family = "Gaussian")
Arguments
p |
vector length |
n |
sample size |
sd |
standard deviation |
gamma |
polynomial decay of covariance function for |
family |
distribution of the simulated data. Available distributions are " |
Value
A list containing the following elements:
Y
:pxn
dimensional data matrixsdf
: true spectral density functionacf
: true covariance function
Examples
example1(p=10, n=1, sd=1, gamma=1.2, family="Gaussian")
example2(p=10,n=1,sd=1,family="Gaussian")
example3(p=10, n=1, sd=1, gamma=2,family="Gaussian")
Data Transformation
Description
Applies the Discrete Cosine I transform, data binning and the variance stabilizing transform function to the data.
Usage
Data.trafo(y, Te, dct.out = FALSE)
Arguments
y |
|
Te |
number of bins for data binning. |
dct.out |
logical. If |
Value
A list containing the following elements:
m
: number of data points per bin, that ism=n*round(p/Te)
. Ifp/Te
is not an integer, the first/last bin may contain more thanm
data points.y.star
:2Te-2
dimensional vector with binned, variance stabilized and mirrowed data. The bin numberTe
may be modified to guarantee at least two data points per bin. Ifp/Te
is not an integer, the vector dimension is2*floor(p/round(p/Te))-2
.dct.matrix
:p
-dim. DCT-I matrix (ifdct.out
=TRUE)
Toeplitz Covariance and Precision Matrix Estimator
Description
Estimates the Toeplitz covariance matrix, the inverse matrix and the spectral density from a sample of n
i.i.d. p
-dimensional vectors with mean zero.
Usage
Toep.estimator(y, Te, q, method, f.true = NULL)
Arguments
y |
|
Te |
number of bins for data binning. |
q |
penalization order, |
method |
to select the smoothing parameter of the smoothing spline. Available methods are restricted maxmimum likelihodd " |
f.true |
Te-dimensional vector with the true spectral density function evaluated at equi-sapced points in [0, |
Value
A list containing the following elements:
toep
:p
-dim. Toeplitz covariance matrixtoep.inv
:p
-dim. precision matrixacf
:p
-dim. vector with the covariance functionsdf
:p
-dim. vector with the spectral density in the interval [0,1]
Examples
#EXAMPLE 1: Simulate Gaussian ARMA(2,2)
library(nlme)
library(MASS)
p=100
n=1
Sigma=1.44*corMatrix(Initialize(corARMA(c(0.7, -0.4,-0.2, 0.2),p=2,q=2),data=diag(1:p)))
Y=matrix(mvrnorm(n, mu=numeric(p), Sigma=Sigma),n,p)
fit.toep=Toep.estimator(y=Y,Te=10,q=2,method="GCV")$toep
#EXAMPLE 2: AQUAPORIN DATA
data(aquaporin)
n=length(aquaporin$Y)
y.train=aquaporin$Y[1:(0.01*n)]
y.train=y.train-mean(y.train)
fit.toep=Toep.estimator(y=y.train,Te=10,q=1,method="ML")$toep
Aquaporin Dataset
Description
Dataset with molecular dynamics simulations for the yeast aquaporin (Aqy1) - the gated water channel of the yeast Pichi pastoris. The dataset contains only the diameter Y of the channel which is used in the data analysis in (Klockmann and Krivobokova, 2023). The diameter Y is measured by the distance between two centers of mass of certain residues of the protein. The dataset includes a 100 nanosecond time frame, split into 20000 equidistant observations. The full dataset, including the Euclidean coordinates of all 783 atoms, is available from the authors. For more details see (Klockmann, Krivobokova; 2023).
Usage
aquaporin
Format
A data frame with 20000 rows and 1 variable:
Y
: the diameter of the channel
Source
see (Klockmann, Krivobokova; 2023).
Examples
data(aquaporin)