Type: | Package |
Title: | Analysis of Proportional Hazards Model with Sparse Longitudinal Covariates |
Version: | 1.5 |
Date: | 2025-05-04 |
Author: | Hongyuan Cao [aut], Mathew M. Churpek [aut], Donglin Zeng [aut], Jason P. Fine [aut], Shannon T. Holloway [aut, cre] |
Maintainer: | Shannon T. Holloway <shannon.t.holloway@gmail.com> |
Description: | Provides kernel weighting methods for estimation of proportional hazards models with intermittently observed longitudinal covariates. Cao H., Churpek M. M., Zeng D., and Fine J. P. (2015) <doi:10.1080/01621459.2014.957289>. |
License: | GPL-2 |
Depends: | R (≥ 4.1.0) |
Imports: | stats |
NeedsCompilation: | no |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.3 |
Collate: | 'betaEst.R' 'dataset.R' 'local_kernel.R' 'scoreNVCF.R' 'scoreLVCF.R' 'scoreHalf.R' 'scoreFull.R' 'preprocessInputs.R' 'kernelFixed.R' 'kernelAuto.R' 'fullKernel.R' 'halfKernel.R' 'lastValue.R' 'nearValue.R' |
Packaged: | 2025-05-04 15:47:01 UTC; 19194 |
Repository: | CRAN |
Date/Publication: | 2025-05-04 16:00:02 UTC |
Generated Sparse Longitudinal Data
Description
For the purposes of the package examples, the dataset was adapted from the numerical simulations of the original manuscript.
Format
X is a data frame with 400 observations on the following 3 variables.
ID
patient identifier, there are 400 patients.
Time
the time to event or censoring
Delta
a numeric vector with 0 denoting censoring and 1 event
Z is a data frame with 3237 observations on the following 3 variables.
ID
patient identifier, there are 400 patients.
obsTime
the covariate observation times.
x1
the covariate generated through a piecewise constant function.
Details
Data was generated for 400 subjects. The total number of covariate observation
times was Poisson distributed with intensity rate 8. The covariate
observation times are generated from a uniform distribution Unif(0,1)
independently. The covariate process is piecewise constant, with values
being multivariate normal with mean 0, variance 1 and correlation
\exp(-|i - j|/20)
. The survival time were generated
from the Cox model
\lambda(t | Z(r), r \le t) = \lambda_0 \exp(\beta Z(t))
, where \beta
= 1.5,
and \lambda_0
= 1.0. Covariates are dataset Z. Event times
and indicators are dataset X.
References
Cao H., Churpek M. M., Zeng D., Fine J. P. (2015). Analysis of the proportional hazards model with sparse longitudinal covariates. Journal of the American Statistical Association, 110, 1187-1196.
Full Kernel Estimation with Forward and Backward Lagged Covariates
Description
A kernel weighting scheme to evaluate the effects of longitudinal covariates on the occurrence of events when the time-dependent covariates are measured intermittently. Regression parameter estimation uses full kernel imputation of missing values with both forward and backward lagged covariates.
Usage
fullKernel(
X,
Z,
tau,
kType = c("epan", "uniform", "gauss"),
bw = NULL,
tol = 0.001,
maxiter = 100L,
verbose = TRUE
)
Arguments
X |
An object of class data.frame. The structure of the data.frame must be {patient ID, event time, event indicator}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. The event indicator is 1 if the event occurred; 0 if censored. |
Z |
An object of class data.frame. The structure of the data.frame must be {patient ID, time of measurement, measurement(s)}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. |
tau |
An object of class numeric. The desired time point. |
kType |
An object of class character indicating the type of smoothing kernel to use in the estimating equation. Must be one of {"epan", "uniform", "gauss"}, where "epan" is the Epanechnikov kernel and "gauss" is the Gaussian kernel. |
bw |
NULL or a numeric vector. If provided, the bandwidths for which
parameter estimates are to be obtained. If NULL, an optimal bandwidth will
be determined using an adaptive selection procedure. The range of the
bandwidth search space is taken to be
|
tol |
An object of class numeric. The minimum change in the regression parameters deemed to indicate convergence of the Newton-Raphson method. |
maxiter |
An object of class integer. The maximum number of iterations used to estimate regression parameters. |
verbose |
An object of class logical. TRUE results in progress screen prints. |
Value
A list is returned. If bandwidths are provided, each element is a
matrix, where the ith row corresponds to the ith bandwidth of input
argument bw
, and the columns correspond to the model parameters. If
the bandwidth is determined internally, each element of the list is a
named vector calculated at the optimal bandwidth.
betaHat: The estimated model coefficients.
stdErr: The standard error for each coefficient.
zValue: The estimated z-value for each coefficient.
pValue: The p-value for each coefficient.
If the bandwidth is determined internally, three additional list elements are returned:
optBW: The estimated optimal bandwidth.
minMSE: The mean squared error at the optimal bandwidth.
MSE: The vector of MSE for each bandwidth.
References
Cao H., Churpek M. M., Zeng D., Fine J. P. (2015). Analysis of the proportional hazards model with sparse longitudinal covariates. Journal of the American Statistical Association, 110, 1187-1196.
See Also
halfKernel
, lastValue
, nearValue
Examples
data(SurvLongData)
exp <- fullKernel(X = X, Z = Z, tau = 1.0, bw = 0.015)
Half Kernel Estimation with Backward Lagged Covariates
Description
A kernel weighting scheme to evaluate the effects of longitudinal covariates on the occurrence of events when the time-dependent covariates are measured intermittently. Regression parameter estimation using half kernel imputation of missing values with backward lagged covariates.
Usage
halfKernel(
X,
Z,
tau,
kType = c("epan", "uniform", "gauss"),
bw = NULL,
tol = 0.001,
maxiter = 100L,
verbose = TRUE
)
Arguments
X |
An object of class data.frame. The structure of the data.frame must be {patient ID, event time, event indicator}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. The event indicator is 1 if the event occurred; 0 if censored. |
Z |
An object of class data.frame. The structure of the data.frame must be {patient ID, time of measurement, measurement(s)}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. |
tau |
An object of class numeric. The desired time point. |
kType |
An object of class character indicating the type of smoothing kernel to use in the estimating equation. Must be one of {"epan", "uniform", "gauss"}, where "epan" is the Epanechnikov kernel and "gauss" is the Gaussian kernel. |
bw |
NULL or a numeric vector. If provided, the bandwidths for which
parameter estimates are to be obtained. If NULL, an optimal bandwidth will
be determined using an adaptive selection procedure. The range of the
bandwidth search space is taken to be
|
tol |
An object of class numeric. The minimum change in the regression parameters deemed to indicate convergence of the Newton-Raphson method. |
maxiter |
An object of class integer. The maximum number of iterations used to estimate regression parameters. |
verbose |
An object of class logical. TRUE results in progress screen prints. |
Value
A list is returned. If bandwidths are provided, each element is a
matrix, where the ith row corresponds to the ith bandwidth of input
argument bw
, and the columns correspond to the model parameters. If
the bandwidth is determined internally, each element of the list is a
named vector calculated at the optimal bandwidth.
betaHat: The estimated model coefficients.
stdErr: The standard error for each coefficient.
zValue: The estimated z-value for each coefficient.
pValue: The p-value for each coefficient.
If the bandwidth is determined internally, three additional list elements are returned:
optBW: The estimated optimal bandwidth.
minMSE: The mean squared error at the optimal bandwidth.
MSE: The vector of MSE for each bandwidth.
References
Cao H., Churpek M. M., Zeng D., Fine J. P. (2015). Analysis of the proportional hazards model with sparse longitudinal covariates. Journal of the American Statistical Association, 110, 1187-1196.
See Also
fullKernel
, lastValue
, nearValue
Examples
data(SurvLongData)
exp <- halfKernel(X = X, Z = Z, tau = 1.0, bw = 0.015)
Last Value Carried Forward Method
Description
A simple approach to evaluate the effects of longitudinal covariates on the occurrence of events when the time-dependent covariates are measured intermittently. Regression parameter are estimated using last value carried forward imputation of missing values.
Usage
lastValue(X, Z, tau, tol = 0.001, maxiter = 100L, verbose = TRUE)
Arguments
X |
An object of class data.frame. The structure of the data.frame must be {patient ID, event time, event indicator}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. The event indicator is 1 if the event occurred; 0 if censored. |
Z |
An object of class data.frame. The structure of the data.frame must be {patient ID, time of measurement, measurement(s)}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. |
tau |
An object of class numeric. The desired time point. |
tol |
An object of class numeric. The minimum change in the regression parameters deemed to indicate convergence of the Newton-Raphson method. |
maxiter |
An object of class integer. The maximum number of iterations used to estimate regression parameters. |
verbose |
An object of class logical. TRUE results in progress screen prints. |
Value
A list
betaHat: The estimated model coefficients.
stdErr: The standard error for each coefficient.
zValue: The estimated z-value for each coefficient.
pValue: The p-value for each coefficient.
References
Cao H., Churpek M. M., Zeng D., Fine J. P. (2015). Analysis of the proportional hazards model with sparse longitudinal covariates. Journal of the American Statistical Association, 110, 1187-1196.
See Also
fullKernel
, halfKernel
, nearValue
Examples
data(SurvLongData)
# A truncated dataset to keep example run time brief
exp <- lastValue(X = X[1:200,], Z = Z, tau = 1.0)
Nearest Value Method
Description
A simple approach to evaluate the effects of longitudinal covariates on the occurrence of events when the time-dependent covariates are measured intermittently. Regression parameters are estimated using the nearest value to imputate missing values.
Usage
nearValue(X, Z, tau, tol = 0.001, maxiter = 100L, verbose = TRUE)
Arguments
X |
An object of class data.frame. The structure of the data.frame must be {patient ID, event time, event indicator}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. The event indicator is 1 if the event occurred; 0 if censored. |
Z |
An object of class data.frame. The structure of the data.frame must be {patient ID, time of measurement, measurement(s)}. Patient IDs must be of class integer or be able to be coerced to class integer without loss of information. Missing values must be indicated as NA. |
tau |
An object of class numeric. The desired time point. |
tol |
An object of class numeric. The minimum change in the regression parameters deemed to indicate convergence of the Newton-Raphson method. |
maxiter |
An object of class integer. The maximum number of iterations used to estimate regression parameters. |
verbose |
An object of class logical. TRUE results in progress screen prints. |
Value
A list
betaHat: The estimated model coefficients.
stdErr: The standard error for each coefficient.
zValue: The estimated z-value for each coefficient.
pValue: The p-value for each coefficient.
References
Cao H., Churpek M. M., Zeng D., Fine J. P. (2015). Analysis of the proportional hazards model with sparse longitudinal covariates. Journal of the American Statistical Association, 110, 1187-1196.
See Also
fullKernel
, halfKernel
, lastValue
Examples
data(SurvLongData)
# A truncated dataset to keep example run time brief
exp <- nearValue(X = X[1:100,], Z = Z, tau = 1.0)