Type: | Package |
Title: | Change Point Detection with Missing Values |
Version: | 0.1.1 |
Author: | Yanxi Liu [aut, cre], Abolfazl Safikhani [aut] |
Maintainer: | Yanxi Liu <liuyanxi@ufl.edu> |
Description: | A four step change point detection method that can detect break points with the presence of missing values proposed by Liu and Safikhani (2023) https://drive.google.com/file/d/1a8sV3RJ8VofLWikTDTQ7W4XJ76cEj4Fg/view?usp=drive_link. |
License: | GPL-2 |
Encoding: | UTF-8 |
Imports: | stats, graphics, mvtnorm, factoextra, Rcpp, ggplot2, glmnet |
LinkingTo: | Rcpp, RcppArmadillo |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | yes |
Packaged: | 2025-02-17 11:48:51 UTC; yanxiliu |
Repository: | CRAN |
Date/Publication: | 2025-02-17 12:30:07 UTC |
BIC
Description
BIC and HBIC function
Usage
BIC(residual, phi)
Arguments
residual |
residual matrix |
phi |
estimated coefficient matrix of the model |
Value
A list object, which contains the followings
- BIC
BIC value
- HBIC
HBIC value
BIC_threshold
Description
BIC threshold for final parameter estimation
Usage
BIC_threshold(
beta.final,
k,
m.hat,
brk,
data_y,
data_x = NULL,
b_n = 2,
nlam = 20
)
Arguments
beta.final |
estimated parameter coefficient matrices |
k |
dimensions of parameter coefficient matrices |
m.hat |
number of estimated change points |
brk |
vector of estimated change points |
data_y |
input data matrix (response), with each column representing the time series component |
data_x |
input data matrix (predictor), with each column 1 |
b_n |
the block size |
nlam |
number of hyperparameters for grid search |
Value
lambda.val.best, the tuning parameter lambda selected by BIC.
BTIE
Description
Perform the BTIE algorithm to detect the structural breaks in large scale high-dimensional mean shift models.
Usage
BTIE(
data_y,
lambda.1.cv = NULL,
lambda.2.cv = NULL,
max.iteration = 100,
tol = 10^(-2),
block.size = NULL,
refit = FALSE,
optimal.block = TRUE,
optimal.gamma.val = 1.5,
block.range = NULL
)
Arguments
data_y |
input data matrix (response), with each column representing the time series component |
lambda.1.cv |
tuning parmaeter lambda_1 for fused lasso |
lambda.2.cv |
tuning parmaeter lambda_2 for fused lasso |
max.iteration |
max number of iteration for the fused lasso |
tol |
tolerance for the fused lasso |
block.size |
the block size |
refit |
logical; if TRUE, refit the model, if FALSE, use BIC to find a thresholding value and then output the parameter estimates without refitting. Default is FALSE. |
optimal.block |
logical; if TRUE, grid search to find optimal block size, if FALSE, directly use the default block size. Default is TRUE. |
optimal.gamma.val |
hyperparameter for optimal block size, if optimal.blocks == TRUE. Default is 1.5. |
block.range |
the search domain for optimal block size. |
Value
A list object, which contains the followings
Examples
set.seed(1)
n <- 1000;
p <- 50;
brk <- c(333, 666, n+1)
m <- length(brk)
d <- 5
constant.full <- constant_generation(n, p, d, 50, brk)
e.sigma <- as.matrix(1*diag(p))
data_y <- data_generation(n = n, mu = constant.full, sigma = e.sigma, brk = brk)
data_y <- as.matrix(data_y, ncol = p.y)
data_y_miss <- MCAR(data_y, 0.3)
temp <- BTIE(data_y_miss, optimal.block = FALSE, block.size = 30)
temp$cp.final
Heter_missing
Description
function to do the missing assuming the missing completely at random
Usage
Heter_missing(data, alpha)
Arguments
data |
data before the missing case |
alpha |
the list of percentage of missing compared to whole data |
Value
the data matrix with missing values
MCAR
Description
function to do the missing assuming the missing completely at random
Usage
MCAR(data, alpha)
Arguments
data |
data before the missing case |
alpha |
the percentage of missing compared to whole data |
Value
the data matrix with missing values
constant_generation
Description
function to generate constant given jump size and break points
Usage
constant_generation(n, p, d, vns, brk)
Arguments
n |
the sample size |
p |
the data dimension |
d |
the number of nonzero coeddficients |
vns |
the jump size. It can be a vector or a single value. If single value, it is same for all break points |
brk |
the break points' locations |
Value
the parameter matrix used to generate data
data_generation
Description
The function to generate mean shift data
Usage
data_generation(n, mu, sigma, brk = n + 1)
Arguments
n |
the number of data points |
mu |
the matrix of mean parameter |
sigma |
covariance matrix of the white noise |
brk |
vector of change points |
Value
data_y matrix of generated mean shift data
first.step
Description
Perform the block fused lasso with thresholding to detect candidate break points.
Usage
first.step(
data_y,
data_x,
lambda1,
lambda2,
max.iteration = max.iteration,
tol = tol,
blocks,
cv.index,
fixed_index = NULL,
nonfixed_index = NULL
)
Arguments
data_y |
input data matrix Y, with each column representing the time series component |
data_x |
input data matrix X |
lambda1 |
tuning parmaeter lambda_1 for fused lasso |
lambda2 |
tuning parmaeter lambda_2 for fused lasso |
max.iteration |
max number of iteration for the fused lasso |
tol |
tolerance for the fused lasso |
blocks |
the blocks |
cv.index |
the index of time points for cross-validation |
fixed_index |
index for linear regression model with only partial compoenents change. |
nonfixed_index |
index for linear regression model with only partial compoenents change. |
Value
A list object, which contains the followings
- jump.l2
estimated jump size in L2 norm
- jump.l1
estimated jump size in L1 norm
- pts.list
estimated change points in the first step
- beta.full
estimated parameters in the first step
imputation
Description
function to do the imputation based on block size
Usage
imputation(data, block.size)
Arguments
data |
data before the imputation |
block.size |
the block size that are used to impute the missing |
Value
the data matrix without missing values after imputation
imputation2
Description
function to do the imputation based on change point candidate
Usage
imputation2(data, cp.candidate)
Arguments
data |
data before the imputation |
cp.candidate |
the change point candidate that are used to impute the missing |
Value
the data matrix without missing values after imputation
pred
Description
function to do the prediction
Usage
pred(X, phi, j, p.x, p.y, h = 1)
Arguments
X |
data for prediction |
phi |
parameter matrix |
j |
the start time point for prediction |
p.x |
the dimension of data X |
p.y |
the dimension of data Y |
h |
the length of observation to predict |
Value
prediction matrix
pred.block
Description
Prediction function (block)
Usage
pred.block(X, phi, j, p.x, p.y, h)
Arguments
X |
data for prediction |
phi |
parameter matrix |
j |
the start time point for prediction |
p.x |
the dimension of data X |
p.y |
the dimension of data Y |
h |
the length of observation to predict |
Value
prediction matrix
second.step
Description
Reimputate the missing values and perform the exhaustive search to "thin out" redundant break points.
Usage
second.step(
data_y,
data_x,
max.iteration = max.iteration,
tol = tol,
cp.first,
beta.est,
blocks,
data_y_miss
)
Arguments
data_y |
input data matrix, with each column representing the time series component |
data_x |
input data matrix |
max.iteration |
max number of iteration for the fused lasso |
tol |
tolerance for the fused lasso |
cp.first |
the selected break points after the first step |
beta.est |
the estiamted parameters by block fused lasso |
blocks |
the blocks |
data_y_miss |
the data y matrix before the first imputation |
Value
A list object, which contains the followings
- cp.final
a set of selected break point after the exhaustive search step
- beta.hat.list
the estimated coefficient matrix for each segmentation