Type: | Package |
Title: | Interval Censored Multi-State Models |
Version: | 0.2.0 |
Maintainer: | Daniel Gomon <dgstatsoft@gmail.com> |
Description: | Allows for the non-parametric estimation of transition intensities in interval-censored multi-state models using the approach of Gomon and Putter (2024) <doi:10.48550/arXiv.2409.07176> or Gu et al. (2023) <doi:10.1093/biomet/asad073>. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Imports: | Rcpp, mstate, prodlim, igraph (≥ 1.3.0), checkmate, ggplot2, deSolve, msm, survival, JOPS |
LinkingTo: | Rcpp |
Suggests: | testthat (≥ 3.0.0), icenReg, profvis, knitr, rmarkdown, bookdown, latex2exp |
Config/testthat/edition: | 3 |
RoxygenNote: | 7.3.2 |
Encoding: | UTF-8 |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2025-07-04 09:30:27 UTC; danie |
Author: | Daniel Gomon |
Repository: | CRAN |
Date/Publication: | 2025-07-04 10:00:02 UTC |
icmstate
Description
Non-parametric estimation of transition intensities in interval-censored multi-state models
Details
Allows for the estimation of transition intensities in interval-censored multi-state models using the approach of Gomon and Putter (2024) (Multinomial likelihood) or Gu et al. (2023) (Poisson likelihood).
Author(s)
Maintainer: Daniel Gomon <dgstatsoft@gmail.com> [aut, cre]
Hein Putter [aut]
References
Y. Gu, D. Zeng, G. Heiss, and D. Y. Lin, Maximum likelihood estimation for semiparametric regression models with interval-censored multistate data, Biometrika, Nov. 2023, doi:10.1093/biomet/asad073
D. Gomon and H. Putter, Non-parametric estimation of transition intensities in interval censored Markov multi-state models without loops, arXiv, 2024, doi:10.48550/arXiv.2409.07176
Check if half-open intervals intersect with event times
Description
Function which takes as input a 2 column matrix of half-open intervals and a vector of event times and returns the event times that intersect with the half-open intervals.
Usage
Aintersectb(A, b, A.left.open = FALSE)
Arguments
A |
Two column matrix containing intervals |
b |
Vector of event times |
A.left.open |
Are the intervals in A open on the left side? Default = FALSE. |
Value
Numeric vector of event times from b that intersect with A.
Check if closed interval is contained in half-open infinite interval
Description
Function which takes as input a matrix with 2 columns and a vector indicating left points of intervals [b, Infinity) and checks whether each interval in the matrix is contained within the corresponding interval derived from b.
Usage
Alargerb(A, b)
Arguments
A |
Two column matrix containing intervals to be checked for being contained in b |
b |
Vector indicating left point in corresponding [b, Infinity) interval |
Value
Matrix of size (nrow(A) * length(b)) with binary values indicating whether the intervals in A are contained in the ones induced by b
Check if closed interval is contained in other closed interval
Description
Function which takes as input two matrices with 2 columns each and checks whether each interval in the first matrix is contained within each interval in the second matrix.
Usage
AsubsetB(A, B, B.left.open = FALSE, B.right.open = FALSE)
Arguments
A |
Two column matrix containing intervals to be checked for being contained in B |
B |
Two column matrix containing intervals possibly overlapping the intervals in A |
B.left.open |
Are the intervals in B left-open? |
B.right.open |
Are the intervals in B right-open? |
Value
Matrix of size (nrow(A) * nrow(B)) with binary values indicating whether the intervals in A are contained in B
Function to use
Description
Function to use
Usage
ChapKolm_fwd_mat(t, state, parameters, transMat)
Chapman-Kolmogorov functions
Description
Functions which can be used to solver the Chapman-Kolmogorov equations. The functions are in the form as expected by deSolve:ode().
Usage
ChapKolm_fwd_smooth(t, state, parms, fix_pars, subject)
Arguments
t |
Time at which the derivative is required |
state |
Values for the state in which the system resides at time t (current "estimate" of transition matrix P). Must be a vector of length n_states * n_states containing the stacked columns of P: c(P_11, P_21, ... P_(n_states 1), P_12, ..., P_(n_states n_states)). |
parms |
Parameters to derive the derivative. For P-splines, this is a list of coefficients, with each list entry (corresponding to a transition number) containing a vector of n_splines coefficients. |
fix_pars |
A list of fixed parameters in the EM procedure |
subject |
A subject identifier for risk-adjustment |
Value
Returns the derivative of the transition probability matrix P(s,t) with respect to time (forward: t, backward: s) as a list.
Helper function for npmsm()
Description
For a general Markov chain multi-state model with interval censored transitions calculate the NPMLE using an EM algorithm with multinomial approach
Usage
EM_multinomial(
gd,
tmat,
tmat2,
inits,
beta_params,
support_manual,
exact,
maxit,
tol,
conv_crit,
manual,
verbose,
newmet,
include_inf,
checkMLE,
checkMLE_tol,
prob_tol,
remove_bins,
init_int = init_int,
...
)
Arguments
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
inits |
Which distribution should be used to generate the initial estimates
of the intensities in the EM algorithm. One of c("equalprob", "unif", "beta"),
with "equalprob" assigning 1/K to each intensity, with K the number of distinct
observation times ( |
beta_params |
A vector of length 2 specifying the beta distribution parameters
for initial distribution generation. First entry will be used as |
support_manual |
Used for specifying a manual support region for the transitions.
A list of length the number of transitions in |
exact |
Numeric vector indicating to which states transitions are observed at exact times.
Must coincide with the column number in |
maxit |
Maximum number of iterations. |
tol |
Tolerance of the procedure. |
conv_crit |
Convergence criterion. Stops procedure when the difference
in the chosen quantity between two consecutive iterations is smaller
than the tolerance level
Default is "haz". The options "haz" and "lik" can be compared across different
|
manual |
Manually specify starting transition intensities? |
verbose |
Should iteration messages be printed? Default is FALSE |
newmet |
Should contributions after last observation time also be used in the likelihood? Default is FALSE. |
include_inf |
Should an additional bin from the largest observed time to infinity be included in the algorithm? Default is FALSE. |
checkMLE |
Should a check be performed whether the estimate has converged towards a true Maximum Likelihood Estimate? Default is TRUE. |
checkMLE_tol |
Tolerance for checking whether the estimate has converged to MLE. Whenever an estimated transition intensity is smaller than the tolerance, it is assumed to be zero. |
prob_tol |
If an estimated probability is smaller than |
remove_bins |
Should a bin be removed during the algorithm if all
estimated intensities are zero for a single bin? Can improve
computation speed for large data sets. Note that zero means the estimated intensities
are smaller than |
init_int |
A vector of length 2, with the first entry indicating what percentage of mass should be distributed over (second entry) what percentage of all first bins. Default is c(0, 0), in which case the argument is ignored. This argument has no practical uses and only exists for demonstration purposes in the related article. |
... |
Not used yet |
References
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
Helper function for npmsm()
Description
For a general Markov chain multi-state model with interval censored transitions calculate the NPMLE using an EM algorithm with Poisson latent variable approach
Usage
EM_poisson(
gd,
tmat,
tmat2,
inits,
beta_params,
support_manual,
exact,
maxit,
tol,
conv_crit,
manual,
verbose,
newmet,
include_inf,
checkMLE,
checkMLE_tol,
prob_tol,
remove_bins,
init_int = init_int,
...
)
Arguments
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
inits |
Which distribution should be used to generate the initial estimates
of the intensities in the EM algorithm. One of c("equalprob", "unif", "beta"),
with "equalprob" assigning 1/K to each intensity, with K the number of distinct
observation times ( |
beta_params |
A vector of length 2 specifying the beta distribution parameters
for initial distribution generation. First entry will be used as |
support_manual |
Used for specifying a manual support region for the transitions.
A list of length the number of transitions in |
exact |
Numeric vector indicating to which states transitions are observed at exact times.
Must coincide with the column number in |
maxit |
Maximum number of iterations. |
tol |
Tolerance of the procedure. |
conv_crit |
Convergence criterion. Stops procedure when the difference
in the chosen quantity between two consecutive iterations is smaller
than the tolerance level
Default is "haz". The options "haz" and "lik" can be compared across different
|
manual |
Manually specify starting transition intensities? |
verbose |
Should iteration messages be printed? Default is FALSE |
newmet |
Should contributions after last observation time also be used in the likelihood? Default is FALSE. |
include_inf |
Should an additional bin from the largest observed time to infinity be included in the algorithm? Default is FALSE. |
checkMLE |
Should a check be performed whether the estimate has converged towards a true Maximum Likelihood Estimate? Default is TRUE. |
checkMLE_tol |
Tolerance for checking whether the estimate has converged to MLE. Whenever an estimated transition intensity is smaller than the tolerance, it is assumed to be zero. |
prob_tol |
If an estimated probability is smaller than |
remove_bins |
Should a bin be removed during the algorithm if all
estimated intensities are zero for a single bin? Can improve
computation speed for large data sets. Note that zero means the estimated intensities
are smaller than |
init_int |
A vector of length 2, with the first entry indicating what percentage of mass should be distributed over (second entry) what percentage of all first bins. Default is c(0, 0), in which case the argument is ignored. This argument has no practical uses and only exists for demonstration purposes in the related article. |
... |
Not used yet. |
References
Y. Gu, D. Zeng, G. Heiss, and D. Y. Lin, Maximum likelihood estimation for semiparametric regression models with interval-censored multistate data, Biometrika, Nov. 2023, doi:10.1093/biomet/asad073
EM solver for extended illness-death model (Frydman 1995)
Description
Solves for the cdf and transition intensities using the EM algorithm described in Frydman (1995).
Usage
EM_solver(data_idx, supportMSM, z, lambda, tol = 1e-08)
Arguments
data_idx |
List containing data, outputted from |
supportMSM |
List containing data on the support of the 1->2 transition, output from supportMSM() |
z |
Initial values for |
lambda |
Initial values for |
tol |
Tolerance of the EM algorithm. When the change in sum(abs(z)) and sum(abs(lambda)) no longer exceeds tol, the algorithm stops. |
Value
beta: Indicator whether Q subset A mu: Value used in the EM algorithm, see Frydman (1995) and Notes. I_Q_in_Lm_tn_star: Indicator whether Q is in [L_m, t_n^*] gamma: Value used in the EM algorithm, see Frydman (1995) and Notes. alpha: Indicator whether Q subset [s_j, Infinity) mu_overline: Value used in the EM algorithm, see Frydman (1995) and Notes. lambda: Intensity for the 2->3 transition z: Mass assigned to the 1->2 and 1->3 transitions iter: Number of iterations required for convergence
References
Frydman, H. (1995). Nonparametric Estimation of a Markov 'Illness-Death' Process from Interval- Censored Observations, with Application to Diabetes Survival Data. Biometrika, 82(4), 773-789. doi:10.2307/2337344
E-step of the EM algorithm for smooth estimation of transition intensities
Description
E-step of the EM algorithm for smooth estimation of transition intensities
Usage
Estep_smooth(fix_pars, subject_slices, EM_est, it_num)
Function for fitting proportional hazard model with baseline hazard
Description
Function for fitting proportional hazard model with baseline hazard
Usage
Mstep_smooth(fix_pars, EM_est, transno, from, Pen = Pen)
Arguments
fix_pars |
A list of fixed parameters during the EM procedure |
EM_est |
A list of estimated quantities that change during the EM procedure |
transno |
Transition number for the maximization procedure |
from |
The state from which the relevant transition is taking place. |
Pen |
Penalization matrix for the B-splines and regression coefficients. |
Value
An object with fields:
H
= hazards (matrix),
cbx
= coefficient estimates (vector),
lambda
= proposal for new lambda,
ed
= effective dimension,
G
= G matrix,
ll
= log-likelihood,
pen
= penalized part of log-likelihood,
Mpen
= penalized M matrix
Check if event time is larger/equal than other event time
Description
Function which takes as input a vector a with event times and a vector b with event times and checks whether each entry in a is larger or equal than the entries in b.
Usage
ageqb(a, b)
Arguments
a |
Vector of event times |
b |
Vector of event times |
Value
Matrix of size (length(a) * length(b)) with binary values indicating whether the event times in a are larger than the event times in b
Check if event time is larger than other event time
Description
Function which takes as input a vector a with event times and a vector b with event times and checks whether each entry in a is larger than the entries in b.
Usage
agreaterb(a, b)
Arguments
a |
Vector of event times |
b |
Vector of event times |
Value
Matrix of size (length(a) * length(b)) with binary values indicating whether the event times in a are larger than the event times in b
Check if event time is contained within half-open interval
Description
Function which takes as input a vector a with event times and a 2 column matrix B representing half-open intervals (l, R] and checks whether each event time is contained in each half-open interval.
Usage
ainB(a, B)
Arguments
a |
Vector of event times |
B |
Two column matrix containing intervals |
Value
Matrix of size (length(a) * nrow(B)) with binary values indicating whether the event times in a are contained in the intervals of B
Recover baseline intensities (in the form of intensity_matrices) from a coxph() fit on multi-state data.
Description
Recover baseline intensities (in the form of intensity_matrices) from a coxph() fit on multi-state data.
Usage
baseline_intensities_from_coxmod(object, tmat)
Arguments
object |
A |
tmat |
A transition matrix as created by |
Compute a B-spline basis
Description
Copied from icpack/bases.R, modified diff() function for speed.
Usage
bbase_D(x, xl = min(x), xr = max(x), nseg = 10, bdeg = 3)
Arguments
x |
The vector of values for which the basis is to be evaluated |
xl |
The left boundary of the domain |
xr |
The right boundary of the domain |
nseg |
The number of inter-knot segments on the domain |
bdeg |
The degree of the B-splines (2 means quadratic, 3 means cubic, and so on) |
Value
A matrix containing the basis
Examples
x = runif(100)
B = bbase_D(x, 0, 1, 20, 3)
Compute a B-spline basis for a single time point
Description
Similar to bbase_D, but sped up for single time points.
Usage
bbase_singletime(x, xl = min(x), xr = max(x), nseg = 10, bdeg = 3)
Arguments
x |
The value for which the basis is to be evaluated (ONLY SINGLE VALUE ALLOWED) |
xl |
The left boundary of the domain |
xr |
The right boundary of the domain |
nseg |
The number of inter-knot segments on the domain |
bdeg |
The degree of the B-splines (2 means quadratic, 3 means cubic, and so on) |
Value
A vector containing the basis
Examples
x = 0.02
B = bbase_singletime(x, 0, 1, 20, 3)
Binary search - Larger
Description
Find index of first value larger than x in a sorted array
Usage
binary_search_larger(v, data)
Arguments
v |
- sorted vector instance |
data |
- value to search |
Value
index of first value larger than data -1, -1 otherwise
Binary search - Larger or equal to
Description
Find index of first value larger or equal to x in a sorted array
Usage
binary_search_larger_equal(v, data)
Arguments
v |
- sorted vector instance |
data |
- value to search |
Value
index of first value larger or equal than data, -1 otherwise
Translate observed transition intervals into direct transition intervals
Description
Given observed transition intervals, determine the "worst" (least informative) possible direct transition intervals that could have occurred to form this sample.
Usage
direct_from_observed_intervals(observed_intervals, tmat, gd)
Arguments
observed_intervals |
Output from |
tmat |
A transition matrix as created by |
gd |
A
The true transition time between states is then interval censored between the times. |
Value
A data.frame
with the following named columns
entry_time
:Time of entry into "from" state;
time_from
:Last time subject(id) was seen in state "from";
time_to
:First time subject(id) was seen in state "to";
from
:State from which a transition was observed;
to
:State to which the transition was observed;
id
:Subject identifier;
For right-censored observations, entry_time denotes the first time
seen in the censored state, time_from the last time seen in the censored state,
time_to is Inf
, from the censored state and to is NA
.
Estimate the support of a general Markov interval-censored Multi-state model without loops.
Description
Given a realisation of a multi-state model, estimate the support of the different transitions possible in that MSM. The estimation is performed by viewing each possible state in a competing risks setting and applying the result of Hudgens (2001) to determine the support and left-truncation intervals and Hudgens (2005) to check whether a solution is possible.
Usage
estimate_support_msm(gd, tmat)
Arguments
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
Value
TODO
References
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
M. G. Hudgens, G. A. Satten, and I. M. Longini, Nonparametric Maximum Likelihood Estimation for Competing Risks Survival Data Subject to Interval Censoring and Truncation, Biometrics, vol. 57, no. 1, Pages 74-80, March 2001, doi:10.1111/j.0006-341x.2001.00074.x
Sample from a markov chain multi state model with exactly observed transition times
Description
Given a markov chain multi state model with exactly observed transition times, sample from this chain at the observation times, giving interval censored observations (panel data).
Usage
evalstep(time, stepf, newtime, subst = -Inf, to.data.frame = FALSE)
Arguments
time |
Times at which a transition occurs |
stepf |
States at which the chain is in at |
newtime |
Observation times of the chain, to create observed states |
subst |
State to return if observation time is before first transition time. Default = -Inf. |
to.data.frame |
Should the result be returned as a |
Value
A numeric vector
or data.frame
(if to.data.frame = TRUE
)
containing either the observed states or the named columns newtime
and
res
, representing the observation times and observed states.
Author(s)
Hein Putter
Examples
obs_states <- evalstep(time = seq(0, 20, 2), stepf = sample(1:9, 11, replace = TRUE),
newtime = c(-1, 1, 7, 9, 19))
obs_states
Check existence of NPMLE
Description
Using Theorem 1 from Hudgens (2005) we can check whether an NPMLE exists. This procedure is implemented in this function.
Usage
existenceNPMLE(intervals, supportdf)
Arguments
intervals |
A data.frame with 4 columns containing half-open intervals (left open, right closed) and an indicator whether the interval results from a censored transition or truncation:
Note that the truncation intervals need to be in the form (N, Inf] with N a numeric value. |
supportdf |
A data from containing 2 columns indicating the support intervals:
|
Expand covariates for a data frame so that covariates can be transition specific.
Description
Expand covariates for a data frame so that covariates can be transition specific.
Usage
expand_covariates_long_data(newdata)
Arguments
newdata |
A
Note that newdata must contain a column containing the variable which was
used to determine the stratum of a transition in |
Given a msfit
object, extend the times considered in the object
Description
After using this function, use probtrans to get interpolated transition probabilities. This function is useful when you want to obtain transition probabilities at more than just the minimal number of times that strictly have to be considered. The inserted hazard values are simply the hazards at the nearest time that is smaller or equal.
Usage
extend_msfit(msfit, times)
Arguments
msfit |
A |
times |
Times at which to extend the |
Value
An msfit
object containing the extended hazards
Examples
library(mstate)
tmat <- trans.illdeath()
times <- seq(0, 5, 0.1)
ms_fit <- list(Haz = data.frame(time = rep(times, 3),
Haz = c(replicate(3, cumsum(runif(length(times), 0, 0.02)))),
trans = rep(1:3, each = length(times))),
trans = tmat)
class(ms_fit) <- "msfit"
ms_fit_interpolated <- extend_msfit(ms_fit, seq(0, 5, 0.01))
Get transition intervals from specified data
Description
Given a sample from a multi-state model, summarize the transitions that have been observed.
Usage
get_trans_intervals(gd, tmat)
Arguments
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
Value
A data.frame
with the following named columns
entry_time
:Time of entry into "from" state;
time_from
:Last time subject(id) was seen in state "from";
time_to
:First time subject(id) was seen in state "to";
from
:State from which a transition was observed;
to
:State to which the transition was observed;
id
:Subject identifier;
For right-censored observations, entry_time denotes the first time
seen in the censored state, time_from the last time seen in the censored state,
time_to is Inf
, from the censored state and to is NA
.
Construct Graph from censoring/truncation intervals
Description
Given intervals, construct a graph containing vertices representing these intervals and edges between the vertices if the intervals intersect. See Hudgens (2005).
Usage
graphfromIntervals(intervals)
Arguments
intervals |
A data.frame with 3 columns containing half-open intervals (left open, right closed) and an indicator whether the interval results from a censored transition or truncation: #'
Note that the truncation intervals need to be in the form (N, Inf] with N a numeric value. |
Value
Returns an 'igraph'
object containing the graph with vertices
representing the intervals and edges between the vertices if the intervals
intersect. The vertices will be named accordingly, starting with a 'T' when
representing a truncation interval and 'C' when representing a censoring
interval.
References
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
Given a msfit
object, linearly interpolate the cumulative hazard
taking into account the support sets for msfit
objects.
Description
After using this function, use probtrans to get interpolated transition probabilities.
Usage
interpol_msfit(msfit, times)
Arguments
msfit |
A |
times |
Times at which to interpolate the |
Value
An msfit
object containing the interpolated hazards
Examples
library(mstate)
tmat <- trans.illdeath()
times <- seq(0, 5, 0.1)
ms_fit <- list(Haz = data.frame(time = rep(times, 3),
Haz = c(replicate(3, cumsum(runif(length(times), 0, 0.02)))),
trans = rep(1:3, each = length(times))),
trans = tmat)
class(ms_fit) <- "msfit"
ms_fit_interpolated <- interpol_msfit(ms_fit, seq(0, 5, 0.01))
Determine NPMLE for Multi State illness death Markov model using Frydman (1995)
Description
Determine NPMLE for Multi State illness death Markov model using Frydman (1995)
Usage
msm_frydman(data, tol = 1e-08)
Arguments
data |
A
|
tol |
Tolerance of the EM algorithm. Algorithm will stop when the absolute difference between current mass estimates and new estimates is smaller than the tolerance |
Details
For an illness death model (1 = healthy, 2 = ill, 3 = dead) estimate the NPMLE in the following form:
F12
:Cumulative distribution function of 1->2 transition;
F13
:Cumulative distribution function of 1->3 transition;
Lambda23
:Cumulative intensity of 2->3 transition;
Value
A list with the following entries:
data_idx
:A list containing the data used for the fit (
matdata
), the indices for which group a subject belongs to (GroupX_idx
), some computational parameters (see Frydman(1995)) and the unique failure times of the 2->3 and 1->3 transitions respectively int_n_star
ande_k_star
;supportMSM
:A list containing all transition intervals in
A
and the theoretical support intervals inQ_mat
;z_lambda
:Computational quantities, see Frydman(1995);
cdf
:A list of functions that allow to recover the cdf for the 1->3 (
F13
) and 1->2 (F12
) transition and the cumulative hazard for the 2->3 (Lambda23
) transition.;
References
Frydman, H. (1995). Nonparametric Estimation of a Markov 'Illness-Death' Process from Interval- Censored Observations, with Application to Diabetes Survival Data. Biometrika, 82(4), 773-789. doi:10.2307/2337344
Examples
data <- data.frame(delta = c(0, 0, 1, 1), Delta = c(0, 1, 0, 1),
L = c(NA, NA, 1, 1.5), R = c(NA, 3, 2, 3),
time = c(4, 5, 6, 7))
mod_frydman <- msm_frydman(data)
visualise_data(data, mod_frydman)
NPMLE for general multi-state model with interval censored transitions
Description
For a general Markov chain multi-state model with interval censored
transitions calculate the NPMLE of the transition intensities. The estimates
are returned as an msfit
object.
Usage
npmsm(
gd,
tmat,
method = c("multinomial", "poisson"),
inits = c("equalprob", "homogeneous", "unif", "beta"),
beta_params,
support_manual,
exact,
maxit = 100,
tol = 1e-04,
conv_crit = c("haz", "prob", "lik"),
verbose = FALSE,
manual = FALSE,
newmet = FALSE,
include_inf = FALSE,
checkMLE = TRUE,
checkMLE_tol = 1e-04,
prob_tol = tol/10,
remove_redundant = TRUE,
remove_bins = FALSE,
estimateSupport = FALSE,
init_int = c(0, 0),
...
)
Arguments
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
method |
Which method should be used for the EM algorithm. Choices are
|
inits |
Which distribution should be used to generate the initial estimates
of the intensities in the EM algorithm. One of c("equalprob", "unif", "beta"),
with "equalprob" assigning 1/K to each intensity, with K the number of distinct
observation times ( |
beta_params |
A vector of length 2 specifying the beta distribution parameters
for initial distribution generation. First entry will be used as |
support_manual |
Used for specifying a manual support region for the transitions.
A list of length the number of transitions in |
exact |
Numeric vector indicating to which states transitions are observed at exact times.
Must coincide with the column number in |
maxit |
Maximum number of iterations. Default = 100. |
tol |
Tolerance of the convergence procedure. A change in the value of
|
conv_crit |
Convergence criterion. Stops procedure when the difference
in the chosen quantity between two consecutive iterations is smaller
than the tolerance level
Default is "haz". The options "haz" and "lik" can be compared across different
|
verbose |
Should iteration messages be printed? Default is FALSE |
manual |
Manually specify starting transition intensities? If |
newmet |
Should contributions after last observation time also be used in the likelihood? Default is FALSE. |
include_inf |
Should an additional bin from the largest observed time to infinity be included in the algorithm? Default is FALSE. |
checkMLE |
Should a check be performed whether the estimate has converged
towards a local maximum? This is done by comparing
the reduced gradient to the value of |
checkMLE_tol |
Tolerance for checking whether the estimate has converged to local maximum.
Whenever an estimated transition intensity is smaller than |
prob_tol |
If an estimated probability is smaller than |
remove_redundant |
Should redundant observations be removed before running the algorithm? An observation is redundant when the same state has been observed more than 3 times consecutively, or if it is a repeat observation of an absorbing state. Default is TRUE. |
remove_bins |
Should a bin be removed during the algorithm if all
estimated intensities are zero for a single bin? Can improve
computation speed for large data sets. Note that zero means the estimated intensities
are smaller than |
estimateSupport |
Should the support of the transitions be estimated using
the result of Hudgens (2005)? Currently produces incorrect support sets -
DO NOT USE. Default = |
init_int |
A vector of length 2, with the first entry indicating what percentage of mass should be distributed over (second entry) what percentage of all first bins. Default is c(0, 0), in which case the argument is ignored. This argument has no practical uses and only exists for demonstration purposes in the related article. |
... |
Further arguments to |
Details
Denote the unique observation times in the data as 0 = \tau_0, \tau_1, \ldots, \tau_K
Let g, h \in H
denote the possible states in the model and X(t)
the state of the process at time t.
Then this function can be used to estimate the transition intensities
\alpha_{gh}^k = \alpha_{gh}(\tau_k)
.
Having obtained these estimated, it is possible to recover the transition probabilities
\mathbf{P}(X(t) = h | X(s) = g)
for t > s
using
the transprob
functions.
Value
A list with the following entries:
A: |
A list of class |
Ainit: |
Initial intensities, in an object of class |
gd: |
Data used for the estimation procedure; |
ll: |
Log-likelihood value of the procedure at the last iteration; |
delta: |
Change in log-likelihood value at the last iteration; |
it: |
Number of iterations of the procedure; |
taus: |
Unique time points of the data, the cumulative intensity only makes jumps at these time points.; |
tmat: |
The transition matrix used, see |
tmat2: |
A summary of the transitions in the model, see |
ll_history: |
The log-likelihood value at every iteration of the procedure; |
KKT_violated: |
How often were KKT conditions violated during maximisation of the likelihood? In other words, how often did we hit the optimization boundary during the procedure?; |
min_time: |
The smallest time of an observation in the used data. Note that the smallest time in the data is used as zero reference; |
reduced_gradient: |
The reduced gradient at the last iteration. Rows indicate the transitions and columns the unique observation times; |
isMLE: |
Has the procedure converged to the NPMLE? Checked
using |
langrangemultiplier: |
The lagrange multipliers at the last iteration; |
aghmat: |
A matrix representation of the transition intensities in |
Ygk: |
The summed at-risk indicator for all subjects in the data at the last iteration. Rows represent transitions and columns unique observation times; |
Dmk: |
The summed probability of making a transition for all subjects at the last iteration. Rows represent transitions and columns unique observation times; |
method: |
Method used for the optimization procedure; |
maxit: |
Maximum number of allowed iterations; |
tol: |
Tolerance of the convergence procedure; |
conv_crit: |
Convergence criterion of the procedure; |
checkMLE: |
Was the reduced gradient checked at the last iteration to determine convergence?; |
checkMLE_tol: |
The tolerance of the checkMLE procedure; |
prob_tol: |
Tolerance for probabilities to be set to zero; |
remove_redundant: |
Were redundant observations removed before performing the procedure?; |
References
D. Gomon and H. Putter, Non-parametric estimation of transition intensities in interval censored Markov multi-state models without loops, arXiv, 2024, doi:10.48550/arXiv.2409.07176
Y. Gu, D. Zeng, G. Heiss, and D. Y. Lin, Maximum likelihood estimation for semiparametric regression models with interval-censored multistate data, Biometrika, Nov. 2023, doi:10.1093/biomet/asad073
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
See Also
transprob
for calculating transition probabilities,
plot.npmsm
for plotting the cumulative intensities,
print.npmsm
for printing some output summaries,
visualise_msm
for visualising data,
msfit
for details on the output object.
Examples
#Create transition matrix using mstate functionality: illness-death
if(require(mstate)){
tmat <- mstate::trans.illdeath()
}
#Write a function for evaluation times: observe at 0 and uniform inter-observation times.
eval_times <- function(n_obs, stop_time){
cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) )
}
#Use built_in function to simulate illness-death data
#from Weibull distributions for each transition
sim_dat <- sim_id_weib(n = 50, n_obs = 6, stop_time = 15, eval_times = eval_times,
start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5)))
tmat <- mstate::trans.illdeath()
#Fit the model using method = "multinomial"
mod_fit <- npmsm(gd = sim_dat, tmat = tmat, tol = 1e-2)
#Plot the cumulative intensities for each transition
plot(mod_fit)
Plot a "npmsm" object
Description
Plot the cumulative intensities of a 'npmsm'
objects.
A wrapper for plot.msfit
from the
mstate
package.
Usage
## S3 method for class 'npmsm'
plot(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments to |
Value
A plot will be produced in the plotting window
Plot an object of class "probtrans.subjects"
Description
Plots the transition probabilities for a specific subject. Wrapper for
plot.probtrans
Usage
## S3 method for class 'probtrans.subjects'
plot(x, id, ...)
Arguments
x |
An object of class |
id |
Subject identifier |
... |
Further arguments to |
Details
Note that
Author(s)
Hein Putter and Daniel Gomon
Examples
if(require("mstate")){
data(ebmt3)
n <- nrow(ebmt3)
tmat <- transMat(x = list(c(2, 3), c(3), c()), names = c("Tx",
"PR", "RelDeath"))
ebmt3$prtime <- ebmt3$prtime/365.25
ebmt3$rfstime <- ebmt3$rfstime/365.25
covs <- c("dissub", "age", "drmatch", "tcd", "prtime")
msbmt <- msprep(time = c(NA, "prtime", "rfstime"), status = c(NA,
"prstat", "rfsstat"), data = ebmt3, trans = tmat, keep = covs)
#Expand covariates so that we can have transition specific covariates
msbmt <- expand.covs(msbmt, covs, append = TRUE, longnames = FALSE)
#Simple model, transition specific covariates, each transition own baseline hazard
c1 <- coxph(Surv(Tstart, Tstop, status) ~ dissub1.1 + dissub2.1 +
age1.1 + age2.1 + drmatch.1 + tcd.1 + dissub1.2 + dissub2.2 +
age1.2 + age2.2 + drmatch.2 + tcd.2 + dissub1.3 + dissub2.3 +
age1.3 + age2.3 + drmatch.3 + tcd.3 + strata(trans), data = msbmt,
method = "breslow")
#We need to make a data.frame containing all subjects of interest
ttmat <- to.trans2(tmat)[, c(2, 3, 1)]
names(ttmat)[3] <- "trans"
nd_n <- NULL
for (j in 1:30) {
# Select global covariates of subject j
cllj <- ebmt3[j, covs]
nd2 <- cbind(ttmat, rep(j, 3), rbind(cllj, cllj, cllj))
colnames(nd2)[4] <- "id"
# Make nd2 of class msdata to use expand.covs
attr(nd2, "trans") <- tmat
class(nd2) <- c("msdata", "data.frame")
nd2 <- expand.covs(nd2, covs=covs, longnames = FALSE)
nd_n <- rbind(nd_n, nd2)
}
icmstate_pt <- probtrans_coxph(c1, predt = 0, direction = "forward",
newdata = nd_n, trans = tmat)
#plot transition probabilities for subject 2
plot(icmstate_pt, id = 2)
}
Plot a "smoothmsm" object
Description
Plot the cumulative intensities of a 'npmsm'
objects.
A wrapper for plot.msfit
from the
mstate
package.
Usage
## S3 method for class 'smoothmsm'
plot(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments to |
Value
A plot will be produced in the plotting window
Plot the transition probabilities for a fitted npmsm
model
Description
For a fitted npmsm
model plot the transition probabilities
from a certain state for all possible (direct and indirect) transitions.
Usage
plot_probtrans(
npmsmlist,
from = NULL,
to = NULL,
transitions = NULL,
landmark,
interpolate = TRUE,
facet = TRUE,
times_interpol = NULL,
c.legend = TRUE,
c.names = NULL
)
Arguments
npmsmlist |
An "npmsm" object or a list containing multiple "npmsm" objects |
from |
A numeric value indicating the state from which we consider the transition probabilities. Default is NULL, meaning we consider transition probabilities from all states from which a direct transition is possible. |
to |
A numeric vector indicating to which states we consider the transition
probabilities. Only states that can be reached from the |
transitions |
A numeric vector indicating which transitions to consider (plot).
Can only be used if |
landmark |
A landmark time indicating from which time on survival should be determined. If missing, the smallest between the time in the first "npmsm" object or 0 will be used. |
interpolate |
Should the cumulative hazard be linearly interpolated before determining transition probabilities? Default is TRUE. |
facet |
Should the resulting plot be faceted (one panel per transition)? Default is TRUE. |
times_interpol |
At which times should the cumulative hazard be interpolated? Only necessary to specify if interpolate = TRUE. |
c.legend |
Should legend be displayed for colour (different entries in
|
c.names |
A character vector indicating the names to display in the legend.
These names should represent the entries in |
Value
A plot will be produced in the plotting window. When assigning
the output to an object, the underlying data frame used for plotting
and a 'ggplot'
object will be returned in a list.
Examples
require(mstate)
require(ggplot2)
#Generate from an illness-death model with exponential transitions with
#rates 1/2, 1/10 and 1 for 10 subjects over a time grid.
gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1),
scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20),
startprobs = c(0.9, 0.1, 0))
#Fit 2 models: 1 with at most 4 iterations and 1 with at most 20
mod1 <- npmsm(gd, trans.illdeath(), maxit = 4)
mod2 <- npmsm(gd, trans.illdeath(), maxit = 20)
#Plot the transition probabilities from state 1, without interpolating
#the cumulative hazard for the npmsm runs with max 4 and 20 iterations.
plot_probtrans(list(mod1, mod2), from = 1, interpolate = FALSE,
c.names = c("4 iterations", "20 iterations"))
Plot the transition specific survival probabilities for a fitted npmsm
model
Description
For a fitted npmsm
model plot the transition specific
survival probabilities. These are given by the product integral of the hazard
increments estimated for a single transition. This is equivalent to a Kaplan-Meier
estimator ignoring the existence of all other transitions.
Usage
plot_surv(npmsmlist, landmark, support = FALSE, sup_cutoff = 1e-08)
Arguments
npmsmlist |
An "npmsm" object or a list containing multiple "npmsm" objects |
landmark |
A landmark time indicating from which time on survival should be determined. If missing, the smallest time in the first "npmsm" object will be used. |
support |
Should the support regions be displayed as rectangles? |
sup_cutoff |
Cutoff to be used for determining the support intervals. |
Value
A plot will be produced in the plotting window. When assigning
the output to an object, the underlying data frame used for plotting
and a 'ggplot'
object will be returned in a list.
Examples
require(mstate)
require(ggplot2)
#Generate from an illness-death model with exponential transitions with
#rates 1/2, 1/10 and 1 for 10 subjects over a time grid.
gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1),
scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20),
startprobs = c(0.9, 0.1, 0))
mod1 <- npmsm(gd, trans.illdeath(), maxit = 4)
mod2 <- npmsm(gd, trans.illdeath(), maxit = 20)
#Plot the transition specific Kaplan-Meier estimators and their numerically
#determined support sets.
plot_surv(list(mod1, mod2), support = TRUE)
Calculate subject specific transition probabilities from a multi-state proportional hazards model.
Description
Given a coxph model fit on multi-state data (prepared with msprep
),
determine transition probabilities for subjects in newdata
.
Usage
predict_tp(
object,
predt,
direction = c("forward", "fixedhorizon"),
newdata,
trans
)
Arguments
object |
A |
predt |
A positive number indicating the prediction time. This is either
the time at which the prediction is made (if |
direction |
One of |
newdata |
A
Note that newdata must contain a column containing the variable which was
used to determine the stratum of a transition in |
trans |
A transition matrix as created by |
Details
When using this function for newdata
with many subjects, consider running the
function multiple times for parts of newdata
to negate the risk of
running our of memory.
Using this function, it is only possible to consider models with transition specific covariates.
If you would like to have covariates shared over transitions or proportional
hazards assumptions between transitions, see probtrans_coxph
.
Value
An object of class "probtrans.subjects"
.
This is a list of length n (number of subjects in newdata), with each list element
an object of class probtrans
for the associated
subject. List elements can be accessed using [[x]]
, with x
ranging from 1 to n. Additionally, each list element
has an element $id
, representing the subject id and the output object
also has an element $subject_ids
representing the subject ids in order.
See Also
probtrans_coxph
, plot.probtrans.subjects
,
summary.probtrans.subjects
Examples
#Example from the mstate vignette
#We determine the subject specific transition probabilities for subjects
#in the ebmt3 data-set
if(require("mstate")){
data(ebmt3)
n <- nrow(ebmt3)
tmat <- transMat(x = list(c(2, 3), c(3), c()), names = c("Tx",
"PR", "RelDeath"))
#From days to years
ebmt3$prtime <- ebmt3$prtime/365.25
ebmt3$rfstime <- ebmt3$rfstime/365.25
#Covariates we will use
covs <- c("dissub", "age", "drmatch", "tcd", "prtime")
msbmt <- msprep(time = c(NA, "prtime", "rfstime"), status = c(NA,
"prstat", "rfsstat"), data = ebmt3, trans = tmat, keep = covs)
#Expand covariates so that we can have transition specific covariates
msbmt2 <- expand.covs(msbmt, covs, append = TRUE, longnames = FALSE)
#-------------Model---------------------#
#Simple model, transition specific covariates, each transition own baseline hazard
c1 <- coxph(Surv(Tstart, Tstop, status) ~ dissub1.1 + dissub2.1 +
age1.1 + age2.1 + drmatch.1 + tcd.1 + dissub1.2 + dissub2.2 +
age1.2 + age2.2 + drmatch.2 + tcd.2 + dissub1.3 + dissub2.3 +
age1.3 + age2.3 + drmatch.3 + tcd.3 + strata(trans), data = msbmt2,
method = "breslow")
#Predict transition probabilities for first 30 subjects.
tp_subjects <- predict_tp(c1, predt = 0, direction = "forward",
newdata = ebmt3[1:30,], trans = tmat)
#Now we can plot the transition probabilities for each subject separately:
plot(tp_subjects, id = 1)
#tp_subjects has length number of subjects in newdata + 1
#And tp_subjects[[i]] is an object of class "probtrans", so you can
#use all probtrans functions: summary, plot etc.
}
Print a "npmsm" object
Description
Print some details of a npmsm
fit
Usage
## S3 method for class 'npmsm'
print(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments to print |
Value
A summary of the fitted model will be displayed in the console
Print method for a summary.probtrans.subjects object
Description
Print method for a summary.probtrans.subjects object
Usage
## S3 method for class 'summary.probtrans.subjects'
print(x, complete = FALSE, ...)
Arguments
x |
Object of class 'summary.probtrans.subjects', to be printed |
complete |
Whether or not the complete estimated transition
probabilities should be printed ( |
... |
Further arguments to print |
Examples
## Not run:
# If all time points should be printed, specify complete=TRUE in the print statement
print(x, complete=TRUE)
## End(Not run)
Calculate subject specific transition probabilities from a
multi-state coxph
model.
Description
Given a coxph model fit on multi-state data (prepared with msprep
),
determine transition probabilities for subjects in newdata
.
Usage
probtrans_coxph(
object,
predt,
direction = c("forward", "fixedhorizon"),
newdata,
trans
)
Arguments
object |
A |
predt |
A positive number indicating the prediction time. This is either
the time at which the prediction is made (if |
direction |
One of |
newdata |
A
Note that newdata must contain a column containing the variable which was
used to determine the stratum of a transition in |
trans |
A transition matrix as created by |
Details
When using this function for newdata
with many subjects, consider running the
function multiple times for parts of newdata
to negate the risk of
running our of memory.
Value
An object of class "probtrans.subjects"
.
This is a list of length n (number of subjects in newdata), with each list element
an object of class probtrans
for the associated
subject. List elements can be accessed using [[x]]
, with x
ranging from 1 to n. Additionally, each list element
has an element $id
, representing the subject id and the output object
also has an element $subject_ids
representing the subject ids in order.
Examples
#Example from the mstate vignette
#We determine the subject specific transition probabilities for subjects
#in the ebmt3 data-set
if(require("mstate")){
data(ebmt3)
n <- nrow(ebmt3)
tmat <- transMat(x = list(c(2, 3), c(3), c()), names = c("Tx",
"PR", "RelDeath"))
ebmt3$prtime <- ebmt3$prtime/365.25
ebmt3$rfstime <- ebmt3$rfstime/365.25
covs <- c("dissub", "age", "drmatch", "tcd", "prtime")
msbmt <- msprep(time = c(NA, "prtime", "rfstime"), status = c(NA,
"prstat", "rfsstat"), data = ebmt3, trans = tmat, keep = covs)
#Expand covariates so that we can have transition specific covariates
msbmt <- expand.covs(msbmt, covs, append = TRUE, longnames = FALSE)
#Create extra variable to allow gender mismatch to have the same effect
#for transitions 2 and 3.
msbmt$drmatch.2.3 <- msbmt$drmatch.2 + msbmt$drmatch.3
#Introduce pr covariate for proportionality assumption of transitions 2 and 3
#(only used in models 2 and 4)
msbmt$pr <- 0
msbmt$pr[msbmt$trans == 3] <- 1
#-------------Models---------------------#
#Simple model, transition specific covariates, each transition own baseline hazard
c1 <- coxph(Surv(Tstart, Tstop, status) ~ dissub1.1 + dissub2.1 +
age1.1 + age2.1 + drmatch.1 + tcd.1 + dissub1.2 + dissub2.2 +
age1.2 + age2.2 + drmatch.2 + tcd.2 + dissub1.3 + dissub2.3 +
age1.3 + age2.3 + drmatch.3 + tcd.3 + strata(trans), data = msbmt,
method = "breslow")
#Model with same baseline hazards for transitions 2 (1->3) and 3(2->3)
#pr then gives the ratio of the 2 hazards for these transitions
c2 <- coxph(Surv(Tstart, Tstop, status) ~ dissub1.1 + dissub2.1 +
age1.1 + age2.1 + drmatch.1 + tcd.1 + dissub1.2 + dissub2.2 +
age1.2 + age2.2 + drmatch.2 + tcd.2 + dissub1.3 + dissub2.3 +
age1.3 + age2.3 + drmatch.3 + tcd.3 + pr + strata(to), data = msbmt,
method = "breslow")
#Same as c2, but now Gender mismatch has the same effect for both
c3 <- coxph(Surv(Tstart, Tstop, status) ~ dissub1.1 + dissub2.1 +
age1.1 + age2.1 + drmatch.1 + tcd.1 + dissub1.2 + dissub2.2 +
age1.2 + age2.2 + drmatch.2.3 + tcd.2 + dissub1.3 + dissub2.3 +
age1.3 + age2.3 + tcd.3 + pr + strata(to), data = msbmt,
method = "breslow")
#Predict for first 30 people in ebmt data
tmat2 <- to.trans2(tmat)[, c(2,3,1)]
names(tmat2)[3] <- "trans"
n_transitions <- nrow(tmat2)
newdata <- ebmt3[rep(seq_len(30), each = nrow(tmat2)),]
newdata <- cbind(tmat2[rep(seq_len(nrow(tmat2)), times = 30), ], newdata)
#Make of class "msdata"
attr(newdata, "trans") <- tmat
class(newdata) <- c("msdata", "data.frame")
#Covariate names to expand
covs <- names(newdata)[5:ncol(newdata)]
newdata <- expand.covs(newdata, covs, append = TRUE, longnames = FALSE)
newdata$drmatch.2.3 <- newdata$drmatch.2 + newdata$drmatch.3
newdata$pr <- 0
newdata$pr[newdata$trans == 3] <- 1
#Calculate transition probabilities for the Cox fits
icmstate_pt1 <- probtrans_coxph(c1, predt = 0, direction = "forward",
newdata = newdata, trans = tmat)
icmstate_pt2 <- probtrans_coxph(c2, predt = 0, direction = "forward",
newdata = newdata, trans = tmat)
icmstate_pt3 <- probtrans_coxph(c3, predt = 0, direction = "forward",
newdata = newdata, trans = tmat)
#Now we can plot the transition probabilities for each subject separately:
plot(icmstate_pt1[[1]])
#icmstate_pt has length number of subjects in newdata
#And icmstate_pt1[[i]] is an object of class "probtrans", so you can
#use all probtrans functions: summary, plot etc.
#Alternatively, use the plotting function directly:
plot(icmstate_pt2, id = 2)
}
Determine transition probabilities for a multi-state model with Weibull hazards for the transitions.
Description
Determine transition probabilities for a multi-state model with Weibull hazards for the transitions.
Usage
probtrans_weib(transMat, times, shapes, scales, type = c("prodint", "ODE"))
Arguments
transMat |
A transition matrix as generated by |
times |
The times at which the transition probabilities should be
determined. Will always determine the probabilities forward in time starting
from |
shapes |
The Weibull shapes corresponding to the numbered transitions
in |
scales |
The Weibull scales corresponding to the numbered transitions
in |
type |
Should the transition probabilities be determined using
product integration |
Value
An object containing the "true" transition probabilities for the specified Weibull hazards.
Examples
#Illness-death model
tmat <- mstate::trans.illdeath()
IDM <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2),
scales = c(5, 10, 10/gamma(1.5)), type = "prodint")
IDM2 <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2),
scales = c(5, 10, 10/gamma(1.5)), type = "ODE")
plot(IDM)
plot(IDM2)
#Extended illness-death model
tmat <- mstate::transMat(list(c(2, 3), c(4), c(), c()))
IDM <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2),
scales = c(5, 10, 10/gamma(1.5)), type = "prodint")
IDM2 <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2),
scales = c(5, 10, 10/gamma(1.5)), type = "ODE")
plot(IDM)
plot(IDM2)
Calculate the product of intensities over interval decided by failure times
Description
This function calculates \prod_{\lambda}
G as defined in
Frydman (1995), with t_n
the failure times. Note that length(t_n) must be
equal to length(lambda)
Usage
prod_lambda_G_base(lambda, t_n, Q, A)
Arguments
lambda |
Intensities of the 2->3 transition |
t_n |
Unique failure times, same length as lambda |
Q |
Matrix (2 column) containing support intervals as rows |
A |
Matrix (2 column) containing censoring intervals as rows |
Remove redundant observations from supplied data frame
Description
Remove redundant observed states from a supplied data frame. Observations are redundant either when we observe an absorbing state multiple times (as we cannot leave an absorbing state), or when a transient state is observed multiple times between transitions (as we cannot have loops, therefore no extra information is provided when we observe a transient state multiple times).
Usage
remove_redundant_observations(gd, tmat)
Arguments
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
Value
A data.frame
containing the information contained in the
input data.frame
gd
,
but without redundant observations. Depending on whether tmat
was
specified the function may remove more observations.
Examples
#We simulate some data
#Function to generate evaluation times: at 0 and uniform inter-observation
eval_times <- function(n_obs, stop_time){
cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) )
}
#Simulate illness-death model data with Weibull transitions
sim_dat <- sim_id_weib(n = 20, n_obs = 6, stop_time = 15, eval_times = eval_times,
start_state = "stable", shape = c(0.5, 0.5, 2),
scale = c(5, 10, 10/gamma(1.5)))
visualise_msm(sim_dat)
require(mstate)
sim_dat_clean <- remove_redundant_observations(sim_dat, trans.illdeath())
visualise_msm(sim_dat_clean)
Simulate panel data from an illness-death model with Weibull transition hazards
Description
An illness-death model has 3 transitions:
1
:State 1 (Healthy) to State 2 (Illness);
2
:State 1 (Healthy) to State 3 (Death);
3
:State 2 (Illness) to State 3 (Death);
Using this function, it is possible to simulate data from an illness-death model with Weibull transition intensities. Requires the use of an external (self-written) function to generate observation times.
Usage
sim_id_weib(
n,
n_obs,
stop_time,
eval_times,
start_state = c("stable", "equalprob"),
shape,
scale,
...
)
Arguments
n |
Number of subjects to generate paths for. |
n_obs |
Number of observations in time period for each subject. |
stop_time |
Largest time at which the model is considered. |
eval_times |
A function which returns the evaluation times for a subject.
Must have as arguments at least |
start_state |
In which states can subjects start? Either everyone starts in state 1 ("stable") or equal probability to start in state 1 or 2 ("equalprob"). |
shape |
Vector of shape parameters for the 3 transitions. See |
scale |
Vector of scale parameters for the 3 transitions. See |
... |
Further parameters to |
Details
Taking shape = 1
we get an exponential distribution with rate
1/scale
Value
Panel data in the form of a data.frame
with 3 named columns
id, time and state. These represent the subject identifier, the observation
time and the state at the
observation time.
Examples
#Function to generate evaluation times: at 0 and uniform inter-observation
eval_times <- function(n_obs, stop_time){
cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) )
}
#Simulate illness-death model data with Weibull transitions
sim_dat <- sim_id_weib(n = 20, n_obs = 6, stop_time = 15, eval_times = eval_times,
start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5)))
visualise_msm(sim_dat)
Simulate multiple trajectories from an interval-censored multi-state model with Weibull transition intensities
Description
Simulate multiple trajectories from a multi-state model quantified by a transition matrix, with interval-censored transitions and Weibull distributed transition intensities. Allows for Weibull censoring in each of the states.
Usage
sim_weibmsm(
data,
tmat,
startprobs,
exact,
shape,
scale,
censshape,
censscale,
n_subj,
obs_pars,
true_trajec = FALSE
)
Arguments
data |
A |
tmat |
A transition matrix as created by |
startprobs |
A numeric vector of length H indicating the probability of each subject to start in any of the possible states. Must sum to 1. By default, all subjects will start in state 1. |
exact |
A numeric vector indicating which states are exactly observed.
The transition time to exact states will be observed at exact times, regardless
of the times in |
shape |
A numeric vector of length M indicating the shape of the Weibull
transition intensity for the corresponding transition in |
scale |
A numeric vector of length M indicating the scale of the Weibull
transition intensity for the corresponding transition in |
censshape |
A numeric vector of length H indicating the Weibull
censoring shape in each of the states. If no censoring is required in some states,
set corresponding entries to |
censscale |
A numeric vector of length H indicating the Weibull censoring
scale in each of the states. If no censoring is required in some states,
set corresponding entries to |
n_subj |
(Optional) Instead of specifying |
obs_pars |
(Optional) A numeric vector of length 3 specifying what the
time is between planned assessments, what the uniform deviation from this
time is at the visits and the maximum visit time.
Specifying |
true_trajec |
Should the true (right-censored) trajectory be returned for
the subjects as well? Default = |
Details
Taking (cens)shape
to be 1 for all transitions, we obtain exponential
(censoring)/transitions with rate 1/(cens)scale
.
If right-censoring parameters are specified, a right-censoring time is generated in
each of the visited states. If the subject is right-censored, we assume the subject
is no longer observed at later obstimes
. Due to the interval-censored
nature of the generation process, it may therefore appear as if the subject
was right-censored in an earlier state.
Suppose a subject arrives in state g at time s. If we wish to generate
a survival time from that state according to a Weibull intensity in a clock forward
model, we can use the inverse transform of the conditional Weibull intensity.
More specifically, letting a
denote the shape and \sigma
denote the scale,
the conditional survival function for t > s
is given by
S(t|s) = \mathbf{P}(T \geq t | T \geq s) = \exp(\left( \frac{s}{\sigma} \right)^a - \left( \frac{t}{\sigma} \right)^a)
The corresponding cumulative intensity is then given by:
A(t|s) = -\log(S(t|s)) = \left( \frac{t}{\sigma} \right)^a - \left( \frac{s}{\sigma} \right)^a
And the inverse cumulative intensity is then:
A^{-1}(t|s) = \sigma \sqrt[a]{t + \left( \frac{s}{\sigma} \right)^a}
A conditional survival time is then generated by:
T|s = A^{-1}(-\log(U)|s)
with U
a sample from the standard uniform distribution.
If we additionally have covariates (or frailties), the -\log(U)
above should be replaced by \frac{-\log(U)}{\exp(\beta X)}
with \beta
and X
the coefficients and covariates respectively.
Value
A matrix
with 3 columns time
, state
and id
, indicating
the observation time, the corresponding state and subject identifier. If
true_trajec = TRUE
, a list
with the matrix described above and a matrix
representing the underlying right-censored trajectory.
Examples
require(mstate)
require(ggplot2)
#Generate from an illness-death model with exponential transitions with
#rates 1/2, 1/10 and 1 for 10 subjects over a time grid.
gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1),
scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20),
startprobs = c(0.9, 0.1, 0), true_trajec = TRUE)
#Observed trajectories
visualise_msm(gd$observed)
#True trajectories
visualise_msm(gd$true)
#Can supply data-frame with specified observation times
obs_df <- data.frame(time = c(0, 1, 3, 5, 0.5, 6, 9),
id = c(1, 1, 1, 1, 2, 2, 2))
gd <- sim_weibmsm(data = obs_df, tmat = trans.illdeath(), shape = c(1, 1, 1),
scale = c(2, 10, 1))
visualise_msm(gd)
Smooth hazard estimation for general multi-state model with interval censored transitions
Description
For a general Markov chain multi-state model with interval censored
transitions calculate the NPMLE of the transition intensities. The estimates
are returned as an msfit
object. The smallest time
in the data will be set to zero.
Usage
smoothmsm(
gd,
tmat,
exact,
formula,
data,
deg_splines = 3,
n_segments = 20,
ord_penalty = 2,
maxit = 100,
tol = 1e-04,
Mtol = 1e-04,
conv_crit = c("haz", "prob", "lik"),
verbose = FALSE,
prob_tol = tol/10,
ode_solver = "lsoda",
ridge_penalty = 1e-06
)
Arguments
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
exact |
Numeric vector indicating to which states transitions are observed at exact times.
Must coincide with the column number in |
formula |
Formula to interpret in data for covariates. |
data |
A |
deg_splines |
Degree to use for the B-spline basis functions. Defaults to 3 (cubic B-splines). |
n_segments |
Number of segments to use for the P-splines. The segments will space the domain evenly. According to Eilers & Marx (2021), it it OK to choose this number very large. Default = 20. |
ord_penalty |
Order of the P-spline penalty (penalty on the difference between d-order differences of spline coefficients). See Eilers & Marx Section 2.3. Defaults to 2. |
maxit |
Maximum number of iterations. Default = 100. |
tol |
Tolerance of the convergence procedure in the E-step. A change in the value of
|
Mtol |
Tolerance of the convergence procedure of the M-step. The M-step
consists of an Iteratively Reweighted Least Squares (IRLS) procedure, where
the (unobserved) complete-data likelihood is maximized. Default is |
conv_crit |
Convergence criterion. Stops procedure when the difference
in the chosen quantity between two consecutive iterations is smaller
than the tolerance level
Default is "haz". The options "haz" and "lik" can be compared across different
|
verbose |
Should iteration messages be printed? Default is FALSE |
prob_tol |
If an estimated probability is smaller than |
ode_solver |
The integrator to use for solving the ODE's. See
|
ridge_penalty |
The ridge penalty to use for estimating risk-adjustment coefficients. Default = 1e-06. |
References
Eilers, P.H.C. and Marx, B.D., Practical Smoothing: The Joys of P-splines, Cambridge University Press (2021)
Calculate the subject specific intensity matrices
Description
For each subject, calculate a 3D array containing the states (from, to) in the first two dimension and the times in the third. This is then stored in a 4D array, with the 4th dimension indicating each unique subject.
Usage
subject_specific_intensity_matrices(
subject_specific_risks,
baseline_intensities,
trans
)
Determine subject specific intensity matrix for a single subject
Description
Determine subject specific intensity matrix for a single subject
Usage
subject_specific_intensity_matrix(
subject_specific_risk,
intensity_matrices_zero_diag,
trans,
n_transitions,
n_times
)
Summary method for a probtrans.subjects object
Description
Summary method for an object of class 'probtrans.subjects'. It prints a selection of
the estimated transition probabilities. Wrapper for
summary.probtrans
.
Usage
## S3 method for class 'probtrans.subjects'
summary(object, id, times, from = 1, to = 0, extend = FALSE, ...)
Arguments
object |
Object of class 'probtrans.subjects', containing estimated transition probabilities from and to all states in a multi-state model |
id |
Subject identifier |
times |
Time points at which to evaluate the transition probabilites |
from |
Specifies from which state the transition probabilities are to be printed. Should be subset of 1:S, with S the number of states in the multi-state model. Default is print from state 1 only. User can specify from=0 to print transition probabilities from all states |
to |
Specifies the transition probabilities to which state are to be printed. User can specify to=0 to print transition probabilities to all states. This is also the default |
extend |
logical value: if |
... |
Further arguments to |
Value
Function summary.probtrans
returns an object of class
"summary.probtrans.subjects", which is a list (for each from
state) of
transition probabilities at the specified (or all) time points. The
print
method of a summary.probtrans.subjects
doesn't return a value.
Author(s)
Hein Putter and Daniel Gomon
See Also
Examples
if(require("mstate")){
data(ebmt3)
n <- nrow(ebmt3)
tmat <- transMat(x = list(c(2, 3), c(3), c()), names = c("Tx",
"PR", "RelDeath"))
ebmt3$prtime <- ebmt3$prtime/365.25
ebmt3$rfstime <- ebmt3$rfstime/365.25
covs <- c("dissub", "age", "drmatch", "tcd", "prtime")
msbmt <- msprep(time = c(NA, "prtime", "rfstime"), status = c(NA,
"prstat", "rfsstat"), data = ebmt3, trans = tmat, keep = covs)
#Expand covariates so that we can have transition specific covariates
msbmt <- expand.covs(msbmt, covs, append = TRUE, longnames = FALSE)
#Simple model, transition specific covariates, each transition own baseline hazard
c1 <- coxph(Surv(Tstart, Tstop, status) ~ dissub1.1 + dissub2.1 +
age1.1 + age2.1 + drmatch.1 + tcd.1 + dissub1.2 + dissub2.2 +
age1.2 + age2.2 + drmatch.2 + tcd.2 + dissub1.3 + dissub2.3 +
age1.3 + age2.3 + drmatch.3 + tcd.3 + strata(trans), data = msbmt,
method = "breslow")
#We need to make a data.frame containing all subjects of interest
ttmat <- to.trans2(tmat)[, c(2, 3, 1)]
names(ttmat)[3] <- "trans"
nd_n <- NULL
for (j in 1:30) {
# Select global covariates of subject j
cllj <- ebmt3[j, covs]
nd2 <- cbind(ttmat, rep(j, 3), rbind(cllj, cllj, cllj))
colnames(nd2)[4] <- "id"
# Make nd2 of class msdata to use expand.covs
attr(nd2, "trans") <- tmat
class(nd2) <- c("msdata", "data.frame")
nd2 <- expand.covs(nd2, covs=covs, longnames = FALSE)
nd_n <- rbind(nd_n, nd2)
}
icmstate_pt <- probtrans_coxph(c1, predt = 0, direction = "forward",
newdata = nd_n, trans = tmat)
#Obtain summary of probtrans.subjects object
plot(icmstate_pt, id = 2)
}
Determine the support of the NPMLE for interval censored data.
Description
Given censoring/truncation intervals, find the maxcliques and determine the support of the interval censored problem.
Usage
supportHudgens(intervals, reduction = TRUE, existence = FALSE)
Arguments
intervals |
A data.frame with 3 columns containing half-open intervals (left open, right closed) and an indicator whether the interval results from a censored transition or truncation:
Note that the truncation intervals need to be in the form (N, Inf] with N a numeric value. |
reduction |
Should the support be reduced using Lemma 3 from Hudgens (2005)? This requires checking an extra condition. Default is TRUE. |
existence |
Should the existence of the NPMLE be checked using Theorem 1/Lemma 4 from
Hudgens (2005)? Requires |
Value
-
graph
: Anigraph
object representing the censoring/truncation intervals -
support
: Support estimated from the censoring intervals -
dir_graph
: A directedigraph
object used to determine whether the NPMLE exists in the presence of left-truncation. -
exist_mle
: Logical output indicating whether the NPMLE exists.
References
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
Estimate support of multiple transitions given direct transition intervals
Description
Given only direct transition intervals, determine the support for each transition separately using Hudgens(2001) result. Each state is considered from a competing-risks viewpoint. Hudgens(2005) result is applied to see if the NPMLE for any of the transitions does not exist.
Usage
support_from_direct_intervals(direct_intervals, tmat)
Arguments
direct_intervals |
Output from |
tmat |
A transition matrix as created by |
Value
A list containing the estimated support sets for each possible transition
in tmat
.
Find support for Illness-death model. Only applicable to Frydman (1995) setting!!
Description
Find support for Illness-death model. Only applicable to Frydman (1995) setting!!
Usage
support_frydman(data_idx)
Arguments
data_idx |
List containing all the necessary data, output from the
|
Details
Finds the support sets Q for the interval censored transition 1 -> 2. The transitions 2->3 and 1->3 are exactly observed in this setting, therefore do not require determination of the support.
Value
A: Matrix containing the intervals indicating when the transitions were censored. unique_L_times: Union of failure times in Group 4 and censoring times in Group 1. L_tilde: Left intervals for constructing the Q intervals R_tilde: Right intervals for constructing the Q intervals Q_mat: Matrix containing the support set for the 1->2 transition I: number of intervals in the support set Q.
References
Frydman, H. (1995). Nonparametric Estimation of a Markov 'Illness-Death' Process from Interval- Censored Observations, with Application to Diabetes Survival Data. Biometrika, 82(4), 773-789. doi:10.2307/2337344
Numerically find the support of the transitions from a converged npmsm algorithm
Description
For each transition in tmat
, determine all consecutive
bins with non-zero (higher than cutoff
) transition intensities.
These then determine the numerical support of the transition.
Usage
support_npmsm(npmsm, cutoff = 1e-08)
Arguments
npmsm |
|
cutoff |
Above which value is a mass in a bin considered to be non-zero? Default = 1e-8. Note that this is independent of bin size, so can be tricky!! |
Value
A list containing a list for each transition. Each transition specific list contains the support intervals for that transition in a matrix with 3 named columns L, R and dA, indicating the left/right-endpoints of the support intervals and the change in the estimated intensities over this support interval.
Examples
require(mstate)
require(ggplot2)
#Generate from an illness-death model with exponential transitions with
#rates 1/2, 1/10 and 1 for 10 subjects over a time grid.
gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1),
scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20),
startprobs = c(0.9, 0.1, 0))
#Fit 2 models: 1 with at most 4 iterations and 1 with at most 20
mod1 <- npmsm(gd, trans.illdeath(), maxit = 4)
#Determine support numerically:
mod1_supp <- support_npmsm(mod1)
mod1_supp[[1]]
Calculate subject specific risks for subjects in newdata
Description
Return a 2 dimensional array, with subjects in the first dimension and transition (numbers) in the second. Can expand later to include time-dependent covariates by introducing extra dimension. Entries of the matrix are exp(lp)_i^m, with i denoting the subject and m the transition.
Usage
trans_specific_risks(object, newdata, trans)
Arguments
object |
A |
newdata |
A
Note that newdata must contain a column containing the variable which was
used to determine the stratum of a transition in |
trans |
A transition matrix as created by |
Wrapper for the probtrans
function
Description
For 'msm'
objects: determine transition probabilities
(as in probtrans
) from an
msm
object. Currently only direction = "forward" is supported.
For 'npmsm'
objects: Determine transition probabilities for an 'npmsm'
object
using the probtrans
function.
For 'msfit'
objects: Wrapper for probtrans
Usage
## S3 method for class 'msm'
transprob(object, predt, times, ...)
## S3 method for class 'npmsm'
transprob(object, ...)
## S3 method for class 'msfit'
transprob(object, ...)
transprob(object, ...)
Arguments
object |
Object of compatible class |
predt |
A positive number indicating the prediction time. This is
the time at which the prediction is made. If missing, smallest time of
|
times |
A vector of times at which the transition probabilities should be determined. |
... |
Further arguments to |
Details
Can be used for objects of class 'npmsm', 'msm' and 'msfit'
Value
A probtrans
object containing the estimated transition probabilities.
Visualise data for illness-death model, only applicable to Frydman(1995) setting.
Description
Visualise data for illness-death model, only applicable to Frydman(1995) setting.
Usage
visualise_data(data, msmFrydman)
Arguments
data |
A
|
msmFrydman |
A fitted model from |
Value
Returns a visualisation of illness-death data, with the transition
from healthy to illness interval-censored and the other two transitions
observed exactly or right-censored. If msmFrydman
is specified, the
support intervals from the fit are additionally plotted at the top of the
data visualisation.
References
Frydman, H. (1995). Nonparametric Estimation of a Markov 'Illness-Death' Process from Interval- Censored Observations, with Application to Diabetes Survival Data. Biometrika, 82(4), 773-789. doi:10.2307/2337344
See Also
See msm_frydman
for fitting a model.
Examples
data <- data.frame(delta = c(0, 0, 1, 1), Delta = c(0, 1, 0, 1),
L = c(NA, NA, 1, 1.5), R = c(NA, 3, 2, 3),
time = c(4, 5, 6, 7))
mod_frydman <- msm_frydman(data)
visualise_data(data, mod_frydman)
Visualise multi-state data
Description
Produce a plot with the y-axis representing subjects in the data and the x-axis representing the time at which states have been observed.
Usage
visualise_msm(gd, npmsm, tmat, neat = TRUE, cutoff)
Arguments
gd |
A
|
npmsm |
Output from |
tmat |
A transition matrix as created by |
neat |
Boolean indicating whether redundant observations should be removed in the plot. Default is TRUE |
cutoff |
cutoff value for numerically determining the support using
|
Value
A plot will be produced in the plotting window.
Examples
#Write a function for evaluation times: observe at 0 and uniform inter-observation times.
eval_times <- function(n_obs, stop_time){
cumsum( c( runif(1, 0, 0.5), runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) )
}
#Use built_in function to simulate illness-death data
#from Weibull distributions for each transition
sim_dat <- sim_id_weib(n = 50, n_obs = 6, stop_time = 15, eval_times = eval_times,
start_state = "stable", shape = c(0.5, 0.5, 2),
scale = c(5, 10, 10/gamma(1.5)))
#Visualise the data
visualise_msm(sim_dat)