% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/smdi_diagnose.R
\name{smdi_diagnose}
\alias{smdi_diagnose}
\title{Computes three group missing data summary diagnostics}
\usage{
smdi_diagnose(
  data = NULL,
  covar = NULL,
  median = TRUE,
  includeNA = FALSE,
  train_test_ratio = c(0.7, 0.3),
  set_seed = 42,
  ntree = 1000,
  n_cores = 1,
  model = c("logistic", "linear", "cox"),
  form_lhs = NULL,
  exponentiated = FALSE
)
}
\arguments{
\item{data}{dataframe or tibble object with partially observed/missing variables}

\item{covar}{character covariate or covariate vector with partially observed variable/column name(s) to investigate. If NULL, the function automatically includes all columns with at least one missing observation and all remaining covariates will be used as predictors}

\item{median}{logical if the median (= TRUE; recommended default) or mean of all absolute standardized mean differences (asmd) should be computed (smdi_asmd())}

\item{includeNA}{logical, should missingness of other partially observed covariates be explicitly modeled for computation of absolute standardized mean differences (default is FALSE)}

\item{train_test_ratio}{numeric vector to indicate the test/train split ratio for random forest missingness prediction model, e.g. c(.7, .3) is the default}

\item{set_seed}{seed for reproducibility of random forest missingness prediction model, defaults to 42}

\item{ntree}{integer, number of trees for random forest missingness prediction model (defaults to 1000 trees)}

\item{n_cores}{integer, if >1, computations will be parallelized across amount of cores specified in n_cores (only UNIX systems)}

\item{model}{character describing which outcome model to fit to assess the association between covar missingness indicator and outcome. Currently supported are models of type logistic, linear and cox (see smdi_outcome)}

\item{form_lhs}{string specifying the left-hand side of the outcome formula (see smdi_outcome)}

\item{exponentiated}{logical, should results of outcome regression to assess association between missingness and outcome be exponentiated (default is FALSE)}
}
\value{
smdi object including a summary table of all three smdi group diagnostics:

\strong{Group 1 diagnostic:}
\itemize{
\item asmd_{mean/median}: average/median absolute standardized mean difference (and min, max) of patient characteristics between those without (1) and with (0) observed covariate
\item hotteling_p: p-value of hotelling test. Rejecting the H0 means that Hotelling's test detects a significant difference in the distribution between patients without (1) and with (0) the observed covariate
}

\strong{Group 2 diagnostic:}
\itemize{
\item rf_auc: The area under the receiver operating curve (AUC) as a measure of the ability to predict the missingness of the partially observed covariate
}

\strong{Group 3 diagnostic:}
\itemize{
\item estimate_univariate: univariate association between missingness indicator of covar and outcome
\item estimate_adjusted: association between missingness indicator of covar and outcome conditional on other fully observed covariates and missing indicator variables of other partially observed covariates
}
}
\description{
This function bundles and calls all three group diagnostics and returns the most important summary metrics.
For more information and details, please refer to the individual functions.

Important: don't include variables like ID variables, ZIP codes, dates, etc.
}
\details{
Wrapper for individual diagnostics function.
}
\examples{
library(smdi)

smdi_diagnose(
 data = smdi_data,
 covar = "egfr_cat",
 model = "cox",
 form_lhs = "Surv(eventtime, status)"
 )

}
\references{
TBD
}
\seealso{
\code{\link{smdi_asmd}}
\code{\link{smdi_hotelling}}
\code{\link{smdi_little}}
\code{\link{smdi_rf}}
\code{\link{smdi_outcome}}
}
