% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/StSignificanceTestingCadVsRad.R
\name{StSignificanceTestingCadVsRad}
\alias{StSignificanceTestingCadVsRad}
\title{Significance testing: standalone CAD vs. radiologists}
\usage{
StSignificanceTestingCadVsRad(
  dataset,
  FOM,
  FPFValue = 0.2,
  method = "1T-RRRC",
  alpha = 0.05,
  plots = FALSE
)
}
\arguments{
\item{dataset}{\strong{The dataset to be analzed; must be single-treatment
multiple-readers, where the first reader is CAD.}}

\item{FOM}{The desired FOM; for ROC data it must be \code{"Wilcoxon"}, for FROC data 
it can be any valid FOM, e.g., \code{"HrAuc"}, \code{"wAFROC"}, etc; 
for LROC data it must be \code{"Wilcoxon"}, or \code{"PCL"} or \code{"ALROC"}.}

\item{FPFValue}{Only needed for \code{LROC} data \strong{and} FOM = "PCL" or "ALROC";
where to evaluate a partial curve based figure of merit. The default is 0.2.}

\item{method}{The desired analysis: "1T-RRFC","1T-RRRC" (the default) or "2T-RRRC",
see manuscript for details.}

\item{alpha}{Significance level of the test, defaults to 0.05.}

\item{plots}{Flag, default is FALSE, i.e., a plot is not displayed. 
If TRUE, it displays the appropriate operating characteristic for all 
readers and CAD.}
}
\value{
If \code{method = "1T-RRRC"} the return value is a 
   list with the following elements:

\item{fomCAD}{The observed FOM for CAD.}

\item{fomRAD}{The observed FOM array for the readers.}

\item{avgRadFom}{The average FOM of the readers.}

\item{avgDiffFom}{The mean of the difference FOM, RAD - CAD.}

\item{ciAvgDiffFom}{The 95-percent CI of the average difference, RAD - CAD.}

\item{varR}{The variance of the radiologists.}

\item{varError}{The variance of the error term in the single-treatment 
   multiple-reader OR model.}

\item{cov2}{The covariance of the error term.}

\item{tstat}{The observed value of the t-statistic; it's square is 
   equivalent to an F-statistic.}

\item{df}{The degrees of freedom of the t-statistic.}

\item{pval}{The p-value for rejecting the NH.}

\item{Plots}{If argument plots = TRUE, a \pkg{ggplot} object 
   containing empirical operating characteristics  
   corresponding to specified FOM. For example, if \code{FOM} = 
   \code{"Wilcoxon"} an ROC plot object
   is produced where reader 1 is CAD. If an LROC FOM is selected, an LROC
   plot is displayed.}

If \code{method = "2T-RRRC"} the return value is a list 
   with the following elements:

\item{fomCAD}{The observed FOM for CAD.}

\item{fomRAD}{The observed FOM array for the readers.}

\item{avgRadFom}{The average FOM of the readers.}

\item{avgDiffFom}{The mean of the difference FOM, RAD - CAD.}

\item{ciDiffFom}{A data frame containing the statistics associated 
   with the average difference, RAD - CAD.}

\item{ciAvgRdrEachTrt}{A data frame containing the statistics 
   associated with the average FOM in each "treatment".}

\item{varR}{The variance of the pure reader term in the OR model.}

\item{varTR}{The variance of the treatment-reader term error 
   term in the OR model.}

\item{cov1}{The covariance1 of the error term - same reader, 
   different treatments.}

\item{cov2}{The covariance2 of the error term  - 
   different readers, same treatment.}

\item{cov3}{The covariance3 of the error term  - different readers, 
   different treatments.}

\item{varError}{The variance of the pure error term in the OR model.}

\item{FStat}{The observed value of the F-statistic.}

\item{ndf}{The numerator degrees of freedom of the F-statistic.}

\item{df}{The denominator degrees of freedom of the F-statistic.}

\item{pval}{The p-value for rejecting the NH.}

\item{Plots}{see above.}
}
\description{
Comparing standalone CAD vs. a group of radiologists interpreting 
   the same cases; (ideally) \strong{standalone CAD} means that all the 
   \bold{designer-level} mark-rating pairs provided by the CAD algorithm 
   are available, not just the one or two marks usually displayed to the 
   radiologist. At the very minimum, location-level information, such as in
   the LROC paradigm, should be used. Ideally the FROC paradigm should be used.
   A severe statistical power penalty is paid if one uses the ROC paradigm. 
   Details of the method are in a pdf file that will be uploaded to GitHub and 
   in my 2017 book.
}
\details{
\itemize{
   \item{\strong{PCL} is the probability of a correct localization.} 
   \item{The LROC is the plot of PCL (ordinate) vs. FPF.} 
   \item{For LROC data, FOM = "PCL" means the interpolated PCL value 
   at the specified \code{FPFValue}.}
   \item{For FOM = "ALROC" the trapezoidal area under the LROC
   from FPF = 0 to FPF = \code{FPFValue} is used.} 
   \item{If \code{method = "1T-RRRC"} the first \strong{reader} is assumed to be CAD.} 
   \item{If \code{method = "2T-RRRC"} the first \strong{treatment} is assumed to be CAD.} 
   \item{The NH is that the FOM of CAD equals the average of the readers.} 
   \item{The \code{method = "1T-RRRC"} analysis uses an adaptation of the 
   single-treatment multiple-reader Obuchowski Rockette (OR) model described in a 
   paper by Hillis (2007), section 5.3. It is characterized by 3 parameters
   \code{VarR}, \code{Var} and \code{Cov2}, where the latter two are estimated 
   using the jackknife.} 
   \item{For \code{method = "2T-RRRC"} the analysis replicates the CAD data as many times as
   necessary so as to form one "treatment" of an MRMC pairing, the other 
   "treatment" being the radiologists. Then standard ORH analysis is applied. The 
   method is described in Kooi et al. It gives exactly the same final results 
   (F-statistic, ddf and p-value) as \code{"1T-RRRC"} but the intermediate quantities 
   are meaningless.}
   }
}
\examples{
ret1M <- StSignificanceTestingCadVsRad (dataset09, 
FOM = "Wilcoxon", method = "1T-RRRC")

StSignificanceTestingCadVsRad(datasetCadLroc, 
FOM = "Wilcoxon", method = "1T-RRFC")

retLroc1M <- StSignificanceTestingCadVsRad (datasetCadLroc, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)

## test with fewer readers
dataset09a <- DfExtractDataset(dataset09, rdrs = seq(1:7))
ret1M7 <- StSignificanceTestingCadVsRad (dataset09a, 
FOM = "Wilcoxon", method = "1T-RRRC")

datasetCadLroc7 <- DfExtractDataset(datasetCadLroc, rdrs = seq(1:7))
ret1MLroc7 <- StSignificanceTestingCadVsRad (datasetCadLroc7, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)

\donttest{
## takes longer than 5 sec on OSX
## retLroc2M <- StSignificanceTestingCadVsRad (datasetCadLroc, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)

## ret2MLroc7 <- StSignificanceTestingCadVsRad (datasetCadLroc7, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)
}

}
\references{
Hillis SL (2007) A comparison of denominator degrees of freedom methods 
for multiple observer ROC studies, Statistics in Medicine. 26:596-619.

Chakraborty DP (2017) \emph{Observer Performance Methods for Diagnostic Imaging - Foundations, 
Modeling, and Applications with R-Based Examples}, CRC Press, Boca Raton, FL. 

Hupse R, Samulski M, Lobbes M, et al (2013) Standalone computer-aided detection compared to radiologists 
performance for the detection of mammographic masses, Eur Radiol. 23(1):93-100.

Kooi T, Gubern-Merida A, et al. (2016) A comparison between a deep convolutional 
neural network and radiologists for classifying regions of interest in mammography. 
Paper presented at: International Workshop on Digital Mammography, Malmo, Sweden.
}
