--- title: "Hypothesis Tests in lsReg" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Hypothesis Tests in lsReg} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Available test statistics `lsReg` supports five hypothesis tests for the added covariate(s), specified via the `testtype` argument to `lsReg()`: | `testtype` | Description | |---|---| | `"lrt"` | Likelihood ratio test — twice the log-likelihood difference between the full and base models | | `"score"` | Score (Rao) test — evaluated at the base-model estimates, no full model fit required | | `"wald"` | Wald test — based on the maximum likelihood estimate and its standard error | | `"robustscore"` | Score test with Huber-White sandwich variance | | `"robustwald"` | Wald test with Huber-White sandwich variance | The **standard** tests (LRT, score, Wald) assume the model is correctly specified. The **robust** tests use the Huber-White sandwich estimator for the variance-covariance matrix, providing valid inference when the variance function is mis-specified (e.g., overdispersion in count data, heteroscedasticity in linear models). ### Return values - `"lrt"` returns a chi-square statistic with degrees of freedom equal to the number of added columns. P-values use `pchisq(..., lower.tail = FALSE)`. - All other tests return a z-score. P-values use `2 * pnorm(-abs(statistic))`. Under correct model specification all five tests are asymptotically equivalent, so their statistics should be similar in large samples. ## Example We use the simulated dataset with `ylin` as the outcome and `ylin ~ x1 + x2` as the base model. We test `z5` (a true predictor) and `z9` (a true predictor) separately under all five test types and compare the resulting p-values. ```{r load-data} library(lsReg) datafile <- system.file("extdata", "simulated_data.rds", package = "lsReg") dat <- readRDS(datafile) basemdl <- glm(ylin ~ x1 + x2, data = dat, family = gaussian) ``` ```{r run-all-tests} testtypes <- c("lrt", "score", "robustscore", "wald", "robustwald") zvars <- c("z5", "z9") results <- expand.grid(testtype = testtypes, variable = zvars, stringsAsFactors = FALSE) results$statistic <- NA_real_ results$pvalue <- NA_real_ for (tt in testtypes) { mem <- lsReg(basemdl, colstoadd = 1, testtype = tt) for (zv in zvars) { xr <- as.matrix(dat[, zv, drop = FALSE]) addcovar(mem, xr) stat <- if (tt == "lrt") mem$testvalue[1] else mem$testvalue[1, 1] pval <- if (tt == "lrt") pchisq(stat, df = 1, lower.tail = FALSE) else 2 * pnorm(-abs(stat)) idx <- results$testtype == tt & results$variable == zv results$statistic[idx] <- stat results$pvalue[idx] <- pval } } ``` ## Results ```{r results} # Reshape for a readable side-by-side comparison res_z5 <- results[results$variable == "z5", c("testtype", "statistic", "pvalue")] res_z9 <- results[results$variable == "z9", c("testtype", "statistic", "pvalue")] cat("=== z5 ===\n") print(res_z5, digits = 5, row.names = FALSE) cat("\n=== z9 ===\n") print(res_z9, digits = 5, row.names = FALSE) ``` For both `z5` and `z9` all five tests yield similar statistics and p-values, consistent with the asymptotic equivalence of the tests under a correctly specified model. Minor differences arise because each test uses a different approximation to the same quantity. The LRT statistic is on the chi-square scale; the others are z-scores. To compare them directly, note that the square root of the LRT statistic approximates the z-scores from the other tests: ```{r compare-scales} for (zv in zvars) { lrt_z <- sqrt(results$statistic[results$testtype == "lrt" & results$variable == zv]) wald_z <- results$statistic[results$testtype == "wald" & results$variable == zv] score_z <- results$statistic[results$testtype == "score"& results$variable == zv] cat(zv, "— sqrt(LRT):", round(lrt_z, 4), " Wald:", round(wald_z, 4), " Score:", round(score_z, 4), "\n") } ```