--- title: "Getting Started with likelihood.model" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Getting Started with likelihood.model} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4 ) old_opts <- options(digits = 4) ``` ## Introduction The `likelihood.model` package provides a flexible framework for specifying and using likelihood models for statistical inference in R. It supports: - Generic likelihood model interface (`loglik`, `score`, `hess_loglik`, `fim`) - Maximum likelihood estimation with confidence intervals - Bootstrap sampling distributions - Pure likelihood-based (Fisherian) inference - Likelihood ratio tests This vignette provides a quick introduction to the main features using the built-in `exponential_lifetime` model. For contribution-based models with heterogeneous observation types (exact, censored, etc.), see the companion package `likelihood.contr`. ## Installation ```{r install, eval=FALSE} # Install from GitHub if (!require(devtools)) install.packages("devtools") devtools::install_github("queelius/likelihood.model") ``` ## Loading Packages ```{r load, message=FALSE, warning=FALSE} library(likelihood.model) ``` ## Example 1: Closed-Form MLE with Exponential Lifetime Model The `exponential_lifetime` model demonstrates a key design feature: specialized models can override `fit()` to bypass `optim` entirely when the MLE has a closed-form solution. For Exponential(lambda) data, the MLE is simply lambda_hat = d / T, where d is the number of uncensored observations and T is the total observation time. ```{r exponential-example} # Generate exponential survival data set.seed(99) df_exp <- data.frame(t = rexp(200, rate = 3.0)) # Create model and fit -- no initial guess needed! model_exp <- exponential_lifetime("t") # View model assumptions assumptions(model_exp) # Fit the model mle_exp <- fit(model_exp)(df_exp) # View results summary(mle_exp) ``` ### Extracting Results ```{r exponential-results} # Parameter estimates cat("Estimated parameters:\n") print(coef(mle_exp)) # Confidence intervals (Wald-based) cat("\n95% Confidence Intervals:\n") print(confint(mle_exp)) # Standard errors cat("\nStandard Errors:\n") print(se(mle_exp)) # Log-likelihood value cat("\nLog-likelihood:", as.numeric(logLik(mle_exp)), "\n") # AIC for model selection cat("AIC:", AIC(mle_exp), "\n") # The score at the MLE is exactly zero (by construction) cat("Score at MLE:", score_val(mle_exp), "\n") ``` ## Example 2: Right-Censored Exponential Data The model handles right-censoring naturally through the sufficient statistics. ```{r exponential-censored} # Generate data with right-censoring at time 0.5 set.seed(99) true_lambda <- 3.0 x <- rexp(200, rate = true_lambda) censored <- x > 0.5 df_cens <- data.frame( t = pmin(x, 0.5), status = ifelse(censored, "right", "exact") ) cat("Censoring rate:", mean(censored) * 100, "%\n") model_cens <- exponential_lifetime("t", censor_col = "status") mle_cens <- fit(model_cens)(df_cens) cat("MLE (censored):", coef(mle_cens), "(true:", true_lambda, ")\n") cat("95% CI:", confint(mle_cens)[1, ], "\n") ``` ## Example 3: Verifying the Score at MLE At the MLE, the score function (gradient of log-likelihood) should be approximately zero. This provides a useful diagnostic. ```{r score-check} model <- exponential_lifetime("t") df <- data.frame(t = rexp(100, rate = 2.0)) # Fit MLE mle <- fit(model)(df) # Evaluate score at MLE score_func <- score(model) score_at_mle <- score_func(df, coef(mle)) cat("Score at MLE (should be near zero):\n") print(score_at_mle) # The score is also stored in the MLE object cat("\nScore stored in MLE object:\n") print(score_val(mle)) ``` The score is exactly zero at the MLE for `exponential_lifetime` because the closed-form solution satisfies the score equation by construction. ## Example 4: Fisherian Likelihood Inference The package supports pure likelihood-based inference in the Fisherian tradition. Instead of confidence intervals, we can compute likelihood intervals that make no probability statements about parameters. ```{r fisherian-example} # Fit a model model <- exponential_lifetime("t") df <- data.frame(t = rexp(200, rate = 2.0)) mle <- fit(model)(df) # Support function: log relative likelihood # S(theta) = logL(theta) - logL(theta_hat) # Support at MLE is always 0 s_at_mle <- support(mle, coef(mle), df, model) cat("Support at MLE:", s_at_mle, "\n") # Support at alternative parameter values (negative = less supported) s_alt <- support(mle, c(lambda = 1.0), df, model) cat("Support at lambda=1.0:", s_alt, "\n") # Relative likelihood = exp(support) rl <- relative_likelihood(mle, c(lambda = 1.0), df, model) cat("Relative likelihood at lambda=1.0:", rl, "\n") ``` ### Likelihood Intervals ```{r likelihood-intervals} # Compute 1/8 likelihood interval (roughly equivalent to 95% CI) # This is the set of theta where R(theta) >= 1/8 li <- likelihood_interval(mle, df, model, k = 8) print(li) # Compare with Wald confidence interval cat("\nWald 95% CI:\n") print(confint(mle)) ``` ## Example 5: Evidence Function The evidence function computes the log-likelihood ratio between two parameter values, providing a pure likelihood-based measure of support. ```{r evidence-example} model <- exponential_lifetime("t") set.seed(42) df <- data.frame(t = rexp(100, rate = 2.0)) # Evidence for lambda=2 vs lambda=1 ev <- evidence(model, df, c(lambda = 2), c(lambda = 1)) cat("Evidence for lambda=2 vs lambda=1:", ev, "\n") # Positive evidence supports the first hypothesis if (ev > log(8)) { cat("Strong evidence favoring lambda=2\n") } ``` ## Example 6: Bootstrap Sampling Distribution For more robust inference, especially with small samples, we can use bootstrap methods to estimate the sampling distribution of the MLE. ```{r bootstrap-example, cache=TRUE} model <- exponential_lifetime("t") set.seed(42) df <- data.frame(t = rexp(100, rate = 2.0)) # Create bootstrap sampler boot_sampler <- sampler(model, df = df, par = c(lambda = 2)) # Generate bootstrap samples (100 replicates for speed) boot_result <- boot_sampler(n = 100) # Compare asymptotic vs bootstrap standard errors mle <- fit(model)(df) asymp_se <- se(mle) boot_se <- se(boot_result) cat("Standard Error Comparison:\n") cat(" Asymptotic Bootstrap\n") cat(sprintf("Lambda: %10.4f %9.4f\n", asymp_se[1], boot_se[1])) # Bootstrap bias estimate cat("\nBootstrap Bias Estimate:\n") print(bias(boot_result)) ``` ## Summary of Key Functions ### Model Creation | Function | Description | |----------|-------------| | `exponential_lifetime()` | Exponential model with closed-form MLE and optional censoring | For contribution-based models and standard distribution wrappers, see the `likelihood.contr` package. ### Likelihood Calculus | Function | Description | |----------|-------------| | `loglik()` | Get log-likelihood function | | `score()` | Get score (gradient) function | | `hess_loglik()` | Get Hessian of log-likelihood | | `fim()` | Fisher information matrix | ### Estimation and Inference | Function | Description | |----------|-------------| | `fit()` | Create MLE solver (returns `fisher_mle`) | | `sampler()` | Create bootstrap sampler (returns `fisher_boot`) | | `coef()` | Extract parameter estimates | | `vcov()` | Variance-covariance matrix | | `confint()` | Confidence intervals | | `se()` | Standard errors | ### Fisherian Inference | Function | Description | |----------|-------------| | `support()` | Log relative likelihood: S(theta) = logL(theta) - logL(theta_hat) | | `relative_likelihood()` | Likelihood ratio: R(theta) = L(theta)/L(theta_hat) | | `likelihood_interval()` | Interval where R(theta) >= 1/k | | `profile_loglik()` | Profile log-likelihood | | `evidence()` | Evidence for theta_1 vs theta_2 | ### Model Comparison | Function | Description | |----------|-------------| | `lrt()` | Likelihood ratio test | | `AIC()` | Akaike Information Criterion | | `BIC()` | Bayesian Information Criterion | | `deviance()` | Deviance statistic | ## Next Steps For more details, see the other vignettes: - **Exponential Lifetime Model**: Detailed coverage of `exponential_lifetime` with analytical derivatives and right-censoring - **Integration with algebraic.mle**: Three-package ecosystem demonstration ## Session Info ```{r session} sessionInfo() ``` ```{r cleanup, include=FALSE} options(old_opts) ```