% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/get_predicted.R
\name{get_predicted}
\alias{get_predicted}
\alias{get_predicted.lm}
\alias{get_predicted.stanreg}
\title{Model Predictions (robust)}
\usage{
get_predicted(x, ...)

\method{get_predicted}{lm}(
  x,
  data = NULL,
  predict = "expectation",
  iterations = NULL,
  verbose = TRUE,
  ...
)

\method{get_predicted}{stanreg}(
  x,
  data = NULL,
  predict = "expectation",
  iterations = NULL,
  include_random = TRUE,
  include_smooth = TRUE,
  verbose = TRUE,
  ...
)
}
\arguments{
\item{x}{A statistical model (can also be a data.frame, in which case the
second argument has to be a model).}

\item{...}{Other argument to be passed for instance to
\code{\link[=get_predicted_ci]{get_predicted_ci()}}.}

\item{data}{An optional data frame in which to look for variables with which
to predict. If omitted, the data used to fit the model is used.}

\item{predict}{string or \code{NULL}
\itemize{
\item \code{"link"} returns predictions on the model's link-scale (for logistic models, that means the log-odds scale) with a confidence interval (CI).
\item \code{"expectation"} (default) also returns confidence intervals, but this time the output is on the response scale (for logistic models, that means probabilities).
\item \code{"prediction"} also gives an output on the response scale, but this time associated with a prediction interval (PI), which is larger than a confidence interval (though it mostly make sense for linear models).
\item \code{"classification"} only differs from \code{"prediction"} for binomial models where it additionally transforms the predictions into the original response's type (for instance, to a factor).
\item Other strings are passed directly to the \code{type} argument of the \code{predict()} method supplied by the modelling package.
\item When \code{predict = NULL}, alternative arguments such as \code{type} will be captured by the \code{...} ellipsis and passed directly to the \code{predict()} method supplied by the modelling package.
\item Notes: You can see the 4 options for predictions as on a gradient from "close to the model" to "close to the response data": "link", "expectation", "prediction", "classification". The \code{predict} argument modulates two things: the scale of the output and the type of certainty interval. Read more about in the \strong{Details} section below.
}}

\item{iterations}{For Bayesian models, this corresponds to the number of
posterior draws. If \code{NULL}, will return all the draws (one for each
iteration of the model). For frequentist models, if not \code{NULL}, will
generate bootstrapped draws, from which bootstrapped CIs will be computed.
Iterations can be accessed by running \code{as.data.frame()} on the output.}

\item{verbose}{Toggle warnings.}

\item{include_random}{If \code{TRUE} (default), include all random effects in
the prediction. If \code{FALSE}, don't take them into account. Can also be
a formula to specify which random effects to condition on when predicting
(passed to the \code{re.form} argument). If \code{include_random = TRUE}
and \code{newdata} is provided, make sure to include the random effect
variables in \code{newdata} as well.}

\item{include_smooth}{For General Additive Models (GAMs). If \code{FALSE},
will fix the value of the smooth to its average, so that the predictions
are not depending on it. (default), \code{mean()}, or
\code{bayestestR::map_estimate()}.}
}
\value{
The fitted values (i.e. predictions for the response). For Bayesian
or bootstrapped models (when \code{iterations != NULL}), iterations (as
columns and observations are rows) can be accessed via \code{as.data.frame}.
}
\description{
The \code{get_predicted()} function is a robust, flexible and user-friendly alternative to base R \code{\link[=predict]{predict()}} function. Additional features and advantages include availability of uncertainty intervals (CI), bootstrapping, a more intuitive API and the support of more models than base R's \code{predict} function. However, although the interface are simplified, it is still very important to read the documentation of the arguments. This is because making "predictions" (a lose term for a variety of things) is a non-trivial process, with lots of caveats and complications. Read the \code{Details} section for more information.
}
\details{
In \code{insight::get_predicted()}, the \code{predict} argument jointly
modulates two separate concepts, the \strong{scale} and the \strong{uncertainty interval}.

\subsection{Confidence Interval (CI) vs. Prediction Interval (PI))}{
\itemize{
\item \strong{Linear models} - \code{lm()}: For linear models, Prediction
intervals (\code{predict="prediction"}) show the range that likely
contains the value of a new observation (in what range it is likely to
fall), whereas confidence intervals (\code{predict="expectation"} or
\code{predict="link"}) reflect the uncertainty around the estimated
parameters (and gives the range of uncertainty of the regression line). In
general, Prediction Intervals (PIs) account for both the uncertainty in the
model's parameters, plus the random variation of the individual values.
Thus, prediction intervals are always wider than confidence intervals.
Moreover, prediction intervals will not necessarily become narrower as the
sample size increases (as they do not reflect only the quality of the fit,
but also the variability within the data).
\item \strong{Generalized Linear models} - \code{glm()}: For binomial models,
prediction intervals are somewhat useless (for instance, for a binomial
(Bernoulli) model for which the dependent variable is a vector of 1s and
0s, the prediction interval is... \verb{[0, 1]}).
}}

\subsection{Link scale vs. Response scale}{
When users set the \code{predict} argument to \code{"expectation"}, the predictions
are returned on the response scale, which is arguably the most convenient
way to understand and visualize relationships of interest. When users set
the \code{predict} argument to \code{"link"}, predictions are returned on the link
scale, and no transformation is applied. For instance, for a logistic
regression model, the response scale corresponds to the predicted
probabilities, whereas the link-scale makes predictions of log-odds
(probabilities on the logit scale). Note that when users select
\code{predict="classification"} in binomial models, the \code{get_predicted()}
function will first calculate predictions as if the user had selected
\code{predict="expectation"}. Then, it will round the responses in order to
return the most likely outcome.
}
}
\examples{
data(mtcars)
x <- lm(mpg ~ cyl + hp, data = mtcars)

predictions <- get_predicted(x)
predictions

# Options and methods ---------------------
get_predicted(x, predict = "prediction")

# Get CI
as.data.frame(predictions)

# Bootstrapped
as.data.frame(get_predicted(x, iterations = 4))
summary(get_predicted(x, iterations = 4)) # Same as as.data.frame(..., keep_iterations = F)

# Different predicttion types ------------------------
data(iris)
data <- droplevels(iris[1:100, ])

# Fit a logistic model
x <- glm(Species ~ Sepal.Length, data = data, family = "binomial")

# Expectation (default): response scale + CI
pred <- get_predicted(x, predict = "expectation")
head(as.data.frame(pred))

# Prediction: response scale + PI
pred <- get_predicted(x, predict = "prediction")
head(as.data.frame(pred))

# Link: link scale + CI
pred <- get_predicted(x, predict = "link")
head(as.data.frame(pred))

# Classification: classification "type" + PI
pred <- get_predicted(x, predict = "classification")
head(as.data.frame(pred))
}
\seealso{
\code{\link[=get_predicted_ci]{get_predicted_ci()}}
}
