% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/major_mutate_variations.R
\name{mutate_subset}
\alias{mutate_subset}
\title{Propagate a calculation performed on a subset of data to the rest of the data}
\usage{
mutate_subset(.df, ..., .filter, .group_i = TRUE, .i = NULL,
  .t = NULL, .d = NA, .uniqcheck = FALSE, .setpanel = TRUE)
}
\arguments{
\item{.df}{Data frame or tibble.}

\item{...}{Specification to be passed to \code{dplyr::summarize()}.}

\item{.filter}{Unquoted logical condition for which observations \code{dplyr::summarize()} operations are to be run on.}

\item{.group_i}{By default, if \code{.i} is specified or found in the data, \code{mutate_cascade} will group the data by \code{.i}, overwriting any grouping already implemented. Set \code{.group_i = FALSE} to avoid this.}

\item{.i}{Quoted or unquoted variables that identify the individual cases. Note that setting any one of \code{.i}, \code{.t}, or \code{.d} will override all three already applied to the data, and will return data that is \code{as_pibble()}d with all three, unless \code{.setpanel=FALSE}.}

\item{.t}{Quoted or unquoted variable indicating the time. \code{pmdplyr} accepts two kinds of time variables: numeric variables where a fixed distance \code{.d} will take you from one observation to the next, or, if \code{.d=0}, any standard variable type with an order. Consider using the \code{time_variable()} function to create the necessary variable if your data uses a \code{Date} variable for time.}

\item{.d}{Number indicating the gap in \code{.t} between one period and the next. For example, if \code{.t} indicates a single day but data is collected once a week, you might set \code{.d=7}. To ignore gap length and assume that "one period ago" is always the most recent prior observation in the data, set \code{.d=0}. The default \code{.d = NA} here will become \code{.d = 1} if either \code{.i} or \code{.t} are declared.}

\item{.uniqcheck}{Logical parameter. Set to TRUE to always check whether \code{.i} and \code{.t} uniquely identify observations in the data. By default this is set to FALSE and the check is only performed once per session, and only if at least one of \code{.i}, \code{.t}, or \code{.d} is set.}

\item{.setpanel}{Logical parameter. \code{TRUE} by default, and so if \code{.i}, \code{.t}, and/or \code{.d} are declared, will return a \code{pibble} set in that way.}
}
\description{
This function performs \code{dplyr::summarize} on a \code{.filter}ed subset of data. Then it applies the result to all observations (or all observations in the group, if applied to grouped data), filling in columns of the data with the summarize results, as though \code{dplyr::mutate} had been run.
}
\details{
One application of this is to partially widen data. For example, if your analysis uses childhood height as a control variable in all years, \code{mutate_subset()} could be used to easily generate a \code{height_age10} variable from a \code{height} variable.
}
\examples{

data(SPrail)
# In preparation for fitting a choice model for how people choose ticket type,
# I'd like to know the price of a "Promo" ticket for a given route
# So that I can compare each other type of ticket price to that type
SPrail <- SPrail \%>\%
  mutate_subset(
    promo_price = mean(price, na.rm = TRUE),
    .filter = fare == "Promo",
    .i = c(origin, destination)
  )
}
