% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils.R
\name{pulse_summarise}
\alias{pulse_summarise}
\title{Summarise PULSE heartbeat rate estimates over new time windows}
\usage{
pulse_summarise(
  heart_rates,
  FUN = stats::median,
  span_mins = 10,
  min_data_points = 2
)
}
\arguments{
\item{heart_rates}{the output from \code{\link[=PULSE]{PULSE()}}, \code{\link[=pulse_heart]{pulse_heart()}}, \code{\link[=pulse_doublecheck]{pulse_doublecheck()}} or \link{pulse_choose_keep}.}

\item{FUN}{a custom function, defaults to \code{median}; Note that \code{FUN} must take a vector of \code{numeric} values and output a single \code{numeric} value.}

\item{span_mins}{integer, in \code{mins}, defaults to \code{10}; expresses the width of the new summary windows}

\item{min_data_points}{numeric, defaults to \code{2}; value indicating the minimum number of data points in each new summarizing window. Windows covering less data points are discarded. If set to \code{0} (zero), no window is ever discarded.}
}
\value{
A similar tibble as the one provided for input, but fewer columns and rows. Among the columns now absent is the \code{data} column (raw data is no longer available). IMPORTANT NOTE: Despite retaining the same names, several columns present in the output now provide slightly different information (because they are recalculated for each summarizing window): \code{time} corresponds to the first time stamp of the summarizing window; \code{n} shows the number of valid original windows used by the summary function; \code{sd} represents the standard deviation of all heartbeat rate estimates within each summarizing window (and not the standard deviation of the intervals between each identified wave peak, as was the case in \code{heart_rates}); \code{ci} is the confidence interval of the new value for \code{hz}.
}
\description{
Take the output from \code{\link[=PULSE]{PULSE()}} and summarise \code{hz} estimates over new user-defined time windows using \code{FUN} (a summary function). In effect, this procedure reduces the number of data points available over time.

Note that the output of \code{pulse_summarise()} can be inspected with \code{\link[=pulse_plot]{pulse_plot()}} but not \code{\link[=pulse_plot_raw]{pulse_plot_raw()}}.
}
\section{Details}{

The PULSE multi-channel system captures data continuously. When processing those data, users should aim to obtain estimates of heart beat frequency at a rate that conforms to their system's natural temporal variability, or risk running into oversampling (which has important statistical implications and must be avoided or explicitly handled).

With this in mind, users can follow two strategies:

\emph{If, for example, users are targeting 1 data point every 5 mins...}
\itemize{
\item If the raw data is of good quality (i.e., minimal noise, signal wave with large amplitude), users can opt for a relatively narrow split_window (e.g, by setting \code{window_width_secs} in \code{\link[=PULSE]{PULSE()}} (or \code{\link[=pulse_split]{pulse_split()}}) to \code{30} secs) and to only sample split_windows every 5 mins with \code{window_shift_secs = 300}. This means that data is processed in 5-mins split-windows where 30 secs of data are used and four and a half mins of data are skipped, yielding our target of 1 data point every 5 mins. Doing so will greatly speed up the processing of the data (less and smaller windows to work on), and the final output will immediately have the desired sample frequency. However, if any of the split_windows effectively analysed features a gap in the data or happens to coincide with the occasional drop in signal quality, those estimates of heartbeat rate will reflect that lack of quality (even if \emph{better} data may be present in the four and a half mins of data that is being skipped). This strategy is usually used at the beginning to assess the dataset, and depending on the results, the more time-consuming strategy described next may have to be used instead.
\item If sufficient computing power is available and/or the raw data can't be guaranteed to be high quality from beginning to end, users can opt for a strategy that scans the entire dataset without skipping any data. This can be achieved by setting \code{window_width_secs} and \code{window_shift_secs} in \code{\link[=PULSE]{PULSE()}} (or \code{\link[=pulse_split]{pulse_split()}}) to the same low value. In this case, if both parameters are set to \code{30} secs, processing will take significantly longer and each 5 mins of data will result in \code{10} data points. Then, \code{pulse_summarise} can be used with \code{span_mins = 5} to summarise the data points back to the target sample frequency. More importantly, if the right summary function is used, this strategy can greatly reduce the negative impact of spurious \emph{bad} readings. For example, setting \code{FUN = median}, will reduce the contribution of values of \code{hz} that deviate from the center ("wrong" values) to the final heartbeat estimate for a given time window). Thus, if the computational penalty is bearable, this more robust strategy can prove useful.
}
}

\examples{
## Begin prepare data ----
paths <- pulse_example()
heart_rates <- PULSE(
  paths,
  discard_channels = c(paste0("c0", c(1:7, 9)), "c10"),
  show_progress = FALSE
  )
## End prepare data ----

# Summarise heartbeat estimates (1 data point every 5 mins)
nrow(heart_rates) # == 13
summarised_5mins <- pulse_summarise(heart_rates, span_mins = 5)
nrow(summarised_5mins) # == 3
summarised_5mins

# using a custom function
pulse_summarise(heart_rates, span_mins = 5, FUN = function(x) quantile(x, 0.2))

# normalized data is supported automatically
pulse_summarise(pulse_normalize(heart_rates))

# Note that visualizing the output from 'plot_summarise()' with
#  'pulse_plot()' may result in many warnings
pulse_plot(summarised_5mins)
"> There were 44 warnings (use warnings() to see them)"

# That happens when the value chosen for 'span_mins' is such
#  that the output from 'plot_summarise()' doesn't contain
#  enough data points for the smoothing curve to be computed
# Alternatively, do one of the following:

# reduce 'span_mins' to still get enough data points
pulse_plot(pulse_summarise(heart_rates, span_mins = 2, min_data_points = 0))

# or disable the smoothing curve
pulse_plot(summarised_5mins, smooth = FALSE)
}
\seealso{
\itemize{
\item \code{\link[=pulse_heart]{pulse_heart()}}, \code{\link[=pulse_doublecheck]{pulse_doublecheck()}}, \code{\link[=pulse_choose_keep]{pulse_choose_keep()}}, and \code{\link[=pulse_normalize]{pulse_normalize()}} are the functions that generate the input for \code{pulse_summarise}
\item \code{\link[=pulse_plot]{pulse_plot()}} can be called to visualize the output from \code{pulse_summarise} (but not \code{\link[=pulse_plot_raw]{pulse_plot_raw()}})
\item \code{\link[=PULSE]{PULSE()}} is a wrapper function that executes all the steps needed to process PULSE data at once, and its output can also be passed on to \code{pulse_summarise}
}
}
