% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pk_methods_do_preprocess.R
\name{do_preprocess.pk}
\alias{do_preprocess.pk}
\title{Do pre-processing}
\usage{
\method{do_preprocess}{pk}(obj, ...)
}
\arguments{
\item{obj}{A `pk` object}

\item{...}{Additional arguments. Not in use currently.}
}
\value{
The same `pk` object, with added elements `data` (containing the
  cleaned, gap-filled data) and `data_info` (containing summary information
  about the data, e.g. number of observations by route, media,
  detect/nondetect; empirical tmax, time of peak concentration for oral data;
  number of observations before and after empirical tmax)
}
\description{
Pre-process data for a `pk` object
}
\details{
Data pre-processing for an object `obj` includes the following steps, in order:

- Coerce data to class `data.frame` (if it is not already)
- Rename variables to harmonized "`invivopkfit` aesthetic" variable names, using `obj$mapping`
- Check that the data includes only routes in `obj$settings_preprocess$routes_keep` and media in `obj$settings_preprocess$media_keep`
- Check that the data includes only one unit for concentration, one unit for time, and one unit for dose.
- Coerce `Value`, `Value_SD`, `LOQ`, `Dose`, and `Time` to numeric, if they are not already.
- Coerce `Species`, `Route`, and `Media` to lowercase.
- Replace any negative `Value`, `Value_SD`, `Dose`, or `Time` with `NA`
- If any non-NA `Value` is currently less than its non-NA LOQ, then replace it with NA
-  Impute any NA `LOQ`: as `calc_loq_factor` * minimum non-NA `Value` in each `loq_group`
- For any cases where `N_Subject`s is NA, impute `N_Subjects` = 1
- For anything with `N_Subjects` == 1, set `Value_SD` to 0
- Impute missing `Value_SD` as follows: For observations with `N_Subjects` > 1, take the minimum non-missing `Value_SD` for each `sd_group`. If all SDs are missing in an `sd_group`, then `Value_SD` for each observation in that group will be imputed as 0.
- Mark data for exclusion according to the following criteria:
    - Exclude any remaining observations where both Value and LOQ are NA
    - For any cases where `N_Subjects` is NA, impute `N_Subjects` = 1
    - Exclude any remaining observations with `N_Subjects` > 1 and `Value_SD` still NA. (This should never occur, if SD imputation is performed, but just in case.)
    - Exclude any observations with `N_Subjects` > 1 where reported `Value` is NA, because log-likelihood for non-detect multi-subject observations has not been implemented.
    - Exclude any observations with NA `Time` values
    - Exclude any observations with `Dose` = 0
- Apply any time transformations specified by user
- Scale concentration by `ratio_conc_dose`
- Apply any concentration transformations specified by the user.
- If `Series_ID` is not included, then assign it as NA
- Create variable `pLOQ` and set it equal to `LOQ`
}
\author{
John Wambaugh, Caroline Ring, Christopher Cook, Gilberto Padilla Mercado
}
