% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pertRates.R
\name{pertRates}
\alias{pertRates}
\title{Calculates perturbation rates of overall data set and specific variables.}
\usage{
pertRates(obs_data, new_data, imp_vars, desc = FALSE, sig = 4)
}
\arguments{
\item{obs_data}{The original dataset to which the next will be compared, of the type "data.frame".}

\item{new_data}{The fully or partially synthetic data set to be compared to the observed data, of the type "data.frame".}

\item{imp_vars}{A vector of variables which were imputed and are to be used in the overall perturbation rate calculation.}

\item{desc}{Whether or not the variable perturbation rates should be output in descending rate order.  Defaults to FALSE.}

\item{sig}{The number of significant digits desired for the overall perturbation rate.  Defaults to 4.}
}
\value{
Returns the overall perturbation rate of the synthetic data set and the specific variable perturbation rates in percentages, rounded to 0.1.  The function will also output in list format with the following components:

\item{overall}{The overall perturbation rate.}

\item{variable}{A vector of variable perturbation rates.}
}
\description{
This function will calculate the overall perturbation rate of an imputed data set and for specific variables requested.
}
\details{
A record in a data set is considered "perturbed" when at least one value in the record is different from the observed data.  The overall perturbation rate is therefore the number of records that are found to be perturbed over the number of records in a data set.

The variable perturbation rate is simply the rate at which the values for a given variable are different from those in the observed data set.

This function was developed with the intention of making the job of researching synthetic data utility a bit easier by quickly calculating perturbation rates.
}
\examples{
#PPA is observed data set, PPAps2 is a partially synthetic data set derived from the observed data.
#age17plus, marriage, and vet are three categorical variables within these data sets.

pertRates(PPA, PPAps2, c("age17plus", "marriage", "vet"))
}
\keyword{imputation}
\keyword{perturbation}
\keyword{synthetic}
