% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/textcleaner.R
\name{textcleaner}
\alias{textcleaner}
\title{Text Cleaner}
\usage{
textcleaner(data, miss = 99, partBY = c("row", "col"), ID = NULL,
  database = NULL)
}
\arguments{
\item{data}{A dataset of linguistic data}

\item{miss}{Value for missing data.
Defaults to 99}

\item{partBY}{Are participants by row or column?
Set to "row" for by row.
Set to "col" for by column}

\item{ID}{If subject IDs are included in the data file,
then the row or column must be specified
(e.g., if partBY = "row" and IDs are in the first column, then 1 should be entered)}

\item{database}{Database for more efficient text cleaning.
Defaults to NULL.
Can be a vector of a corpus or any text for comparison.
Currently, the only option is for "animals"}
}
\value{
This function returns a list containing the following objects:

\item{binary}{A matrix of responses where each row represents a participant
and each column represents a unique response. A response that a participant has provided is a '1'
and a response that a participant has not provided is a '0'}

\item{resposnes}{A response matrix that has been spell checked and de-pluralized with duplicates removed}

\item{spellcheck}{A list containing two objects: full and unique. \strong{full} contains
all responses regardless of spellcheck changes and \strong{unique} contains only responses that were
changed during the spell check}

\item{removed}{A list containing two objects: rows and ids.
\strong{rows} identifies removed participants by their row location in the original data file
and \strong{ids} identifies removed participants by their ID}

\item{partChanges}{A list where each participant is an object with their
responses that have been changed. Participants are identified by their ID.
This can be used to replicate the cleaning process and to keep track of changes more generaly.
Participants with \strong{NA} did not have any changes from the original data
and participants with NULL were removed due to missing data (see \emph{removed$ids})}
}
\description{
An automated cleaning function for spell-checking, de-pluralizing,
removing duplicates, and binarizing text data
}
\examples{

\donttest{
rmat <- semnetcleaner(trial, partBY = "col")
}

}
\references{
Hornik, K., & Murdoch, D. (2010).
Watch Your Spelling!.
\emph{The R Journal}, 3(2), 22-28.
}
\author{
Alexander Christensen <alexpaulchristensen@gmail.com>
}
