% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/create_informant.R
\name{create_informant}
\alias{create_informant}
\title{Create a \strong{pointblank} \emph{informant} object}
\usage{
create_informant(
  tbl = NULL,
  tbl_name = NULL,
  label = NULL,
  agent = NULL,
  lang = NULL,
  locale = NULL,
  read_fn = NULL
)
}
\arguments{
\item{tbl}{\emph{Table or expression for reading in one}

\verb{obj:<tbl_*>|<tbl reading expression>} // \strong{required}

The input table. This can be a data frame, a tibble, a \code{tbl_dbi} object, or
a \code{tbl_spark} object. Alternatively, an expression can be supplied to serve
as instructions on how to retrieve the target table at incorporation-time.
There are two ways to specify an association to a target table: (1) as a
table-prep formula, which is a right-hand side (RHS) formula expression
(e.g., \verb{~ \{ <tbl reading code>\}}), or (2) as a function (e.g.,
\verb{function() \{ <tbl reading code>\}}).}

\item{tbl_name}{\emph{A table name}

\verb{scalar<character>} // \emph{default:} \code{NULL} (\code{optional})

A optional name to assign to the input table object. If no value is
provided, a name will be generated based on whatever information is
available.}

\item{label}{\emph{An optional label for the information report}

\verb{scalar<character>} // \emph{default:} \code{NULL} (\code{optional})

An optional label for the information report. If no value is provided, a
label will be generated based on the current system time. Markdown can be
used here to make the label more visually appealing (it will appear in the
header area of the information report).}

\item{agent}{\emph{The pointblank agent object}

\verb{obj:<ptblank_agent>} // \emph{default:} \code{NULL} (\code{optional})

A pointblank \emph{agent} object. The table from this object can be extracted
and used in the new informant instead of supplying a table in \code{tbl}.}

\item{lang}{\emph{Reporting language}

\verb{scalar<character>} // \emph{default:} \code{NULL} (\code{optional})

The language to use for the information report (a summary table that
provides all of the available information for the table. By default, \code{NULL}
will create English (\code{"en"}) text. Other options include French (\code{"fr"}),
German (\code{"de"}), Italian (\code{"it"}), Spanish (\code{"es"}), Portuguese (\code{"pt"}),
Turkish (\code{"tr"}), Chinese (\code{"zh"}), Russian (\code{"ru"}), Polish (\code{"pl"}),
Danish (\code{"da"}), Swedish (\code{"sv"}), and Dutch (\code{"nl"}).}

\item{locale}{\emph{Locale for value formatting within reports}

\verb{scalar<character>} // \emph{default:} \code{NULL} (\code{optional})

An optional locale ID to use for formatting values in the information
report according the locale's rules. Examples include \code{"en_US"} for English
(United States) and \code{"fr_FR"} for French (France); more simply, this can be
a language identifier without a country designation, like "es" for Spanish
(Spain, same as \code{"es_ES"}).}

\item{read_fn}{\emph{\link{Deprecated} Table reading function}

\code{function} // \emph{default:} \code{NULL} (\code{optional})

The \code{read_fn} argument is deprecated. Instead, supply a table-prep formula
or function to \code{tbl}.}
}
\value{
A \code{ptblank_informant} object.
}
\description{
The \code{create_informant()} function creates an \emph{informant} object, which is
used in an \emph{information management} workflow. The overall aim of this
workflow is to record, collect, and generate useful information on data
tables. We can supply any information that is useful for describing a
particular data table. The \emph{informant} object created by the
\code{create_informant()} function takes information-focused functions:
\code{\link[=info_columns]{info_columns()}}, \code{\link[=info_tabular]{info_tabular()}}, \code{\link[=info_section]{info_section()}}, and \code{\link[=info_snippet]{info_snippet()}}.

The \verb{info_*()} series of functions allows for a progressive build up of
information about the target table. The \code{\link[=info_columns]{info_columns()}} and \code{\link[=info_tabular]{info_tabular()}}
functions facilitate the entry of \emph{info text} that concerns the table columns
and the table proper; the \code{\link[=info_section]{info_section()}} function allows for the creation
of arbitrary sections that can have multiple subsections full of additional
\emph{info text}. The system allows for dynamic values culled from the target
table by way of \code{\link[=info_snippet]{info_snippet()}}, for getting named text extracts from
queries, and the use of \verb{\{<snippet_name>\}} in the \emph{info text}. To make the
use of \code{\link[=info_snippet]{info_snippet()}} more convenient for common queries, a set of
\verb{snip_*()} functions are provided in the package (\code{\link[=snip_list]{snip_list()}},
\code{\link[=snip_stats]{snip_stats()}}, \code{\link[=snip_lowest]{snip_lowest()}}, and \code{\link[=snip_highest]{snip_highest()}}) though you are free to
use your own expressions.

Because snippets need to query the target table to return fragments of \emph{info
text}, the \code{\link[=incorporate]{incorporate()}} function needs to be used to initiate this action.
This is also necessary for the \emph{informant} to update other metadata elements
such as row and column counts. Once the incorporation process is complete,
snippets and other metadata will be updated. Calling the \emph{informant} itself
will result in a reporting table. This reporting can also be accessed with
the \code{\link[=get_informant_report]{get_informant_report()}} function, where there are more reporting
options.
}
\section{Supported Input Tables}{


The types of data tables that are officially supported are:
\itemize{
\item data frames (\code{data.frame}) and tibbles (\code{tbl_df})
\item Spark DataFrames (\code{tbl_spark})
\item the following database tables (\code{tbl_dbi}):
\itemize{
\item \emph{PostgreSQL} tables (using the \code{RPostgres::Postgres()} as driver)
\item \emph{MySQL} tables (with \code{RMySQL::MySQL()})
\item \emph{Microsoft SQL Server} tables (via \strong{odbc})
\item \emph{BigQuery} tables (using \code{bigrquery::bigquery()})
\item \emph{DuckDB} tables (through \code{duckdb::duckdb()})
\item \emph{SQLite} (with \code{RSQLite::SQLite()})
}
}

Other database tables may work to varying degrees but they haven't been
formally tested (so be mindful of this when using unsupported backends with
\strong{pointblank}).
}

\section{YAML}{


A \strong{pointblank} informant can be written to YAML with \code{\link[=yaml_write]{yaml_write()}} and the
resulting YAML can be used to regenerate an informant (with
\code{\link[=yaml_read_informant]{yaml_read_informant()}}) or perform the 'incorporate' action using the target
table (via \code{\link[=yaml_informant_incorporate]{yaml_informant_incorporate()}}). Here is an example of how a
complex call of \code{create_informant()} is expressed in R code and in the
corresponding YAML representation.

R statement:

\if{html}{\out{<div class="sourceCode r">}}\preformatted{create_informant(
  tbl = ~ small_table,
  tbl_name = "small_table",
  label = "An example.",
  lang = "fr", 
  locale = "fr_CA"
)
}\if{html}{\out{</div>}}

YAML representation:

\if{html}{\out{<div class="sourceCode yaml">}}\preformatted{type: informant
tbl: ~small_table
tbl_name: small_table
info_label: An example.
lang: fr
locale: fr_CA
table:
  name: small_table
  _columns: 8
  _rows: 13.0
  _type: tbl_df
columns:
  date_time:
    _type: POSIXct, POSIXt
  date:
    _type: Date
  a:
    _type: integer
  b:
    _type: character
  c:
    _type: numeric
  d:
    _type: numeric
  e:
    _type: logical
  f:
    _type: character
}\if{html}{\out{</div>}}

The generated YAML includes some top-level keys where \code{type} and \code{tbl} are
mandatory, and, two metadata sections: \code{table} and \code{columns}. Keys that begin
with an underscore character are those that are updated whenever
\code{\link[=incorporate]{incorporate()}} is called on an \emph{informant}. The \code{table} metadata section can
have multiple subsections with \emph{info text}. The \code{columns} metadata section
can similarly have have multiple subsections, so long as they are children to
each of the column keys (in the above YAML example, \code{date_time} and \code{date}
are column keys and they match the table's column names). Additional sections
can be added but they must have key names on the top level that don't
duplicate the default set (i.e., \code{type}, \code{table}, \code{columns}, etc. are treated
as reserved keys).
}

\section{Writing an Informant to Disk}{


An \emph{informant} object can be written to disk with the \code{\link[=x_write_disk]{x_write_disk()}}
function. Informants are stored in the serialized RDS format and can be
easily retrieved with the \code{\link[=x_read_disk]{x_read_disk()}} function.

It's recommended that table-prep formulas are supplied to the \code{tbl} argument
of \code{create_informant()}. In this way, when an \emph{informant} is read from disk
through \code{\link[=x_read_disk]{x_read_disk()}}, it can be reused to access the target table (which
may changed, hence the need to use an expression for this).
}

\section{Examples}{


Let's walk through how we can generate some useful information for a really
small table. It's actually called \code{small_table} and we can find it as a
dataset in this package.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{small_table
#> # A tibble: 13 x 8
#>    date_time           date           a b             c      d e     f    
#>    <dttm>              <date>     <int> <chr>     <dbl>  <dbl> <lgl> <chr>
#>  1 2016-01-04 11:00:00 2016-01-04     2 1-bcd-345     3  3423. TRUE  high 
#>  2 2016-01-04 00:32:00 2016-01-04     3 5-egh-163     8 10000. TRUE  low  
#>  3 2016-01-05 13:32:00 2016-01-05     6 8-kdg-938     3  2343. TRUE  high 
#>  4 2016-01-06 17:23:00 2016-01-06     2 5-jdo-903    NA  3892. FALSE mid  
#>  5 2016-01-09 12:36:00 2016-01-09     8 3-ldm-038     7   284. TRUE  low  
#>  6 2016-01-11 06:15:00 2016-01-11     4 2-dhe-923     4  3291. TRUE  mid  
#>  7 2016-01-15 18:46:00 2016-01-15     7 1-knw-093     3   843. TRUE  high 
#>  8 2016-01-17 11:27:00 2016-01-17     4 5-boe-639     2  1036. FALSE low  
#>  9 2016-01-20 04:30:00 2016-01-20     3 5-bce-642     9   838. FALSE high 
#> 10 2016-01-20 04:30:00 2016-01-20     3 5-bce-642     9   838. FALSE high 
#> 11 2016-01-26 20:07:00 2016-01-26     4 2-dmx-010     7   834. TRUE  low  
#> 12 2016-01-28 02:51:00 2016-01-28     2 7-dmx-010     8   108. FALSE low  
#> 13 2016-01-30 11:23:00 2016-01-30     1 3-dka-303    NA  2230. TRUE  high
}\if{html}{\out{</div>}}

Create a pointblank \code{informant} object with \code{create_informant()} and the
\code{small_table} dataset.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{informant <- 
  create_informant(
    tbl = pointblank::small_table,
    tbl_name = "small_table",
    label = "`create_informant()` example."
  )
}\if{html}{\out{</div>}}

This function creates some information without any extra help by profiling
the supplied table object. It adds the \code{COLUMNS} section with stubs for each
of the target table's columns. We can use the \code{\link[=info_columns]{info_columns()}} or
\code{\link[=info_columns_from_tbl]{info_columns_from_tbl()}} to provide descriptions for each of the columns.
The \code{informant} object can be printed to see the information report in the
Viewer.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{informant
}\if{html}{\out{</div>}}

\if{html}{

\out{
<img src="https://raw.githubusercontent.com/rstudio/pointblank/main/images/man_create_informant_1.png" alt="This image was generated from the first code example in the `create_informant()` help file." style="width:100\%;">
}
}

If we want to make use of more report display options, we can alternatively
use the \code{\link[=get_informant_report]{get_informant_report()}} function.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{report <- 
  get_informant_report(
    informant,
    title = "Data Dictionary for `small_table`"
  )
  
report
}\if{html}{\out{</div>}}

\if{html}{

\out{
<img src="https://raw.githubusercontent.com/rstudio/pointblank/main/images/man_create_informant_2.png" alt="This image was generated from the second code example in the `create_informant()` help file." style="width:100\%;">
}
}
}

\section{Function ID}{

1-3
}

\seealso{
Other Planning and Prep: 
\code{\link{action_levels}()},
\code{\link{create_agent}()},
\code{\link{db_tbl}()},
\code{\link{draft_validation}()},
\code{\link{file_tbl}()},
\code{\link{scan_data}()},
\code{\link{tbl_get}()},
\code{\link{tbl_source}()},
\code{\link{tbl_store}()},
\code{\link{validate_rmd}()}
}
\concept{Planning and Prep}
