dataProfilerR: Automated Exploratory Data Analysis and Dataset Profiling
Profiles a data frame with minimal input: column type inference,
missing-value analysis, distributional summary statistics (including
skewness and kurtosis), normality tests, outlier detection, correlation and
categorical-association analysis, date-column profiling, grouped comparisons
and an overall data-quality score, alongside a set of 'ggplot2'
visualisations. A single entry point, profile_data(), returns a structured
S3 object holding metadata, statistics, diagnostics and plots, with
print(), summary() and plot() methods, and report() renders the whole
profile to a self-contained HTML file. Statistical methods include the
Shapiro-Wilk normality test as implemented by Royston (1995)
<doi:10.2307/2986146> and the Anderson-Darling test following Stephens
(1974) <doi:10.1080/01621459.1974.10480196>, with power comparisons of these
tests in Yap and Sim (2011) <doi:10.1080/00949655.2010.520163>, and the
categorical association measure of Cramer (1946, ISBN:9780691080048).
Documentation:
Downloads:
Linking:
Please use the canonical form
https://CRAN.R-project.org/package=dataProfilerR
to link to this page.