Type: | Package |
Title: | Correspondence Analysis Variants |
Version: | 6.0 |
Date: | 2023-10-19 |
Author: | Rosaria Lombardo and Eric J Beh |
Maintainer: | Rosaria Lombardo <rosaria.lombardo@unicampania.it> |
Description: | Provides six variants of two-way correspondence analysis (ca): simple ca, singly ordered ca, doubly ordered ca, non symmetrical ca, singly ordered non symmetrical ca, and doubly ordered non symmetrical ca. |
Depends: | R (> 3.0.1), methods, tools, ggforce, ggrepel, gridExtra |
Imports: | ggplot2, plotly |
LazyData: | true |
License: | GPL (> 2) |
URL: | https://www.R-project.org |
NeedsCompilation: | no |
Packaged: | 2023-10-19 16:57:15 UTC; rosar |
Repository: | CRAN |
Date/Publication: | 2023-10-19 17:20:06 UTC |
Six variants of correspondence analysis
Description
It performs
1) simple correspondence analysis
2) doubly ordered correspondence analysis
3) singly ordered correspondence analysis
4) non symmetrical correspondence analysis
5) doubly ordered non symmetrical correspondence analysis
6) singly ordered non symmetrical correspondence analysis
Usage
CAvariants(Xtable, mj = NULL, mi = NULL, firstaxis = 1, lastaxis = 2,
catype = "CA", M = min(nrow(Xtable), ncol(Xtable)) - 1, alpha = 0.05)
Arguments
Xtable |
The two-way contingency table. |
mi |
The assigned ordered scores for the row categories. By default, |
mj |
The assigned ordered scores for the column categories, By default, |
firstaxis |
The horizontal polynomial, or principal, axis. It is used for the construction of the Inner product table. By default |
lastaxis |
The vertical polynomial, or principal, axis. It is used for the construction of the Inner product table. By default |
catype |
The input parameter for specifying what variant of correspondence analysis is to be performed. By default, |
M |
The number of axes used for determining the structure of the elliptical confidence regions.
By default, |
alpha |
The level of significance for the elliptical regions. By default, |
Value
Description of the output returned
Xtable |
The two-way contingency table. |
rows |
The number of rows of the two-way contingency table. |
cols |
The number of columns of the two-way contingency table. |
r |
The rank of the two-way contingency table. |
n |
The total number of observations of the two-way contingency table. |
rowlabels |
The labels of the row variable. |
collabels |
The labels of the column variable. |
Rprinccoord |
The row principal coordinates. When the input parameter |
Cprinccoord |
The column principal coordinates. When the input parameter |
Rstdcoord |
The row standard coordinates. When the input parameter |
Cstdcoord |
The column standard coordinates. When the input parameter |
tauden |
The denominator of the Goodman-Kruskal tau index is given when the input parameter |
tau |
The index of Goodman and Kruskal is given when the input parameter |
inertiasum |
The total inertia of the analysis based on Pearson's chi-squared when catype is |
singvalue |
The singular values of the two-way contingency table. |
inertias |
The inertia in absolute value and percentage, in the row space for each principal or polynomial axis. |
inertias2 |
The inertia in absolute value and percentage, in the column space for each principal or polynomial axis.
When |
t.inertia |
The total inertia of the two-way contingency table. |
comps |
The polynomial components of inertia when the variables are ordered. |
catype |
The type of correspondence analysis chosen by the analyst. By default, |
mj |
The ordered scores of the column variable. When |
mi |
The ordered scores of the row variable. When |
pcc |
The weighted centered column profile matrix. |
Jmass |
The weight matrix of the column variable. |
Imass |
The weight matrix of the row variable. |
Innprod |
The inner product, |
Z |
The generalised correlation matrix when |
M |
The number of axes used for determining the structure of the elliptical confidence regions.
By default, |
eccentricity |
When |
row.summ |
When |
col.summ |
When |
Note
This function recalls internally many other functions, depending on the setting of the input parameter catype
, it recalls
one of the six functions which does a variant of correspondence analysis.
After performing a variant of correspondence analysis, it gives the output object necessary for printing and plotting the results. These two
important functions are print.CAvariants
and plot.CAvariants
.
Author(s)
Rosaria Lombardo and Eric J Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
data(asbestos)
CAvariants(asbestos, catype = "CA")
CAvariants(asbestos, catype = "DOCA", mi = c(1:nrow(asbestos)), mj =c(4.5,14.5,24.5,34.5,44.5),
firstaxis = 1, lastaxis = 2, M = min(nrow(asbestos), ncol(asbestos)) - 1)
CAvariants(asbestos, catype = "DONSCA")
data(shopdataM)
CAvariants(shopdataM, catype = "NSCA")
CAvariants(shopdataM, catype = "SONSCA")
CAvariants(shopdataM, catype = "SOCA")
Selikoff's data, a two-way contingency table.
Description
The data set consists of 4 rows and 5 columns. The rows represent the degree of severity of asbestosis and the columns are concerned with the time of exposure to asbestos in years of the workers
Usage
data(asbestos)
Format
The format is:
row names [1:4] "None" "grade1" "grade2" "grade3"
col names [1:5] "0-9" "10-19" "20-29" "30-39" "40+"
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Selikoff IJ 1981 Household risks with inorganic fibers. Bulletin of the New York Academy of Medicine, 57, 947 – 961.
Examples
asbestos <-structure(c(310, 36, 0, 0, 212, 158, 9, 0, 21, 35, 17, 4, 25,
102, 49, 18, 7, 35, 51, 28), .Dim = 4:5, .Dimnames = list(c("none",
"grade1", "grade2", "grade3"), c("0-9", "10-19", "20-29", "30-39",
"40+")))
dim(asbestos)
dimnames(asbestos)
Row isometric biplot or Column isometric biplot
Description
This function is used in the main plot function when the plot type parameter is
plottype = "biplot"
. It can produce a row biplot or a column biplot.
Usage
caRbiplot(frows, gcols, firstaxis, lastaxis, inertiapc, bip="row", size1,size2)
Arguments
frows |
The row principal or standard coordinates. |
gcols |
The column principal or standard coordinates. |
firstaxis |
The first axis number. |
lastaxis |
The second axis number. |
inertiapc |
The percentage of the explained inertia by each dimension. |
bip |
The type of biplot. One may specify a row-isometric biplot or a column-isometric biplot (when using
in the function |
size1 |
The size of the plotted symbol for categories in biplot. |
size2 |
The size of the plotted text for categories in biplot. |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Classical two-way correspondence analysis
Description
This function is used in the main function CAvariants
when the input parameter is catype = "CA"
.
It performs the singular value decomposition of Pearson's ratio and
computes principal axes, coordinates, the weights of rows and columns, the total inertia (equal to Pearson's index)
and the rank of the matrix.
Usage
cabasic(Xtable)
Arguments
Xtable |
The two-way contingency table. |
Note
This function belongs to the R
object class called cabasicresults
.
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
data(asbestos)
cabasic(asbestos)
Three dimensional correspondence plot
Description
This function is used in the plot function plot.CAvariants
when the logical parameter is
plot3d = TRUE
.
It produces a 3-dimensional visualization of the association.
Usage
caplot3d(coordR, coordC, inertiaper, firstaxis = 1, lastaxis = 2, thirdaxis = 3)
Arguments
coordR |
The row principal or standard coordinates. |
coordC |
The column principal or standard coordinates. |
inertiaper |
The percentage of the total inertia explained inertia by each dimension. |
firstaxis |
The first axis number. By default, |
lastaxis |
The second axis number. By default, |
thirdaxis |
The third axis number. By default, |
Note
This function depends on the R
library plotly
.
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Row isometric or column isometric biplot for ordered variants of correspondence analysis
Description
This function is used in the main plot function when the plot type parameter is
plottype = "biplot"
. It can produce a row polynomial biplot or a column polynomial biplot.
Usage
caplotord(frows, gcols, firstaxis, lastaxis, nseg, inertiapc, thingseg, col1,
col2, col3, size1, size2)
Arguments
frows |
The row principal or standard coordinates. |
gcols |
The column principal or standard coordinates. |
firstaxis |
The first polynomial axis number. |
lastaxis |
The second polynomial axis number. |
nseg |
The vectors/arrows number where to project principal (or standard) coordinates. |
inertiapc |
The percentage of the explained inertia by each dimension. |
thingseg |
The principal or standard coordinates used to draw vectors (arrows). |
col1 |
The colour for the row variable labels. |
col2 |
The colour for the column variable labels. |
col3 |
The colour for the vectors (arrows) used in biplots. |
size1 |
The size of the plotted symbol for categories in biplot. |
size2 |
The size of the plotted text for categories in biplot. |
Note
This function depends on the R
library plotly
.
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Polynomial component of inertia in column space
Description
This function allows the analyst to compute the contribution that the polynomial components make to the inertia
(Pearson's chi-squared statistic or the Goodman-Kruskal tau index).
The ordered variable should be the column variable that is transformed by polynomials.
The polynomial components are the column polynomial components.
The given input matrix is the Z matrix of generalised correlations from the hybrid decomposition.
It is called by CAvariants
when catype = "SOCA"
or catype = "SONSCA"
.
Usage
compsonetable.exe(Z)
Arguments
Z |
The matrix of generalised correlations between the polynomial and principal axes. |
Value
The value returned is the matrix
comps |
The matrix of the column polynomial component of inertia. |
Note
This function belongs to the class called cacorporateplus
.
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Polynomial component of inertia for the row and column spaces
Description
This function allows the analyst to compute the contribution of the polynomial components to the inertia
(Pearson's chi-squared statistic or the Goodman-Kruskal tau index).
The ordered variable should be both the row and column variables that are transformed by the polynomials.
The polynomial components are the row and column polynomial components.
The given input matrix is the Z matrix of generalised correlations from the bivariate moment decomposition.
It is called by CAvariants
when catype = "DOCA"
or catype = "DONSCA"
.
Usage
compstable.exe(Z)
Arguments
Z |
The matrix of generalised correlations between the polynomial axes. |
Value
The value returned is the matrix
comps |
The matrix of the polynomial components of the inertia. |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Doubly, or two-way, ordered correspondence analysis: for two ordered variables
Description
This function is used by the main function CAvariants
when the input parameter is catype = "DOCA"
.
It performs the bivariate moment decomposition of the Pearson ratio,
computes polynomial axes, coordinates, weights of rows and columns, total inertia (based on Pearson's chi-squared statistic), the rank of the matrix.
It also decomposes the inertia into row and column polynomial components.
Usage
docabasic(Xtable, mi, mj)
Arguments
Xtable |
The two-way contingency table. |
mi |
The set of ordered row scores. By default, |
mj |
The set of ordered column scores. By default, |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
data(asbestos)
mi <- c(1,2,3,4) #natural scores for rows
mj <- c(4.5,14.5,24.5,34.5,44.5) #midpoints for columns
docabasic(asbestos, mi, mj)
Doubly, or two-way ordered, non symmetrical correspondence analysis: for two ordered variables
Description
This function is used in the main function CAvariants
when the input parameter is catype = "DONSCA"
.
It performs the bivariate moment decomposition of the numerator of the Goodman-Kruskal tau index for a contingency table consisting of two ordered variables.
It computes the polynomial axes, coordinates, weights of the rows and columns, total inertia (equal to the numerator of the tau index) and the rank of the matrix.
It also decomposes the inertia into row and column polynomial components.
Usage
donscabasic(Xtable, mi, mj)
Arguments
Xtable |
The two-way contingency table. |
mi |
The set of ordered row scores. By default, |
mj |
The set of ordered column scores. By default, |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
data(asbestos)
mi <- c(1, 2, 3, 4) # natural scores for the rows
mj <- c(4.5, 14.5, 24.5, 34.5, 44.5) #midpoints for the columns
donscabasic(asbestos, mi, mj)
Orthogonal polynomials
Description
This function is called from the functions
docabasic, socabasic, sonscabasic
and donscabasic
.
It computes the orthogonal polynomials for the ordered categorical variables.
The number of the polynomials is equal to the number of categories for that variable less one.
The function computes the polynomial transformation of the ordered categorical variable.
Usage
emerson.poly(mj, pj)
Arguments
mj |
The ordered scores of the ordered variable. By default, |
pj |
The marginal relative frequencies of the ordered variable. |
Value
Describe the value returned
B |
the matrix of the orthogonal polynomials with the trivial polynomial removed. |
Note
Note that the sum of the marginal relative frequencies of the ordered variables must be one.
Author(s)
Rosaria Lombardo and Eric J Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Emerson PL 1968 Numerical construction of orthogonal polynomials from a general recurrence formula. Biometrics, 24 (3), 695-701.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325-349.
Two-way non symmetrical correspondence analysis
Description
This function is used in the main function CAvariants
when the input parameter is catype = "NSCA"
.
It calculates the singular value decomposition of the numerator of the Goodman-Kruskal tau index (index of predictability),
computes principal axes, coordinates, weights of the rows and columns, total inertia (numerator of the tau index) and the rank of the matrix.
Usage
nscabasic(Xtable)
Arguments
Xtable |
The two-way contingency table. |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
data(asbestos)
nscabasic(asbestos)
Main plot function
Description
This function produces the graphical display for the selected variant of correspondence analysis.
When catype = "CA"
catype = "NSCA"
and plottype = "classic"
, the function produces a plot
of the principal coordinates for the row and column categories.
When plottype = "biplot"
, it produces a biplot graphical display, or a polynomial biplot
in case of ordered variables.
For an ordered analysis only the polynomial biplots are constructed. In particular,
for the singly ordered variants only the row isometric polynomial biplot is appropriate.
When the parameter catype
defines an ordered variant of CA, the input parameter
plottype
should be equal to plottype = "biplot"
. If biptype = "row"
,
it will produce a row isometric polynomial biplot.
Usage
## S3 method for class 'CAvariants'
plot(x, firstaxis = 1, lastaxis = 2, thirdaxis = 3, cex = 0.8,
cex.lab = 0.8, plottype = "biplot", biptype = "row",
scaleplot = 1, posleg = "right", pos = 2, ell = FALSE,
alpha = 0.05, plot3d = FALSE, size1 = 1.5, size2 = 3, ...)
Arguments
x |
The name of the output object used with the main function |
firstaxis |
The horizontal polynomial, or principal, axis. By default, |
lastaxis |
The vertical polynomial, or principal, axis. By default, |
thirdaxis |
The third polynomial, or principal, axis in tridimensional plot. By default, |
cex |
The parameter for setting the size of the character labels for the points in a graphical display. By default, |
cex.lab |
The parameter for setting the size of the character labels of axes in graphical displays. By default, |
plottype |
The type of graphical display required (either a correspondence plot or a biplot).
The type of graphical display to be constructed. By default, |
biptype |
The parameter for specifying the type of biplot.
One may specify a row-isometric biplot ( |
scaleplot |
The parameter for scaling the classic plot and biplot coordinates. See Gower et al. (2011), section 2.3.1, or
page 135 of Beh and Lombardo (2014). By default, |
posleg |
The position of the legend when portraying trends of the categories
for ordered variants of correspondence analysis.
By default, |
pos |
The parameter that specifies the position of label of each point in the graphical display. By default, |
ell |
The logical parameter which specifies whether algebraic confidence ellipses are to be included in the plot or not.
Setting the input parameter to |
alpha |
The confidence level of the elliptical regions. By default, |
plot3d |
The logical parameter specifies whether a 3D plot is to be included
in the output or not. By default, |
size1 |
The size of the plotted symbol. By default, |
size2 |
The size of the plotted text. By default, |
... |
Further arguments passed to, or from, other functions. |
Details
It produces either a classical or biplot graphical display. Further, when catype = "DOCA"
,
catype = "SOCA"
, catype = "DONSCA"
or catype = "SONSCA"
,
the trends of the row and column variables (after the reconstruction of column profiles by the polynomials) is portrayed.
For classical biplot displays, it superimposes the algebraic confidence ellipses. It uses the secondary plot function caellipse
(or
nscaellipse
) for the symmetrical (or non symmetrical) CA variants.
Note
For the classical plots, row and column principal coordinates are plotted. For biplots, one set of coordinates is the standard coordinates and the other is the principal coordinates. When an ordered variant of correspondence analysis is performed, the biplot is constructed where one set of coordinates consists of the standard polynomial coordinates and the other one is the principal polynomial coordinates.
Author(s)
Rosaria Lombardo and Eric J Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Gower J, Lubbe S, and le Roux, N 2011 Understanding Biplots. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
data(asbestos)
resasbestosCA<-CAvariants(asbestos, catype = "CA", M=2)
plot(resasbestosCA, plottype = "classic", plot3d = TRUE)
plot(resasbestosCA, plottype = "classic", ell = TRUE)
plot(resasbestosCA, plottype = "biplot", biptype = "column", scaleplot=1.5)
resasbestosDOCA<-CAvariants(asbestos, catype = "DOCA")
plot(resasbestosDOCA, plottype = "biplot", biptype = "column")
resasbestosNSCA<-CAvariants(asbestos, catype = "NSCA")
plot(resasbestosNSCA, plottype = "biplot", biptype = "column", plot3d = TRUE)
Main printing function for numerical summaries
Description
This function prints the numerical output for any of the six variants of correspondence analysis called by catype
.
The input parameter is the name of the output of the main function CAvariants
.
Usage
## S3 method for class 'CAvariants'
print(x, printdims = 2, ellcomp = TRUE, digits = 3,...)
Arguments
x |
The name of the output object from the main function |
printdims |
The number of dimensions that are used for summarising the numerical output of the analysis. By default, |
ellcomp |
This parameter specifies whether the characteristics of the confidence ellipses (eccentricity, semi-axis, area, p-values)
are to be computed. By default, |
digits |
The number of decimal places used for displaying the numerical summaries of the analysis.
By default, |
... |
Further arguments passed to, or from, other functions. |
Details
This function uses another function (called printwithaxes
) for specifying the number of
columns of a matrix to print.
Value
The output returned depends on the type of correspondence analysis that is performed
Xtable |
The two-way contingency table. |
Row weights: Imass |
The row weight matrix. These weights depend on the type of analysis that is performed. |
Column weights: Jmass |
The column weight matrix. These weights are equal to the column marginal relative frequencies for all types of analysis performed. |
Total inertia |
The total inertia of the analysis performed. For example, for variants of non symmetrical correspondence analysis, the output produced includes the numerator of the Goodman-Kruskal tau index, its C-statistic and p-value. |
Inertias |
The inertia values, their percentage contribution to the total inertia and the cumulative percent inertias for the row and column variables. |
Generalised correlation matrix |
The matrix of generalised correlations when performing
an ordered correspondence analysis, |
Row principal coordinates |
The row principal coordinates when |
Column principal coordinates |
The column principal coordinates when |
Row standard coordinates |
The row standard coordinates when |
Column standard coordinates |
The column standard coordinates when |
Row principal polynomial coordinates |
The row principal polynomial coordinates when performing an ordered correspondence analysis. |
Column principal polynomial coordinates |
The column principal coordinates when performing a doubly ordered correspondence analysis. |
Row standard polynomial coordinates |
The row standard polynomial coordinates, when performing an ordered variant of correspondence analysis. |
Column standard polynomial coordinates |
The column standard polynomial coordinates, when performing an ordered variant of correspondence analysis. |
Row distances from the origin of the plot |
The squared Euclidean distance of the row categories from the origin of the plot. |
Column distances from the origin of the plot |
The squared Euclidean distance of the column categories from the origin of the plot. |
Polynomial components |
The polynomial components of the total inertia and their p-values.
The total inertia of the column space is partitioned to identify polynomial components.
when |
Inner product |
The inner product of the biplot coordinates for the two-dimensional plot. |
eccentricity |
Value of ellipse eccentricity, the distance between its center and either of its two foci, It can be thought of as a measure of how much the conic section deviates from being circular. |
HL Axis 1 |
Value of ellipse semi-axis 1 for each row and column points. |
HL Axis 2 |
Value of ellipse semi-axis 2 for each row and column points. |
Area |
Ellipse area for each row and column points. |
pvalcol |
P-value for each row and column points. |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Examples
data(asbestos)
resasbestos <- CAvariants(asbestos, catype = "DOCA")
print(resasbestos)
Secondary printing function
Description
The function is called from the main print function print.CAvariants
.
It adds the names to objects.
Usage
printwithaxes(x, thenames,digits=3)
Arguments
x |
A matrix. |
thenames |
A character vector of the same length as |
digits |
The number of decimal places used for displaying the numerical summaries of the analysis.
By default, |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Two-way contingency table of Dutch shoplifting (1977-1978)
Description
This two-way contingency table summarises, in part, the results of a survey of the Dutch Central Bureau of Statistics (Israels, 1987). The table considers a sample of 20819 men who were suspected of shoplifting in stores of the Netherlands between 1977 and 1978.
Usage
data(shopdataM)
Format
The format is:
row names [1:13] "clothing" "accessories" "tobacco" "stationary" ...
col names [1:9] "M12<" "M13" "M16" "M19" ...
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Israels A 1987 Eigenvalue Techniques for Qualitative Data. DSWO Press, Leiden.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Examples
shopdataM <- structure(c(81, 66, 150, 667, 67, 24, 47, 430, 743, 132, 32,
197, 209, 138, 204, 340, 1409, 259, 272, 117, 637, 684, 408,
57, 547, 550, 304, 193, 229, 527, 258, 368, 98, 246, 116, 298,
61, 402, 454, 384, 149, 151, 84, 146, 141, 61, 40, 13, 71, 52,
138, 252, 942, 297, 313, 92, 251, 167, 193, 30, 16, 130, 111,
280, 624, 359, 109, 136, 36, 96, 67, 75, 11, 16, 31, 54, 200,
195, 178, 53, 121, 36, 48, 29, 50, 5, 6, 14, 41, 152, 88, 137,
68, 171, 37, 56, 27, 55, 17, 3, 11, 50, 211, 90, 45, 28, 145,
17, 41, 7, 29, 28, 8, 10, 28, 111, 34), .Dim = c(13L,9L), .Dimnames = list(
c("clothing", "accessories", "tobacco", "stationary", "books",
"records", "household", "candy", "toys", "jewelry", "perfumes",
"hobby", "other"), c("M12<", "M13", "M16", "M19", "M25",
"M35", "M45", "M57", "M65+")))
dim(shopdataM)
Singly, or one-way, ordered correspondence analysis: for an ordered column variable
Description
This function is used by the main function CAvariants
when the input parameter is catype = "SOCA"
.
It performs the hybrid decomposition of Pearson's ratios and
computes the principal axes for the rows and polynomial axes for the columns. It also gives
the coordinates, row and column weights, total inertia (based on Pearson's chi-squared statistic)
and the rank of the matrix. It decomposes the inertia in terms of the column polynomial components.
Usage
socabasic(Xtable, mj)
Arguments
Xtable |
The two-way contingency table. |
mj |
The set of ordered column scores. By default, |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
data(asbestos)
mj <- c(1, 2, 3, 4, 5)
socabasic(asbestos, mj)
Singly, or one-way, ordered non symmetrical correspondence analysis: for an ordered column predictor variable
Description
This function is used by the main function CAvariants
when the input parameter is catype = "SONSCA"
.
It performs the hybrid decomposition of the numerator of the Goodman-Kruskal tau index and implies an ordered (column) variable.
It calculates the principal axes for the rows and polynomial axes for the columns, coordinates.
It also calculates the row and column weights, inertia (based on the numerator of the tau index) and the rank of the matrix.
It decomposes the inertia into column polynomial components.
Usage
sonscabasic(Xtable, mj)
Arguments
Xtable |
The two-way contingency table. |
mj |
The set of ordered column scores. By default, |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325-349.
Examples
data(asbestos)
mj<-c(1, 2, 3, 4, 5)
sonscabasic(asbestos, mj)
Summary of numerical results from CA variants
Description
This function prints a numerical summary of the results from any of the six variants of correspondence analysis.
The input parameter is the name of the output of the main function CAvariants
.
Usage
## S3 method for class 'CAvariants'
summary(object, printdims, digits, ...)
Arguments
object |
The output of the main function |
printdims |
The number of dimensions that are used for summarising the numerical output of the analysis. By default, |
digits |
The number of decimal places used for displaying the numerical summaries of the analysis.
By default, |
... |
Further arguments passed to, or from, other functions. |
Value
The value of output returned depends on the type of correspondence analysis that is performed.
Inertias |
The inertia values, their percentage contribution to the total inertia and the cumulative percent inertias for the row and column variables. |
Generalised correlation matrix |
The matrix of generalised correlations when performing
an ordered correspondence analysis, |
Row principal coordinates |
The row principal coordinates when |
Column principal coordinates |
The column principal coordinates when |
Row standard coordinates |
The row standard coordinates when |
Column standard coordinates |
The column standard coordinates when |
Row principal polynomial coordinates |
The row principal polynomial coordinates
when |
Column principal polynomial coordinates |
The column principal coordinates when |
Row standard polynomial coordinates |
The row standard polynomial coordinates when |
Column standard polynomial coordinates |
The column standard polynomial coordinates when |
Total inertia |
The total inertia. For example, for non symmetrical correspondence analysis the numerator of the Goodman-Kruskal tau index, its C-statistic and p-value are returned. |
Polynomial components |
The polynomial components of the total inertia and their p-values.
The total inertia of the column space is partitioned to identify polynomial components.
when |
Inner product |
The inner product of the biplot coordinates for the two-dimensional plot. |
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Examples
asbestos <- matrix(c(310, 36, 0, 0, 212, 158, 9, 0, 21, 35, 17, 4, 25, 102,
49, 18, 7, 35, 51, 28), 4, 5, dimnames = list(c("none","grade1", "grade2", "grade3"),
c("0-9", "10-19", "20-29", "30-39", "40")))
risasbestos <- CAvariants(asbestos, catype = "DOCA", firstaxis = 1, lastaxis = 2)
summary(risasbestos)
Trends of matrix rows and columns
Description
This function portrays the row and column trends of the centred column profile matrix reconstructed by means of othogonal polynomials and/or principal axes.
Usage
trendplot(f, g, cex = 1, cex.lab = 0.8, main = " ", prop = 0.5,
posleg = "right", xlab = "First Axis",
ylab = "Second Axis")
Arguments
f |
The row coordinates. |
g |
The column coordinates. |
cex |
The parameter for setting the size of character labels of points in graphical displays. By default, |
cex.lab |
The parameter for setting the size of character labels of axes in graphical displays. By default, |
main |
The title of the graphical display. |
prop |
The scaling parameter for specifying the limits of the plotting area. By default, |
posleg |
The parameter for specifying the position of the legend in the graphical function |
xlab |
The parameter for setting the character label of the horizontal axis in graphical displays. |
ylab |
The parameter for setting the character label of the vertical axis in graphical displays. |
Note
This function is called from the main plot function plot.CAvariants
.
Author(s)
Rosaria Lombardo and Eric J. Beh
References
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. John Wiley & Sons.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.
Lombardo R Beh EJ and Kroonenberg PM 2016 Modelling Trends in Ordered Correspondence Analysis Using Orthogonal
Polynomials. Psychometrika, 81(2), 325–349.
Algebraic elliptical confidence regions for symmetrical variants of correspondence analysis
Description
It produces elliptical confidence regions when symmetrical or ordered symmetrical correspondence analysis is performed.
This function allows the analyst to superimpose confidence ellipses onto a graphical display when the input parameter catype
of the main function CAvariants
is set to "CA", "SOCA"
or "DOCA"
.
It is called internally from the main plot function plot.CAvariants
.
It uses the function ellipse
.
Usage
vcaellipse(t.inertia, inertias, inertiapc, cord1, cord2, a, b, firstaxis=1,
lastaxis = 2, n, M = 2, Imass, Jmass)
Arguments
t.inertia |
The total inertia of the two-way contingency table (Pearson's chi-squared or Goodman and Kruskal's index depends on the CA variant). |
inertias |
The explained inertia of each dimension. |
inertiapc |
The percentage of explained inertia for each dimension. |
cord1 |
The row principal coordinates. |
cord2 |
The column principal coordinates. |
a |
The row standard coordinates or, in case of the ordered variants of CA, the row standard polynomial coordinates. |
b |
The column standard coordinates or, in case of the ordered variants of CA, the column standard polynomial coordinates. |
firstaxis |
The horizontal polynomial, or principal, axis. By default, |
lastaxis |
The vertical polynomial, or principal, axis. By default, |
n |
The total number of observations. |
M |
The number of axes considered in determining the structure of the elliptical confidence regions. |
Imass |
The weight matrix of the row variable. |
Jmass |
The weight matrix of the column variable. |
Details
The output values of this function.
Value
eccentricity |
The eccentricity of the ellipses. This is the distance between the centre of the ellipse and its two foci, which can be thought of as a measure of how much the conic section deviates from being circular (when the region is perfectly circular, eccentricity is zero). |
HL Axis 1 |
Value of the semi-major axis length for each row and column point. |
HL Axis 2 |
Value of the semi-minor axis length for each row and column point. |
Area |
Area of the ellipse for each row and column point. |
pvalcol |
Approximate p-value for each of the row and column points. |
Note
This function is called from the main plot function plot.CAvariants
and is executed when in the main plot function the parameter
ell = TRUE
.
Author(s)
Rosaria Lombardo and Eric J Beh
References
Beh EJ 2010 Elliptical confidence regions for simple correspondence analysis. J. Stat. Plan.
Inference 140, 2582–2588.
Beh EJ and Lombardo R 2014 Correspondence Analysis: Theory, Practice and New Strategies. Wiley.
Beh EJ Lombardo R 2015 Confidence regions and Approximate P-values for classical and non-symmetric correspondence analysis.
Journal of Communications and Statistics, Theory and Methods. 44: 95–114.
Lombardo R Beh EJ 2016 Variants of Simple Correspondence Analysis. The R Journal, 8 (2), 167–184.