Type: | Package |
Title: | Multidimensional Scaling of Asymmetric Proximities |
Version: | 2.0.4 |
Date: | 2022-06-17 |
Author: | Berrie Zielman |
Maintainer: | Berrie Zielman <berrie.zielman@gmail.com> |
Description: | Multidimensional scaling models and methods for the visualization and analysis of asymmetric proximity data <doi:10.1111/j.2044-8317.1996.tb01078.x>. An asymmetric data matrix has the same number of rows and columns, and these rows and columns refer to the same set of objects. At least some elements in the upper-triangle are different from the corresponding elements in the lower triangle. An example of an asymmetric matrix is a student migration table, where the rows correspond to the countries of origin of the students and the columns to the destination countries. This package provides algorithms for three multidimensional scaling models. These are the slide-vector model <doi:10.1007/BF02294474>, a scaling model with unique dimensions and the asymscal model for asymmetric multidimensional scaling. Furthermore, a heat map for skew-symmetric data, and the decomposition of asymmetry are provided for the exploratory analysis of asymmetric tables. |
License: | GPL (≥ 3) |
Imports: | gplots, stats, methods, smacof |
Suggests: | knitr, rmarkdown, RColorBrewer |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2022-06-21 06:46:09 UTC; berriez |
Repository: | CRAN |
Date/Publication: | 2022-06-22 08:20:17 UTC |
Multidimensional Scaling of Asymmetric Proximities
Description
Multidimensional scaling models and methods for the visualization and analysis of asymmetric proximity data <doi:10.1111/j.2044-8317.1996.tb01078.x>. An asymmetric data matrix has the same number of rows and columns, and these rows and columns refer to the same set of objects. At least some elements in the upper-triangle are different from the corresponding elements in the lower triangle. An example of an asymmetric matrix is a student migration table, where the rows correspond to the countries of origin of the students and the columns to the destination countries. This package provides algorithms for three multidimensional scaling models. These are the slide-vector model <doi:10.1007/BF02294474>, a scaling model with unique dimensions and the asymscal model for asymmetric multidimensional scaling. Furthermore, a heat map for skew-symmetric data, and the decomposition of asymmetry are provided for the exploratory analysis of asymmetric tables.
Author(s)
Berrie Zielman
Maintainer: Berrie Zielman <berrie.zielman@gmail.com>
References
Zielman, B., and Heiser, W. J. (1993), The analysis of asymmetry by a slide-vector, Psychometrika, 58, 101-114.
Distance Matrix of Eight English Towns
Description
A data matrix with 8 rows and 8 columns. The data are distances between eight English towns, this datamatrix is made asymmetric by adding linear skew-symmetric matrix. In this dataset, asymmetry is imposed by perturbing the data.
Usage
data("Englishtowns")
References
Constantine, A.G. & Gower, J.C. (1978). Graphical Representation of Asymmetric Matrices. Appl. Statist, 27, 297-304.
Examples
data(Englishtowns)
Weighted Euclidean Model for Asymmetric Matrices
Description
This function fits a weighted multidimensional scaling model that is known as the asymscal model. This model is an extension of the symmetric Euclidean distance model proposed by Young (1975). The model is fitted in a stress majorization framework called SMACOF, whereas Young fitted this model using a least squares algorithm. Asymmetry is modelled by differential weighting of the dimensions of a multidimensional scaling configuration. When a subject compares object i to j he or she may use different weights when comparing object j to i In addition to these weights, the locations of the objects are jointly estimated from the data.
d_{ij}(X)=\sqrt{\sum_{s=1}^pv_{is}(x_{is}-x_{js})^2}.
Usage
asymscal(data, ndim = 2, start = NULL, verbose = FALSE, itmax = 10000, eps = 1e-10)
Arguments
data |
Asymmetric dissimilarity matrix |
ndim |
Number of dimensions |
start |
Optional configuration with starting values, the default is a random start configuration |
verbose |
If TRUE, stress values during the iterations are printed |
itmax |
Maximum number of iterations |
eps |
Convergence criterion for Stress |
Details
This function exploits a connection between the INDSCAL model and the asymscal model. This method inherits the methods for plotting an printing from the smacofIndDiff
in the smacof package. Basically, the asymscal takes two steps. First, this function sets up the appropriate dissimilarity and missing data structure for a three-way multidimensional scaling model, then a call to the method smacofIndDiff
in the imported package smacof is made. After correcting for the normalization applied to the data by smacofIndDiff
, the results can be displayed and plotted by the methods in the package smacof
.
The original algorithm for fitting the asymscal model fits squared distances. This function is based on majorization, and fits distances and not squared distances. The configuration matrix is normalized, the sum of squares of the columns of this matrix are equal to one.
Value
delta |
Observed dissimilarities |
obsdiss |
List of observed dissimilarities, normalized |
gspace |
Joint configurations aka group stimulus space |
cweights |
Configuration weights |
stress |
Stress-1 value |
resmat |
Matrix with residuals |
rss |
Residual sum-of-squares |
spp |
Stress per point |
ndim |
Number of dimensions |
model |
Type of the asymmetric scaling model |
niter |
Number of iterations |
nobj |
Number of objects |
References
Young, F. W. (1975). An asymmetric Euclidean model for multi-process asymmetric data. Paper presented at the U.S.-Japan Seminar on Multidimensional scaling, San Diego, U.S.A.
Examples
## Not run:
data("asymscalexample")
t<-asymscal(asymscalexample, ndim = 2, itmax = 10000, eps = 1e-10)
t$cweights
round(t$cweights, 3)
plot(t, plot.type = "confplot")
plot(t, plot.type = "bubbleplot")
plot(t, plot.type = "stressplot")
## End(Not run)
Asymscal Example Data
Description
This is an artificial dataset. The data are distances from a two-dimensional model, and because of this construction the asymscal model fit this data exactly. In addition, two rows of this matrix have weights different from (1,1). The fifth subject has weights (1.35,.25), and the 15th subject has weights (1.65,.425).
Usage
data("asymscalexample")
Format
A matrix with 15 rows and 15 columns.
Heatmap for skew-symmetric data
Description
This heatmap displays the values of a skew-symmetric matrix by colors. The option dominance
orders the rows and columns of the matrix in such a way that the values in the uppertriangle are positive and the values in the lower triangle are negative. The order is calculated from the row-sums of the signs obtained from the skew-symmetric matrix.
Usage
hmap(x, dominance = FALSE, ...)
Arguments
x |
A square matrix, either skew-symmetric or asymmetric, or an object of class |
dominance |
If true the signs of the skew-symmetric matrix are shown in the heatmap, if set to false the values in this matrix are shown. |
... |
Further plot arguments: see |
Examples
data(studentmigration)
hmap(studentmigration, dominance = TRUE, col = c("red", "white", "blue"))
MDS Model with Unique Dimensions
Description
This asymmetric MDS model proposed by Holman (1979) and further analyzed by Bentler & Weeks (1982) has both common and unique dimensions. The common dimensions are shared by all other objects, whereas unique dimension apply to one object. A unique dimension has a non zero value for only one object, the coordinates for the other objects are zero. There are as many unique dimensions as there are objects. An asymmetric version of this model has two sets of unique dimensions: one for the rows and one for the columns. The distance in this model is defined as:
d_{ij}(X)=\sqrt{\sum_{s=1}^p (x_{is}-x_{js})^2 + r_{i}^{2}+c_{j}^{2}}.
Usage
mdsunique(data, weight = NULL, ndim = 2, verbose = FALSE, itmax = 125, eps = 1e-12)
Arguments
data |
Asymmetric dissimilarity matrix |
weight |
Optional non-negative matrix with weights, if no weights are given all weights are set equal to one |
ndim |
Number of dimensions |
verbose |
If true, prints the iteration history to screen |
itmax |
Maximum number of iterations |
eps |
Convergence criterion for Stress |
Value
ndim |
Number of dimensions of the configuration |
fulldim |
Number of dimensions of the full model, this equals |
stress |
The raw stress for this model |
confi |
Returns the configuration matrix of shared dimensions of this multidimensional scaling model |
X |
Returns the configuration matrix of the full model consisting of shared and unique dimensions |
niter |
The number of iterations for the algorithm to converge |
nobj |
The number of objects in this model |
resid |
A matrix with raw residuals |
model |
Name of this asymmetric multidimensional scaling model |
row |
The unique dimensions for the rows |
col |
The unique dimensions for the columns |
unique |
The unique dimensions |
Examples
## Not run:
data("studentmigration")
mm<-studentmigration
mm[mm==0]<-.5 # replace zeroes by a small number
mm <- -log(mm/sum(mm)) # convert similarities to dissimilarities
v<-mdsunique(mm, ndim = 2, itmax = 2100, verbose=FALSE, eps = .0000000001)
plot(v, yplus = .3)
## End(Not run)
Plotmethod for Multidimensional sScaling models
Description
Method for a two-dimensional plot of the model. Available rownames are plotted as labels above the points. The slide-vector is shown as an arrow.
Usage
## S3 method for class 'slidevector'
plot(x, plot.dim = c(1, 2), yplus = 0, xlab, ylab, ...)
## S3 method for class 'mdsunique'
plot(x, plot.dim = c(1, 2), yplus = 0, xlab, ylab, ...)
Arguments
x |
Object of class |
plot.dim |
A vector with dimensions to be plotted |
yplus |
Parameter to adjust the vertical position of the label |
xlab |
Label of x-axis. |
ylab |
Label of y-axis. |
... |
Further plot arguments: see |
Examples
## 2D plot for the slide-vector model on generated data
dis <- matrix(c(1, 2, 3, 4, 5, 6, 2, 8, 9, 3), nrow = 5, ncol = 2) #configuration
a <- rbind(dis, dis+1.5) #generate slide-vector
test <- as.matrix(dist(a))[1:5, 6:10] #extract data
v <- slidevector(test, ndim = 2, itmax = 250, eps = .001)
plot(v)
Plotting method for the skew-symmetric part of an asymmetric matrix
Description
This plotting method provides a multidimensional representation of skew-symmetry based on the singular value decomposition (SVD). The properties of the SVD of a skew-symmetric matrix were given by Gower (1977) where also the guidelines for the interpretation of diagrams obtained by plotting pairs of singular vectors is described. The singular vectors of a skew-symmetric matrix come in pairs with equal singular values. The diagrams are not interpreted by comparing distances between point as is usual in multidimensional scaling, but by comparing areas formed by two points and the origin. The singular vectors span a plane, and the area of the triangle between two points and the origin represents skew-symmetry. The sign of the skew-symmetry between two points is modelled by a direction in the plane. Going clockwise the area between two points and the origin is negative, goint counter clockwise the area is positive.
Usage
## S3 method for class 'skewsymmetry'
plot(x, plot.plane = 1, yplus = 0, xlab, ylab, ...)
Arguments
x |
An object of class skewsymmetry |
plot.plane |
Integer indicating which plane to plot |
yplus |
Offset for the labels above the object points |
xlab |
Label for the x-axis |
ylab |
Label for the y-axis |
... |
Further plot arguments |
References
Gower, J.C. (1977) The analysis of asymmetry and orthogonality. In: Recent Developments in Statistics ( J. Barra, F. Brodeau, G. Romier & B. van Cutsem, Eds.), 109-123. North Holland, Amsterdam.
Intercountry Notification of Unsafe Products
Description
The Rapid Alert System for dangerous non-food products (RAPEX) notifies EU member states about risks of products to the health and safety of consumers. Risks for the consumer include choking, strangulation and fire, to name just a few. Examples of products in this database are powerbanks, clothing, toys, lighters, among others. Dozens of products in the EU are withdrawn from the market every month because they pose a risk to users health and safety. Market surveillance authorities in EU member states are expected to inform other countries about dangerous products, so that they are removed from the market in other countries. These data are maintained in an exchange system known as RAPEX. Countries can register unsafe products in the RAPEX database, this process is called notification. Other countries may then act on a notification made by one of the other countries. This table is derived from the RAPEX database. The entries in the table give the number of products removed from the row country, that is acted upon by the column country.
References
Decompose an Asymmetric Matrix into Symmetric and Skew-symmetric Components
Description
The decomposition of an asymmetric matrix into a symmetric matrix and a skew-symmetric matrix is an elementary result from mathematics that is the cornerstone of this package. The decomposition into a skew-symmetric and a symmetric component is written as: Q=S+A
, where Q
is an asymmetric matrix, S
is a symmetric matrix, and A
is a skew-symmetric matrix. This decomposition provides a justification for separate analyses of S
and A
. This decomposition is a useful tool for data analysis and graphical representation by areas. A second application is to the study of an asymmetric matrix of residuals, obtained after fitting a MDS model.
Usage
skewsymmetry(x)
Arguments
x |
Asymmetric matrix |
Value
S |
The symmetric part of the matrix |
A |
The skew-symmetric part of the matrix |
linear |
The row means of the skew-symmetric matrix, this amounts to fitting a linear model with row and column effects to the skew-symmetric matrix |
sv |
The singular vectors of the skew-symmetric matrix |
sval |
a vector containing the singular values of the skew-symmetric part of the data matrix |
nobj |
The number of objects |
See Also
Examples
data("Englishtowns")
Q <- skewsymmetry(Englishtowns)
# the skew-symmetric part
Q$A
The slide-vector model
Description
The slide-vector model is a multidimensional scaling model for asymmetric proximity data. Here, an asymmetric distance model is fitted to the data, where the asymmetry in the data is represented by the projections of the coordinates of the objects onto the slide-vector. The slide-vector points in the direction of large asymmetries in the data. The interpretation of asymmetry in this model is aided by the use of projections of points onto the slide-vector. The distance from i to j is larger if the point $i$ has a higher projection onto the slide-vector than the distance from j to i. If the line connecting two points is perpendicular to the slide-vector the difference between the two projections is zero. In this case the distance between the two points is symmetric. The algorithm for fitting this model is derived from the majorization approach to multidimensional scaling.
d_{ij}(X)=\sqrt{\sum_{s=1}^p(x_{is}-x_{js}+z_{s})^2}.
Usage
slidevector(data, weight = NULL, ndim = 2, verbose = FALSE, itmax = 125, eps = 1e-12)
Arguments
data |
Asymmetric dissimilarity matrix |
weight |
Optional non-negative matrix with weights, if no weights are given all weights are set equal to one |
ndim |
Number of dimensions |
verbose |
If TRUE, print the history of iterations |
itmax |
Maximum number of iterations |
eps |
Convergence criterion for the algorithm |
Details
The slide-vector model is a special case of the unfolding model. Therefore, the algorithm for fitting this model is a constrained unfolding model. The coordinates of the objects are calculated by minimizing a least squares loss function. This loss function is called stress in the multidimensional scaling literature. The stress is minimized by a version of the SMACOF algorithm. The main output are the configuration of points and the slide-vector.
Value
ndim |
Number of dimensions |
stress |
The raw stress for this model |
confi |
Returns the configuration matrix of this multidimensional scaling model |
niter |
The number of iterations for the algorithm to converge |
nobj |
The number of observations in this model |
resid |
A matrix with raw residuals |
slvec |
Coordinates of the slide-vector |
model |
Name of this asymmetric multidimensional scaling model |
References
Zielman, B., and Heiser, W. J. (1993), The analysis of asymmetry by a slide-vector, Psychometrika, 58, 101-114.
See Also
Examples
## asymmetric distances between English towns
data(Englishtowns)
v <- slidevector(Englishtowns, ndim = 2, itmax = 250, eps = .001)
plot(v)
Student Mobility in the Erasmus Program
Description
The table lists the home and destination country of 268.142 students in the academic year 2012-2013 participating in the Erasmus program. The 33 rows of this table refer to the home country whereas the 33 columns refer to the destination countries. The table gives the number of inbound and outbound students between every pair of countries, and the entries in the table are read as follows: 32 students from Bulgaria studied in The Netherlands, 18 students from the Netherlands studied in Bulgaria. Macedonia (MK) was excluded from the published table because only one student from Macedonia studie abroad and this country did not receive any students.
Usage
data(studentmigration)
Format
A matrix of 33 rows by 33 columns
Details
The Erasmus program is a student exchange program from the European Union. Three million students had taken part since the start of the program in 1987. To join the program a student has study at least three months or do an internship of at least two months in another country. The 2-letter codes shown below are supplied by the ISO (International Organization for Standardization). Country codes are given here: Countrycodes
Note
Macedonia has been removed from this table because only one student from this country participated in the program, and no students moved to Macedonia.
Source
https://education.ec.europa.eu
Examples
data(studentmigration)
hmap(studentmigration)
Summary method of the decomposition
Description
Prints a decomposition of the sum of squares of an asymmetric matrix. The first column gives the sum of squares, and the second column gives the percentages of the two components. This decomposition can be applied to data, but also to a matrix of residuals obtained from a fitted model.
Usage
## S3 method for class 'skewsymmetry'
summary(object, ...)
Arguments
object |
An object of class |
... |
Further parameters |
Examples
data(Englishtowns)
q <- skewsymmetry(Englishtowns)
summary(q)