Version: | 1.1.3 |
Date: | 2024-03-04 |
Title: | Accelerated Sparse Discriminant Analysis |
Imports: | MASS (≥ 7.3.45), ggplot2 (≥ 2.1.0), grid (≥ 3.2.2), gridExtra (≥ 2.2.1) |
Depends: | R (≥ 3.2) |
Description: | Implementation of sparse linear discriminant analysis, which is a supervised classification method for multiple classes. Various novel optimization approaches to this problem are implemented including alternating direction method of multipliers ('ADMM'), proximal gradient (PG) and accelerated proximal gradient ('APG') (See Atkins 'et al'. <doi:10.48550/arXiv.1705.07194>). Functions for performing cross validation are also supplied along with basic prediction and plotting functions. Sparse zero variance discriminant analysis ('SZVD') is also included in the package (See Ames and Hong, <doi:10.48550/arXiv.1401.5492>). See the 'github' wiki for a more extended description. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
URL: | https://github.com/gumeo/accSDA/wiki |
BugReports: | https://github.com/gumeo/accSDA/issues |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Packaged: | 2024-03-06 18:34:56 UTC; gudmundureinarsson |
Author: | Gudmundur Einarsson [aut, cre, trl], Line Clemmensen [aut, ths], Brendan Ames [aut], Summer Atkins [aut] |
Maintainer: | Gudmundur Einarsson <gumeo140688@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-03-06 18:50:02 UTC |
ADMM on l1 regularized quadratic program
Description
Applies Alternating Direction Method of Multipliers to the l1-regularized quadratic program
f(\mathbf{x}) + g(\mathbf{x}) = \frac{1}{2}\mathbf{x}^TA\mathbf{x} - d^T\mathbf{x} + \lambda |\mathbf{x}|_1
Usage
ADMM_EN2(R, d, x0, lam, mu, maxits, tol, quiet, selector = rep(1, dim(x)[1]))
Arguments
R |
Upper triangular matrix in Chol decomp |
d |
nx1 dimensional column vector. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
mu |
Augmented Lagrangian penalty parameter, must be greater than zero. |
maxits |
Number of iterations to run |
tol |
Vector of stopping tolerances, first value is absolute, second is relative tolerance. |
quiet |
Logical controlling display of intermediate statistics. |
selector |
Vector to choose which parameters in the discriminant vector will be used to calculate the regularization terms. The size of the vector must be *p* the number of predictors. The default value is a vector of all ones. This is currently only used for ordinal classification. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
ADMM_EN2
returns an object of class
"ADMM_EN2
" including a list
with the following named components
call
The matched call.
x
Found solution.
y
Dual solution.
z
Slack variables.
k
Number of iterations used.
See Also
Used by: SDAD
and the SDADcv
cross-validation version.
ADMM on l1 regularized quadratic program
Description
Applies Alternating Direction Method of Multipliers to the l1-regularized quadratic program
f(\mathbf{x}) + g(\mathbf{x}) = \frac{1}{2}\mathbf{x}^TA\mathbf{x} - d^T\mathbf{x} + \lambda |\mathbf{x}|_1
Usage
ADMM_EN_SMW(Ainv, V, R, d, x0, lam, mu, maxits, tol, quiet, selector)
Arguments
Ainv |
Diagonal of |
V |
Matrix from SMW formula. |
R |
Upper triangular matrix in Cholesky decomposition of |
d |
nx1 dimensional column vector. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
mu |
Augmented Lagrangian penalty parameter, must be greater than zero. |
maxits |
Number of iterations to run |
tol |
Vector of stopping tolerances, first value is absolute, second is relative tolerance. |
quiet |
Logical controlling display of intermediate statistics. |
selector |
Vector to choose which parameters in the discriminant vector will be used to calculate the regularization terms. The size of the vector must be *p* the number of predictors. The default value is a vector of all ones. This is currently only used for ordinal classification. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
ADMM_EN_SMW
returns an object of class
"ADMM_EN_SMW
" including a list
with the following named components
call
The matched call.
x
Found solution.
y
Dual solution.
z
Slack variables.
k
Number of iterations used.
See Also
Used by: SDAD
and the SDADcv
cross-validation version.
Accelerated Proximal Gradient on l1 regularized quadratic program
Description
Applies accelerated proximal gradient algorithm to the l1-regularized quadratic program
f(\mathbf{x}) + g(\mathbf{x}) = \frac{1}{2}\mathbf{x}^TA\mathbf{x} - d^T\mathbf{x} + \lambda |\mathbf{x}|_1
Usage
APG_EN2(A, d, x0, lam, alpha, maxits, tol, selector = rep(1, dim(x0)[1]))
Arguments
A |
p by p positive definite coefficient matrix
. |
d |
nx1 dimensional column vector. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
alpha |
Step length. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
selector |
Vector to choose which parameters in the discriminant vector will be used to calculate the regularization terms. The size of the vector must be *p* the number of predictors. The default value is a vector of all ones. This is currently only used for ordinal classification. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
APG_EN2
returns an object of class
"APG_EN2
" including a list
with the following named components
call
The matched call.
x
Found solution.
k
Number of iterations used.
See Also
Used by: SDAAP
and the SDAAPcv
cross-validation version.
Accelerated Proximal Gradient (with backtracking) on l1 regularized quadratic program
Description
Applies accelerated proximal gradient algorithm (with backtracking) to the l1-regularized quadratic program
f(\mathbf{x}) + g(\mathbf{x}) = \frac{1}{2}\mathbf{x}^TA\mathbf{x} - d^T\mathbf{x} + \lambda |\mathbf{x}|_1
Usage
APG_EN2bt(
A,
Xt,
Om,
gamma,
d,
x0,
lam,
L,
eta,
maxits,
tol,
selector = rep(1, dim(x0)[1])
)
Arguments
A |
p by p positive definite coefficient matrix
. |
Xt |
Same as X above, we need it to make calculations faster. |
Om |
Same reason as for the above parameter. |
gamma |
l2 regularizing parameter. |
d |
nx1 dimensional column vector. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
L |
Initial vlaue of the backtracking Lipshitz constant. |
eta |
Backtracking scaling parameter. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
selector |
Vector to choose which parameters in the discriminant vector will be used to calculate the regularization terms. The size of the vector must be *p* the number of predictors. The default value is a vector of all ones. This is currently only used for ordinal classification. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
APG_EN2bt
returns an object of class
"APG_EN2bt
" including a list
with the following named components
call
The matched call.
x
Found solution.
k
Number of iterations used.
See Also
Used by: SDAAP
and the SDAAPcv
cross-validation version.
Accelerated Proximal Gradient on l1 regularized quadratic program
Description
Applies accelerated proximal gradient algorithm to the l1-regularized quadratic program (with rank reduced Omega inside A)
f(\mathbf{x}) + g(\mathbf{x}) = \frac{1}{2}\mathbf{x}^TA\mathbf{x} - d^T\mathbf{x} + \lambda |\mathbf{x}|_1
Usage
APG_EN2rr(A, d, x0, lam, alpha, maxits, tol, selector = rep(1, dim(x0)[1]))
Arguments
A |
Object containing everythign needed for calculating A, X and Omega are factored. |
d |
nx1 dimensional column vector. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
alpha |
Step length. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
selector |
Vector to choose which parameters in the discriminant vector will be used to calculate the regularization terms. The size of the vector must be *p* the number of predictors. The default value is a vector of all ones. This is currently only used for ordinal classification. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
APG_EN2
returns an object of class
"APG_EN2
" including a list
with the following named components
call
The matched call.
x
Found solution.
k
Number of iterations used.
See Also
Used by: SDAAP
and the SDAAPcv
cross-validation version.
Accelerated Sparse Discriminant Analysis
Description
Applies accelerated proximal gradient algorithm, proximal gradient algorithm or alternating direction methods of multipliers algorithm to the optimal scoring formulation of sparse discriminant analysis proposed by Clemmensen et al. 2011.
argmin{|(Y_t\theta-X_t\beta)|_2^2 + t|\beta|_1 + \lambda|\beta|_2^2}
Usage
ASDA(Xt, ...)
## Default S3 method:
ASDA(
Xt,
Yt,
Om = diag(p),
gam = 0.001,
lam = 1e-06,
q = K - 1,
method = "SDAAP",
control = list(),
...
)
Arguments
Xt |
n by p data matrix, (can also be a data.frame that can be coerced to a matrix) |
... |
Additional arguments for |
Yt |
n by K matrix of indicator variables (Yij = 1 if i in class j). This will later be changed to handle factor variables as well. Each observation belongs in a single class, so for a given row/observation, only one element is 1 and the rest is 0. |
Om |
p by p parameter matrix Omega in generalized elastic net penalty. |
gam |
Regularization parameter for elastic net penalty. |
lam |
Regularization parameter for l1 penalty, must be greater than zero.
If cross-validation is used ( |
q |
Desired number of discriminant vectors. |
method |
This parameter selects which optimization method to use. It is specified as a character vector which can be one of the three values
Note that further parameters are passed to the function in the argument |
control |
List of control arguments. See Details. |
Details
The control list contains the following entries to further tune the algorithms.
PGsteps
Maximum number if inner proximal gradient/ADMM algorithm for finding beta. Default value is 1000.
PGtol
Stopping tolerance for inner method. If the method is
SDAD
, then this must be a vector of two values, absolute (first element) and relative tolerance (second element). Default value is 1e-5 for both absolute and relative tolerances.maxits
Number of iterations to run. Default value is 250.
tol
Stopping tolerance. Default value is 1e-3.
mu
Penalty parameter for augmented Lagrangian term, must be greater than zero and only needs to be specified when using method
SDAD
. Default value is 1.CV
Logical value which is
TRUE
if cross validation is supposed to be performed. If cross-validation is performed, then lam should be specified as a vector containing the regularization values to be tested. Default value isFALSE
.folds
Integer determining the number of folds in cross-validation. Not needed if CV is not specified. Default value is 5.
feat
Maximum fraction of nonzero features desired in validation scheme. Not needed if CV is not specified. Default value is 0.15.
quiet
Set to
FALSE
if status updates are supposed to be printed to the R console. Default value isTRUE
. Note that this triggers a lot of printing to the console.ordinal
Set to
TRUE
if the labels are ordinal. Only available for methodsSDAAP
andSDAD
.initTheta
Option to set the initial theta vector, by default it is a vector of all ones for the first theta.
bt
Logical indicating whether backtracking should be used, only applies to the Proximal Gradient based methods. By default, backtracking is not used.
L
Initial estimate for Lipshitz constant used for backtracking. Default value is 0.25.
eta
Scalar for Lipshitz constant. Default value is 1.25.
rankRed
Boolean indicating whether Om is factorized, such that R^t*R=Om, currently only applicable for accelerated proximal gradient.
Value
ASDA
returns an object of class
"ASDA
" including a list
with the following named components:
call
The matched call.
B
p by q matrix of discriminant vectors, i.e. sparse loadings.
Q
K by q matrix of scoring vectors, i.e. optimal scores.
varNames
Names of the predictors used, i.e. column names of Xt.
origP
Number of variables in Xt.
fit
Output from function
lda
on projected data. This isNULL
the trivial solution is found, i.e. B is all zeroes. Use lower values oflam
if that is the case.classes
The classes in Yt.
lambda
The lambda/
lam
used, best value found by cross- validation ifCV
isTRUE
.
NULL
Note
The input matrix Xt should be normalized, i.e. each column corresponding to
a variable should have its mean subtracted and scaled to unit length. The functions
normalize
and normalizetest
are supplied for this purpose in the package.
See Also
Examples
set.seed(123)
# Prepare training and test set
train <- c(1:40,51:90,101:140)
Xtrain <- iris[train,1:4]
nX <- normalize(Xtrain)
Xtrain <- nX$Xc
Ytrain <- iris[train,5]
Xtest <- iris[-train,1:4]
Xtest <- normalizetest(Xtest,nX)
Ytest <- iris[-train,5]
# Define parameters for Alternating Direction Method of Multipliers (SDAD)
Om <- diag(4)+0.1*matrix(1,4,4) #elNet coef mat
gam <- 0.0001
lam <- 0.0001
method <- "SDAD"
q <- 2
control <- list(PGsteps = 100,
PGtol = c(1e-5,1e-5),
mu = 1,
maxits = 100,
tol = 1e-3,
quiet = FALSE)
# Run the algorithm
res <- ASDA(Xt = Xtrain,
Yt = Ytrain,
Om = Om,
gam = gam ,
lam = lam,
q = q,
method = method,
control = control)
# Can also just use the defaults, which is Accelerated Proximal Gradient (SDAAP):
resDef <- ASDA(Xtrain,Ytrain)
# Some example on simulated data
# Generate Gaussian data on three classes with plenty of redundant variables
# This example shows the basic steps on how to apply this to data, i.e.:
# 1) Setup training data
# 2) Normalize
# 3) Train
# 4) Predict
# 5) Plot projected data
# 6) Accuracy on test set
P <- 300 # Number of variables
N <- 50 # Number of samples per class
# Mean for classes, they are zero everywhere except the first 3 coordinates
m1 <- rep(0,P)
m1[1] <- 3
m2 <- rep(0,P)
m2[2] <- 3
m3 <- rep(0,P)
m3[3] <- 3
# Sample dummy data
Xtrain <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
Xtest <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
# Generate the labels
Ytrain <- factor(rep(1:3,each=N))
Ytest <- Ytrain
# Normalize the data
Xt <- accSDA::normalize(Xtrain)
Xtrain <- Xt$Xc # Use the centered and scaled data
Xtest <- accSDA::normalizetest(Xtest,Xt)
# Train the classifier and increase the sparsity parameter from the default
# so we penalize more for non-sparse solutions.
res <- accSDA::ASDA(Xtrain,Ytrain,lam=0.01)
# Plot the projected training data, it is projected to
# 2-dimension because we have 3 classes. The number of discriminant
# vectors is maximum number of classes minus 1.
XtrainProjected <- Xtrain%*%res$beta
plot(XtrainProjected[,1],XtrainProjected[,2],col=Ytrain)
# Predict on the test data
preds <- predict(res, newdata = Xtest)
# Plot projected test data with predicted and correct labels
XtestProjected <- Xtest%*%res$beta
plot(XtestProjected[,1],XtestProjected[,2],col=Ytest,
main="Projected test data with original labels")
plot(XtestProjected[,1],XtestProjected[,2],col=preds$class,
main="Projected test data with predicted labels")
# Calculate accuracy
sum(preds$class == Ytest)/(3*N) # We have N samples per class, so total 3*N
barplot for ASDA objects
Description
This is a function to visualize the discriminant vector from the ASDA method. The plot is constructed as a ggplot barplot and the main purpose of it is to visually inspect the sparsity of the discriminant vectors. The main things to look for are how many parameters are non-zero and if there is any structure in the ones that are non-zero, but the structure is dependent on the order you specify your variables. For time-series data, this could mean that a chunk of variables are non-zero that are close in time, meaning that there is some particular event that is best for discriminating between the classes that you have.
Usage
ASDABarPlot(asdaObj, numDVs = 1, xlabel, ylabel, getList = FALSE, main, ...)
Arguments
asdaObj |
Object from the |
numDVs |
Number of discriminant vectors (DVs) to plot. This is limited by the
number of DVs outputted from the |
xlabel |
Label to put under every plot |
ylabel |
Vector of y-axis labels for each plot, e.g. if there are three DVs, then
|
getList |
Logical value indicating whether the output should be a list of the plots or the plots stacked in one plot using the gridExtra package. By default the function produces a single plot combining all plots of the DVs. |
main |
Main title for the plots, this is not used if getList is set to |
... |
Extra arguments to |
Value
barplot.ASDA
returns either a single combined plot or a list of
individual ggplot objects.
Note
This function is used as a quick diagnostics tool for the output from the ASDA function. Feel free to look at the code to customize the plots in any way you like.
See Also
Examples
# Generate and ASDA object with your data, e.g.
# Prepare training and test set
# This is a very small data set, I advise you to try it on something with more
# variables, e.g. something from this source: http://www.cs.ucr.edu/~eamonn/time_series_data/
# or possibly run this on the Gaussian data example from the ASDA function
train <- c(1:40,51:90,101:140)
Xtrain <- iris[train,1:4]
nX <- normalize(Xtrain)
Xtrain <- nX$Xc
Ytrain <- iris[train,5]
Xtest <- iris[-train,1:4]
Xtest <- normalizetest(Xtest,nX)
Ytest <- iris[-train,5]
# Run the method
resIris <- ASDA(Xtrain,Ytrain)
# Look at the barplots of the DVs
ASDABarPlot(resIris)
Sparse Discriminant Analysis solved via Accelerated Proximal Gradient
Description
Applies accelerated proximal gradient algorithm to the optimal scoring formulation of sparse discriminant analysis proposed by Clemmensen et al. 2011.
Usage
SDAAP(Xt, ...)
## Default S3 method:
SDAAP(
Xt,
Yt,
Om,
gam,
lam,
q,
PGsteps,
PGtol,
maxits,
tol,
selector = rep(1, dim(Xt)[2]),
initTheta,
bt = FALSE,
L,
eta,
rankRed = FALSE,
...
)
Arguments
Xt |
n by p data matrix, (not a data frame, but a matrix) |
Yt |
n by K matrix of indicator variables (Yij = 1 if i in class j). This will later be changed to handle factor variables as well. Each observation belongs in a single class, so for a given row/observation, only one element is 1 and the rest is 0. |
Om |
p by p parameter matrix Omega in generalized elastic net penalty. |
gam |
Regularization parameter for elastic net penalty. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
q |
Desired number of discriminant vectors. |
PGsteps |
Maximum number if inner proximal gradient algorithm for finding beta. |
PGtol |
Stopping tolerance for inner APG method. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
selector |
Vector to choose which parameters in the discriminant vector will be used to calculate the regularization terms. The size of the vector must be *p* the number of predictors. The default value is a vector of all ones. This is currently only used for ordinal classification. |
initTheta |
Option to set the initial theta vector, by default it is a vector of all ones for the first theta. |
bt |
Boolean to indicate whether backtracking should be used, default false. |
L |
Initial estimate for Lipshitz constant used for backtracking. |
eta |
Scalar for Lipshitz constant. |
rankRed |
Boolean indicating whether Om is in factorized form, such that R^t*R = mO |
Value
SDAAP
returns an object of class
"SDAAP
" including a list
with the following named components:
call
The matched call.
B
p by q matrix of discriminant vectors.
Q
K by q matrix of scoring vectors.
subits
Total number of iterations in proximal gradient subroutine.
totalits
Number coordinate descent iterations for all discriminant vectors
NULL
See Also
Sparse Discriminant Analysis solved via ADMM
Description
Applies alternating direction methods of multipliers algorithm to the optimal scoring formulation of sparse discriminant analysis proposed by Clemmensen et al. 2011.
Usage
SDAD(Xt, ...)
## Default S3 method:
SDAD(
Xt,
Yt,
Om,
gam,
lam,
mu,
q,
PGsteps,
PGtol,
maxits,
tol,
selector = rep(1, dim(Xt)[2]),
initTheta,
...
)
Arguments
Xt |
n by p data matrix, (not a data frame, but a matrix) |
Yt |
n by K matrix of indicator variables (Yij = 1 if i in class j). This will later be changed to handle factor variables as well. Each observation belongs in a single class, so for a given row/observation, only one element is 1 and the rest is 0. |
Om |
p by p parameter matrix Omega in generalized elastic net penalty. |
gam |
Regularization parameter for elastic net penalty. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
mu |
Penalty parameter for augmented Lagrangian term, must be greater than zero. |
q |
Desired number of discriminant vectors. |
PGsteps |
Maximum number if inner proximal gradient algorithm for finding beta. |
PGtol |
Two stopping tolerances for inner ADMM method, first is absolute tolerance, second is relative. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
selector |
Vector to choose which parameters in the discriminant vector will be used to calculate the regularization terms. The size of the vector must be *p* the number of predictors. The default value is a vector of all ones. This is currently only used for ordinal classification. |
initTheta |
Initial first theta, default value is a vector of ones. |
Value
SDAD
returns an object of class
"SDAD
" including a list
with the following named components: (More will be added later to handle the predict function)
call
The matched call.
B
p by q matrix of discriminant vectors.
Q
K by q matrix of scoring vectors.
subits
Total number of iterations in proximal gradient subroutine.
totalits
Number coordinate descent iterations for all discriminant vectors
NULL
See Also
Sparse Discriminant Analysis solved via Proximal Gradient
Description
Applies proximal gradient algorithm to the optimal scoring formulation of sparse discriminant analysis proposed by Clemmensen et al. 2011.
Usage
SDAP(Xt, ...)
## Default S3 method:
SDAP(
Xt,
Yt,
Om,
gam,
lam,
q,
PGsteps,
PGtol,
maxits,
tol,
initTheta,
bt = FALSE,
L,
eta,
...
)
Arguments
Xt |
n by p data matrix, (not a data frame, but a matrix) |
Yt |
n by K matrix of indicator variables (Yij = 1 if i in class j). This will later be changed to handle factor variables as well. Each observation belongs in a single class, so for a given row/observation, only one element is 1 and the rest is 0. |
Om |
p by p parameter matrix Omega in generalized elastic net penalty. |
gam |
Regularization parameter for elastic net penalty. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
q |
Desired number of discriminant vectors. |
PGsteps |
Maximum number if inner proximal gradient algorithm for finding beta. |
PGtol |
Stopping tolerance for inner APG method. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
initTheta |
Initial first theta, default value is a vector of ones. |
bt |
Boolean to indicate whether backtracking should be used, default false. |
L |
Initial estimate for Lipshitz constant used for backtracking. |
eta |
Scalar for Lipshitz constant. |
Value
SDAP
returns an object of class
"SDAP
" including a list
with the following named components: (More will be added later to handle the predict function)
call
The matched call.
B
p by q matrix of discriminant vectors.
Q
K by q matrix of scoring vectors.
subits
Total number of iterations in proximal gradient subroutine.
totalits
Number coordinate descent iterations for all discriminant vectors
NULL
See Also
Sparse Zero Variance Discriminant Analysis
Description
Applies SZVD heuristic for sparse zero-variance discriminant analysis to given training set.
Usage
SZVD(train, ...)
## Default S3 method:
SZVD(
train,
gamma,
D,
penalty = TRUE,
scaling = TRUE,
tol = list(abs = 1e-04, rel = 1e-04),
maxits = 2000,
beta = 1,
quiet = TRUE,
...
)
Arguments
train |
Data matrix where first column is the response class. |
... |
Parameters passed to SZVD.default. |
gamma |
Set of regularization parameters controlling l1-penalty. |
D |
dictionary/basis matrix. |
penalty |
Controls whether to apply reweighting of l1-penalty (using sigma = within-class std devs). |
scaling |
Logical indicating whether to scale data such that each feature has variance 1. |
tol |
Stopping tolerances for ADMM algorithm, must include tol$rel and tol$abs. |
maxits |
Maximum number of iterations used in the ADMM algorithm. |
beta |
penalty term controlling the splitting constraint. |
quiet |
Print intermediate outpur or not. |
Details
This function will currently solve as a standalone function in accSDA for time comparison. A wrapper function like ASDA will be created to use the functionality of plots and such. Maybe call it ASZDA. For that purpose the individual ZVD function will need to be implemented.
Value
SZVD
returns an object of class
"SZVD
" including a list
with the following named components:
DVs
Discriminant vectors.
its
Number of iterations required to find DVs.
pen_scal
Weights used in reweighted l1-penalty.
N
Basis for the null-space of the sample within-class covariance.
means
Training class-means.
mus
Training meand and variance scaling/centering terms.
w0
unpenalized zero-variance discriminants (initial solutions) plus B and W, etc.
NULL
See Also
Used by: SZVDcv
.
Examples
set.seed(123)
P <- 300 # Number of variables
N <- 50 # Number of samples per class
# Mean for classes, they are zero everywhere except the first 3 coordinates
m1 <- rep(0,P)
m1[1] <- 3
m2 <- rep(0,P)
m2[2] <- 3
m3 <- rep(0,P)
m3[3] <- 3
# Sample dummy data
Xtrain <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
# Generate the labels
Ytrain <- rep(1:3,each=N)
# Normalize the data
Xt <- accSDA::normalize(Xtrain)
Xtrain <- Xt$Xc
# Train the classifier and increase the sparsity parameter from the default
# so we penalize more for non-sparse solutions.
res <- accSDA::SZVD(cbind(Ytrain,Xtrain),beta=2.5,
maxits=1000,tol = list(abs = 1e-04, rel = 1e-04))
Alternating Direction Method of Multipliers for SZVD
Description
Iteratively solves the problem
\min(-1/2*x^TB^Tx + \gamma p(y): ||x||_2 \leq 1, DNx = y)
Usage
SZVD_ADMM(B, N, D, sols0, pen_scal, gamma, beta, tol, maxits, quiet = TRUE)
Arguments
B |
Between class covariance matrix for objective (in space defined by N). |
N |
basis matrix for null space of covariance matrix W. |
D |
penalty dictionary/basis. |
sols0 |
initial solutions sols0$x, sols0$y, sols0$z |
pen_scal |
penalty scaling term. |
gamma |
l1 regularization parameter |
beta |
penalty term controlling the splitting constraint. |
tol |
tol$abs = absolute error, tol$rel = relative error to be achieved to declare convergence of the algorithm. |
maxits |
maximum number of iterations of the algorithm to run. |
quiet |
toggles between displaying intermediate statistics. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
SZVD_ADMM
returns an object of class
"SZVD_ADMM
" including a list
with the following named components
x,y,z
Iterates at termination.
its
Number of iterations required to converge.
errtol
Stopping error bound at termination
See Also
Used by: SZVDcv
.
Cross-validation of sparse zero variance discriminant analysis
Description
Applies alternating direction methods of multipliers to solve sparse zero variance discriminant analysis.
Usage
SZVD_kFold_cv(X, ...)
## Default S3 method:
SZVD_kFold_cv(
X,
Y,
folds,
gams,
beta,
D,
q,
maxits,
tol,
ztol,
feat,
penalty,
quiet,
...
)
Arguments
X |
n by p data matrix, variables should be scaled to by sd |
... |
Parameters passed to SZVD.default. |
Y |
n by K indicator matrix. |
folds |
number of folds to use in K-fold cross-validation. |
gams |
Number of regularly spaced regularization parameters to try in [0,1]*max_gamma. See details for how max_gamma is computed in the function. |
beta |
Augmented Lagrangian parameter. Must be greater than zero. |
D |
Penalty dictionary basis matrix. |
q |
Desired number of discriminant vectors. |
maxits |
Number of iterations to run ADMM algorithm. |
tol |
Stopping tolerances for ADMM, must have tol$rel and tol$abs. |
ztol |
Rounding tolerance for truncating entries to 0. |
feat |
Maximum fraction of nonzero features desired in validation scheme. |
penalty |
Controls whether to apply reweighting of l1-penalty (using sigma = within-class std devs) |
quiet |
toggles between displaying intermediate statistics. |
Details
Add how max_gamma is calculated from the ZVD solution. This function might require a wrapper similar to ASDA.
Value
SZVDcv
returns an object of class
"SZVDcv
"
including a list with the named components DVs
and gambest
.
Where DVs
are the discriminant vectors for the best l1 regularization
parameter and gambest
is the best regularization parameter found
in the cross-validation.
NULL
See Also
Used by: SZVDcv
.
Examples
P <- 150 # Number of variables
N <- 20 # Number of samples per class
# Mean for classes, they are zero everywhere except the first 3 coordinates
m1 <- rep(0,P)
m1[1] <- 3
m2 <- rep(0,P)
m2[2] <- 3
m3 <- rep(0,P)
m3[3] <- 3
# Sample dummy data
Xtrain <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
# Generate the labels
Ytrain <- cbind(c(rep(1,N),rep(0,2*N)),
c(rep(0,N),rep(1,N),rep(0,N)),
c(rep(0,2*N),rep(1,N)))
# Normalize the data
Xt <- accSDA::normalize(Xtrain)
Xtrain <- Xt$Xc
# Train the classifier and increase the sparsity parameter from the default
# so we penalize more for non-sparse solutions.
res <- accSDA::SZVD_kFold_cv(Xtrain,Ytrain,folds=2,gams=2,beta=2.5,q=1, D=diag(P),
maxits=50,tol=list(abs=1e-2,rel=1e-2),
ztol=1e-3,feat=0.3,quiet=FALSE,penalty=TRUE)
Cross-validation of sparse zero variance discriminant analysis
Description
Applies alternating direction methods of multipliers to solve sparse zero variance discriminant analysis.
Usage
SZVDcv(Atrain, ...)
## Default S3 method:
SZVDcv(
Atrain,
Aval,
k,
num_gammas,
g_mults,
D,
sparsity_pen,
scaling,
penalty,
beta,
tol,
ztol,
maxits,
quiet,
...
)
Arguments
Atrain |
Training data set. |
... |
Parameters passed to SZVD.default. |
Aval |
Validation set. |
k |
Number of classes within training and validation sets. |
num_gammas |
Number of gammas to train on. |
g_mults |
Parameters defining range of gammas to train, g_max*(c_min, c_max). Note that it is an array/vector with two elements. |
D |
Penalty dictionary basis matrix. |
sparsity_pen |
weight defining validation criteria as weighted sum of misclassification error and cardinality of discriminant vectors. |
scaling |
Whether to rescale data so each feature has variance 1. |
penalty |
Controls whether to apply reweighting of l1-penalty (using sigma = within-class std devs) |
beta |
Parameter for augmented Lagrangian term in the ADMM algorithm. |
tol |
Stopping tolerances for the ADMM algorithm, must have tol$rel and tol$abs. |
ztol |
Threshold for truncating values in DVs to zero. |
maxits |
Maximum number of iterations used in the ADMM algorithm. |
quiet |
Controls display of intermediate results. |
Details
This function might require a wrapper similar to ASDA.
Value
SZVDcv
returns an object of class
"SZVDcv
"
including a list with the following named components:
DVs
Discriminant vectors for the best choice of gamma.
all_DVs
Discriminant vectors for all choices of gamma.
l0_DVs
Discriminant vectors for gamma minimizing cardinality.
mc_DVs
Discriminant vector minimizing misclassification.
gamma
Choice of gamma minimizing validation criterion.
gammas
Set of all gammas trained on.
max_g
Maximum value of gamma guaranteed to yield a nontrivial solution.
ind
Index of best gamma.
w0
unpenalized zero-variance discriminants (initial solutions) plus B and W, etc. from ZVD
NULL
See Also
Non CV version: SZVD
.
Examples
P <- 300 # Number of variables
N <- 50 # Number of samples per class
# Mean for classes, they are zero everywhere except the first 3 coordinates
m1 <- rep(0,P)
m1[1] <- 3
m2 <- rep(0,P)
m2[2] <- 3
m3 <- rep(0,P)
m3[3] <- 3
# Sample dummy data
Xtrain <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
Xval <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
# Generate the labels
Ytrain <- rep(1:3,each=N)
Yval <- rep(1:3,each=N)
# Train the classifier and increase the sparsity parameter from the default
# so we penalize more for non-sparse solutions.
res <- accSDA::SZVDcv(cbind(Ytrain,Xtrain),cbind(Yval,Xval),num_gammas=4,
g_mults = c(0,1),beta=2.5,
D=diag(P), maxits=100,tol=list(abs=1e-3,rel=1e-3), k = 3,
ztol=1e-4,sparsity_pen=0.3,quiet=FALSE,penalty=TRUE,scaling=TRUE)
Zero Variance Discriminant Analysis
Description
Implements the ZVD algorithm to solve dicriminant vectors.
Usage
ZVD(A, ...)
## Default S3 method:
ZVD(A, scaling = FALSE, get_DVs = FALSE, ...)
Arguments
A |
Matrix, where first column corresponds to class labels. |
... |
Parameters passed to ZVD.default. |
scaling |
Logical whether to rescale data so each feature has variance 1. |
get_DVs |
Logical whether to obtain unpenalized zero-variance discriminant vectors. |
Details
This function should potentially be made internal for the release.
Value
SZVDcv
returns an object of class
"ZVD
"
including a list with the following named components:
dvs
discriminant vectors (optional).
B
sample between-class covariance.
W
sample within-class covariance.
N
basis for the null space of the sample within-class covariance.
mu
training mean and variance scaling/centering terms
means
vectors of sample class-means.
k
number of classes in given data set.
labels
list of classes.
obs
matrix of data observations.
class_obs
Matrices of observations of each class.
NULL
See Also
Used by: SZVDcv
.
Examples
# Generate Gaussian data on three classes with bunch of redundant variables
P <- 300 # Number of variables
N <- 50 # Number of samples per class
# Mean for classes, they are zero everywhere except the first 3 coordinates
m1 <- rep(0,P)
m1[1] <- 3
m2 <- rep(0,P)
m2[2] <- 3
m3 <- rep(0,P)
m3[3] <- 3
# Sample dummy data
Xtrain <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
# Generate the labels
Ytrain <- rep(1:3,each=N)
# Normalize the data
Xt <- accSDA::normalize(Xtrain)
Xtrain <- Xt$Xc
# Train the classifier and increase the sparsity parameter from the default
# so we penalize more for non-sparse solutions.
res <- accSDA::ZVD(cbind(Ytrain,Xtrain))
accSDA: A package for performing sparse discriminant analysis in various ways.
Description
The accSDA package provides functions to perform sparse discriminant
analysis using a selection of three optimization methods,
proximal gradient (PG), accelerated proximal gradient (APG) and
alternating direction method of multipliers (ADMM). The package
is intended to extend the available tools to perform sparse
discriminant analysis in R. The three methods can be called
from the function ASDA
. Cross validation is also
implemented for the L1 regularization parameter. Functions
for doing predictions, summary, printing and simple plotting
are also provided. The sparse discriminant functions perform
lda on the projected data by default, using the lda function
in the MASS package. The functions return an object of the
same class as the name of the function and provide the lda
solution, along with the projected data, thus other kinds
of classification algorithms can be employed on the projected data.
Generate data for ordinal examples in the package
Description
Given the parameters, the function creates a dataset for testing the ordinal functionality of the package. The data is samples from multivariate Gaussians with different means, where the mean varies along a sinusoidal curve w.r.t. the class label.
Usage
genDat(numClasses, numObsPerClass, mu, sigma)
Arguments
numClasses |
Positive integer specifying the number of classes for the dataset. |
numObsPerClass |
Number of observations sampled per class. |
mu |
Mean of the first class. |
sigma |
2 by 2 covariance matrix |
Details
This function is used to demonstrate the usage of the ordinal classifier.
Value
genDat
Returns a list with the following attributes:
- X
A matrix with two columns and
numObsPerClass
*numClasses
rows.- Y
Labels for the rows of
X
.
Author(s)
Gudmundur Einarsson
See Also
Examples
set.seed(123)
# You can play around with these values to generate some 2D data to test one
numClasses <- 15
sigma <- matrix(c(1,-0.2,-0.2,1),2,2)
mu <- c(0,0)
numObsPerClass <- 5
# Generate the data, can access with train$X and train$Y
train <- accSDA::genDat(numClasses,numObsPerClass,mu,sigma)
test <- accSDA::genDat(numClasses,numObsPerClass*2,mu,sigma)
# Visualize it, only using the first variable gives very good separation
plot(train$X[,1],train$X[,2],col = factor(train$Y),asp=1,main="Training Data")
Normalize training data
Description
Normalize a vector or matrix to zero mean and unit length columns.
Usage
normalize(X)
Arguments
X |
a matrix with the training data with observations down the rows and variables in the columns. |
Details
This function can e.g. be used for the training data in the ASDA
function.
Value
normalize
Returns a list with the following attributes:
- Xc
The normalized data
- mx
Mean of columns of
X
.- vx
Length of columns of
X
.- Id
Logical vector indicating which variables are included in X. If some of the columns have zero length they are omitted
Author(s)
Line Clemmensen
References
Clemmensen, L., Hastie, T. and Ersboell, K. (2008) "Sparse discriminant analysis", Technical report, IMM, Technical University of Denmark
See Also
normalizetest
, predict.ASDA
, ASDA
Examples
## Data
X<-matrix(sample(seq(3),12,replace=TRUE),nrow=3)
## Normalize data
Nm<-normalize(X)
print(Nm$Xc)
## See if any variables have been removed
which(!Nm$Id)
Normalize training data
Description
Normalize test data using output from the normalize()
of the training data
Usage
normalizetest(Xtst, Xn)
Arguments
Xtst |
a matrix with the test data with observations down the rows and variables in the columns. |
Xn |
List with the output from normalize(Xtr) of the training data. |
Details
This function can e.g. be used for the test data in the predict.ASDA
function.
Value
normalizetest
returns the normalized test data Xtst
Author(s)
Line Clemmensen
References
Clemmensen, L., Hastie, T. and Ersboell, K. (2008) "Sparse discriminant analysis", Technical report, IMM, Technical University of Denmark
See Also
Examples
## Data
Xtr<-matrix(sample(seq(3),12,replace=TRUE),nrow=3)
Xtst<-matrix(sample(seq(3),12,replace=TRUE),nrow=3)
## Normalize training data
Nm<-normalize(Xtr)
## Normalize test data
Xtst<-normalizetest(Xtst,Nm)
Finding null space of linear operator
Description
Finds the null space of a linear operator A in R^{n \times m}
.
The null space is given as a matrix, where the columns form an orthonormal basis
for the nullspace. This function emulates the null function in matlab, it
works exactly the same, but the basis vectors may be different, i.e. rotated.
Usage
nullSp(A)
Arguments
A |
m by n matrix |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
nullSp
returns a matrix whose columns span the nullspace of A.
See Also
Alternative Null
function in MASS package.
Ordinal Accelerated Sparse Discriminant Analysis
Description
Applies accelerated proximal gradient algorithm to
the optimal scoring formulation of sparse discriminant analysis proposed
by Clemmensen et al. 2011. The problem is further casted to a binary
classification problem as described in "Learning to Classify Ordinal Data:
The Data Replication Method" by Cardoso and da Costa to handle the ordinal labels.
This function serves as a wrapper for the ASDA
function, where the
appropriate data augmentation is performed. Since the problem is casted into
a binary classication problem, only a single discriminant vector comes from the
result. The first *p* entries correspond to the variables/coefficients for
the predictors, while the following K-1 entries correspond to biases for the
found hyperplane, to separate the classes. The resulting object is of class ordASDA
and has an accompanying predict function. The paper by Cardoso and dat Costa can
be found here: (http://www.jmlr.org/papers/volume8/cardoso07a/cardoso07a.pdf).
Usage
ordASDA(Xt, ...)
## Default S3 method:
ordASDA(
Xt,
Yt,
s = 1,
Om,
gam = 0.001,
lam = 1e-06,
method = "SDAAP",
control,
...
)
Arguments
Xt |
n by p data matrix, (can also be a data.frame that can be coerced to a matrix) |
... |
Additional arguments for |
Yt |
vector of length n, equal to the number of samples. The classes should be 1,2,...,K where K is the number of classes. Yt needs to be a numeric vector. |
s |
We need to find a hyperplane that separates all classes with different biases. For each new bias we define a binary classification problem, where a maximum of s ordinal classes or contained in each of the two classes. A higher value of s means that more data will be copied in the data augmentation step. BY default s is 1. |
Om |
p by p parameter matrix Omega in generalized elastic net penalty, where p is the number of variables. |
gam |
Regularization parameter for elastic net penalty, must be greater than zero. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
method |
String to select method, now either SDAD or SDAAP, see ?ASDA for more info. |
control |
List of control arguments further passed to ASDA. See |
Value
ordASDA
returns an object of class
"ordASDA
" including a list
with the same components as an ASDA objects and:
h
Scalar value for biases.
K
Number of classes.
NULL
Note
Remember to normalize the data.
See Also
ASDA
.
Examples
set.seed(123)
# You can play around with these values to generate some 2D data to test one
numClasses <- 5
sigma <- matrix(c(1,-0.2,-0.2,1),2,2)
mu <- c(0,0)
numObsPerClass <- 5
# Generate the data, can access with train$X and train$Y
train <- accSDA::genDat(numClasses,numObsPerClass,mu,sigma)
test <- accSDA::genDat(numClasses,numObsPerClass*2,mu,sigma)
# Visualize it, only using the first variable gives very good separation
plot(train$X[,1],train$X[,2],col = factor(train$Y),asp=1,main="Training Data")
# Train the ordinal based model
res <- accSDA::ordASDA(train$X,train$Y,s=2,h=1, gam=1e-6, lam=1e-3)
vals <- predict(object = res,newdata = test$X) # Takes a while to run ~ 10 seconds
sum(vals==test$Y)/length(vals) # Get accuracy on test set
#plot(test$X[,1],test$X[,2],col = factor(test$Y),asp=1,
# main="Test Data with correct labels")
#plot(test$X[,1],test$X[,2],col = factor(vals),asp=1,
# main="Test Data with predictions from ordinal classifier")
Predict method for sparse discriminant analysis
Description
Predicted values based on fit from the function ASDA
. This
function is used to classify new observations based on their explanatory variables/features.
Usage
## S3 method for class 'ASDA'
predict(object, newdata = NULL, ...)
Arguments
object |
Object of class ASDA. This object is returned from the function |
newdata |
A matrix of new observations to classify. |
... |
Arguments passed to |
Value
A list with components:
class
The classification (a factor)
posterior
posterior probabilities for the classes
x
the scores
Note
The input matrix newdata should be normalized w.r.t. the normalization of the training data
See Also
Examples
# Prepare training and test set
train <- c(1:40,51:90,101:140)
Xtrain <- iris[train,1:4]
nX <- normalize(Xtrain)
Xtrain <- nX$Xc
Ytrain <- iris[train,5]
Xtest <- iris[-train,1:4]
Xtest <- normalizetest(Xtest,nX)
Ytest <- iris[-train,5]
# Define parameters for SDAD
Om <- diag(4)+0.1*matrix(1,4,4) #elNet coef mat
gam <- 0.01
lam <- 0.01
method <- "SDAD"
q <- 2
control <- list(PGsteps = 100,
PGtol = c(1e-5,1e-5),
mu = 1,
maxits = 100,
tol = 1e-3,
quiet = FALSE)
# Run the algorithm
res <- ASDA(Xt = Xtrain,
Yt = Ytrain,
Om = Om,
gam = gam ,
lam = lam,
q = q,
method = method,
control = control)
# Do the predictions on the test set
preds <- predict(object = res, newdata = Xtest)
Predict method for ordinal sparse discriminant analysis
Description
Predicted values based on fit from the function ordASDA
. This
function is used to classify new observations based on their explanatory variables/features.
There is no need to normalize the data, the data is normalized based on the normalization
data from the ordASDA object.
Usage
## S3 method for class 'ordASDA'
predict(object, newdata = NULL, ...)
Arguments
object |
Object of class ordASDA. This object is returned from the function |
newdata |
A matrix of new observations to classify. |
... |
Arguments passed to |
Value
A vector of predictions.
See Also
Examples
set.seed(123)
# You can play around with these values to generate some 2D data to test one
numClasses <- 5
sigma <- matrix(c(1,-0.2,-0.2,1),2,2)
mu <- c(0,0)
numObsPerClass <- 5
# Generate the data, can access with train$X and train$Y
train <- accSDA::genDat(numClasses,numObsPerClass,mu,sigma)
test <- accSDA::genDat(numClasses,numObsPerClass*2,mu,sigma)
# Visualize it, only using the first variable gives very good separation
plot(train$X[,1],train$X[,2],col = factor(train$Y),asp=1,main="Training Data")
# Train the ordinal based model
res <- accSDA::ordASDA(train$X,train$Y,s=2,h=1, gam=1e-6, lam=1e-3)
vals <- predict(object = res,newdata = test$X) # Takes a while to run ~ 10 seconds
sum(vals==test$Y)/length(vals) # Get accuracy on test set
#plot(test$X[,1],test$X[,2],col = factor(test$Y),asp=1,
# main="Test Data with correct labels")
#plot(test$X[,1],test$X[,2],col = factor(vals),asp=1,
# main="Test Data with predictions from ordinal classifier")
Print method for ASDA object
Description
Prints a summary of the output from the ASDA
function. The
output summarizes the discriminant analysis in human readable format.
Usage
## S3 method for class 'ASDA'
print(x, digits = max(3, getOption("digits") - 3), numshow = 5, ...)
Arguments
x |
Object of class ASDA. This object is returned from the function |
digits |
Number of digits to show in printed numbers. |
numshow |
Number of best ranked variables w.r.t. to their absolute coefficients. |
... |
arguments passed to or from other methods. |
Value
An invisible copy of x
.
See Also
ASDA
, predict.ASDA
and SDAD
Examples
# Prepare training and test set
train <- c(1:40,51:90,101:140)
Xtrain <- iris[train,1:4]
nX <- normalize(Xtrain)
Xtrain <- nX$Xc
Ytrain <- iris[train,5]
Xtest <- iris[-train,1:4]
Xtest <- normalizetest(Xtest,nX)
Ytest <- iris[-train,5]
# Run the algorithm
resDef <- ASDA(Xtrain,Ytrain)
# Print
print(resDef)
Accelerated Proximal Gradient on l1 regularized quadratic program
Description
Applies accelerated proximal gradient algorithm to the l1-regularized quadratic program
f(\mathbf{x}) + g(\mathbf{x}) = \frac{1}{2}\mathbf{x}^TA\mathbf{x} - d^T\mathbf{x} + \lambda |\mathbf{x}|_1
Usage
prox_EN(A, d, x0, lam, alpha, maxits, tol)
Arguments
A |
p by p positive definite coefficient matrix
. |
d |
nx1 dimensional column vector. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
alpha |
Step length. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
prox_EN
returns an object of class
"prox_EN
" including a list
with the following named components
call
The matched call.
x
Found solution.
k
Number of iterations used.
See Also
Used by: SDAP
and the SDAPcv
cross-validation version.
Accelerated Proximal Gradient on l1 regularized quadratic program with backtracking
Description
Applies accelerated proximal gradient (with backtracking) algorithm to the l1-regularized quadratic program
f(\mathbf{x}) + g(\mathbf{x}) = \frac{1}{2}\mathbf{x}^TA\mathbf{x} - d^T\mathbf{x} + \lambda |\mathbf{x}|_1
Usage
prox_ENbt(A, Xt, Om, gamma, d, x0, lam, L, eta, maxits, tol)
Arguments
A |
p by p positive definite coefficient matrix
. |
Xt |
Same as X above, we need it to make calculations faster. |
Om |
Same reason as for the above parameter. |
gamma |
l2 regularizing parameter. |
d |
nx1 dimensional column vector. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
L |
Initial value of backtracking Lipshitz constant. |
eta |
Backtracking scaling parameter. |
maxits |
Number of iterations to run |
tol |
Stopping tolerance for proximal gradient algorithm. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
prox_ENbt
returns an object of class
"prox_ENbt
" including a list
with the following named components
call
The matched call.
x
Found solution.
k
Number of iterations used.
See Also
Used by: SDAP
and the SDAPcv
cross-validation version.
Classify test data using nearest centroid classification and discriminant vectors learned from the training set.
Description
This function is used in SZVDcv and is only meant for internal use at this stage. Will potentially be released in future versions.
Usage
test_ZVD(w, test, classMeans, mus, scaling, ztol)
Arguments
w |
Matrix with columns equal to discriminant vectors. |
test |
matrix containing test set. |
classMeans |
Means of each class in the training set, (used for computing centroids for classification). |
mus |
means/standard devs of the training set, (used for centering/normalizing the test data appropriately). |
scaling |
Logical indicating whether scaling should be done. on the test set. |
ztol |
Threshold for setting values in DVs to zero. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes. Potential release in the future. This function should potentially be made internal for the release.
Value
test_ZVD
returns an object of class
"test_ZVD
" including a list
with the following named components
stats
list containing number of misclassified observations, l0 and l1 norms of discriminants.
pred_labs
predicted class labels according to nearest centroid and the discriminants.
See Also
Used by: SZVDcv
.
Softmax for SZVD ADMM iterations
Description
Applies softmax the soft thresholding shrinkage operator to v with tolerance a. That is, output is the vector with entries with absolute value v_i - a if |v_i| > a and zero otherwise, with sign pattern matching that of v.
Usage
vec_shrink(v, a)
Arguments
v |
Vector to be thresholded. |
a |
Vector of tolerances. |
Details
This function is used by other functions and should only be called explicitly for debugging purposes.
Value
thresholded v vector.
x,y,z
Iterates at termination.
its
Number of iterations required to converge.
errtol
Stopping error bound at termination
See Also
Used by: SZVD_ADMM
.