Type: | Package |
Title: | Fast Algorithm for Support Vector Machine |
Version: | 1.0.1 |
Date: | 2025-02-05 |
Maintainer: | Yikai Zhang <yikai-zhang@uiowa.edu> |
Description: | Implements an efficient algorithm to fit and tune penalized Support Vector Machine models using the generalized coordinate descent algorithm. Designed to handle high-dimensional datasets effectively, with emphasis on precision and computational efficiency. This package implements the algorithms proposed in Tang, Q., Zhang, Y., & Wang, B. (2022) https://openreview.net/pdf?id=RvwMTDYTOb. |
License: | GPL-2 |
Encoding: | UTF-8 |
Depends: | R (≥ 3.5.0) |
Imports: | stats, Matrix, methods |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
RoxygenNote: | 7.2.3 |
Packaged: | 2025-02-05 20:41:31 UTC; qtang7 |
Author: | Yikai Zhang [aut, cre], Qian Tang [aut], Boxiang Wang [aut] |
Repository: | CRAN |
Date/Publication: | 2025-02-11 13:50:02 UTC |
Extract Coefficients from a 'cv.hdsvm' Object
Description
Retrieves coefficients from a cross-validated 'hdsvm()' model, using the stored '"hdsvm.fit"' object and the optimal 'lambda' value determined during cross-validation.
Usage
## S3 method for class 'cv.hdsvm'
coef(object, s = c("lambda.1se", "lambda.min"), ...)
Arguments
object |
A fitted 'cv.hdsvm()' object from which coefficients are to be extracted. |
s |
Specifies the value(s) of the penalty parameter 'lambda' for which coefficients are desired. The default is 's = "lambda.1se"', which corresponds to the largest value of 'lambda' such that the cross-validation error estimate is within one standard error of the minimum. Alternatively, 's = "lambda.min"' can be used, corresponding to the minimum of the cross-validation error estimate. If 's' is numeric, these are taken as the actual values of 'lambda' to use. |
... |
Not used. |
Value
Returns the coefficients at the specified 'lambda' values.
See Also
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
cv.fit <- cv.hdsvm(x, y, lam2 = 0.01)
coef(cv.fit, s = c(0.02, 0.03))
Extract Coefficients from a 'cv.nc.hdsvm' Object
Description
Retrieves coefficients at specified values of 'lambda' from a fitted 'cv.nc.hdsvm()' model. Utilizes the stored '"nchdsvm.fit"' object and the optimal 'lambda' values determined during the cross-validation process.
Usage
## S3 method for class 'cv.nc.hdsvm'
coef(object, s = c("lambda.1se", "lambda.min"), ...)
Arguments
object |
A fitted 'cv.nc.hdsvm()' object from which coefficients are to be extracted. |
s |
Specifies the 'lambda' values at which coefficients are requested. The default is 's = "lambda.1se"', representing the largest 'lambda' such that the cross-validation error estimate is within one standard error of the minimum. Alternatively, 's = "lambda.min"' corresponds to the 'lambda' yielding the minimum cross-validation error. If 's' is numeric, these values are directly used as the 'lambda' values for coefficient extraction. |
... |
Not used. |
Value
Returns a vector or matrix of coefficients corresponding to the specified 'lambda' values.
See Also
cv.nc.hdsvm
, predict.cv.nc.hdsvm
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
lambda <- 10^(seq(1,-4, length.out = 30))
cv.nc.fit <- cv.nc.hdsvm(x = x, y = y, lambda = lambda, lam2 = lam2, pen = "scad")
coef(cv.nc.fit, s = c(0.02, 0.03))
Extract Model Coefficients from a 'hdsvm' Object
Description
Retrieves the coefficients at specified values of 'lambda' from a fitted 'hdsvm()' model.
Usage
## S3 method for class 'hdsvm'
coef(object, s = NULL, type = c("coefficients", "nonzero"), ...)
Arguments
object |
Fitted 'hdsvm()' object. |
s |
Values of the penalty parameter 'lambda' for which coefficients are requested. Defaults to the entire sequence used during the model fit. |
type |
Type of prediction required. Type '"coefficients"' computes the coefficients at the requested
values for 's'. Type '"nonzero"' returns a list of the indices of the nonzero coefficients for each
value of |
... |
Not used. |
Details
This function extracts coefficients for specified 'lambda' values from a 'hdsvm()' object. If 's', the vector of 'lambda' values, contains values not originally used in the model fitting, the 'coef' function employs linear interpolation between the closest 'lambda' values from the original sequence to estimate coefficients at the new 'lambda' values.
Value
Returns a matrix or vector of coefficients corresponding to the specified 'lambda' values.
See Also
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
fit <- hdsvm(x, y, lam2=lam2)
coefs <- coef(fit, s = fit$lambda[3:5])
Extract Model Coefficients from a 'nc.hdsvm' Object
Description
Retrieves the coefficients at specified values of 'lambda' from a fitted 'nc.hdsvm()' model.
Usage
## S3 method for class 'nc.hdsvm'
coef(object, s = NULL, type = c("coefficients", "nonzero"), ...)
Arguments
object |
Fitted 'nc.hdsvm()' object. |
s |
Values of the penalty parameter 'lambda' for which coefficients are requested. Defaults to the entire sequence used during the model fit. |
type |
Type of prediction required. Type '"coefficients"' computes the coefficients at the requested
values for 's'. Type '"nonzero"' returns a list of the indices of the nonzero coefficients for each
value of |
... |
Not used. |
Details
This function extracts coefficients for specified 'lambda' values from a 'nc.hdsvm()' object. If 's', the vector of 'lambda' values, contains values not originally used in the model fitting, the 'coef' function employs linear interpolation between the closest 'lambda' values from the original sequence to estimate coefficients at the new 'lambda' values.
Value
Returns a matrix or vector of coefficients corresponding to the specified 'lambda' values.
See Also
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
lambda <- 10^(seq(1,-4, length.out = 30))
nc.fit <- nc.hdsvm(x = x, y = y, lambda = lambda, lam2 = lam2, pen = "scad")
nc.coefs <- coef(nc.fit, s = nc.fit$lambda[3:5])
Cross-validation for Selecting the Tuning Parameter in the Penalized SVM
Description
Performs k-fold cross-validation for hdsvm
.
Usage
cv.hdsvm(x, y, lambda = NULL, nfolds = 5L, foldid, ...)
Arguments
x |
A numerical matrix with |
y |
Response variable. |
lambda |
Optional; a user-supplied sequence of |
nfolds |
Number of folds for cross-validation. Defaults to 5. |
foldid |
Optional vector specifying the indices of observations in each fold.
If provided, it overrides |
... |
Additional arguments passed to |
Details
This function computes the average cross-validation error and provides the standard error.
Value
An object with S3 class cv.hdsvm
consisting of
lambda |
Candidate |
cvm |
Mean cross-validation error. |
cvsd |
Standard error of the mean cross-validation error. |
cvup |
Upper confidence curve: |
cvlo |
Lower confidence curve: |
lambda.min |
|
lambda.1se |
Largest |
cv.min |
Cross-validation error at |
cv.1se |
Cross-validation error at |
hdsvm.fit |
a fitted |
nzero |
Number of non-zero coefficients at each |
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
fit <- cv.hdsvm(x, y, lam2=lam2)
Cross-validation for Selecting the Tuning Parameter of Nonconvex Penalized SVM
Description
Conducts k-fold cross-validation for the 'nc.hdsvm()' function.
Usage
cv.nc.hdsvm(x, y, lambda = NULL, nfolds = 5L, foldid, ...)
Arguments
x |
A numerical matrix with dimensions ( |
y |
Response variable. |
lambda |
Optional user-supplied sequence of |
nfolds |
Number of folds in the cross-validation, default is 5. |
foldid |
An optional vector that assigns each observation to a specific fold.
If provided, this parameter overrides |
... |
Additional arguments passed to |
Details
This function estimates the average cross-validation error and its standard error across folds. It is primarily used to
identify the optimal lambda
value for fitting nonconvex penalized SVM models.
Value
An object of class cv.nc.hdsvm
is returned,
which is a list with the ingredients of the cross-validated fit.
lambda |
the values of |
cvm |
the mean cross-validated error - a vector of length |
cvsd |
estimate of standard error of |
cvupper |
upper curve = |
cvlower |
lower curve = |
nzero |
number of non-zero coefficients at each |
name |
a text string indicating type of measure (for plotting purposes). |
nchdsvm.fit |
a fitted |
lambda.min |
The optimal value of |
lambda.1se |
The largest value of |
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
lambda <- 10^(seq(1,-4, length.out=30))
cv.nc.fit <- cv.nc.hdsvm(x=x, y=y, lambda=lambda, lam2=lam2, pen="scad")
Solve Penalized SVM
Description
Fits a penalized support vector machine (SVM) model using a range of lambda
values,
allowing for detailed control over regularization parameters and model complexity.
Usage
hdsvm(
x,
y,
nlambda = 100,
lambda.factor = ifelse(nobs < nvars, 0.01, 1e-04),
lambda = NULL,
lam2 = 0,
hval = 1,
pf = rep(1, nvars),
pf2 = rep(1, nvars),
exclude,
dfmax = nvars + 1,
pmax = min(dfmax * 1.2, nvars),
standardize = TRUE,
eps = 1e-08,
maxit = 1e+06,
sigma = 0.9,
is_exact = FALSE
)
Arguments
x |
Matrix of predictors, with dimensions ( |
y |
Response variable vector of length |
nlambda |
Number of |
lambda.factor |
The factor for getting the minimal value
in the |
lambda |
A user-supplied |
lam2 |
Regularization parameter |
hval |
Smoothing parameter for the smoothed hinge loss, default is 1. |
pf |
L1 penalty factor of length |
pf2 |
L2 penalty factor of length |
exclude |
Indices of variables to be excluded from the model. Default is none. Equivalent to an infinite penalty factor. |
dfmax |
The maximum number of variables allowed in the model.
Useful for very large |
pmax |
Maximum count of non-zero coefficients across the solution path. |
standardize |
Logical flag for variable standardization,
prior to fitting the model sequence. The coefficients are
always returned to the original scale. Default is |
eps |
Convergence criterion for stopping the algorithm. |
maxit |
Maximum number of iterations permitted. |
sigma |
Penalty parameter in the quadratic term of the augmented Lagrangian. |
is_exact |
If |
Details
The function utilizes the hinge loss function combined with elastic net penalization:
1'[\max\{1 - y_i (\beta_0 + X_i^\top \beta), 0\}]/N + \lambda_1 \cdot |pf_1 \circ \beta|_1 +
0.5 \cdot \lambda_2 \cdot (\sqrt{pf_2} \circ \beta)^2,
where \circ
denotes the Hadamard product.
For faster computation, if the algorithm is not converging or
running slow, consider increasing eps
, increasing
sigma
, decreasing nlambda
, or increasing
lambda.factor
before increasing maxit
.
Value
An object with S3 class hdsvm
consisting of
call |
the call that produced this object |
b0 |
intercept sequence of length |
beta |
a |
lambda |
the actual sequence of |
df |
the number of nonzero coefficients for each value
of |
npasses |
the number of iterations for every lambda value |
jerr |
error flag, for warnings and errors, 0 if no error. |
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
fit <- hdsvm(x, y, lam2=lam2)
Solve the Penalized SVM with Nonconvex Penalties
Description
This function fits the penalized SVM using nonconvex penalties such as SCAD or MCP. It allows for flexible control over the regularization parameters and offers advanced options for initializing and optimizing the fit.
Usage
nc.hdsvm(
x,
y,
lambda,
pen = "scad",
aval = NULL,
lam2 = 1,
ini_beta = NULL,
lla_step = 3,
...
)
Arguments
x |
Matrix of predictors, with dimensions (nobs * nvars); each row represents an observation. |
y |
Response variable, with length |
lambda |
Optional user-supplied sequence of |
pen |
Specifies the type of nonconvex penalty: "SCAD" or "MCP". |
aval |
The parameter value for the SCAD or MCP penalty. Default is 3.7 for SCAD and 2 for MCP. |
lam2 |
Regularization parameter |
ini_beta |
Optional initial coefficients to start the fitting process. |
lla_step |
Number of Local Linear Approximation (LLA) steps. Default is 3. |
... |
Additional arguments passed to |
Value
An object with S3 class nc.hdsvm
consisting of
call |
the call that produced this object |
b0 |
intercept sequence of length |
beta |
a |
lambda |
the actual sequence of |
df |
the number of nonzero coefficients for each value
of |
npasses |
the number of iterations for every lambda value |
jerr |
error flag, for warnings and errors, 0 if no error. |
#'
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
lambda <- 10^(seq(1,-4, length.out = 30))
nc.fit <- nc.hdsvm(x = x, y = y, lambda = lambda, lam2 = lam2, pen = "scad")
Make Predictions from a 'cv.hdsvm' Object
Description
Generates predictions using a fitted 'cv.hdsvm()' object. This function utilizes the stored 'hdsvm.fit' object and an optimal value of 'lambda' determined during the cross-validation process.
Usage
## S3 method for class 'cv.hdsvm'
predict(
object,
newx,
s = c("lambda.1se", "lambda.min"),
type = c("class", "loss"),
...
)
Arguments
object |
A fitted 'cv.hdsvm()' object from which predictions are to be made. |
newx |
Matrix of new predictor values for which predictions are desired. This must be a matrix and is a required argument. |
s |
Specifies the value(s) of the penalty parameter 'lambda' at which predictions are desired. The default is 's = "lambda.1se"', representing the largest value of 'lambda' such that the cross-validation error estimate is within one standard error of the minimum. Alternatively, 's = "lambda.min"' can be used, corresponding to the minimum of the cross-validation error estimate. If 's' is numeric, these are taken as the actual values of 'lambda' to use for predictions. |
type |
Type of prediction required. Type '"class"' produces the predicted binary class labels and
type '"loss"' returns the fitted values. Default is |
... |
Not used. |
Value
Returns a matrix or vector of predicted values corresponding to the specified 'lambda' values.
See Also
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
cv.fit <- cv.hdsvm(x, y, lam2 = 0.01)
predict(cv.fit, newx = x[50:60, ], s = "lambda.min")
Make Predictions from a 'cv.nc.hdsvm' Object
Description
Generates predictions using a fitted 'cv.nc.hdsvm()' object. This function utilizes the stored 'nchdsvm.fit' object and an optimal value of 'lambda' determined during the cross-validation process.
Usage
## S3 method for class 'cv.nc.hdsvm'
predict(
object,
newx,
s = c("lambda.1se", "lambda.min"),
type = c("class", "loss"),
...
)
Arguments
object |
A fitted 'cv.nc.hdsvm()' object from which predictions are to be made. |
newx |
Matrix of new predictor values for which predictions are desired. This must be a matrix and is a required argument. |
s |
Specifies the value(s) of the penalty parameter 'lambda' at which predictions are desired. The default is 's = "lambda.1se"', representing the largest value of 'lambda' such that the cross-validation error estimate is within one standard error of the minimum. Alternatively, 's = "lambda.min"' can be used, corresponding to the minimum of the cross-validation error estimate. If 's' is numeric, these are taken as the actual values of 'lambda' to use for predictions. |
type |
Type of prediction required. Type '"class"' produces the predicted binary class labels and
type '"loss"' returns the fitted values. Default is |
... |
Not used. |
Value
Returns a matrix or vector of predicted values corresponding to the specified 'lambda' values.
See Also
cv.nc.hdsvm
, predict.cv.nc.hdsvm
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
lambda <- 10^(seq(1,-4, length.out = 30))
cv.nc.fit <- cv.nc.hdsvm(x = x, y = y, lambda = lambda, lam2 = lam2, pen = "scad")
predict(cv.nc.fit, newx = x[50:60, ], s = "lambda.min")
Make Predictions from a 'hdsvm' Object
Description
Produces fitted values for new predictor data using a fitted 'hdsvm()' object.
Usage
## S3 method for class 'hdsvm'
predict(object, newx, s = NULL, type = c("class", "loss"), ...)
Arguments
object |
Fitted 'hdsvm()' object from which predictions are to be derived. |
newx |
Matrix of new predictor values for which predictions are desired. This must be a matrix and is a required argument. |
s |
Values of the penalty parameter 'lambda' for which predictions are requested. Defaults to the entire sequence used during the model fit. |
type |
Type of prediction required. Type '"class"' produces the predicted binary class labels and
type '"loss"' returns the fitted values. Default is |
... |
Not used. |
Details
This function generates predictions at specified 'lambda' values from a fitted 'hdsvm()' object. It is essential to provide a new matrix of predictor values ('newx') at which these predictions are to be made.
Value
Returns a vector or matrix of predicted values corresponding to the specified 'lambda' values.
See Also
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
fit <- hdsvm(x, y, lam2=lam2)
preds <- predict(fit, newx = tail(x), s = fit$lambda[3:5])
Make Predictions from a 'nc.hdsvm' Object
Description
Produces fitted values for new predictor data using a fitted 'nc.hdsvm()' object.
Usage
## S3 method for class 'nc.hdsvm'
predict(object, newx, s = NULL, type = c("class", "loss"), ...)
Arguments
object |
Fitted 'nc.hdsvm()' object from which predictions are to be derived. |
newx |
Matrix of new predictor values for which predictions are desired. This must be a matrix and is a required argument. |
s |
Values of the penalty parameter 'lambda' for which predictions are requested. Defaults to the entire sequence used during the model fit. |
type |
Type of prediction required. Type '"class"' produces the predicted binary class labels and
type '"loss"' returns the fitted values. Default is |
... |
Not used. |
Details
This function generates predictions at specified 'lambda' values from a fitted 'nc.hdsvm()' object. It is essential to provide a new matrix of predictor values ('newx') at which these predictions are to be made.
Value
Returns a vector or matrix of predicted values corresponding to the specified 'lambda' values.
See Also
Examples
set.seed(315)
n <- 100
p <- 400
x1 <- matrix(rnorm(n / 2 * p, -0.25, 0.1), n / 2)
x2 <- matrix(rnorm(n / 2 * p, 0.25, 0.1), n / 2)
x <- rbind(x1, x2)
beta <- 0.1 * rnorm(p)
prob <- plogis(c(x %*% beta))
y <- 2 * rbinom(n, 1, prob) - 1
lam2 <- 0.01
lambda <- 10^(seq(1,-4, length.out = 30))
nc.fit <- nc.hdsvm(x = x, y = y, lambda = lambda, lam2 = lam2, pen = "scad")
nc.preds <- predict(nc.fit, newx = tail(x), s = nc.fit$lambda[3:5])