Coordinate Descent Lasso

Fit lasso for regression using coordinate descent.

cdlasso(formula,
      data,
      nfolds = 0,
      weights = NULL,
      nlambda = 100,
      lambda.min.ratio = ifelse(n < n.xvar, 0.01, 1e-04),
      lambda = NULL,
      threshold = 1e-7,
      eps = .0001,
      maxit = 5000,
      efficiency = ifelse(n.xvar < 500, "covariance", "naive"),
      seed = NULL,
      do.trace = FALSE)

Arguments

formula: Formula describing the model to be fit.
data: Data frame containing response and features.
nfolds: Number of cross-validation folds where default is 0 corresponding to no cross-validation.
weights: Observation weights. Default is 1 for each observation.
nlambda: The number of lambda values; default is 100.
lambda.min.ratio: Smallest value for lambda, as a fraction of lambda.max which equals smallest value for which all coefficients are zero. A very small value of lambda.min.ratio will lead to a saturated fit in if number of observations n is less than number of features n.xvar.
lambda: Lasso lambda sequence. Default is an internally selected sequence based on nlambda and lambda.min.ratio. For experts only.
threshold: Convergence threshold for coordinate descent. Each inner coordinate-descent loop continues until the maximum change in the objective after any coefficient update is less than threshold times the null deviance.
eps: Multiplication factor applied to lambda.min.ratio used to define the smallest lambda value.
maxit: Maximum number of passes over the data for all lambda values.
efficiency: Switches the algorithm to efficiency or naive mode depending on number of variables. Efficiency covariance saves all inner-products and can be significantly faster in certain settings than naive which loops through all values n each time an inner-product is formed.
seed: Negative integer specifying seed for the random number generator.
do.trace: Number of seconds between updates to the user on approximate time to completion.

Details

Use coordinate descent to fit lasso to a regression model.

Value

Lasso solution path with the following values.

beta: Matrix containing beta values for the lasso solution path.
lambda: The sequence of lambda values used.
lambda.min.indx: Index for value of lambda that gives the minimum cross-validation error. Only applies if nfolds is greater than 1.
lambda.1se.min.indx: Index for minimum lambda value within 1 standard error of the minimum cross-validation error. This is more liberal. Only applies if nfolds is greater than 1.
lambda.1se.max.indx: Index for maximum lambda value within 1 standard error of the minimum cross-validation error. This is more conservative. Only applies if nfolds is greater than 1.

Author

Hemant Ishwaran and Udaya B. Kogalur

References

Friedman, J., Hastie, T. and Tibshirani, R. (2010) Regularization paths for generalized linear models via coordinate descent, J. of Statistical Software, 33(1):1-22.

Examples