Class

Class of object returned by any fitting function of the quadrupen package (elastic_net or bounded_reg).

This class comes with the usual predict(), fitted(), coef(), residuals(), show(), print() and deviance() S3 methods.

Specific R6 methods are available for model extraction QuadrupenFit$get_model(), cross validation QuadrupenFit$cross_validate(), stability selection QuadrupenFit$stability_path(), criteria derivation QuadrupenFit$criteria() and plotting QuadrupenFit$plot(). They come with equivalent S3 methods : cross_validate(), stability() and plot().

The "path"plot is available as soon as a fit has been performed. For the others, the appropriate post-treatments must have been made via the methods QuadrupenFit$criteria(), QuadrupenFit$cross_validate() or QuadrupenFit$stability()

All plots functions are given with the default arguments, except for labels and log_scale. If you need more control, please use the dedicated methods: QuadrupenFit$plot_path(), InformationCriteria$plot(), CrossValidation$plot(), StabilityPath$plot() or the corresponding S3 methods.

Plot method for regularization path

Active bindings

nvar: number of coefficient (without intercept)
nobs: sample size
dataModel: an object with class DataModel storing the data
major_tuning: vector of "leading" tuning parameters (either l1, linf or l2)
minor_tuning: vector of "minor" tuning parameters (either l1 or l2)
is_l2_regularized: Boolean indicating if l2 regularization is applied
optim_monitoring: list monitoring the optimization
optim_config: list with low level options used for optimization.
fitted: Matrix of fitted values, each column corresponding to a value of lambda1.
coefficients: Matrix (class "dgCMatrix") of coefficients with respect to the original input. The number of rows corresponds the length of lambda1.
intercept: A vector containing the successive values of the (unpenalized) intercept. Equals to zero if intercept has been set to FALSE.
debias: logical, should we rely on the debias coefficient of the regularizer (if available) or not
residuals: Matrix of residuals, each column corresponding to a value of lambda1.
deviance: the model deviance
degrees_freedom: Estimated degree of freedoms for the successive lambda1.
r_squared: vector giving the coefficient of determination as a function of lambda1.
information_criteria: object with class InformationCriteria storing various information criteria (AIC, BIC, GCV, etc) for the current fit.
cross_validation: object with class CrossValidation storing output of CV job. Only available once method cross_validate has been called.
stability_path: object with class StabilityPath storing output of stability selection. Only available once method $stability has been called.

Methods

`QuadrupenFit$new()`

Initialize a QuadrupenFit model

Usage

QuadrupenFit$new(data, intercept, regParam)

Arguments

data: a DataModel object
intercept: a logical; should an intercept be included in the mode?
regParam: a list with two elements, a vector and a scalar, for the regularization

`QuadrupenFit$show()`

User friendly print method

Usage

QuadrupenFit$show()

`QuadrupenFit$print()`

User friendly print method

Usage

QuadrupenFit$print()

`QuadrupenFit$fit()`

function performing the optimization

Usage

QuadrupenFit$fit(control)

Arguments

control: list controlling the optimization process

`QuadrupenFit$get_model()`

Model extraction

Usage

QuadrupenFit$get_model(selection, type = c("coefficients", "penalty", "index"))

Arguments

selection: either a character (model selection criteria) of a scalar (lambda value)
type: character for the desired output

Returns

either a vector of coefficients, a scalar or the model index

`QuadrupenFit$predict()`

Predict response for new sample based on the current model

Usage

QuadrupenFit$predict(newx = NULL, selection = NULL)

Arguments

newx: matrix of new values for the regressor with which to predict. If omitted, the fitted values are used.
selection: either a character (model selection criteria) of a scalar (lambda value)

Returns

a vector of predicted value Cross-validation for Quadrupen object

`QuadrupenFit$cross_validate()`

Function that computes K-fold cross-validated error of a quadrupen fit, possibly on a grid of lambda1, lambda2.

Usage

QuadrupenFit$cross_validate(
  K = 10,
  folds = split(sample(1:self$nobs), rep(1:K, length = self$nobs)),
  lambda2 = self$minor_tuning,
  verbose = TRUE,
  cores = 1
)

Arguments

K: integer indicating the number of folds. Default is 10.
folds: list of K vectors that describes the folds to use for the cross-validation. By default, the folds are randomly sampled with the specified K. The same folds are used for each values of lambda2.
lambda2: tunes the $\ell_2$-penalty (ridge-like) of the fit. If none is provided, a vector of values is generated and a CV is performed on a grid of lambda2 and lambda1, using the same folds for each lambda2.
verbose: logical; indicates if the progression (the current lambda2 should be displayed. Default is TRUE.
cores: the number of cores to use. The default uses 1 core (safer in case your BLAS/LAPACK libraries are multithreaded)

Returns

an object with class CrossValidation is sent back and stored as a field of the original QuadrupenFit object.

`QuadrupenFit$stability()`

Compute the stability path of a (possibly randomized) fitting procedure as introduced by Meinshausen and Buhlmann (2010).

Usage

QuadrupenFit$stability(
  n_subsamples = 50,
  subsample_size = floor(self$nobs/2),
  subsamples = replicate(n_subsamples, sample(1:self$nobs, subsample_size), simplify =
    FALSE),
  weakness = 1,
  verbose = TRUE,
  cores = 1
)

Arguments

n_subsamples: integer indicating the number of subsamplings used to estimate the selection probabilities. Default is 100.
subsample_size: integer indicating the size of each subsamples. Default is floor(n/2).
subsamples: list with subsamples entries with vectors describing the folds to use for the stability procedure. By default, the folds are randomly sampled with the specified n_subsamples and subsample_size argument.
weakness: Coefficient used for randomizing the weights of each features. Default is 1` for no randomization. See details below.
verbose: logical; indicates if the progression should be displayed. Default is TRUE.
cores: the number of cores to use. The default uses 1 core (safer in case your BLAS/LAPACK libraries are multithreaded)

Returns

an object with class StabilityPath is sent back and stored as a field of the original QuadrupenFit object.

`QuadrupenFit$criteria()`

Produce a plot or send back the values of some penalized criteria accompanied with the vector(s) of parameters selected accordingly. The default behavior plots the BIC and the AIC (with respective factor $\log(n)$ and $2$) yet the user can specify any penalty.

Usage

QuadrupenFit$criteria(
  penalty = setNames(c(2, log(self$nobs), log(self$nvar), log(self$nobs) + 2 *
    log(self$nvar)), c("AIC", "BIC", "mBIC", "eBIC")),
  sigma = NULL
)

Arguments

penalty: a vector with as many penalties a desired. The default contains the penalty corresponding to the AIC and the BIC ($2$ and $\log(n)$). Setting the "names" attribute, as done in the default definition, leads to outputs which are easier to read.
sigma: scalar: an estimate of the residual variance. When available, it is plugged-in the criteria, which may be more relevant. If NULL (the default), it is estimated as usual (see details).

Returns

an object with class InformationCriteria is sent back and stored as a field of the original QuadrupenFit object.

`QuadrupenFit$plot()`

Plot method for QuadrupenFit

Usage

QuadrupenFit$plot(
  type = c("path", "criteria", "crossval", "stability"),
  log_scale = TRUE,
  labels = NULL
)

Arguments

type: the type of plot, either "path" for regularization path; "criteria" for BIC-like information criteria ; "crossval" for cross-validation plot ; and "stability" for stability path.
log_scale: logical; indicates if a log-scale should be used when xvar="lambda". Default is TRUE.
labels: vector indicating the names associated to the plotted variables. When specified, a legend is drawn in order to identify each variable. Only relevant when the number of predictor is small. Remind that the intercept does not count. Default is NULL.

`QuadrupenFit$plot_path()`

Produce a plot of the solution path of a QuadrupenFit object.

Usage

QuadrupenFit$plot_path(
  xvar = c("lambda", "fraction", "df"),
  log_scale = TRUE,
  title = paste("Path for", self$penalty),
  standardize = TRUE,
  labels = NULL
)

Arguments

xvar: variable to plot on the X-axis: either "lambda" ($\ell_1$ penalty level, or $\ell_2$ for ridge and $\ell_\infty$) or "fraction" ($\ell_1$-norm of the coefficients) or df for estimated degrees of freedom. Default is set to "lambda".
log_scale: logical; indicates if a log-scale should be used when xvar="lambda". Default is TRUE.
title: the title. Default is set to the model name followed by what is on the Y-axis.
standardize: logical; standardize the coefficients before plotting (with the norm of the predictor). Default is TRUE.
labels: vector indicating the names associated to the plotted variables. When specified, a legend is drawn in order to identify each variable. Only relevant when the number of predictor is small. Remind that the intercept does not count. Default is NULL.

Returns

a ggplot2 object .

Examples

## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww  <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)

## Plot the Lasso path
plot(lasso(x,y), title="Lasso solution path")
## Plot the Elastic-net path
plot(elastic_net(x,y), title = "Elastic-net solution path")
## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient)
plot(elastic_net(x,y, lambda2=10), standardize=FALSE, xvar="fraction")
## Plot the Bounded regression path (fraction on X-axis)
plot(bounded_reg(x,y, lambda2=10), xvar="fraction")

`QuadrupenFit$clone()`

The objects of this class are cloneable with this method.

Usage

QuadrupenFit$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `QuadrupenFit$plot_path()`
## ------------------------------------------------

if (FALSE) { # \dontrun{
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww  <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)

## Plot the Lasso path
plot(lasso(x,y), title="Lasso solution path")
## Plot the Elastic-net path
plot(elastic_net(x,y), title = "Elastic-net solution path")
## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient)
plot(elastic_net(x,y, lambda2=10), standardize=FALSE, xvar="fraction")
## Plot the Bounded regression path (fraction on X-axis)
plot(bounded_reg(x,y, lambda2=10), xvar="fraction")
} # }

Class "QuadrupenFit"

See also

Active bindings

Methods

Public methods

QuadrupenFit$new()

Usage

Arguments

QuadrupenFit$show()

Usage

QuadrupenFit$print()

Usage

QuadrupenFit$fit()

Usage

Arguments

QuadrupenFit$get_model()

Usage

Arguments

Returns

QuadrupenFit$predict()

Usage

Arguments

Returns

QuadrupenFit$cross_validate()

Usage

Arguments

Returns

QuadrupenFit$stability()

Usage

Arguments

Returns

QuadrupenFit$criteria()

Usage

Arguments

Returns

QuadrupenFit$plot()

Usage

Arguments

QuadrupenFit$plot_path()

Usage

Arguments

Returns

Examples

QuadrupenFit$clone()

Usage

Arguments

Examples

`QuadrupenFit$new()`

`QuadrupenFit$show()`

`QuadrupenFit$print()`

`QuadrupenFit$fit()`

`QuadrupenFit$get_model()`

`QuadrupenFit$predict()`

`QuadrupenFit$cross_validate()`

`QuadrupenFit$stability()`

`QuadrupenFit$criteria()`

`QuadrupenFit$plot()`

`QuadrupenFit$plot_path()`

`QuadrupenFit$clone()`