Class "QuadrupenFit"
Class "QuadrupenFit"
Details
Class of object returned by any fitting function of the
quadrupen package (elastic.net or
bounded.reg).
This class comes with the usual predict(), fitted(), coef(),
residuals(), show(), print() and deviance() S3 methods.
Specific R6 methods are available for model extraction QuadrupenFit$get_model(),
cross validation QuadrupenFit$cross_validate(), stability selection
QuadrupenFit$stability_path(), criteria derivation QuadrupenFit$criteria()
and plotting QuadrupenFit$plot(). They come with equivalent S3 methods : cross_validate(),
stability() and plot().
The "path"plot is available as soon as a fit has been performed.
For the others, the appropriate post-treatments must have been made via the
methods QuadrupenFit$criteria(), QuadrupenFit$cross_validate() or
QuadrupenFit$stability()
All plots functions are given with the default arguments, except for labels and log_scale.
If you need more control, please use the dedicated methods: QuadrupenFit$plot_path(),
InformationCriteria$plot(), CrossValidation$plot(),
StabilityPath$plot() or the corresponding S3 methods.
Plot method for regularization path
See also
See also InformationCriteria, CrossValidation and
StabilityPath
cross_validate()
Stability selection for Quadrupen object
stability()
Penalized criteria based on estimation of degrees of freedom
Active bindings
nvarnumber of coefficient (without intercept)
nobssample size
dataModelan object with class
DataModelstoring the datamajor_tuningvector of "leading" tuning parameters (either l1, linf or l2)
minor_tuningvector of "minor" tuning parameters (either l1 or l2)
optim_monitoringlist monitoring the optimization
optim_configlist with low level options used for optimization.
fittedMatrix of fitted values, each column corresponding to a value of
lambda1.coefficientsMatrix (class
"dgCMatrix") of coefficients with respect to the original input. The number of rows corresponds the length oflambda1.interceptA vector containing the successive values of the (unpenalized) intercept. Equals to zero if
intercepthas been set toFALSE.debiaslogical, should we rely on the debias coefficient of the regularizer (if available) or not
residualsMatrix of residuals, each column corresponding to a value of
lambda1.deviancethe model deviance
degrees_freedomEstimated degree of freedoms for the successive
lambda1.r_squaredvector giving the coefficient of determination as a function of lambda1.
information_criteriaobject with class
InformationCriteriastoring various information criteria (AIC, BIC, GCV, etc) for the current fit.cross_validationobject with class
CrossValidationstoring output of CV job. Only available once method cross_validate has been called.stability_pathobject with class
StabilityPathstoring output of stability selection. Only available once method $stability has been called.
Methods
Method new()
Initialize a QuadrupenFit model
Usage
QuadrupenFit$new(data, intercept, regParam)Arguments
dataa DataModel object
intercepta logical; should an intercept be included in the mode?
regParama list with two elements, a vector and a scalar, for the regularization
Method get_model()
Model extraction
Usage
QuadrupenFit$get_model(selection, type = c("coefficients", "penalty", "index"))Method predict()
Predict response for new sample based on the current model
Method cross_validate()
Function that computes K-fold cross-validated error of a
quadrupen fit, possibly on a grid of lambda1, lambda2.
Arguments
Kinteger indicating the number of folds. Default is 10.
foldslist of
Kvectors that describes the folds to use for the cross-validation. By default, the folds are randomly sampled with the specified K. The same folds are used for each values oflambda2.lambda2tunes the \(\ell_2\)-penalty (ridge-like) of the fit. If none is provided, a vector of values is generated and a CV is performed on a grid of
lambda2andlambda1, using the same folds for eachlambda2.verboselogical; indicates if the progression (the current
lambda2should be displayed. Default isTRUE.coresthe number of cores to use. The default uses all the cores available.
Returns
an object with class CrossValidation is sent back and stored as a field of the original QuadrupenFit object.
Method stability()
Compute the stability path of a (possibly randomized) fitting procedure as introduced by Meinshausen and Buhlmann (2010).
Arguments
n_subsamplesinteger indicating the number of subsamplings used to estimate the selection probabilities. Default is 100.
subsample_sizeinteger indicating the size of each subsamples. Default is
floor(n/2).subsampleslist with
subsamplesentries with vectors describing the folds to use for the stability procedure. By default, the folds are randomly sampled with the specifiedn_subsamplesandsubsample_sizeargument.weaknessCoefficient used for randomizing the weights of each features. Default is 1` for no randomization. See details below.
verboselogical; indicates if the progression should be displayed. Default is
TRUE.coresthe number of cores to use. The default uses all the cores available.
Returns
an object with class StabilityPath is sent back and stored as a field of the original QuadrupenFit object.
Method criteria()
Produce a plot or send back the values of some penalized criteria accompanied with the vector(s) of parameters selected accordingly. The default behavior plots the BIC and the AIC (with respective factor \(\log(n)\) and \(2\)) yet the user can specify any penalty.
Arguments
penaltya vector with as many penalties a desired. The default contains the penalty corresponding to the AIC and the BIC (\(2\) and \(\log(n)\)). Setting the "names" attribute, as done in the default definition, leads to outputs which are easier to read.
sigmascalar: an estimate of the residual variance. When available, it is plugged-in the criteria, which may be more relevant. If
NULL(the default), it is estimated as usual (see details).
Returns
an object with class InformationCriteria is sent back and stored as a field of the original QuadrupenFit object.
Method plot()
Plot method for QuadrupenFit
Usage
QuadrupenFit$plot(
type = c("path", "criteria", "crossval", "stability"),
log_scale = TRUE,
labels = NULL
)Arguments
typethe type of plot, either
"path"for regularization path;"criteria"for BIC-like information criteria ;"crossval"for cross-validation plot ; and"stability"for stability path.log_scalelogical; indicates if a log-scale should be used when
xvar="lambda". Default isTRUE.labelsvector indicating the names associated to the plotted variables. When specified, a legend is drawn in order to identify each variable. Only relevant when the number of predictor is small. Remind that the intercept does not count. Default is
NULL.
Method plot_path()
Produce a plot of the solution path of a QuadrupenFit object.
Arguments
xvarvariable to plot on the X-axis: either
"lambda"(\(\ell_1\) penalty level, or \(\ell_2\) for ridge and \(\ell_\infty\)) or"fraction"(\(\ell_1\)-norm of the coefficients) ordffor estimated degrees of freedom. Default is set to"lambda".log_scalelogical; indicates if a log-scale should be used when
xvar="lambda". Default isTRUE.titlethe title. Default is set to the model name followed by what is on the Y-axis.
standardizelogical; standardize the coefficients before plotting (with the norm of the predictor). Default is
TRUE.labelsvector indicating the names associated to the plotted variables. When specified, a legend is drawn in order to identify each variable. Only relevant when the number of predictor is small. Remind that the intercept does not count. Default is
NULL.
Examples
\dontrun{
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Plot the Lasso path
plot(lasso(x,y), title="Lasso solution path")
## Plot the Elastic-net path
plot(elastic.net(x,y), title = "Elastic-net solution path")
## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient)
plot(elastic.net(x,y, lambda2=10), standardize=FALSE, xvar="fraction")
## Plot the Bounded regression path (fraction on X-axis)
plot(bounded.reg(x,y, lambda2=10), xvar="fraction")
}
Examples
## ------------------------------------------------
## Method `QuadrupenFit$plot_path`
## ------------------------------------------------
if (FALSE) { # \dontrun{
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Plot the Lasso path
plot(lasso(x,y), title="Lasso solution path")
## Plot the Elastic-net path
plot(elastic.net(x,y), title = "Elastic-net solution path")
## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient)
plot(elastic.net(x,y, lambda2=10), standardize=FALSE, xvar="fraction")
## Plot the Bounded regression path (fraction on X-axis)
plot(bounded.reg(x,y, lambda2=10), xvar="fraction")
} # }