SkorchRegressor#
- class skactiveml.regressor.SkorchRegressor(module, criterion=<class 'torch.nn.modules.loss.MSELoss'>, forward_outputs=None, criterion_output_keys=None, neural_net_param_dict=None, sample_dtype=<class 'numpy.float32'>, include_unlabeled_samples=False, missing_label=nan, random_state=None)[source]#
Bases:
SkactivemlRegressor,SkorchMixinImplement a regression wrapper class, to make it possible to use torch with skactiveml. This is achieved by providing a wrapper around torch that has a skactiveml interface and can handle missing labels. This wrapper is based on the open-source library skorch [1].
- Parameters:
- moduletorch.nn.Module.__class__ or torch.nn.Module
A PyTorch torch.nn.Module. In general, the uninstantiated class should be passed, although instantiated modules will also work.
- criteriontorch.nn.Module or torch.nn.Module.__class__, default=torch.nn.MSELoss
The loss (criterion) used to optimize the module.
If a class (subclass of torch.nn.Module) is passed (e.g. torch.nn.MSELoss), it is instantiated internally.
If an instance is passed (e.g. torch.nn.MSELoss()), that instance (or a wrapped copy of it) is used.
By default, torch.nn.MSELoss is used as criterion.
- forward_outputsdict[str, tuple[int, Callable | None]] or None, default=None
Dictionary that describes how to get and post-process the outputs of module.forward for prediction. This parameter replaces the functionality of predict_nonlinearity in a skorch.net.NeuralNet (see documentation of neural_net_param_dict).
Given raw_outputs = module.forward(x), each entry name -> (idx, transform) in forward_outputs is interpreted as:
idx : int Index into raw_outputs (0-based).
transform : callable or None If not None, it is applied to the selected raw tensor raw_outputs[idx]. Otherwise, the raw tensor is used.
This allows multiple named outputs to reference the same raw tensor with different transforms, for example:
forward_outputs = { "raw-pred": (0, None), # raw predicted targets "log-pred": (0, torch.log), # log predicted targets "emb": (1, None), # embeddings }
The first entry in forward_outputs defines the primary scores used for prediction:
In predict, the transformed first output is interpreted as predicted targets.
If
forward_outputsisNone, a sensible default is chosen for common single-output regressors based on thecriterion:If
criterionistorch.nn.MSELoss,torch.nn.L1Loss, ortorch.nn.SmoothL1Loss, it is assumed thatmodule.forwardreturns the regression predictions directly and the effective mapping is:{"output": (0, torch.ravel)}
For all other criteria, a single-output module is assumed to already produce values in the target space, and the effective mapping is:
{"output": (0, None)}
- criterion_output_keysstr or sequence of str or None, default=None
Name or names of the forward outputs that are passed to the loss / criterion during training. Use this when module.forward returns multiple outputs (e.g. (logits, embeddings, …)), but the criterion expects a single tensor input or a specific tuple of inputs.
The names must refer to keys of the effective forward_outputs mapping. If criterion_output_keys is not None and forward_outputs is None, a ValueError is raised because the names cannot be resolved.
If a str, the corresponding named output of module.forward (i.e., the raw tensor selected via its index in forward_outputs before applying the transform) is passed to the criterion (e.g. “raw-pred” to use only the raw predicted targets).
If a sequence of str, the selected named outputs are passed to the criterion in that order. Each raw forward output index may appear at most once: using multiple names that resolve to the same underlying index (e.g. “raw-pred” and “log-pred” both pointing to index 0) is not allowed and results in a ValueError.
If None, the first output defined by the effective forward_outputs mapping is used as criterion input.
To pass all distinct forward outputs to the criterion in the same order as forward_outputs, choose one representative name per raw output index and set, for example:
# assuming that each key refers to a different raw index criterion_output_keys = tuple(forward_outputs.keys())
If forward_outputs contains multiple names that refer to the same raw output index (aliases such as “raw-pred” and`”log-pred”` both mapping to index 0), you must select at most one name per raw index in criterion_output_keys.
- neural_net_param_dictdict, default=None
Additional arguments for skorch.net.NeuralNet. If neural_net_param_dict is None, no additional arguments are added.
- sample_dtypestr or type, default=np.float32
Dtype to which input samples are cast inside the estimator. If set to None, the input dtype is preserved. The label data type is always cast to np.float32.
- include_unlabeled_samplesbool, default=False
If False, only labeled samples are passed to the fit method of the estimator.
If True, all samples including the unlabeled ones are passed to the fit method of the estimator. Ensure that the criterion is able to handle unlabeled samples marked by missing_label. Otherwise, missing_label is interpreted as a regular target value.
- missing_labelscalar or string or np.nan or None, default=np.nan
Value to represent a missing label.
- random_stateint or RandomState instance or None, default=None
Determines random number for ‘predict’ method. Pass an int for reproducible results across multiple method calls.
Notes
Adjust your criterion and module.forward outputs consistently. See the documentation of the parameters forward_outputs and criterion_output_keys for further details.
References
[1]Marian Tietz, Thomas J. Fan, Daniel Nouri, Benjamin Bossan, and skorch Developers. skorch: A scikit-learn compatible neural network library that wraps PyTorch, July 2017.
Methods
fit(X, y, **fit_params)Initialize and fit the module.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
initialize([X, y, enforce_check_X_y])Initialize the wrapper and (optionally) validate inputs.
partial_fit(X, y, **fit_params)Fit the module without re-initialization.
predict(X[, extra_outputs])Return predicted targets for the test data X.
score(X, y[, sample_weight])Return coefficient of determination on test data.
set_fit_request(*[, sample_weight])Configure whether metadata should be requested to be passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_predict_request(*[, extra_outputs])Configure whether metadata should be requested to be passed to the
predictmethod.set_score_request(*[, sample_weight])Configure whether metadata should be requested to be passed to the
scoremethod.- fit(X, y, **fit_params)[source]#
Initialize and fit the module.
If the module was already initialized, by calling fit, the module will be re-initialized (unless warm_start is True).
- Parameters:
- Xmatrix-like, shape (n_samples, n_features)
Training data set, usually complete, i.e. including the labeled and unlabeled samples
- yarray-like of shape (n_samples,)
Labels of the training data set (possibly including unlabeled ones indicated by self.missing_label)
- fit_paramsdict-like
Further parameters as input to the ‘fit’ method of the skorch.net.NeuralNet.
- Returns:
- self: SkorchRegressor,
SkorchRegressor fitted on the training data.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- initialize(X=None, y=None, enforce_check_X_y=False)#
Initialize the wrapper and (optionally) validate inputs.
If any data is provided or enforce_check_X_y is True, inputs are validated via _validate_data. A new skorch.NeuralNet is then created and assigned to self.neural_net_.
- Parameters:
- Xarray-like of shape (n_samples, …), default=None
Input samples for optional validation.
- yarray-like of shape (n_samples, …), default=None
Target values for optional validation.
- enforce_check_X_ybool, default=False
Whether to validate even if both X and y are None.
- Returns:
- selfSkorchMixin
Returned when no input data was supplied (both X and y are None).
- X_out, y_outtuple of nd.array, optional
Validated X and y as a tuple, returned when enforce_check_X_y=True.
- partial_fit(X, y, **fit_params)[source]#
Fit the module without re-initialization.
If the module was already initialized, by calling partial_fit, the module will not be re-initialized again.
- Parameters:
- Xmatrix-like, shape (n_samples, n_features)
Training data set, usually complete, i.e. including the labeled and unlabeled samples
- yarray-like of shape (n_samples, )
Labels of the training data set (possibly including unlabeled ones indicated by self.missing_label)
- fit_paramsdict-like
Further parameters as input to the ‘partial_fit’ method of the skorch.net.NeuralNet.
- Returns:
- self: SkorchRegressor,
SkorchRegressor object fitted on the training data.
- predict(X, extra_outputs=None)[source]#
Return predicted targets for the test data X.
By default, this method returns only the predicted targets y_pred. If extra_outputs is provided, a tuple is returned whose first element is y_pred and whose remaining elements are the requested additional forward outputs, in the order specified by extra_outputs.
- Parameters:
- Xarray-like of shape (n_samples, …)
Test samples.
- extra_outputsNone or str or sequence of str, default=None
Names of additional outputs to return next to y_pred. The names must be a subset of the keys of the effective forward_outputs mapping.
For example, if:
self.forward_outputs = { "raw-pred": (0, None), "log-pred": (0, None), "emb": (1, None), }
then valid values for extra_outputs include “emb” or [“emb”, “log-pred”].
If extra_outputs is None, only y_pred is returned.
If extra_outputs is a string, e.g. “emb”, the return value is (y_pred, emb).
If extra_outputs is a sequence of strings, the return value is (y_pred, out_1, out_2, …), where out_i corresponds to the i-th name in extra_outputs.
- Returns:
- y_prednumpy.ndarray of shape (n_samples,)
Predicted targets of the test samples.
- *extrasnumpy.ndarray, optional
Additional outputs. Only present if extra_outputs is not None. In that case, the method returns a single tuple whose first element is y_pred and whose remaining elements (extras) correspond to the requested forward outputs in the order given by extra_outputs.
- score(X, y, sample_weight=None)#
Return coefficient of determination on test data.
The coefficient of determination, \(R^2\), is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
True values for X.
- sample_weightarray-like of shape (n_samples,), default=None
Sample weights.
- Returns:
- scorefloat
\(R^2\) of
self.predict(X)w.r.t. y.
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score(). This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SkorchRegressor#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter infit.
- Returns:
- selfobject
The updated object.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_predict_request(*, extra_outputs: bool | None | str = '$UNCHANGED$') SkorchRegressor#
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- extra_outputsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
extra_outputsparameter inpredict.
- Returns:
- selfobject
The updated object.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SkorchRegressor#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
- Returns:
- selfobject
The updated object.