SkorchClassifier#

class skactiveml.classifier.SkorchClassifier(module, criterion=<class 'torch.nn.modules.loss.CrossEntropyLoss'>, forward_outputs=None, criterion_output_keys=None, neural_net_param_dict=None, sample_dtype=<class 'numpy.float32'>, include_unlabeled_samples=False, classes=None, cost_matrix=None, missing_label=nan, random_state=None)[source]#

Bases: SkactivemlClassifier, SkorchMixin

Implement a classification wrapper class to make it possible to use torch with skactiveml. This is achieved by providing a wrapper around torch that has a skactiveml interface and can handle missing labels. This wrapper is based on the open-source library skorch [1].

Parameters:

moduletorch.nn.Module.__class__ or torch.nn.Module

A PyTorch torch.nn.Module. In general, the uninstantiated class should be passed, although instantiated modules will also work.

criteriontorch.nn.Module or torch.nn.Module.__class__, default=torch.nn.CrossEntropyLoss

The loss (criterion) used to optimize the module.

If a class (subclass of torch.nn.Module) is passed (e.g. torch.nn.CrossEntropyLoss), it is instantiated internally.
If an instance is passed (e.g. torch.nn.CrossEntropyLoss()), that instance (or a wrapped copy of it) is used.

By default, torch.nn.CrossEntropyLoss is used as criterion.

forward_outputsdict[str, tuple[int, Callable | None]] or None, default=None

Dictionary that describes how to get and post-process the outputs of module.forward for prediction. This parameter replaces the functionality of predict_nonlinearity in a skorch.net.NeuralNet (see documentation of neural_net_param_dict).

Given raw_outputs = module.forward(x), each entry name -> (idx, transform) in forward_outputs is interpreted as:

idx : int Index into raw_outputs (0-based).
transform : callable or None If not None, it is applied to the selected raw tensor raw_outputs[idx]. Otherwise, the raw tensor is used.

This allows multiple named outputs to reference the same raw tensor with different transforms, for example:

forward_outputs = {
    "proba":  (0, torch.nn.Softmax(dim=-1)),  # probabilities
    "logits": (0, None),                      # raw scores
    "emb":    (1, None),                      # embeddings
}

The first entry in forward_outputs defines the primary scores used for prediction:

In predict_proba, the transformed first output is interpreted as class probabilities P.
In predict, the class probabilities P returned by predict_proba are used to infer class label predictions.

If forward_outputs is None, a sensible default is chosen for common single-output classifiers based on the criterion:

If criterion is torch.nn.CrossEntropyLoss, it is assumed that module.forward returns logits and the effective mapping is:
```
{"proba": (0, torch.nn.Softmax(dim=-1))}
```
If criterion is torch.nn.NLLLoss, it is assumed that module.forward returns log-probabilities and the effective mapping is:
```
{"proba": (0, torch.exp)}
```
For all other criteria, a single-output module is assumed to already produce values in probability space, and the effective mapping is:
```
{"proba": (0, None)}
```

criterion_output_keysstr or sequence of str or None, default=None

Name or names of the forward outputs that are passed to the loss / criterion during training. Use this when module.forward returns multiple outputs (e.g. (logits, embeddings, …)), but the criterion expects a single tensor input or a specific tuple of inputs.

The names must refer to keys of the effective forward_outputs mapping. If criterion_output_keys is not None and forward_outputs is None, a ValueError is raised because the names cannot be resolved.

If a str, the corresponding named output of module.forward (i.e., the raw tensor selected via its index in forward_outputs before applying the transform) is passed to the criterion (e.g. “logits” to use only the class scores).
If a sequence of str, the selected named outputs are passed to the criterion in that order. Each raw forward output index may appear at most once: using multiple names that resolve to the same underlying index (e.g. “proba” and “logits” both pointing to index 0) is not allowed and results in a ValueError.
If None, the first output defined by the effective forward_outputs mapping is used as criterion input.

To pass all distinct forward outputs to the criterion in the same order as forward_outputs, choose one representative name per raw output index and set, for example:

# assuming that each key refers to a different raw index
criterion_output_keys = tuple(forward_outputs.keys())

If forward_outputs contains multiple names that refer to the same raw output index (aliases such as “proba” and “logits” both mapping to index 0), you must select at most one name per raw index in criterion_output_keys.

neural_net_param_dictdict, default=None

Additional arguments for skorch.net.NeuralNet. If neural_net_param_dict is None, no additional arguments are added. module, criterion, and predict_nonlinearity are not allowed in this dictionary.

sample_dtypestr or type, default=np.float32

Dtype to which input samples are cast inside the estimator. If set to None, the input dtype is preserved. The encoded label data type is always np.int64.

include_unlabeled_samplesbool, default=False

If False, only labeled samples are passed to the fit method of the estimator.
If True, all samples including the unlabeled ones are passed to the fit method of the estimator. Ensure that the criterion is able to handle unlabeled samples marked by missing_label. Otherwise, missing_label is interpreted as a regular class label.

classesarray-like of shape (n_classes,), default=None

Holds the label for each class. If None, the classes are determined during the fit.

missing_labelscalar or str or np.nan or None, default=np.nan

Value to represent a missing label.

cost_matrixarray-like of shape (n_classes, n_classes)

Cost matrix with cost_matrix[i, j] indicating the cost of predicting class classes[j] for a sample of class classes[i]. Can only be set if classes is not None.

random_stateint or RandomState instance or None, default=None

Determines random number generation for methods that rely on randomness (e.g. predict for stochastic models). Pass an int for reproducible results across multiple method calls.

Notes

Adjust your criterion and module.forward outputs consistently. See the documentation of the parameters forward_outputs and criterion_output_keys for further details.

References

[1]

Marian Tietz, Thomas J. Fan, Daniel Nouri, Benjamin Bossan, and skorch Developers. skorch: A scikit-learn compatible neural network library that wraps PyTorch, July 2017.

Methods

`fit`(X, y, **fit_params)	Initialize and fit the module.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`initialize`([X, y, enforce_check_X_y])	Initialize the wrapper and (optionally) validate inputs.
`partial_fit`(X, y, **fit_params)	Fit the module without re-initialization.
`predict`(X[, extra_outputs])	Return class predictions for the test samples X.
`predict_proba`(X[, extra_outputs])	Return class probability estimates for the test samples X.
`score`(X, y[, sample_weight])	Return the mean accuracy on the given test data and labels.
`set_fit_request`(*[, sample_weight])	Configure whether metadata should be requested to be passed to the `fit` method.
`set_params`(**params)	Set the parameters of this estimator.
`set_predict_proba_request`(*[, extra_outputs])	Configure whether metadata should be requested to be passed to the `predict_proba` method.
`set_predict_request`(*[, extra_outputs])	Configure whether metadata should be requested to be passed to the `predict` method.
`set_score_request`(*[, sample_weight])	Configure whether metadata should be requested to be passed to the `score` method.

fit(X, y, **fit_params)[source]#

Initialize and fit the module.

If the module was already initialized, by calling fit, the module will be re-initialized (unless warm_start is True).

Parameters:

Xmatrix-like, shape (n_samples, n_features): Training data set, usually complete, i.e. including the labeled and unlabeled samples
yarray-like of shape (n_samples, ): Labels of the training data set (possibly including unlabeled ones indicated by self.missing_label)
fit_paramsdict-like: Further parameters as input to the ‘fit’ method of the skorch.net.NeuralNet.

Returns:

self: SkorchClassifier,: SkorchClassifier object fitted on the training data.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

initialize(X=None, y=None, enforce_check_X_y=False)#

Initialize the wrapper and (optionally) validate inputs.

If any data is provided or enforce_check_X_y is True, inputs are validated via _validate_data. A new skorch.NeuralNet is then created and assigned to self.neural_net_.

Parameters:

Xarray-like of shape (n_samples, …), default=None: Input samples for optional validation.
yarray-like of shape (n_samples, …), default=None: Target values for optional validation.
enforce_check_X_ybool, default=False: Whether to validate even if both X and y are None.

Returns:

selfSkorchMixin: Returned when no input data was supplied (both X and y are None).
X_out, y_outtuple of nd.array, optional: Validated X and y as a tuple, returned when enforce_check_X_y=True.

partial_fit(X, y, **fit_params)[source]#

Fit the module without re-initialization.

If the module was already initialized, by calling partial_fit, the module will not be re-initialized again.

Parameters:

Xmatrix-like, shape (n_samples, n_features): Training data set, usually complete, i.e. including the labeled and unlabeled samples
yarray-like of shape (n_samples, ): Labels of the training data set (possibly including unlabeled ones indicated by self.missing_label)
fit_paramsdict-like: Further parameters as input to the ‘partial_fit’ method of the skorch.net.NeuralNet.

Returns:

self: SkorchClassifier: SkorchClassifier object fitted on the training data.

predict(X, extra_outputs=None)[source]#

Return class predictions for the test samples X.

By default, this method returns only the predicted classes y_pred. The predictions are obtained via the class probabilities P outputted by predict_proba. If extra_outputs is provided, a tuple is returned whose first element is y_pred and whose remaining elements are the requested additional forward outputs, in the order specified by extra_outputs.

Parameters:

Xarray-like of shape (n_samples, …)

Test samples.

extra_outputsNone or str or or sequence of str, default=None

Names of additional outputs to return next to y_pred. The names must be a subset of the keys of the effective forward_outputs mapping.

For example, if:

self.forward_outputs = {
    "proba":  (0, torch.nn.Softmax(dim=-1)),
    "logits": (0, None),
    "emb":    (1, None),
}

then valid values for extra_outputs include “emb” or [“emb”, “logits”].

If extra_outputs is None, only y_pred is returned.
If extra_outputs is a string, e.g. “emb”, the return value is (y_pred, emb).
If extra_outputs is a sequence of strings, the return value is (y_pred, out_1, out_2, …), where out_i corresponds to the i-th name in extra_outputs.

Returns:

y_prednumpy.ndarray of shape (n_samples,): Predicted class labels of the test samples.
*extrasnumpy.ndarray, optional: Additional outputs. Only present if extra_outputs is not None. In that case, the method returns a single tuple whose first element is y_pred and whose remaining elements (extras) correspond to the requested forward outputs in the order given by extra_outputs.

predict_proba(X, extra_outputs=None)[source]#

Return class probability estimates for the test samples X.

By default, this method returns only the predicted class probabilities P. If extra_outputs is provided, a tuple is returned whose first element is y_pred and whose remaining elements are the requested additional forward outputs, in the order specified by extra_outputs.

Parameters:

Xarray-like of shape (n_samples, …)

Test samples.

extra_outputsNone or str or sequence of str, default=None

Names of additional outputs to return next to P. The names must be a subset of the keys of the effective forward_outputs mapping.

For example, if:

self.forward_outputs = {
    "proba":  (0, torch.nn.Softmax(dim=-1)),
    "logits": (0, None),
    "emb":    (1, None),
}

then valid values for extra_outputs include “emb” or [“emb”, “logits”].

If extra_outputs is None, only P is returned.
If extra_outputs is a string, e.g. “logits”, the return value is (P, logits).
If extra_outputs is a sequence of strings, the return value is (P, out_1, out_2, …), where out_i corresponds to the i-th name in extra_outputs.

Returns:

Pnumpy.ndarray of shape (n_samples, n_classes): Class probabilities of the test samples. Classes are ordered according to self.classes_.
*extrasnumpy.ndarray, optional: Additional outputs. Only present if extra_outputs is not None. In that case, the method returns a single tuple whose first element is P and whose remaining elements (extras) correspond to the requested forward outputs in the order given by extra_outputs.

score(X, y, sample_weight=None)#

Return the mean accuracy on the given test data and labels.

Parameters:

Xarray-like of shape (n_samples, n_features): Test samples.
yarray-like of shape (n_samples,): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns:

scorefloat: Mean accuracy of self.predict(X) regarding y.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → SkorchClassifier#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns:

selfobject: The updated object.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

set_predict_proba_request(*, extra_outputs: bool | None | str = '$UNCHANGED$') → SkorchClassifier#

Configure whether metadata should be requested to be passed to the predict_proba method.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

extra_outputsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for extra_outputs parameter in predict_proba.

Returns:

selfobject: The updated object.

set_predict_request(*, extra_outputs: bool | None | str = '$UNCHANGED$') → SkorchClassifier#

Configure whether metadata should be requested to be passed to the predict method.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

extra_outputsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for extra_outputs parameter in predict.

Returns:

selfobject: The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → SkorchClassifier#

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns:

selfobject: The updated object.

SkorchClassifier#

This Page