skactiveml.stream.StreamProbabilisticAL#

class skactiveml.stream.StreamProbabilisticAL(budget_manager=None, budget=None, metric=None, metric_dict=None, random_state=None, prior=0.001, m_max=2)[source]#

Bases: SingleAnnotatorStreamQueryStrategy

Probabilistic Active Learning in Datastreams (StreamProbabilisticAL) is an extension to Multi-Class Probabilistic Active Learning (McPAL) (see pool.ProbabilisticAL). It assesses McPAL spatial to assess the spatial utility. The Balanced Incremental Quantile Filter (BalancedIncrementalQuantileFilter), that is implemented within the default budget manager, is used to evaluate the temporal utility (see stream.budgetmanager.BalancedIncrementalQuantileFilter).

Parameters
budgetfloat, optional (default=None)

The budget which models the budgeting constraint used in the stream-based active learning setting.

budget_managerBudgetManager, optional (default=None)

The BudgetManager which models the budgeting constraint used in the stream-based active learning setting. if set to None, BalancedIncrementalQuantileFilter will be used by default. The budget manager will be initialized based on the following conditions:

If only a budget is given the default budget manager is initialized with the given budget. If only a budget manager is given use the budget manager. If both are not given the default budget manager with the default budget. If both are given and the budget differs from budgetmanager.budget a warning is thrown.

metricstr or callable, optional (default=None)

The metric must a be None or a valid kernel as defined by the function sklearn.metrics.pairwise.pairwise_kernels. The kernel is used to calculate the frequency of labels near the candidates and multiplied with the probabilities returned by the clf to get a kernel frequency estimate for each class. If metric is set to None, the predict_freq function of the clf will be used instead. If this is not defined, an Exception is raised.

metric_dictdict, optional (default=None)

Any further parameters are passed directly to the kernel function. If metric_dict is None and metric is ‘rbf’ metric_dict is set to {‘gamma’: ‘mean’}.

random_stateint, RandomState instance, optional (default=None)

Controls the randomness of the query strategy.

priorfloat, optional (default=1.0e-3)

The prior value that is passed onto ProbabilisticAL (see pool.ProbabilisticAL).

m_maxfloat, optional (default=2)

The m_max value that is passed onto ProbabilisticAL (see pool.ProbabilisticAL).

References

[1] Kottke, M. (2015). Probabilistic Active Learning in Datastreams. In

Advances in Intelligent Data Analysis XIV (pp. 145–157). Springer.

Methods

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

query(candidates, clf[, X, y, ...])

Ask the query strategy which instances in candidates to acquire.

set_params(**params)

Set the parameters of this estimator.

update(candidates, queried_indices[, ...])

Updates the budget manager.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

query(candidates, clf, X=None, y=None, sample_weight=None, fit_clf=False, utility_weight=None, return_utilities=False)[source]#

Ask the query strategy which instances in candidates to acquire.

Parameters
candidates{array-like, sparse matrix} of shape
(n_samples, n_features)

The instances which may be queried. Sparse matrices are accepted only if they are supported by the base query strategy.

clfSkactivemlClassifier

Model implementing the methods fit and predict_proba. If self.metric is None, the clf must also implement predict_freq.

Xarray-like of shape (n_samples, n_features), optional
(default=None)

Input samples used to fit the classifier.

yarray-like of shape (n_samples), optional (default=None)

Labels of the input samples ‘X’. There may be missing labels.

sample_weightarray-like of shape (n_samples,), optional
(default=None)

Sample weights for X, used to fit the clf.

fit_clfbool,optional (default=False)

If True, refit the classifier also requires X and y to be given.

utility_weightarray-like of shape (n_candidate_samples), optional
(default=None)

Densities for each sample in candidates.

return_utilitiesbool, optional (default=False)

If true, also return the utilities based on the query strategy. The default is False.

Returns
queried_indicesndarray of shape (n_queried_instances,)

The indices of instances in candidates which should be queried, with 0 <= n_queried_instances <= n_samples.

utilities: ndarray of shape (n_samples,), optional

The utilities based on the query strategy. Only provided if return_utilities is True.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.

update(candidates, queried_indices, budget_manager_param_dict=None)[source]#

Updates the budget manager.

Parameters
candidates{array-like, sparse matrix} of shape
(n_samples, n_features)

The instances which could be queried. Sparse matrices are accepted only if they are supported by the base query strategy.

queried_indicesarray-like of shape (n_samples,)

Indicates which instances from candidates have been queried.

budget_manager_param_dictkwargs, optional (default=None)

Optional kwargs for budgetmanager.

Returns
selfStreamProbabilisticAL

PALS returns itself, after it is updated.

Examples using skactiveml.stream.StreamProbabilisticAL#

Probabilistic Active Learning in Datastreams

Probabilistic Active Learning in Datastreams