skactiveml.stream.budgetmanager.DensityBasedSplitBudgetManager#

class skactiveml.stream.budgetmanager.DensityBasedSplitBudgetManager(theta=1.0, s=0.01, delta=1.0, random_state=None, budget=None)[source]#

Bases: BudgetManager

Budget Manager for DBALStream

This budget manager is an adaptation of RandomVariableUncertaintyBudgetManager for DBALStream [1]. It mainly differs in how the available budget ist estimated. Instead of the estimated budget proposed by Žliobaitė et. al. [2], this budget manager counts the number of queried and seen instance, such that the number of available queries is given as n_seen_samples-n_queried_samples*budget.

Parameters
thetafloat, default=1.0

Specifies the initial value for theta_ that is used for calculating the threshold.

sfloat, default=0.1

Specifies the relative increase or decrease of the threshold if an sample is queried or not, respectively.

deltafloat, default=1.0

Specifies the standart deviation of the normal distribution used for randomization of the threshold.

random_stateint or RandomState instance or None, default=None

Controls the randomness of the budget manager.

budgetfloat, default=None

Specifies the ratio of samples which are allowed to be sampled, with 0 <= budget <= 1. If budget is None, it is replaced with the default budget 0.1.

References

1

D. Ienco, I. Žliobaitė, and B. Pfahringer. High density-focused uncertainty sampling for active learning over evolving stream data. In Int. Workshop Big Data Streams Heterog. Source Min. Algorithms Syst. Program. Models Appl., pages 133–148, 2014.

2

I. Žliobaitė, A. Bifet, B. Pfahringer, and G. Holmes. Active Learning With Drifting Streaming Data. IEEE Trans. Neural Netw. Learn. Syst., 25(1):27–39, 2014

Methods

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

query_by_utility(utilities)

Ask the budget manager which utilities are sufficient to query the corresponding labels.

set_params(**params)

Set the parameters of this estimator.

update(candidates, queried_indices)

Updates the budget manager.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

query_by_utility(utilities)[source]#

Ask the budget manager which utilities are sufficient to query the corresponding labels.

Parameters
utilitiesarray-like of shape (n_samples,)

The utilities provided by the stream-based active learning strategy, which are used to determine whether querying a sample is worth it given the budgeting constraint.

Returns
queried_indicesnp.ndarray of shape (n_queried_indices,)

The indices of samples in candidates whose labels are queried, with 0 <= queried_indices <= n_candidates.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.

update(candidates, queried_indices)[source]#

Updates the budget manager.

Parameters
candidates{array-like, sparse matrix} of shape (n_samples, n_features)

The samples which may be queried. Sparse matrices are accepted only if they are supported by the base query strategy.

queried_indicesnp.ndarray of shape (n_queried_indices,)

The indices of samples in candidates whose labels are queried, with 0 <= queried_indices <= n_candidates.

Returns
selfRandomVariableUncertaintyBudgetManager

The budget manager returns itself, after it is updated.