skactiveml.stream.StreamDensityBasedAL#
- class skactiveml.stream.StreamDensityBasedAL(budget_manager=None, budget=None, random_state=None, window_size=1000, dist_func=None, dist_func_dict=None)[source]#
Bases:
SingleAnnotatorStreamQueryStrategy
The StreamDensityBasedAL [1] query strategy is an extension to the uncertainty based query strategies proposed by Žliobaitė et al. [2]. In addition to the uncertainty assessment, StreamDensityBasedAL assesses the local density and only allows querying the label for a candidate if that local density is sufficiently high. The local density is measured using a sliding window. The local density is represented by the number of instances, the new instance is the new nearest neighbor from.
- Parameters
- budgetfloat, optional (default=None)
The budget which models the budgeting constraint used in the stream-based active learning setting.
- budget_managerBudgetManager, optional (default=None)
The BudgetManager which models the budgeting constraint used in the stream-based active learning setting. if set to None, DensityBasedBudgetManager will be used by default. The budget manager will be initialized based on the following conditions:
If only a budget is given the default budget manager is initialized with the given budget. If only a budget manager is given use the budget manager. If both are not given the default budget manager with the default budget. If both are given and the budget differs from budgetmanager.budget a warning is thrown.
- window_sizeint, optional (default=100)
Determines the sliding window size of the local density window.
- random_stateint, RandomState instance, optional (default=None)
Controls the randomness of the estimator.
- dist_funccallable, optional (default=None)
The distance function used to calculate the distances within the local density window. If None, sklearn.metrics.pairwise.pairwise_distances will be used by default
- dist_func_dictdict, optional (default=None)
Additional parameters for dist_func.
References
- [1] Ienco, D., Pfahringer, B., & Zliobaitė, I. (2014). High density-focused
uncertainty sampling for active learning over evolving stream data. In BigMine 2014 (pp. 133-148).
- [2] Žliobaitė, I., Bifet, A., Pfahringer, B., & Holmes, G. (2014). Active
Learning With Drifting Streaming Data. IEEE Transactions on Neural Networks and Learning Systems, 25(1), 27-39.
Methods
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
query
(candidates, clf[, X, y, ...])Ask the query strategy which instances in candidates to acquire.
set_params
(**params)Set the parameters of this estimator.
update
(candidates, queried_indices[, ...])Updates the budget manager and the count for seen and queried instances
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsdict
Parameter names mapped to their values.
- query(candidates, clf, X=None, y=None, sample_weight=None, fit_clf=False, return_utilities=False)[source]#
Ask the query strategy which instances in candidates to acquire.
- Parameters
- candidates{array-like, sparse matrix} of shape
- (n_samples, n_features)
The instances which may be queried. Sparse matrices are accepted only if they are supported by the base query strategy.
- clfSkactivemlClassifier
Model implementing the methods fit and predict_freq.
- Xarray-like of shape (n_samples, n_features), optional
- (default=None)
Input samples used to fit the classifier.
- yarray-like of shape (n_samples), optional (default=None)
Labels of the input samples ‘X’. There may be missing labels.
- sample_weightarray-like of shape (n_samples,), optional
Sample weights for X, used to fit the clf.
- fit_clfbool, optional (default=False)
If true, refit the classifier also requires X and y to be given.
- return_utilitiesbool, optional (default=False)
If true, also return the utilities based on the query strategy. The default is False.
- Returns
- queried_indicesndarray of shape (n_queried_instances,)
The indices of instances in candidates which should be queried, with 0 <= n_queried_instances <= n_samples.
- utilities: ndarray of shape (n_samples,), optional
The utilities based on the query strategy. Only provided if return_utilities is True.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
- **paramsdict
Estimator parameters.
- Returns
- selfestimator instance
Estimator instance.
- update(candidates, queried_indices, budget_manager_param_dict=None)[source]#
Updates the budget manager and the count for seen and queried instances
- Parameters
- candidates{array-like, sparse matrix} of shape
- (n_samples, n_features)
The instances which could be queried. Sparse matrices are accepted only if they are supported by the base query strategy.
- queried_indicesarray-like of shape (n_samples,)
Indicates which instances from candidates have been queried.
- budget_manager_param_dictkwargs, optional (default=None)
Optional kwargs for budget_manager.
- Returns
- selfStreamDensityBasedAL
The StreamDensityBasedAL returns itself, after it is updated.