skactiveml.stream.StreamProbabilisticAL#
- class skactiveml.stream.StreamProbabilisticAL(metric=None, metric_dict=None, prior=0.001, m_max=2, budget_manager=None, budget=None, random_state=None)[source]#
Bases:
SingleAnnotatorStreamQueryStrategy
Probabilistic Active Learning in Datastreams
StreamProbabilisticAL [1] is an extension to Multi-Class Probabilistic Active Learning [2] (McPAL) (see pool.ProbabilisticAL). It uses McPAL to assess the spatial utility. The Balanced Incremental Quantile Filter (BalancedIncrementalQuantileFilter), that is implemented within the default budget manager, is used to evaluate the temporal utility (see stream.budgetmanager.BalancedIncrementalQuantileFilter).
- Parameters
- metricstr or callable, default=None
The metric must a be None or a valid kernel as defined by the function sklearn.metrics.pairwise.pairwise_kernels. The kernel is used to calculate the frequency of labels near the candidates and multiplied with the probabilities returned by the clf to get a kernel frequency estimate for each class. If metric is set to None, the predict_freq function of the clf will be used instead. If this is not defined, an Exception is raised.
- metric_dictdict, default=None
Any further parameters are passed directly to the kernel function. If metric_dict is None and metric is ‘rbf’ metric_dict is set to {‘gamma’: ‘mean’}.
- priorfloat, default=1.0e-3
The prior value that is passed onto ProbabilisticAL (see pool.ProbabilisticAL).
- m_maxfloat, default=2
The m_max value that is passed onto ProbabilisticAL (see pool.ProbabilisticAL).
- budget_managerBudgetManager, default=None
The BudgetManager which models the budgeting constraint used in the stream-based active learning setting. if set to None, BalancedIncrementalQuantileFilter will be used by default. The budget manager will be initialized based on the following conditions:
If only a budget is given, the default budget manager is initialized with the given budget.
If only a budget manager is given, use the budget manager.
If both are not given, the default budget manager with the default budget.
If both are given, and the budget differs from budgetmanager.budget, throw a warning and the budget manager is used as is.
- budgetfloat, default=None
Specifies the ratio of samples which are allowed to be sampled, with 0 <= budget <= 1. If budget is None, it is replaced with the default budget 0.1.
- random_stateint or RandomState instance, default=None
Controls the randomness of the estimator.
References
- 1
D. Kottke, G. Krempl, and M. Spiliopoulou. Probabilistic Active Learning in Datastreams. In Adv. Intell. Data Anal., pages 145–157, 2015.
- 2
D. Kottke, G. Krempl, D. Lang, J. Teschner, and M. Spiliopoulou. Multi-class Probabilistic Active Learning. In Eur. Conf. Artif. Intell., pages 586–594, 2016.
Methods
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
query
(candidates, clf[, X, y, ...])Determines for which candidate samples labels are to be queried.
set_params
(**params)Set the parameters of this estimator.
update
(candidates, queried_indices[, ...])Updates the budget manager and the count for seen and queried labels.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsdict
Parameter names mapped to their values.
- query(candidates, clf, X=None, y=None, sample_weight=None, fit_clf=False, utility_weight=None, return_utilities=False)[source]#
Determines for which candidate samples labels are to be queried.
The query startegy determines the most useful samples in candidates, which can be acquired within the budgeting constraint specified by budget. Please note that, this method does not change the internal state of the query strategy. To adapt the query strategy to the selected candidates, use update(…).
- Parameters
- candidates{array-like, sparse matrix} of shape (n_candidates, n_features)
The samples which may be queried. Sparse matrices are accepted only if they are supported by the base query strategy.
- clfskactiveml.base.SkactivemlClassifier
Model implementing the methods fit and predict_proba.
- Xarray-like of shape (n_samples, n_features), default=None
Training data set used to fit the classifier.
- yarray-like of shape (n_samples,)
Labels of the training data set (possibly including unlabeled ones indicated by self.missing_label).
- sample_weightarray-like of shape (n_samples,), default=None
Weights of training samples in X.
- fit_clfbool, default=False
Defines whether the classifier should be fitted on X, y, and sample_weight.
- utility_weightarray-like of shape (n_candidate_samples),
- default=None
Densities for each sample in candidates.
- return_utilitiesbool, default=False
If True, also return the utilities based on the query strategy.
- Returns
- queried_indicesnp.ndarray of shape (n_queried_indices,)
The indices of samples in candidates whose labels are queried, with 0 <= queried_indices <= n_candidates.
- utilities: np.ndarray of shape (n_candidates,),
The utilities based on the query strategy. Only provided if return_utilities is True.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
- **paramsdict
Estimator parameters.
- Returns
- selfestimator instance
Estimator instance.
- update(candidates, queried_indices, budget_manager_param_dict=None)[source]#
Updates the budget manager and the count for seen and queried labels. This function should be used in conjunction with the query function.
- Parameters
- candidates{array-like, sparse matrix} of shape (n_candidates, n_features)
The samples which may be queried. Sparse matrices are accepted only if they are supported by the base query strategy.
- queried_indicesnp.ndarray of shape (n_queried_indices,)
The indices of samples in candidates whose labels are queried, with 0 <= queried_indices <= n_candidates.
- budget_manager_param_dictdict, default=None
Optional kwargs for budget_manager.
- Returns
- selfSingleAnnotatorStreamQueryStrategy
The query strategy returns itself, after it is updated.