skactiveml.pool.batch_bald#

skactiveml.pool.batch_bald(probas, batch_size, n_MC_samples=None, random_state=None, eps=1e-07)[source]#

BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

BatchBALD [1] is an extension of BALD [2] (Bayesian Active Learning by Disagreement) whereby points are jointly scored by estimating the mutual information between a joint of multiple data points and the model parameters.

Parameters

probasarray-like of shape (n_estimators, n_samples, n_classes): The probability estimates of all estimators, samples, and classes.
batch_sizeint, default=1: The number of samples to be selected in one AL cycle.
n_MC_samplesint > 0, default=n_estimators: The number of monte carlo samples used for label estimation.
epsfloat > 0, default=1e-7: Minimum probability threshold to compute log-probabilities.
random_stateint or np.random.RandomState, default=None: The random state to use.

Returns

utilities: numpy.ndarray of shape (batch_size, n_samples): Sample utilities computed according to BatchBALD [2].

References

1: Kirsch, Andreas, Joost Van Amersfoort, and Yarin Gal. “BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning.” Advances in Neural Information Processing Systems 32 (2019).
2: Houlsby, Neil, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. “Bayesian Active Learning for Classification and Preference Learning.” arXiv preprint arXiv:1112.5745 (2011).