In-depth Tutorials#

The following sections summarize a selection of our in-depth tutorials. Each entry lists the data modality and models used in the tutorial, while the active learning scenario and prediction task are reflected by the subsections.

🏊 Pool-based Active Learning#

In pool-based active learning, a model has access to a large pool of unlabeled samples. In each iteration it selects one or several informative samples from this pool, queries their labels, and retrains on the enlarged labeled set. This setting is common when data can be stored and queried flexibly, while labeling is the main bottleneck.

Classification#

Tutorial

Data

Models

Pool-based Active Learning: Getting Started

Synthetic

  • Logistic Regression

Pool-based Active Learning: Simple Evaluation Study

Tabular

  • Gaussian Process Classifier

  • Parzen Window Classifier

Deep Active Learning for Fine-tuning Vision Transformers

Image

  • Vision Transformer with Full Fine-tuning

Deep Active Learning with Frozen Vision Transformers

Image

  • Vision Transformer with Linear Probing

Semi-supervised Active Learning

Image

  • Vision Transformer with Linear Probing

  • Self-training

Bayesian Active Learning

Audio

  • Wav2Vec with Multi-layer Perceptron Probing

Image Annotation Tool

Image

  • Multi-layer Perceptron

Paper Annotation Tool

Text

  • Text Transformer with Linear Probing

Regression#

Tutorial

Data

Models

Pool-based Active Learning for Regression: Getting Started

Synthetic

  • Kernel Regressor

Advanced Active Learning for Regression Tasks

Tabular

  • Extreme Gradient Boosted Tree

  • Multi-layer Perceptron

  • Random Forest

Multi-annotator Learning#

Tutorial

Data

Models

Multi-annotator Active Learning: Getting Started

Synthetic

  • Logistic Regression

Advanced Multi-annotator Active Learning

Image

  • Convolutional Neural Network

🌊 Stream-based Active Learning#

In stream-based active learning, samples arrive sequentially as a data stream. For each incoming sample, the learner must immediately decide whether to query its label or discard it, typically under a strict labeling budget. This setting is relevant when data cannot be stored indefinitely or when decisions need to be made online.

Classification#

Tutorial

Data

Models

Stream-based Active Learning: Getting Started

Text

  • Sentence Transformer with Parzen Window Classifier

Stream-based Active Learning in Batches

Synthetic

  • Parzen Window Classifier