.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "generated/sphinx_gallery_examples/1-pool-classification/plot-CostEmbeddingAL-Active_Learning_with_Cost_Embedding_(ALCE).py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_generated_sphinx_gallery_examples_1-pool-classification_plot-CostEmbeddingAL-Active_Learning_with_Cost_Embedding_(ALCE).py: Active Learning with Cost Embedding (ALCE) ========================================== .. GENERATED FROM PYTHON SOURCE LINES 7-8 **Idea:** ALCE embeds the misclassification cost matrix into a hidden space via non-metric multidimensional scaling (with a mirroring trick for asymmetric costs) and trains a multi-target regressor to predict each sample's hidden point. It then defines cost-sensitive uncertainty as the distance from the predicted hidden point to the nearest class point and selects the unlabeled instance with the largest distance, directly prioritizing high-cost confusions. .. GENERATED FROM PYTHON SOURCE LINES 10-20 | **Google Colab Note**: If the notebook fails to run after installing the needed packages, try to restart the runtime (Ctrl + M) under Runtime -> Restart session. .. image:: https://colab.research.google.com/assets/colab-badge.svg :target: https://colab.research.google.com/github/scikit-activeml/scikit-activeml.github.io/blob/gh-pages/latest/generated/sphinx_gallery_notebooks//1-pool-classification/plot-CostEmbeddingAL-Active_Learning_with_Cost_Embedding_(ALCE).ipynb | **Notebook Dependencies** | Uncomment the following cell to install all dependencies for this tutorial. .. GENERATED FROM PYTHON SOURCE LINES 20-23 .. code-block:: Python # !pip install scikit-activeml .. GENERATED FROM PYTHON SOURCE LINES 24-126 .. code-block:: Python import numpy as np from matplotlib import pyplot as plt, animation from sklearn.datasets import make_blobs from sklearn.model_selection import train_test_split from skactiveml.utils import MISSING_LABEL, labeled_indices from skactiveml.visualization import plot_utilities, plot_decision_boundary from skactiveml.classifier import ParzenWindowClassifier from skactiveml.pool import CostEmbeddingAL random_state = np.random.RandomState(0) # Build a dataset. X_true, y_clusters = make_blobs( n_samples=400, n_features=2, centers=[[0, 1], [-3, 0.5], [-1, -1], [2, 1], [1, -0.5]], cluster_std=0.7, random_state=random_state, ) y_true = y_clusters % 2 X_pool, X_test, y_pool, y_test = train_test_split( X_true, y_true, test_size=0.25, random_state=random_state ) X = X_pool y = np.full(shape=y_pool.shape, fill_value=MISSING_LABEL) # Initialise the classifier. clf = ParzenWindowClassifier(classes=[0, 1], random_state=random_state) # Initialise the query strategy. qs = CostEmbeddingAL(classes=[0, 1]) # Preparation for plotting. fig, ax = plt.subplots() feature_bound = [ [min(X[:, 0]), min(X[:, 1])], [max(X[:, 0]), max(X[:, 1])] ] artists = [] # Active learning cycle: n_cycles = 20 for c in range(n_cycles): # Fit the classifier with current labels. clf.fit(X, y) # Query the next sample(s). query_idx = qs.query(X=X, y=y) # Capture the current plot state. coll_old = list(ax.collections) title = ax.text( 0.5, 1.05, f"Decision boundary after acquiring {c} labels\n" f"Test Accuracy: {clf.score(X_test, y_test):.4f}", size=plt.rcParams["axes.titlesize"], ha="center", transform=ax.transAxes, ) # Update plot with utility values, samples, and decision boundary. X_labeled = X[labeled_indices(y)] ax = plot_utilities( qs, X=X, y=y, candidates=None, res=25, feature_bound=feature_bound, ax=ax, ) ax.scatter( X[:, 0], X[:, 1], c=y_pool, cmap="coolwarm", marker=".", zorder=2 ) ax.scatter( X_labeled[:, 0], X_labeled[:, 1], c="grey", alpha=0.8, marker=".", s=300, ) ax = plot_decision_boundary(clf, feature_bound, ax=ax) ax.set_xlabel('Feature 1') ax.set_ylabel('Feature 2') coll_new = list(ax.collections) coll_new.append(title) artists.append([x for x in coll_new if x not in coll_old]) # Update labels based on query. y[query_idx] = y_pool[query_idx] ani = animation.ArtistAnimation(fig, artists, interval=1000, blit=True) .. container:: sphx-glr-animation .. raw:: html
.. GENERATED FROM PYTHON SOURCE LINES 127-128 .. image:: ../../examples/pool_classification_legend.png .. GENERATED FROM PYTHON SOURCE LINES 130-135 .. rubric:: References: The implementation of this strategy is based on :footcite:t:`huang2016novel`. .. footbibliography:: .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 9.609 seconds) .. _sphx_glr_download_generated_sphinx_gallery_examples_1-pool-classification_plot-CostEmbeddingAL-Active_Learning_with_Cost_Embedding_(ALCE).py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot-CostEmbeddingAL-Active_Learning_with_Cost_Embedding_(ALCE).ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot-CostEmbeddingAL-Active_Learning_with_Cost_Embedding_(ALCE).py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot-CostEmbeddingAL-Active_Learning_with_Cost_Embedding_(ALCE).zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_