Train a simple convnet on the MNIST dataset and evaluate the BALD acquisition function
The acquisition function could then be used for active learning with the images the model is most confused about (highest acquisition function value).
See Deep Bayesian Active Learning with Image Data. This code extends on the Keras CNN example.