A Monte Carlo approximation to the adjusted and standardized mutual information for faster clustering comparisons. Use this package as a drop-in replacement for sklearn.metrics.adjusted_mutual_info_score
, when the exact calculation is too slow, i.e., because of large datasets and large numbers of clusters. You can find more details in our publication.
fastami
requires Python >=3.8. You can install fastami
via pip from PyPI:
pip install fastami
You can use FastAMI as you would use adjusted_mutual_info_score
from scikit-learn
:
from fastami import adjusted_mutual_info_mc
labels_true = [0, 0, 1, 1, 2]
labels_pred = [0, 1, 1, 2, 2]
ami, ami_error = adjusted_mutual_info_mc(labels_true, labels_pred)
# Output: AMI = -0.255 +- 0.008
print(f"AMI = {ami:.3f} +- {ami_error:.3f}")
Note that the output may vary a little bit, due to the nature of the Monte Carlo approach. If you would like to ensure reproducible results, use the seed
argument. By default, the algorithm terminates when it reaches an accuracy of 0.01
. You can adjust this behavior using the accuracy_goal
argument.
FastSMI works similarly:
from fastami import standardized_mutual_info_mc
labels_true = [0, 0, 1, 1, 2]
labels_pred = [0, 1, 1, 2, 2]
smi, smi_error = standardized_mutual_info_mc(labels_true, labels_pred)
# Output: SMI = -0.673 +- 0.035
print(f"SMI = {smi:.3f} +- {smi_error:.3f}")
While FastSMI is usually faster than an exact calculation of the SMI, it is still orders of magnitude slower than FastAMI. Since the SMI is not confined to the interval [-1,1]
like the AMI, the SMI by default terminates at a given absolute or relative error of at least 0.1
, whichever is reached first. You can adjust this behavior using the precision_goal
argument.
If you use fastami
in your research work, please cite our paper:
@article{Klede_Schwinn_Zanca_Eskofier_2023,
title={FastAMI – a Monte Carlo Approach to the Adjustment for Chance in Clustering Comparison Metrics},
volume={37},
url={https://ojs.aaai.org/index.php/AAAI/article/view/26003},
DOI={10.1609/aaai.v37i7.26003},
number={7},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
author={Klede, Kai and Schwinn, Leo and Zanca, Dario and Eskofier, Björn},
year={2023},
month={Jun.},
pages={8317-8324}
}