Inconsistent default values for average
argument in classification metrics
#2320
Labels
bug / fix
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
question
Further information is requested
v1.3.x
🐛 Bug
When instantiating the multiclass (or multilabel) accuracy metric through the
Accuracy
wrapper class (legacy), the default value foraverage
ismicro
. When instantiating directly throughMulticlassAccuracy
(new way since 0.11 I believe), the default value ismacro
. This is inconsistent, which can lead to very unexpected results.The same is true for all metrics that are subclasses of
MulticlassStatScores
,BinaryStatScores
orMultilabelStatScores
as well as their respective functional interfaces.To Reproduce
Code sample
Expected behavior
Consistency between the different interfaces.
Environment
conda
,pip
, build from source): >=0.11 (1.3 in my case)Additional context
I would argue that in the case of accuracy the default being
macro
in the task-specific classes is not only inconsistent with legacy but actually wrong. The common deinition of accuracy iswhich is how accuracy is computed when setting
average="micro"
.Setting
average="macro"
can still be useful, as it is less prone to class imbalance. However, I think TorchMetrics should adhere to common definitions with the default settings, and would therefore argue for makingmicro
the default.The same is kind of true for precision and recall, which are also commonly defined as micro averages, if they are defined globally at all. Usually we encounter recall and precision as class-wise metrics.
The text was updated successfully, but these errors were encountered: