More metrics than ever #593

Borda · 2021-10-28T22:42:32Z

Borda
Oct 28, 2021
Maintainer

[0.6.0] - 2021-10-28

We are excited to announce that Torchmetrics v0.6 is now publicly available. TorchMetrics v0.6 does not focus on specific domains but adds a ton of new metrics to several domains, thus increasing the number of metrics in the repository to over 60! Not only have v0.6 added metrics within already covered domains, but we also add support for two new: Pairwise metrics and detection.

https://devblog.pytorchlightning.ai/torchmetrics-v0-6-more-metrics-than-ever-e98c3983621e

Pairwise Metrics

TorchMetrics v0.6 offers a new set of metrics in its functional backend for calculating pairwise distances. Given a tensor X with shape [N,d] (N observations, each in d dimensions), a pairwise metric calculates [N,N] matrix of all possible combinations between the rows of X.

Detection

TorchMetrics v0.6 now includes a detection package that provides for the MAP metric. The implementation essentially wraps pycocotools around securing that we get the correct value, but with the benefit of now being able to scale to multiple devices (as any other metric in TorchMetrics).

New additions

In the audio package, we have two new metrics: Perceptual Evaluation of Speech Quality (PESQ) and Short Term Objective Intelligibility (STOI). Both metrics can be used to assert speech quality.
In the retrieval package, we also have two new metrics: R-precision and Hit-rate. R-precision corresponds to recall at the R-th position of the query. The hit rate is the ratio of the total number of hits returned as a result of a query (hits) to the total number of hits returned.
The text package also receives an update in the form of two new metrics: Sacre BLEU score and character error rate. Sacre BLUE score provides and more systematic way of comparing BLUE scores across tasks. The character error rate is similar to the word error rate but instead calculates if a given algorithm has correctly predicted a sentence based on a character-by-character comparison.
The regression package got a single new metric in the form of the Tweedie deviance score metric. Deviance scores are generally a better measure of fit than measures such as squared error when trying to model data coming from highly screwed distributions.
Finally, we have added five new metrics for simple aggregation: SumMetric, MeanMetric, MinMetric, MaxMetric, CatMetric. All five metrics take in a single input (either native python floats or torch.Tensor) and keep track of the sum, average, min, etc. These new aggregation metrics are especially useful in combination with self.log from lightning if you want to log something other than the average of the metric you are tracking.

Detail changes

Added

Added audio metrics:
- Perceptual Evaluation of Speech Quality (PESQ) (Addition of more audio metrics #353)
- Short Term Objective Intelligibility (STOI) (Addition of more audio metrics #353)
Added Information retrieval metrics:
- RetrievalRPrecision (Implemented R-Precision for IR #577)
- RetrievalHitRate (Implemented HitRate for IR #576)
Added NLP metrics:
- SacreBLEUScore (Add SacreBLEUScore #546)
- CharErrorRate (Character Error Rate #575)
Added other metrics:
- Tweedie Deviance Score (Add Tweedie Deviance Score Metric. #499)
- Learned Perceptual Image Patch Similarity (LPIPS) (add LPIPS #431)
Added MAP (mean average precision) metric to new detection package (Add mean average precision metric for object detection #467)
Added support for float targets in nDCG metric (Add float target support to class & functional NDCG #437)
Added average argument to AveragePrecision metric for reducing multi-label and multi-class problems (Adds average argument to AveragePrecision metric #477)
Added MultioutputWrapper (Implement MultioutputWrapper #510)
Added metric sweeping:
- higher_is_better as constant attribute (Metric sweeping #544)
- higher_is_better to rest of codebase (Add missing higher_is_better attribute to metrics #584)
Added simple aggregation metrics: SumMetric, MeanMetric, CatMetric, MinMetric, MaxMetric (Simple aggregation metrics #506)
Added pairwise submodule with metrics (Pairwise subpackage #553)
- pairwise_cosine_similarity
- pairwise_euclidean_distance
- pairwise_linear_similarity
- pairwise_manhatten_distance

Changed

AveragePrecision will now as default output the macro average for multilabel and multiclass problems (Adds average argument to AveragePrecision metric #477)
half, double, float will no longer change the dtype of the metric states. Use metric.set_dtype instead (Fix dtype issues #493)
Renamed AverageMeter to MeanMetric (Simple aggregation metrics #506)
Changed is_differentiable from property to a constant attribute (make is_differentiable as attribute #551)
ROC and AUROC will no longer throw an error when either the positive or negative class is missing. Instead, return 0 scores and give a warning

Deprecated

Deprecated torchmetrics.functional.self_supervised.embedding_similarity in favour of new pairwise submodule

Removed

Removed dtype property (Fix dtype issues #493)

Fixed

Fixed bug in F1 with average='macro' and ignore_index!=None (Fix f1 score for macro and ignore index #495)
Fixed bug in pit by using the returned first result to initialize device and type (make metric_mtx type and device correct #533)
Fixed SSIM metric using too much memory (Fix SSIM memory #539)
Fixed bug where device property was not properly updated when the metric was a child of a module (Fix child device #542)

Contributors

@an1lam, @Borda, @karthikrangasai, @lucadiliello, @mahinlma, @Obus, @quancs, @SkafteNicki, @stancld, @tkupek

If we forgot someone due to not matching commit email with GitHub account, let us know :]

This discussion was created from the release More metrics than ever.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More metrics than ever #593

{{title}}

Replies: 0 comments

Select a reply

More metrics than ever #593

Borda Oct 28, 2021 Maintainer

[0.6.0] - 2021-10-28

Pairwise Metrics

Detection

New additions

Detail changes

Added

Changed

Deprecated

Removed

Fixed

Contributors

Replies: 0 comments

Borda
Oct 28, 2021
Maintainer