Skip to content

Multimedia - audio & image quality

Compare
Choose a tag to compare
@Borda Borda released this 29 Jun 13:01
· 1602 commits to master since this release

Overview

https://devblog.pytorchlightning.ai/torchmetrics-v0-4-introducing-multimedia-metrics-e6380a3ad354

Audio

The first highlight of v0.4.0 is a set of 3 new metrics for calculating for evaluating audio data: Scale-invariant signal-to-distortion ratio, Scale-invariant signal-to-noise ratio, and signal-to-noise ratio. All these metrics take a predicted audio tensor and a target tensor, both with the shape [...,time] and calculate the metric over the time axis.

Image

Version v0.4.0 also includes a completely new image package. Since its initial 0.2.0 release, Torchmetrics has had both PSNR and SSIM in its regression module, metrics that can be used to evaluate image quality. 
With the image module, we are adding three new metrics for evaluating the quality of generative models (such as GANS): Inception score (IS), Fréchet inception distance (FID) and kernel inception distance (KID).

More Functionality

In addition to the new audio and image package, we also want to highlight a couple of features:

  • Addition of MeanAbsolutePercentageError (MAPE) metric to the regression package. Useful in regression settings where you want to focus on the relative instead of absolute error.
  • Addition of KLDivergence metric to the classification package. Useful for measuring the distance between probability distributions like the ones outputted in variational auto-encoders.
  • Addition of CosineSimilarity metric to the regression package. Useful for calculating the angle between two embedding vectors in domains such as metric learning.
  • As requested by multiple users, Accuracy, Precision, Recall, FBeta, F1, StatScore, Hamming, ConfusionMatrix now directly support that predictions can be unnormalized, e.g. logits from your model. No need to call .softmax(dim=-1) anymore!
  • All modular metrics now have both a sync and sync_context methods that allow the user full control over when metric states are synced. Note that we still automatically do this whenever calling the compute method.
  • The is_differentiable property has been adopted by many more of our metrics!

Thanks

Big thanks to all community members for their contributions and feedback.
A special thanks to @quancs for leading the development of the new audio package.

[0.4.0] - 2021-06-24

Added

  • Added Cosine Similarity metric (#305)
  • Added Specificity metric (#210)
  • Added add_metrics method to MetricCollection for adding additional metrics after initialization (#221)
  • Added pre-gather reduction in the case of dist_reduce_fx="cat" to reduce communication cost (#217)
  • Added better error message for AUROC when num_classes is not provided for multiclass input (#244)
  • Added support for unnormalized scores (e.g. logits) in Accuracy, Precision, Recall, FBeta, F1, StatScore, Hamming, ConfusionMatrix metrics (#200)
  • Added MeanAbsolutePercentageError(MAPE) metric. (#248)
  • Added squared argument to MeanSquaredError for computing RMSE (#249)
  • Added FID metric (#213)
  • Added is_differentiable property to ConfusionMatrix, F1, FBeta, Hamming, Hinge, IOU, MatthewsCorrcoef, Precision, Recall, PrecisionRecallCurve, ROC, StatScores (#253)
  • Added audio metrics: SNR, SI_SDR, SI_SNR (#292)
  • Added Inception Score metric to image module (#299)
  • Added KID metric to image module (#301)
  • Added sync and sync_context methods for manually controlling when metric states are synced (#302)
  • Added KLDivergence metric (#247)

Changed

  • Forward cache is reset when reset method is called (#260)
  • Improved per-class metric handling for imbalanced datasets for precision, recall, precision_recall, fbeta, f1, accuracy, and specificity (#204)
  • Decorated torch.jit.unused to MetricCollection forward (#307)
  • Renamed thresholds argument to binned metrics for manually controlling the thresholds (#322)

Deprecated

  • Deprecated torchmetrics.functional.mean_relative_error (#248)
  • Deprecated num_thresholds argument in BinnedPrecisionRecallCurve (#322)

Removed

  • Removed argument is_multiclass (#319)

Fixed

  • AUC can also support more dimensional inputs when all but one dimension are of size 1 (#242)
  • Fixed dtype of modular metrics after reset has been called (#243)
  • Fixed calculation in matthews_corrcoef to correctly match formula (#321)

Contributors

@AnselmC, @arvindmuralie77, @bhadreshpsavani, @Borda, @GiannisVagionakis, @hassiahk, @IgorHoholko, @johannespitz, @justusschock, @maximsch2, @pranjaldatta, @quancs, @simran2905, @SkafteNicki, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]