You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are excited to announce that TorchMetrics v0.7 is now publicly available. This release is pretty significant. It includes several new metrics (mainly for NLP), naming and import changes, general improvements to the API, and some other great features. TorchMetrics thus now has over 60+ metrics, and the package is more user-friendly than ever.
NLP metrics - Text package
Text package is a part of TorchMetrics as of v0.5. With the growing capability of language generation models, there is also a real need to have reliable evaluation metrics. With several added metrics and unified API, TorchMetrics makes the usage of various metrics even easier! TorchMetrics v0.7 newly includes a couple of machine translation metrics such as chrF, chrF++, Translation Edit Rate, or Extended Edit Distance. Furthermore, it also supports other metrics - Match Error Rate, Word Information Lost, Word Information Preserved, and SQuAD evaluation metrics. Last but not least, we also made possible the evaluation of the ROUGE score using multiple references.
Argument unification
Importantly, all text metrics assume preds, target input order with these explicit keyword arguments. If different naming was used before v0.7, it is deprecated and completely removed in v0.8.
Import and naming changes
TorchMetrics v0.7 brings more extensive and minor changes to how metrics should be imported. The import changes directly impact v0.7, meaning that you will most likely need to change the import statement for some specific metrics. All naming changes follow our standard deprecation process, meaning that in v0.7, any metric that is renamed will still work but raise an error asking to use the new metric name. From v0.8, the old metric names will no longer be available.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We are excited to announce that TorchMetrics v0.7 is now publicly available. This release is pretty significant. It includes several new metrics (mainly for NLP), naming and import changes, general improvements to the API, and some other great features. TorchMetrics thus now has over 60+ metrics, and the package is more user-friendly than ever.
NLP metrics - Text package
Text package is a part of TorchMetrics as of v0.5. With the growing capability of language generation models, there is also a real need to have reliable evaluation metrics. With several added metrics and unified API, TorchMetrics makes the usage of various metrics even easier! TorchMetrics v0.7 newly includes a couple of machine translation metrics such as chrF, chrF++, Translation Edit Rate, or Extended Edit Distance. Furthermore, it also supports other metrics - Match Error Rate, Word Information Lost, Word Information Preserved, and SQuAD evaluation metrics. Last but not least, we also made possible the evaluation of the ROUGE score using multiple references.
Argument unification
Importantly, all text metrics assume preds, target input order with these explicit keyword arguments. If different naming was used before v0.7, it is deprecated and completely removed in v0.8.
Import and naming changes
TorchMetrics v0.7 brings more extensive and minor changes to how metrics should be imported. The import changes directly impact v0.7, meaning that you will most likely need to change the import statement for some specific metrics. All naming changes follow our standard deprecation process, meaning that in v0.7, any metric that is renamed will still work but raise an error asking to use the new metric name. From v0.8, the old metric names will no longer be available.
[0.7.0] - 2022-01-17
Added
MatchErrorRate
(MER - Match Error Rate #619)WordInfoLost
andWordInfoPreserved
(Word Information Lost and Preserved - ASR metrics #630)SQuAD
(Add SQuAD Metric. #623)CHRFScore
(AddChrF++
#641)TranslationEditRate
(AddTER
#646)ExtendedEditDistance
(add Extended Edit Distance (EED) metric #668)MultiScaleSSIM
into image metrics (AddMultiScaleStructuralSimilarityIndexMeasure
#679)SDR
) to audio package (adding SDR [audio] #565)MinMaxMetric
to wrappers (min max wrapper #556)ignore_index
to retrieval metrics (Addignore_idx
to retrieval metrics #676)ROUGEScore
(Multi Reference ROUGEScore #680)Changed
BLEUScore
input stay consistent with all the other text metrics (Untokenized Bleu score to stay consistent with all the other text metrics #640)TER
,BLEUScore
,SacreBLEUScore
,CHRFScore
now the expected input order is predictions first and target second (Unify the input order for text (NLG) metrics - BLEU, SacreBLEU, TER, CHRF #696)torch.float
totorch.long
inConfusionMatrix
to accommodate larger values (bugfix: change dtype of confmat to int64 #715)preds
,target
input argument's naming across all text metrics (Unifypreds, target
input arguments fortext
metrics [1of2] bert, bleu, chrf, sacre_bleu, wip, wil #723, Unifypreds, target
input arguments fortext
metrics [2of2] cer, ter, wer, mer, rouge, squad #727)bert
,bleu
,chrf
,sacre_bleu
,wip
,wil
,cer
,ter
,wer
,mer
,rouge
,squad
Deprecated
functional.wer
->functional.word_error_rate
WER
->WordErrorRate
MatthewsCorrcoef
->MatthewsCorrCoef
PearsonCorrcoef
->PearsonCorrCoef
SpearmanCorrcoef
->SpearmanCorrCoef
stoi
functional #758)audio.STOI
toaudio.ShortTimeObjectiveIntelligibility
functional.audio.stoi
tofunctional.audio.short_time_objective_intelligibility
functional.audio.pesq
->functional.audio.perceptual_evaluation_speech_quality
audio.PESQ
->audio.PerceptualEvaluationSpeechQuality
functional.sdr
->functional.signal_distortion_ratio
functional.si_sdr
->functional.scale_invariant_signal_distortion_ratio
SDR
->SignalDistortionRatio
SI_SDR
->ScaleInvariantSignalDistortionRatio
functional.snr
->functional.signal_distortion_ratio
functional.si_snr
->functional.scale_invariant_signal_noise_ratio
SNR
->SignalNoiseRatio
SI_SNR
->ScaleInvariantSignalNoiseRatio
functional.f1
->functional.f1_score
F1
->F1Score
functional.fbeta
->functional.fbeta_score
FBeta
->FBetaScore
hinge
tohinge_loss
#734)functional.hinge
->functional.hinge_loss
Hinge
->HingeLoss
peak_signal_noise_ratio
#732)functional.psnr
->functional.peak_signal_noise_ratio
PSNR
->PeakSignalNoiseRatio
permutation_invariant_training
#737)functional.pit
->functional.permutation_invariant_training
PIT
->PermutationInvariantTraining
functional.ssim
->functional.scale_invariant_signal_noise_ratio
SSIM
->StructuralSimilarityIndexMeasure
MAP
toMeanAveragePrecision
metric (rename MeanAveragePrecision #754)image.FID
->image.FrechetInceptionDistance
image.KID
->image.KernelInceptionDistance
image.LPIPS
->image.LearnedPerceptualImagePatchSimilarity
Removed
embedding_similarity
metric (Remove deprecated code #638)concatenate_texts
fromwer
metric (Remove deprecated code #638)newline_sep
anddecimal_places
fromrouge
metric (Remove deprecated code #638)Fixed
kwargs
are present in update signature (Fix Collection kwargs filtering #707)Contributors
@ashutoshml, @Borda, @cuent, @Fariborzzz, @getgaurav2, @janhenriklambrechts, @justusschock, @karthikrangasai, @lucadiliello, @mahinlma, @mathemusician, @mona0809, @mrleu, @puhuk, @quancs, @SkafteNicki, @stancld, @twsl
If we forgot someone due to not matching commit email with GitHub account, let us know :]
This discussion was created from the release New NLP metrics and improved API.
Beta Was this translation helpful? Give feedback.
All reactions