Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/multiclass recall macro avg ignore index #2710

Draft
wants to merge 61 commits into
base: master
Choose a base branch
from

Conversation

rittik9
Copy link
Contributor

@rittik9 rittik9 commented Sep 1, 2024

What does this PR do?

Fixes #2441

Details
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

Did you have fun?

Yes

Issue:

  • The root of the problem seems to be that the ignore_index information is not being properly propagated to the final averaging step i.e. the _adjust_weights_safe_divide function doesn't know that which class should be ignored.

Solution:

  • To address this issue, I updated the code to ensure that the ignore_index information is preserved throughout the entire process, making sure it is correctly passed through all intermediate steps up to the final averaging stage i.e. _adjust_weights_safe_divide function .
  • Updated the _adjust_weights_safe_divide function to accept an additional ignore_index parameter, which is passed through the _precision_recall_reduce function, called in the compute method of the MulticlassRecall class. This change adjusts the weights in the _adjust_weights_safe_divide function, setting the weight of the ignored class to 0.

📚 Documentation preview 📚: https://torchmetrics--2710.org.readthedocs.build/en/2710/

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, can we add also test for this case...

@rittik9 rittik9 marked this pull request as draft September 2, 2024 19:12
@rittik9 rittik9 marked this pull request as ready for review September 2, 2024 22:15
@rittik9
Copy link
Contributor Author

rittik9 commented Sep 2, 2024

looks good, can we add also test for this case...

Sure

@rittik9 rittik9 requested a review from Borda September 4, 2024 11:26
Borda
Borda previously approved these changes Sep 4, 2024
@Borda Borda self-requested a review September 6, 2024 09:10
@Borda Borda dismissed their stale review September 6, 2024 09:10

pending on adding test

@Borda Borda added the bug / fix Something isn't working label Sep 6, 2024
@rittik9
Copy link
Contributor Author

rittik9 commented Sep 6, 2024

@Borda What do I have to modify?

@Borda
Copy link
Member

Borda commented Sep 11, 2024

@rittik9 mind checking the changed docstest values and whether it is correct?

@Borda Borda enabled auto-merge (squash) October 31, 2024 11:58
@Borda Borda mentioned this pull request Oct 31, 2024
4 tasks
@@ -661,6 +661,37 @@ def test_corner_case():
assert res == 1.0


def test_multiclass_recall_ignore_index():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we are already testing various ignore_index with reference metric so if we had it wrong this did not pass already... it is possible that we also have a bug in the reference metric?
cc: @SkafteNicki

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking to the code and the ignore index is already applied in _multilabel_stat_scores_format which reduces the preds/target size the same way as the reference metric so calling it with null weights in fact ignores additional index

Copy link
Contributor Author

@rittik9 rittik9 Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is we are using sklearn's recall_score as a reference for our unittests. So even if in _reference_sklearn_precision_recall_multiclass() function we are using remove_ignore_index function for removing those predictions whose real values are ignore_index class before passing it to recall_score function, it does not matter. Because whenever average='macro' sklearn's recall_score will always return mean cosidering the total no. of classes (as we are passing all the classes in recall_score() function's labels argument). That is the reason why unittests failed in the first place. I think we need to fix the unittests to take care of ignore_index using sklearn's recall_score() function's labels argument. I've prepared a notebook for explanation. cc:@Borda.

@rittik9 rittik9 marked this pull request as draft November 1, 2024 20:31
auto-merge was automatically disabled November 1, 2024 20:31

Pull request was converted to draft

@rittik9 rittik9 marked this pull request as ready for review November 2, 2024 00:05
@rittik9 rittik9 marked this pull request as draft November 2, 2024 00:16
@rittik9 rittik9 marked this pull request as ready for review November 2, 2024 00:47
@rittik9 rittik9 marked this pull request as draft November 2, 2024 01:22
@DimitrisMantas
Copy link

Just to chime in, I think this issue is present in pretty much all metrics that make use of _adjust_weights_safe_divide.

I see this PR fixes, some of them, but others, such as JaccardIndex are left as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working topic: Classif
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect result in computing MulticlassRecall macro average when ignore_index is specified
5 participants