Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CV and NLP submodule refactor #27

Merged
merged 25 commits into from
Jun 20, 2023
Merged

CV and NLP submodule refactor #27

merged 25 commits into from
Jun 20, 2023

Conversation

rbavery
Copy link

@rbavery rbavery commented Jun 10, 2023

@NISH1001 can you give this a look and let me know what you think of the new structure? I moved all _base.py files in the subfolders models, evaluators, etc. to their own _base folder so that the nlp and cv submodules can share those base classes. Still need to refactor, get tests passing, and then make sure this picks up all changes from #26 once that is merged.

addresses #25 and #24

weiji14 and others added 9 commits June 6, 2023 13:15
Setting up GitHub Actions Continuous Integration workflow to run unit tests. Initially testing on Ubuntu-22.04 and Python 3.8.
Needed to get Python 3.11 wheels.
Needed to get Python 3.11 wheels.
Contains bugfix for Python 3.11 compatibility, see huggingface/datasets#5238.
Using the HuggingFace datasets library to load squad_v2 and imdb datasets instead of from a data/*.json file. Only taking 10 rows as a sample.
Should have just used the `get_squad_v2` and `get_imdb` functions from evalem/misc/datasets.py directly. Needed to get the correct data schema of a Python dictionary with keys 'inputs' and 'references'.
@rbavery rbavery marked this pull request as draft June 10, 2023 02:12
@NISH1001
Copy link
Collaborator

@rbavery I am looking at this by running some of the pipelines I have. Seems like we need to fix some imports. I will try to push the fixes. Thank you very much for the PR.

@NISH1001
Copy link
Collaborator

@rbavery I think the restructuring needs more refactoring primary in the _base.models where we are pivoting with NLP only models in that. I think we could move that to downstream at nlp.models? For instance we could still have nlp.models._base where we define nlp.models._base.HFWrapper which is derived from evalem._base.models.ModelWrapper.

Also, I think we could improve the base ModelWrapper so that downstream model will inherit from that? What do you think?

Also, the imports aren't working because of 2 level of parent scoping like in evalem.nlp.models.__init__ from .._base.models import <X> should be ..._base.models (3 level scoping)

@NISH1001
Copy link
Collaborator

Let's first fix the imports and then figure out how we're going to abstract the wrappers?

`evalem.nlp.models._base`.

Also fixes import bugs for `evalem.nlp.models`
Now `evalem._base.structures` have generic prediction instance types

`evalem.nlp.structures` now has nlp specific instances

Also, `evalem._base.structures.ClassificationDTO` is added for any
classification task.
@NISH1001
Copy link
Collaborator

NISH1001 commented Jun 15, 2023

@weiji14 @rbavery I have fixed some of the packaging issues involved with evalem.nlp.models. Also, I have refactored evalem._base.structures by segregating nlp-specific structure to evalem.nlp.structures.

See:
94991b4

@weiji14
Copy link
Member

weiji14 commented Jun 15, 2023

@weiji14 I have fixed some of the packaging issues involved with evalem.nlp.models. Also, I have refactored evalem._base.structures by segregating nlp-specific structure to evalem.nlp.structures.

See: 94991b4

Cool, can you also resolve the merge conflict on tests/conftest.py?

NISH1001 and others added 6 commits June 16, 2023 10:19
Now we can do
```
from evalem import BaseMetrics
from evalem import Evaluator

Evaluator(metrics=[
    BertScore(device="mps"),
    ExactMatchMetricNLP(),
    BaseMetrics.ConfusionMatrix()
])(
    predictions=[1, 2, 3],
    references=[2, 1, 3]
)
```
Now we have:
- evalem.nlp.metrics.NLPMetric
- evalem.cv.metrics.CVMetric
- evalem.nlp.evaluators.NLPEvaluator
- evalem.cv.evaluators.CVEvaluator
@rbavery
Copy link
Author

rbavery commented Jun 16, 2023

@NISH1001 I think the refactor is complete. Can you review this as is and I'll start the addition of the dice metric and hls FM evaluation in another PR?

I think most of the changes I did to get tests passing are inline with the refactor. The only thing I'm unsure about is the jury test for single ints. I updated that following the comment to reflect that the func only works for strings right now and so I only tested for that.

def test_single_ints():
    # right now, the jury conversion only works for only strings
    refs = ["1", "2", "3", "4"]
    assert format_to_jury(refs) == refs

@NISH1001
Copy link
Collaborator

@muthukumaranR @rbavery @weiji14 @xhagrg
This restructuring passed the final tests. the evalem.nlp sub is working fine. I will go on and start merging.

@rbavery Are there any metrics we should be adding to evalem.cv.metrics?

@NISH1001 NISH1001 marked this pull request as ready for review June 20, 2023 18:54
@NISH1001
Copy link
Collaborator

@rbavery I think we can add new cv metrics by inheriting from evalem.cv.metrics._base.CVMetric base class. Let's have evalem.cv.metrics.defaults.py module and add the dice metric there.

@rbavery
Copy link
Author

rbavery commented Jun 20, 2023

I think we can add Dice for CV in a separate PR and get this merged. I'm going to be at a workshop this week but will have some time to work on this Friday.

@NISH1001
Copy link
Collaborator

@rbavery That sounds good. I will merge this then. Thanks for the help with this.

@NISH1001
Copy link
Collaborator

I am gonna squash-merge because a lot of redundant commits.

@NISH1001 NISH1001 merged commit 0e96113 into main Jun 20, 2023
@NISH1001 NISH1001 deleted the cv-submodule branch June 20, 2023 19:09
@NISH1001
Copy link
Collaborator

Test coverage so far:
image

@weiji14
Copy link
Member

weiji14 commented Jun 20, 2023

Test coverage so far:

Do you want us to set up code coverage on CI?

@NISH1001
Copy link
Collaborator

@weiji14 Would be nice to have I guess...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants