CV and NLP submodule refactor #27

rbavery · 2023-06-10T02:12:29Z

@NISH1001 can you give this a look and let me know what you think of the new structure? I moved all _base.py files in the subfolders models, evaluators, etc. to their own _base folder so that the nlp and cv submodules can share those base classes. Still need to refactor, get tests passing, and then make sure this picks up all changes from #26 once that is merged.

addresses #25 and #24

Setting up GitHub Actions Continuous Integration workflow to run unit tests. Initially testing on Ubuntu-22.04 and Python 3.8.

Needed to get Python 3.11 wheels.

Contains bugfix for Python 3.11 compatibility, see huggingface/datasets#5238.

Using the HuggingFace datasets library to load squad_v2 and imdb datasets instead of from a data/*.json file. Only taking 10 rows as a sample.

Should have just used the `get_squad_v2` and `get_imdb` functions from evalem/misc/datasets.py directly. Needed to get the correct data schema of a Python dictionary with keys 'inputs' and 'references'.

NISH1001 · 2023-06-14T19:59:33Z

@rbavery I am looking at this by running some of the pipelines I have. Seems like we need to fix some imports. I will try to push the fixes. Thank you very much for the PR.

NISH1001 · 2023-06-15T16:27:17Z

@rbavery I think the restructuring needs more refactoring primary in the _base.models where we are pivoting with NLP only models in that. I think we could move that to downstream at nlp.models? For instance we could still have nlp.models._base where we define nlp.models._base.HFWrapper which is derived from evalem._base.models.ModelWrapper.

Also, I think we could improve the base ModelWrapper so that downstream model will inherit from that? What do you think?

Also, the imports aren't working because of 2 level of parent scoping like in evalem.nlp.models.__init__ from .._base.models import <X> should be ..._base.models (3 level scoping)

NISH1001 · 2023-06-15T16:27:48Z

Let's first fix the imports and then figure out how we're going to abstract the wrappers?

`evalem.nlp.models._base`. Also fixes import bugs for `evalem.nlp.models`

Now `evalem._base.structures` have generic prediction instance types `evalem.nlp.structures` now has nlp specific instances Also, `evalem._base.structures.ClassificationDTO` is added for any classification task.

NISH1001 · 2023-06-15T19:33:00Z

@weiji14 @rbavery I have fixed some of the packaging issues involved with evalem.nlp.models. Also, I have refactored evalem._base.structures by segregating nlp-specific structure to evalem.nlp.structures.

See:
94991b4

weiji14 · 2023-06-15T20:50:06Z

@weiji14 I have fixed some of the packaging issues involved with evalem.nlp.models. Also, I have refactored evalem._base.structures by segregating nlp-specific structure to evalem.nlp.structures.

See: 94991b4

Cool, can you also resolve the merge conflict on tests/conftest.py?

Now we can do ``` from evalem import BaseMetrics from evalem import Evaluator Evaluator(metrics=[ BertScore(device="mps"), ExactMatchMetricNLP(), BaseMetrics.ConfusionMatrix() ])( predictions=[1, 2, 3], references=[2, 1, 3] ) ```

Now we have: - evalem.nlp.metrics.NLPMetric - evalem.cv.metrics.CVMetric - evalem.nlp.evaluators.NLPEvaluator - evalem.cv.evaluators.CVEvaluator

rbavery · 2023-06-16T22:15:51Z

@NISH1001 I think the refactor is complete. Can you review this as is and I'll start the addition of the dice metric and hls FM evaluation in another PR?

I think most of the changes I did to get tests passing are inline with the refactor. The only thing I'm unsure about is the jury test for single ints. I updated that following the comment to reflect that the func only works for strings right now and so I only tested for that.

def test_single_ints():
    # right now, the jury conversion only works for only strings
    refs = ["1", "2", "3", "4"]
    assert format_to_jury(refs) == refs

…errors

NISH1001 · 2023-06-20T18:54:38Z

@muthukumaranR @rbavery @weiji14 @xhagrg
This restructuring passed the final tests. the evalem.nlp sub is working fine. I will go on and start merging.

@rbavery Are there any metrics we should be adding to evalem.cv.metrics?

NISH1001 · 2023-06-20T19:05:14Z

@rbavery I think we can add new cv metrics by inheriting from evalem.cv.metrics._base.CVMetric base class. Let's have evalem.cv.metrics.defaults.py module and add the dice metric there.

rbavery · 2023-06-20T19:06:01Z

I think we can add Dice for CV in a separate PR and get this merged. I'm going to be at a workshop this week but will have some time to work on this Friday.

NISH1001 · 2023-06-20T19:08:12Z

@rbavery That sounds good. I will merge this then. Thanks for the help with this.

NISH1001 · 2023-06-20T19:08:36Z

I am gonna squash-merge because a lot of redundant commits.

NISH1001 · 2023-06-20T19:48:53Z

Test coverage so far:

weiji14 · 2023-06-20T20:03:07Z

Test coverage so far:

Do you want us to set up code coverage on CI?

NISH1001 · 2023-06-21T16:26:57Z

@weiji14 Would be nice to have I guess...

weiji14 and others added 9 commits June 6, 2023 13:15

Add Continuous Integration workflow to run unit tests

d011d4d

Setting up GitHub Actions Continuous Integration workflow to run unit tests. Initially testing on Ubuntu-22.04 and Python 3.8.

Run CI on Python 3.11 too

1456f21

Bump pyarrow from 9.0.0 to 11.0.0

113e89c

Needed to get Python 3.11 wheels.

Bump sentencepiece from 0.1.97 to 0.1.99

7fcb061

Needed to get Python 3.11 wheels.

Bump datasets from 2.3.2 to 2.7.0

c118243

Contains bugfix for Python 3.11 compatibility, see huggingface/datasets#5238.

Refactor fixtures to load squad_v2 and imdb datasets from HuggingFace

78d5e4a

Using the HuggingFace datasets library to load squad_v2 and imdb datasets instead of from a data/*.json file. Only taking 10 rows as a sample.

Refactor conftest.py fixtures to use get_squad_v2 and get_imdb functions

f8d9853

Should have just used the `get_squad_v2` and `get_imdb` functions from evalem/misc/datasets.py directly. Needed to get the correct data schema of a Python dictionary with keys 'inputs' and 'references'.

new proposed structure for nlp and cv submodules

40c38a8

test updates

e5d19f6

rbavery marked this pull request as draft June 10, 2023 02:12

cv structure mirroring nlp module. move more to base

6673099

rbavery mentioned this pull request Jun 12, 2023

segregate nlp and non-nlp (cv) namespace in evalem #25

Closed

refactor progress

c0e476d

NISH1001 mentioned this pull request Jun 15, 2023

evalem.cv submodule location #17

Closed

NISH1001 added 2 commits June 15, 2023 13:43

Move nlp specifc base model class from evalem._base.models to

0caf4f8

`evalem.nlp.models._base`. Also fixes import bugs for `evalem.nlp.models`

Refactor base structure and segregate nlp structure

94991b4

Now `evalem._base.structures` have generic prediction instance types `evalem.nlp.structures` now has nlp specific instances Also, `evalem._base.structures.ClassificationDTO` is added for any classification task.

NISH1001 and others added 6 commits June 16, 2023 10:19

Merge branch 'main' into cv-submodule

439d012

Refactor metrics for base and nlp

1619f6c

Improve structuring for base metrics

225ad11

Now we can do ``` from evalem import BaseMetrics from evalem import Evaluator Evaluator(metrics=[ BertScore(device="mps"), ExactMatchMetricNLP(), BaseMetrics.ConfusionMatrix() ])( predictions=[1, 2, 3], references=[2, 1, 3] ) ```

Refactor type hierarchies for nlp and cv metrics and evaluators

91d37da

Now we have: - evalem.nlp.metrics.NLPMetric - evalem.cv.metrics.CVMetric - evalem.nlp.evaluators.NLPEvaluator - evalem.cv.evaluators.CVEvaluator

fix some failing imports in tests, readme for testing with hatch

ab56f14

passing tests

2e33860

rbavery added 2 commits June 16, 2023 15:20

noqa in init and comments on what isn't implemented to address flake …

af3b902

…errors

pin flake8 and black versions

83aba3e

rbavery and others added 4 commits June 16, 2023 15:28

black lint

16cebf6

undo comment

f1d11ab

noqa

98678e0

Remove unnecessary imports in evalem_base.models

966e224

NISH1001 marked this pull request as ready for review June 20, 2023 18:54

NISH1001 merged commit 0e96113 into main Jun 20, 2023

NISH1001 deleted the cv-submodule branch June 20, 2023 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CV and NLP submodule refactor #27

CV and NLP submodule refactor #27

rbavery commented Jun 10, 2023

NISH1001 commented Jun 14, 2023

NISH1001 commented Jun 15, 2023

NISH1001 commented Jun 15, 2023

NISH1001 commented Jun 15, 2023 •

edited

Loading

weiji14 commented Jun 15, 2023

rbavery commented Jun 16, 2023

NISH1001 commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

rbavery commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

weiji14 commented Jun 20, 2023

NISH1001 commented Jun 21, 2023

CV and NLP submodule refactor #27

CV and NLP submodule refactor #27

Conversation

rbavery commented Jun 10, 2023

NISH1001 commented Jun 14, 2023

NISH1001 commented Jun 15, 2023

NISH1001 commented Jun 15, 2023

NISH1001 commented Jun 15, 2023 • edited Loading

weiji14 commented Jun 15, 2023

rbavery commented Jun 16, 2023

NISH1001 commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

rbavery commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

NISH1001 commented Jun 20, 2023

weiji14 commented Jun 20, 2023

NISH1001 commented Jun 21, 2023

NISH1001 commented Jun 15, 2023 •

edited

Loading