-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CV and NLP submodule refactor #27
Conversation
Setting up GitHub Actions Continuous Integration workflow to run unit tests. Initially testing on Ubuntu-22.04 and Python 3.8.
Needed to get Python 3.11 wheels.
Needed to get Python 3.11 wheels.
Contains bugfix for Python 3.11 compatibility, see huggingface/datasets#5238.
Using the HuggingFace datasets library to load squad_v2 and imdb datasets instead of from a data/*.json file. Only taking 10 rows as a sample.
Should have just used the `get_squad_v2` and `get_imdb` functions from evalem/misc/datasets.py directly. Needed to get the correct data schema of a Python dictionary with keys 'inputs' and 'references'.
@rbavery I am looking at this by running some of the pipelines I have. Seems like we need to fix some imports. I will try to push the fixes. Thank you very much for the PR. |
@rbavery I think the restructuring needs more refactoring primary in the Also, I think we could improve the base Also, the imports aren't working because of 2 level of parent scoping like in |
Let's first fix the imports and then figure out how we're going to abstract the wrappers? |
`evalem.nlp.models._base`. Also fixes import bugs for `evalem.nlp.models`
Now `evalem._base.structures` have generic prediction instance types `evalem.nlp.structures` now has nlp specific instances Also, `evalem._base.structures.ClassificationDTO` is added for any classification task.
Now we can do ``` from evalem import BaseMetrics from evalem import Evaluator Evaluator(metrics=[ BertScore(device="mps"), ExactMatchMetricNLP(), BaseMetrics.ConfusionMatrix() ])( predictions=[1, 2, 3], references=[2, 1, 3] ) ```
Now we have: - evalem.nlp.metrics.NLPMetric - evalem.cv.metrics.CVMetric - evalem.nlp.evaluators.NLPEvaluator - evalem.cv.evaluators.CVEvaluator
@NISH1001 I think the refactor is complete. Can you review this as is and I'll start the addition of the dice metric and hls FM evaluation in another PR? I think most of the changes I did to get tests passing are inline with the refactor. The only thing I'm unsure about is the jury test for single ints. I updated that following the comment to reflect that the func only works for strings right now and so I only tested for that.
|
@muthukumaranR @rbavery @weiji14 @xhagrg @rbavery Are there any metrics we should be adding to |
@rbavery I think we can add new cv metrics by inheriting from |
I think we can add Dice for CV in a separate PR and get this merged. I'm going to be at a workshop this week but will have some time to work on this Friday. |
@rbavery That sounds good. I will merge this then. Thanks for the help with this. |
I am gonna squash-merge because a lot of redundant commits. |
Do you want us to set up code coverage on CI? |
@weiji14 Would be nice to have I guess... |
@NISH1001 can you give this a look and let me know what you think of the new structure? I moved all _base.py files in the subfolders models, evaluators, etc. to their own _base folder so that the nlp and cv submodules can share those base classes. Still need to refactor, get tests passing, and then make sure this picks up all changes from #26 once that is merged.
addresses #25 and #24