-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added GODEL support with T5 model type #376
base: main
Are you sure you want to change the base?
Conversation
Also, we should upload the Rust |
- Skip truncation of prompt for encoder-decoder models - Add right padding logic for encoder-decoder models
Thank you @Emulator000 for identifying the issue with T5 models and submitting this PR. I have opened a small PR to address the issue without commenting out the input prompt filtering that is needed for decoder models. The suggested changes should work for both encoder-decoders (such as T5) and decoders (such as DialoGPT) - please have a look and merge if this is fine - this should update this PR at the same time. It would be good to have the rust-version of the GODEL models pushed to the Hugging Face model hub before this gets merged since they are registered in the library. Can you please submit a PR to add the rust weights to the upstream repositories? When this is completed it would also be great to have a test with the smallest GODEL model, please consider including one as part of the T5 test suite. Thanks! |
Minor GODEL fixes
Definitely good and it works, I just tested it!
I've already converted it from Microsoft's original weights but how can I push into their upstream repository? Where can I open the PR?
Sure, I'll add tests for this right now 😄 |
@guillaume-be tests added! The only weird thing that I see here is the output from GODEL (you can see in the tests), for some strange reason, every generated texts starts with a blank space |
@guillaume-be should we have to ping Microsoft someway in order to get PRs merged? |
@Emulator000 yes it seems notifications sometimes don't get through in the model hub. What has worked well in the past is leaving an issue on the model repo (in this case https://github.com/microsoft/GODEL/issues) and hope to be able to reach the Hugging Face repository organization owners. |
Just opened an issue here 💪🏻 |
This PR adds GODEL support as mentioned in #324 issue.
Notes
I just tested with a local download copy of the GODEL base model and it works except if I change this line:
rust-bert/src/pipelines/conversation.rs
Line 918 in 9fd7983
Unfortunately it panics with:
If instead for testing purposes I put something like:
It works greatly!
Could someone understand why we have this different behavior between GPT2 and T5 models? Is like we don't have the input sequence in the generated sequence so we don't have to remove it with padding.