attempt_download_from_hub, using newly uploaded https://huggingface.co/Ultralytics/YOLOv5/tree/main #261
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
updating
attempt_download_from_hub
to make use of newly uploaded ultralytics yolov5 models on huggingface.Why Draft?
Hello,
Been using this for a few months now, so thanks for creating it! I could not find any contribution guidelines, so was weary to submit any sort of PR, but saw a few recent external contributor merges, so thought I'd give it a shot. I'm happy to rework this as necessary.
I also note that I have not accounted for how this will affect the current usage of
attempt_download_from_hub
, so would have to consider how to make it back-compatible if that is desired. I know this does not pass unittests as is so expect to make some changes after discussing with you.I need to clarify a few things before I can make any meaningful refactors though. Is the current implementation regarding passing
file
asrepo_id
intentional? And if so why? I would like to align this PR with your design goals:In attempt_download
result = attempt_download_from_hub(file, hf_token=hf_token)
it seems to pass a
file
, which according to your README should be a filename likeyolov5.pt
,despite the
attempt_download_from_hub
function definition specifying the first arg asrepo_id
rather than a file:def attempt_download_from_hub(repo_id, hf_token=None, revision=None):
Furthermore a few lines below this
file
is used inlist_repo_files
as therepo_id
:repo_files = list_repo_files(repo_id=repo_id, repo_type='model', token=hf_token)
despite the HfApi calling for the actual repo_id not a filename.
Ideally, for my own purposes I'd like to see:
model_file = [f for f in repo_files if f.endswith('.pt')][0]
replaced by something that can specify the actual file, rather than just the first in a list of many, but replacing this completely would likely cause further issues with refactoring existing usage, thus why I made some attempt to retain this in this PR.
I see in your tests that you use
Setting a breakpoint above
result = attempt_download_from_hub(file, hf_token=hf_token)
here it appears for allTestConstants
this returns None, so I'm guessing this is simply an error, but let me know if it has an otherwise purpose.Looking at how you set up the unittests, it seems you are trying to handle the input to yolov5.load() using either a repo_id or a weights file as a single str arg. This seems quite complicated to me. Why not just align the input args with the hf_hub_download that you are wrapping, so as to take in
repo_id
andfilename
separately? If not for aligning with hf, the ultralytics yolov5 README also seems to separate them with their example of loading a model in their README with:torch.hub.load("ultralytics/yolov5", "yolov5s")
If you provided a similar interface, this would seem to work for both github and hf in a way that supports the multi-model in one repo way provided in the ultralytics huggingface repo. I know hf has a dogma of one-repo one-model, but if you are already implementing the search/match for the github releases, I'm not sure why you wouldn't use the existing
hf_hub_download
's ability to do the same already, especially since you already have bothfile
andrepo_id
in the scope at this point of the code.After I understand your design goals surrounding this I can update this draft PR to include some ideas for refactoring in ways that should keep existing usages from breaking.