You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just have an anecdotal sample size, but I've found that pget works as-is for that model when run on a A100 instance.
Download time with pget was between 21-24 seconds. I tried tweaking -c and found that -c 10 to -c 12 seemed to slightly improve on speed obtained with the default.
Tests with gcloud yielded downloads between 16-24 seconds (with download speeds ranging from 1.1-1.7 GBS).
Potentially it makes sense to also compare to available ram, if one cannot buffer the whole file into memory use scratchspace and bind files together after.
@daanelson has shared that https://storage.googleapis.com/replicate-weights/llama-13b-fp16.tensors which is ~24GB.
To compare,
gcloud
can download this in parallel between 1-2 GBpsThe text was updated successfully, but these errors were encountered: