Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runpod problem with starting Pygmalion-7B #446

Open
skaba04 opened this issue Aug 28, 2023 · 5 comments
Open

Runpod problem with starting Pygmalion-7B #446

skaba04 opened this issue Aug 28, 2023 · 5 comments

Comments

@skaba04
Copy link

skaba04 commented Aug 28, 2023

I have been running on runpod Pygmalion 13B model fine but whatever i am tring to load Pygmalion 7B or 6B the model is located here:https://huggingface.co/TehVenom/Pygmalion-7b-4bit-GPTQ-Safetensors

this error message showes up:
2023-08-28T23:04:34.578170425+02:00 File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 238, in _load
2023-08-28T23:04:34.578171637+02:00 self.model = self._get_model(self.get_local_model_path())
2023-08-28T23:04:34.578173024+02:00 │ │ │ │ │ └ <function HFInferenceModel.get_local_model_path at 0x7f543f713af0>
2023-08-28T23:04:34.578174162+02:00 │ │ │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130>
2023-08-28T23:04:34.578175309+02:00 │ │ │ └ <function model_backend._get_model at 0x7f543f70d670>
2023-08-28T23:04:34.578176464+02:00 │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130>
2023-08-28T23:04:34.578177571+02:00 │ └ None
2023-08-28T23:04:34.578178686+02:00 └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130>
2023-08-28T23:04:34.578179815+02:00
2023-08-28T23:04:34.578180982+02:00 File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 392, in _get_model
2023-08-28T23:04:34.578182526+02:00 model = AutoGPTQForCausalLM.from_quantized(location, model_basename=Path(gptq_file).stem, use_safetensors=gptq_file.endswith(".safetensors"), device_map=device_map, inject_fused_attention=False)
2023-08-28T23:04:34.578184667+02:00 │ │ │ │ │ │ │ └ {'model.layers.0': 0, 'model.layers.1': 0, 'model.layers.2': 0, 'model.layers.3': 0, 'model.layers.4': 0, 'model.layers.5': 0...
2023-08-28T23:04:34.578185871+02:00 │ │ │ │ │ │ └ <method 'endswith' of 'str' objects>
2023-08-28T23:04:34.578187264+02:00 │ │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors'
2023-08-28T23:04:34.578188433+02:00 │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors'
2023-08-28T23:04:34.578189572+02:00 │ │ │ └ <class 'pathlib.Path'>
2023-08-28T23:04:34.578190775+02:00 │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors'
2023-08-28T23:04:34.578191927+02:00 │ └ <classmethod object at 0x7f53c847efd0>
2023-08-28T23:04:34.578194231+02:00 └ <class 'auto_gptq.modeling.auto.AutoGPTQForCausalLM'>
2023-08-28T23:04:34.578195447+02:00
2023-08-28T23:04:34.578196541+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/auto.py", line 108, in from_quantized
2023-08-28T23:04:34.578197683+02:00 return quant_func(
2023-08-28T23:04:34.578198884+02:00 └ <bound method BaseGPTQForCausalLM.from_quantized of <class 'auto_gptq.modeling.llama.LlamaGPTQForCausalLM'>>
2023-08-28T23:04:34.578200011+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/base.py", line 757, in from_quantized
2023-08-28T23:04:34.578201126+02:00 quantize_config = BaseQuantizeConfig.from_pretrained(model_name_or_path, **cached_file_kwargs, **kwargs)
2023-08-28T23:04:34.578202302+02:00 │ │ │ │ └ {}
2023-08-28T23:04:34.578203428+02:00 │ │ │ └ {'cache_dir': None, 'force_download': False, 'proxies': None, 'resume_download': False, 'local_files_only': False, 'use_auth
...
2023-08-28T23:04:34.578204560+02:00 │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors'
2023-08-28T23:04:34.578205729+02:00 │ └ <classmethod object at 0x7f53ed2c8be0>
2023-08-28T23:04:34.578206890+02:00 └ <class 'auto_gptq.modeling._base.BaseQuantizeConfig'>
2023-08-28T23:04:34.578207977+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/_base.py", line 93, in from_pretrained
2023-08-28T23:04:34.578209155+02:00 with open(resolved_config_file, "r", encoding="utf-8") as f:
2023-08-28T23:04:34.578210551+02:00 └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json'
2023-08-28T23:04:34.578211717+02:00
2023-08-28T23:04:34.578212846+02:00 FileNotFoundError: [Errno 2] No such file or directory: 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json'

@henk717
Copy link
Owner

henk717 commented Aug 28, 2023

Known issue with this model since its super old.
If occam's GPTQ module gets updated to work on newer Huggingface versions we can continue to support it, otherwise I suggest you to find a newer conversion of the model that is compatible with AutoGPTQ or to instead load the 16-bit version so KoboldAI quantizes it for you.

@skaba04
Copy link
Author

skaba04 commented Aug 28, 2023

Ok so if nothing works and I cant seem to find the AutoGPTQ version. Only thing that left me is just wait.

@skaba04
Copy link
Author

skaba04 commented Aug 28, 2023

But it says that it is quantanized with GPTQ-for-LLaMa is it any different?

@henk717
Copy link
Owner

henk717 commented Aug 28, 2023

Yes because it lacks the config file that our AutoGPTQ fallback needs. I cant legally tell you where to get reuploads but uploads compatible with the current KoboldAI United versions do exist so you dont have to wait.

@skaba04
Copy link
Author

skaba04 commented Aug 28, 2023

Ok so where those uploads compatible with the current KoboldAI United versions are?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants