Replies: 1 comment 1 reply
-
Partial offloading is tracked in #1562. This being a feature request, it doesn't belong in "Discussions" anyway. I am close to disabling this tab altogether because it seems like nobody understands what it's for. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
most CPU these that come with an IGPU.
The question is can it load your model partially or all of them onto the IGPU regardless of the GPU has/hasn't enough VRM, because fo IGPU, the system ram is the vram, so that you can have GPU acceleration on a laptop
just provide a option to load selected number of layer that load onto GPU like llama cpp does in the command.
Beta Was this translation helpful? Give feedback.
All reactions