Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: 请教关于显存和LLM x MapReduce的问题 #208

Closed
wciq1208 opened this issue Sep 6, 2024 · 5 comments
Closed

[Feature Request]: 请教关于显存和LLM x MapReduce的问题 #208

wciq1208 opened this issue Sep 6, 2024 · 5 comments
Labels
feature New features

Comments

@wciq1208
Copy link

wciq1208 commented Sep 6, 2024

Feature request / 功能建议

我用vllm进行部署,命令如下

vllm serve /hestia/model/MiniCPM3-4B --trust-remote-code --max-model-len 12288 --num-gpu-blocks-override 768 --port 8001 --max-num-seqs 32 --served-model-name minicpm --swap-space 0

12288的上下文长度就消耗了22G的显存,我看readme里提到了LLM x MapReduce可以低显存处理无限上下文,请问要如何开启

@wciq1208 wciq1208 added the feature New features label Sep 6, 2024
@wciq1208 wciq1208 changed the title [Feature Request]: 请教关于显存的问题 [Feature Request]: 请教关于显存和LLM x MapReduce的问题 Sep 6, 2024
@ahkimkoo
Copy link

ahkimkoo commented Sep 6, 2024

同问,有点超预期了。看到4B,我想当然就是8G显存。没想到22G

@LDLINGLINGLING
Copy link
Collaborator

你好,这里提到的长上下文都是要消耗显存的,也就是说,4b模型不量化的情况下,占用显存在8G左右,但是上下文的增长将导致额外的显存占用。并不能在8g内存下,使用无限长上下文。

@shuo-git
Copy link
Collaborator

shuo-git commented Sep 9, 2024

你好,当前代码还不包括MapReduce的功能,MiniCPM3 x MapReduce 的代码将在一周内开源

@sycamore792
Copy link

你好,当前代码还不包括MapReduce的功能,MiniCPM3 x MapReduce 的代码将在一周内开源

你好,这块目前有进展吗

@shuo-git
Copy link
Collaborator

shuo-git commented Sep 28, 2024

您好,请参考开源仓库:https://github.com/thunlp/LLMxMapReduce
详细技术报告近期会公开

@zh-zheng zh-zheng closed this as completed Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New features
Projects
None yet
Development

No branches or pull requests

6 participants