[Feature Request]: 请教关于显存和LLM x MapReduce的问题 #208

wciq1208 · 2024-09-06T10:15:34Z

Feature request / 功能建议

我用vllm进行部署,命令如下

vllm serve /hestia/model/MiniCPM3-4B --trust-remote-code --max-model-len 12288 --num-gpu-blocks-override 768 --port 8001 --max-num-seqs 32 --served-model-name minicpm --swap-space 0

12288的上下文长度就消耗了22G的显存，我看readme里提到了LLM x MapReduce可以低显存处理无限上下文，请问要如何开启

ahkimkoo · 2024-09-06T17:06:47Z

同问，有点超预期了。看到4B，我想当然就是8G显存。没想到22G

LDLINGLINGLING · 2024-09-09T01:10:25Z

你好，这里提到的长上下文都是要消耗显存的，也就是说，4b模型不量化的情况下，占用显存在8G左右，但是上下文的增长将导致额外的显存占用。并不能在8g内存下，使用无限长上下文。

shuo-git · 2024-09-09T07:22:20Z

你好，当前代码还不包括MapReduce的功能，MiniCPM3 x MapReduce 的代码将在一周内开源

sycamore792 · 2024-09-27T18:14:00Z

你好，当前代码还不包括MapReduce的功能，MiniCPM3 x MapReduce 的代码将在一周内开源

你好，这块目前有进展吗

shuo-git · 2024-09-28T01:08:57Z

您好，请参考开源仓库：https://github.com/thunlp/LLMxMapReduce
详细技术报告近期会公开

wciq1208 added the feature New features label Sep 6, 2024

wciq1208 changed the title ~~[Feature Request]: 请教关于显存的问题~~ [Feature Request]: 请教关于显存和LLM x MapReduce的问题 Sep 6, 2024

zh-zheng closed this as completed Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: 请教关于显存和LLM x MapReduce的问题 #208

[Feature Request]: 请教关于显存和LLM x MapReduce的问题 #208

wciq1208 commented Sep 6, 2024

ahkimkoo commented Sep 6, 2024

LDLINGLINGLING commented Sep 9, 2024

shuo-git commented Sep 9, 2024

sycamore792 commented Sep 27, 2024

shuo-git commented Sep 28, 2024 •

edited

Loading

[Feature Request]: 请教关于显存和LLM x MapReduce的问题 #208

[Feature Request]: 请教关于显存和LLM x MapReduce的问题 #208

Comments

wciq1208 commented Sep 6, 2024

Feature request / 功能建议

ahkimkoo commented Sep 6, 2024

LDLINGLINGLING commented Sep 9, 2024

shuo-git commented Sep 9, 2024

sycamore792 commented Sep 27, 2024

shuo-git commented Sep 28, 2024 • edited Loading

shuo-git commented Sep 28, 2024 •

edited

Loading