Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[Doc] Add the start of an arch overview page documentation Improvements or additions to documentation
#10368 opened Nov 15, 2024 by russellb Loading…
[Misc] Medusa supports custom bias ready ONLY add when PR is ready to merge/full CI is needed
#10361 opened Nov 15, 2024 by skylee-01 Loading…
[Platform][Refactor] Extract func get_default_attn_backend to Platform ready ONLY add when PR is ready to merge/full CI is needed
#10358 opened Nov 15, 2024 by MengqingCao Loading…
[Hardware][CPU] Support chunked-prefill and prefix-caching on CPU ci/build documentation Improvements or additions to documentation x86 CPU
#10355 opened Nov 15, 2024 by bigPYJ1151 Loading…
Add KV-Cache int8 quant support
#10354 opened Nov 15, 2024 by YanyunDuanIEI Loading…
[Core] Interface for accessing model from engine
#10353 opened Nov 15, 2024 by DarkLight1337 Loading…
DistServe Prototype
#10321 opened Nov 14, 2024 by Jocn2020 Draft
[Model] Support telechat2
#10311 opened Nov 14, 2024 by shunxing12345 Loading…
[TPU] Implement prefix caching for TPUs ci/build tpu Related to Google TPUs
#10307 opened Nov 13, 2024 by WoosukKwon Draft
[torch.compile] PostGradPassManager, Inductor code caching fix, fix_functionalization pass refactor + tests ready ONLY add when PR is ready to merge/full CI is needed
#10273 opened Nov 12, 2024 by ProExpertProg Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.