-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fix M-RoPE position calculation when chunked prefill is enabled
#10388
opened Nov 16, 2024 by
imkero
Loading…
[CI/Build] Fix IDC hpu [Device not found] issue
ci/build
#10384
opened Nov 16, 2024 by
xuechendi
Loading…
[2/N][torch.compile] make compilation cfg part of vllm cfg
#10383
opened Nov 15, 2024 by
youkaichao
Loading…
[v1] V1EngineArgs for better config handling
ci/build
#10382
opened Nov 15, 2024 by
rickyyx
Loading…
[Bugfix] Ignore ray reinit error when current platform is ROCm or XPU
#10375
opened Nov 15, 2024 by
HollowMan6
Loading…
[V1] Refactor model executable interface for all text-only language models
#10374
opened Nov 15, 2024 by
ywang96
Loading…
[Doc] Add the start of an arch overview page
documentation
Improvements or additions to documentation
#10368
opened Nov 15, 2024 by
russellb
Loading…
[Misc] Medusa supports custom bias
ready
ONLY add when PR is ready to merge/full CI is needed
#10361
opened Nov 15, 2024 by
skylee-01
Loading…
[Platform][Refactor] Extract func ONLY add when PR is ready to merge/full CI is needed
get_default_attn_backend
to Platform
ready
#10358
opened Nov 15, 2024 by
MengqingCao
Loading…
[Hardware][CPU] Support chunked-prefill and prefix-caching on CPU
ci/build
documentation
Improvements or additions to documentation
x86 CPU
#10355
opened Nov 15, 2024 by
bigPYJ1151
Loading…
[Core] Interface for accessing model from engine
#10353
opened Nov 15, 2024 by
DarkLight1337
Loading…
[DRAFT] Cutlass 2:4
ci/build
needs-rebase
#10346
opened Nov 15, 2024 by
robertgshaw2-neuralmagic
•
Draft
Rahul quant merged
ci/build
needs-rebase
#10341
opened Nov 14, 2024 by
robertgshaw2-neuralmagic
•
Draft
[Kernel] Add CUTLASS sparse support, heuristics, and torch operators
ci/build
#10340
opened Nov 14, 2024 by
Faraz9877
Loading…
[TPU] Implement prefix caching for TPUs
ci/build
tpu
Related to Google TPUs
#10307
opened Nov 13, 2024 by
WoosukKwon
•
Draft
[Model] Add Support for Multimodal Granite Models
#10291
opened Nov 13, 2024 by
alex-jw-brooks
Loading…
[Core][Frontend] Add faster-outlines as guided decoding backend
ci/build
#10277
opened Nov 13, 2024 by
unaidedelf8777
Loading…
[torch.compile] PostGradPassManager, Inductor code caching fix, fix_functionalization pass refactor + tests
ready
ONLY add when PR is ready to merge/full CI is needed
#10273
opened Nov 12, 2024 by
ProExpertProg
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.