Skip to content

Pull requests: intel/xFasterTransformer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

add bf16_int8 support for invokeLayerLLaMA API
#470 opened Jul 22, 2024 by miaojinc Loading…
[Layers] Increased the threshold for enabling flashAttn performance performance related.
#428 opened Jun 3, 2024 by abenmao Loading…
[Kernel] Add dynamic onednn matmul. performance performance related.
#425 opened May 28, 2024 by changqi1 Loading…
[Model] Achieve whole pipeline parallel. enhancement New feature or request gpu Related to GPU
#355 opened Apr 28, 2024 by changqi1 Draft
[Eval] Add eval test with opencompass. benchmark performance or accuracy benchmark enhancement New feature or request
#325 opened Apr 17, 2024 by marvin-Yu Draft
Update AWQ GPTQ quantization guide documentation Improvements or additions to documentation
#306 opened Apr 10, 2024 by miaojinc Loading…
ProTip! What’s not been updated in a month: updated:<2024-06-22.