abetlen / llama-cpp-python Public

Notifications You must be signed in to change notification settings
Fork 960
Star 8.1k

Code
Issues 454
Pull requests 77
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: abetlen/llama-cpp-python

Roadmap for v0.2

#487 opened Jul 18, 2023 by abetlen

Open

Add batched inference

#771 opened Sep 30, 2023 by abetlen

Open 35

Improve installation process

#1178 opened Feb 12, 2024 by abetlen

Open 8

Labels 23 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

454 Open 669 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Request for prebuilt CUDA wheels for newer version

#1824 opened Nov 5, 2024 by XJF2332

Prebuilt CUDA wheels not working

#1822 opened Nov 5, 2024 by mjwweb

AttributeError: function 'llama_sampler_init_tail_free' not found after compiling llama.pcc with hipBLAS

#1818 opened Oct 30, 2024 by Micromanner

Specify GPU Selection (e.g., CUDA:0, CUDA:1)

#1816 opened Oct 30, 2024 by RakshitAralimatti

Installed everything but speed is lower than on 3090 compared with industrial GPU. Seems like cuda is not working.

#1815 opened Oct 29, 2024 by lukaLLM

[Feature request] High-level API support for DRY and XTC samplers

#1813 opened Oct 28, 2024 by ddh0

llama_get_logits_ith: invalid logits id -1, reason: no logits

#1812 opened Oct 28, 2024 by ba0gu0

Add support of Qwen2vl

#1811 opened Oct 28, 2024 by PredyDaddy

Setting seed to -1 (random) or using default LLAMA_DEFAULT_SEED generates a deterministic reply chain

#1809 opened Oct 24, 2024 by m-from-space

Assistant message with tool_calls and without content raises an error

#1805 opened Oct 21, 2024 by feloy

4 tasks done

low level examples broken after [feat: Update sampling API for llama.cpp (#1742)]

#1803 opened Oct 20, 2024 by mite51

Llama.from_pretrained should work with HF_HUB_OFFLINE=1

#1801 opened Oct 16, 2024 by davidgilbertson

top_p = 1 causes deterministic outputs

#1797 opened Oct 14, 2024 by oobabooga

Add reranking support

#1794 opened Oct 14, 2024 by donguyen32

Long Context Generation Crashes Google Colab Instance

#1792 opened Oct 12, 2024 by kazunator

server: chat completions returns wrong logprobs model

#1787 opened Oct 6, 2024 by domdomegg

4 tasks done

Tool parser cannot analysis tool calls string from qwen2.5.

#1784 opened Oct 5, 2024 by hpx502766238

Why is this not working for the current release. UNABLE TO USE GPU

#1781 opened Oct 2, 2024 by AnirudhJM24

llama-cpp-python not using GPU on google colab

#1780 opened Oct 2, 2024 by AnirudhJM24

_logger.py: KeyError:5 [bugfix] [patch]

#1778 opened Oct 2, 2024 by themanyone

4 tasks done

Missing async llm call

#1774 opened Oct 1, 2024 by ivanstepanovftw

Setting temperature to 100000000000000000 does not affect output.

#1773 opened Oct 1, 2024 by ivanstepanovftw

Error when passing model to deepcopy in llama_cpp_python>=0.3.0

#1769 opened Sep 28, 2024 by sergey21000

[FEAT]: TLS Certificate Support

#1768 opened Sep 27, 2024 by isgallagher

Inference Speed is Extremely Slow for 72B Model with Long Contexts

#1767 opened Sep 27, 2024 by wrench1997

Previous 1 2 3 4 5 … 18 19 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly