Releases · 3Simplex/llama.cpp

18 Oct 15:24

afd9909

b3942

rpc : backend refactoring (#9912)

* rpc : refactor backend

Use structs for RPC request/response messages

* rpc : refactor server

Assets 22

07 Oct 18:13

github-actions

b3895

f1af42f

b3895

Update building for Android (#9672)

* docs : clarify building Android on Termux

* docs : update building Android on Termux

* docs : add cross-compiling for Android

* cmake : link dl explicitly for Android

Assets 22

01 Oct 14:03

github-actions

b3855

a90484c

b3855

llama : print correct model type for Llama 3.2 1B and 3B

Assets 22

09 Sep 13:47

github-actions

b3711

8e6e2fb

b3711

CUDA: fix variable name conflict for Windows build (#9382)

Assets 19

03 Sep 17:50

github-actions

b3660

b69a480

b3660

readme : refactor API section + remove old hot topics

Assets 19

28 Aug 16:33

github-actions

b3640

66b039a

b3640

docker : update CUDA images (#9213)

Assets 19

21 Aug 17:13

github-actions

b3613

fc54ef0

b3613

server : support reading arguments from environment variables (#9105)

* server : support reading arguments from environment variables

* add -fa and -dt

* readme : specify non-arg env var

Assets 19

12 Aug 14:24

github-actions

b3576

84eb2f4

b3576

docs: introduce gpustack and gguf-parser (#8873)

* readme: introduce gpustack

GPUStack is an open-source GPU cluster manager for running large
language models, which uses llama.cpp as the backend.

Signed-off-by: thxCode <[email protected]>

* readme: introduce gguf-parser

GGUF Parser is a tool to review/check the GGUF file and estimate the
memory usage without downloading the whole model.

Signed-off-by: thxCode <[email protected]>

---------

Signed-off-by: thxCode <[email protected]>

Assets 20