Releases: 3Simplex/llama.cpp
Releases · 3Simplex/llama.cpp
b3942
rpc : backend refactoring (#9912) * rpc : refactor backend Use structs for RPC request/response messages * rpc : refactor server
b3895
Update building for Android (#9672) * docs : clarify building Android on Termux * docs : update building Android on Termux * docs : add cross-compiling for Android * cmake : link dl explicitly for Android
b3855
llama : print correct model type for Llama 3.2 1B and 3B
b3711
CUDA: fix variable name conflict for Windows build (#9382)
b3660
readme : refactor API section + remove old hot topics
b3640
docker : update CUDA images (#9213)
b3613
server : support reading arguments from environment variables (#9105) * server : support reading arguments from environment variables * add -fa and -dt * readme : specify non-arg env var
b3576
docs: introduce gpustack and gguf-parser (#8873) * readme: introduce gpustack GPUStack is an open-source GPU cluster manager for running large language models, which uses llama.cpp as the backend. Signed-off-by: thxCode <[email protected]> * readme: introduce gguf-parser GGUF Parser is a tool to review/check the GGUF file and estimate the memory usage without downloading the whole model. Signed-off-by: thxCode <[email protected]> --------- Signed-off-by: thxCode <[email protected]>
b3569
flake.lock: Update (#8979)
b3549
scripts : sync cann files (#0)