Releases: 3Simplex/llama.cpp
Releases · 3Simplex/llama.cpp
b4125
Skip searching root path for cross-compile builds (#10383)
b4100
server: (web UI) Add samplers sequence customization (#10255) * Samplers sequence: simplified and input field. * Removed unused function * Modify and use `settings-modal-short-input` * rename "name" --> "label" --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
b4067
vulkan: Throttle the number of shader compiles during the build step.…
b4061
metal : reorder write loop in mul mat kernel + style (#10231) * metal : reorder write loop * metal : int -> short, style ggml-ci
b4042
DRY: Fixes clone functionality (#10192)
b4007
server : fix smart selection of available slot (#10120) * Fix smart selection of available slot * minor fix * replace vectors of tokens with shorthands
b3987
llama : Add IBM granite template (#10013) * Add granite template to llama.cpp * Add granite template to test-chat-template.cpp * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Update tests/test-chat-template.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Added proper template and expected output * Small change to \n Small change to \n * Add code space & Co-authored-by: Xuan Son Nguyen <[email protected]> * Fix spacing * Apply suggestions from code review * Update src/llama.cpp --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
b3959
lora : warn user if new token is added in the adapter (#9948)
b3949
rpc : pack only RPC structs (#9959)
b3943
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) * refactor llama_batch_get_one * adapt all examples * fix simple.cpp * fix llama_bench * fix * fix context shifting * free batch before return * use common_batch_add, reuse llama_batch in loop * null terminated seq_id list * fix save-load-state example * fix perplexity * correct token pos in llama_batch_allocr