Releases · 3Simplex/llama.cpp

18 Nov 17:14

531cb1c

b4125

Skip searching root path for cross-compile builds (#10383)

Assets 21

16 Nov 18:29

github-actions

b4100

bcdb7a2

b4100

server: (web UI) Add samplers sequence customization (#10255)

* Samplers sequence: simplified and input field.

* Removed unused function

* Modify and use `settings-modal-short-input`

* rename "name" --> "label"

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

Assets 21

12 Nov 14:48

github-actions

b4067

54ef9cf

b4067

vulkan: Throttle the number of shader compiles during the build step.…

Assets 22

09 Nov 16:16

github-actions

b4061

6423c65

b4061

metal : reorder write loop in mul mat kernel + style (#10231)

* metal : reorder write loop

* metal : int -> short, style

ggml-ci

Assets 22

07 Nov 17:15

github-actions

b4042

5107e8c

b4042

DRY: Fixes clone functionality (#10192)

Assets 22

01 Nov 16:31

github-actions

b4007

d865d14

b4007

server : fix smart selection of available slot (#10120)

* Fix smart selection of available slot

* minor fix

* replace vectors of tokens with shorthands

Assets 22

28 Oct 22:35

github-actions

b3987

61715d5

b3987

llama : Add IBM granite template (#10013)

* Add granite template to llama.cpp

* Add granite template to test-chat-template.cpp

* Update src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Update tests/test-chat-template.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Added proper template and expected output

* Small change to \n

Small change to \n

* Add code space &

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Fix spacing

* Apply suggestions from code review

* Update src/llama.cpp

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

Assets 22

22 Oct 13:32

github-actions

b3959

c421ac0

b3959

lora : warn user if new token is added in the adapter (#9948)

Assets 22

21 Oct 13:46

github-actions

b3949

d5ebd79

b3949

rpc : pack only RPC structs (#9959)

Assets 22

20 Oct 14:00

github-actions

b3943

cda0e4b

b3943

llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)

* refactor llama_batch_get_one

* adapt all examples

* fix simple.cpp

* fix llama_bench

* fix

* fix context shifting

* free batch before return

* use common_batch_add, reuse llama_batch in loop

* null terminated seq_id list

* fix save-load-state example

* fix perplexity

* correct token pos in llama_batch_allocr

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: 3Simplex/llama.cpp

b4125

b4100

b4067

b4061

b4042

b4007

b3987

b3959

b3949

b3943