Releases · yuiseki/llama.cpp

21 Sep 23:26

c4510dc

b6532 Latest

Latest

opencl: initial `q8_0` mv support (#15732)

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-09-21T23:26:07Z
llama-b6532-bin-macos-arm64.zip

sha256:f93aa627fa10975eb4378cde0624fda50cff291effc5b4749d50b33ac0c3608b

10.2 MB 2025-09-21T23:26:17Z
llama-b6532-bin-macos-x64.zip

sha256:6f43c4110e7a3e45e973f64b04a250c5226058603f4e034e4d0dcb1129ec5a5d

28.4 MB 2025-09-21T23:26:17Z
llama-b6532-bin-ubuntu-vulkan-x64.zip

sha256:cd2de81bedccf6a3bb50205ff9c52b65e2932d404de1cf747ce7e27727a1edc0

25.4 MB 2025-09-21T23:26:19Z
llama-b6532-bin-ubuntu-x64.zip

sha256:e6e3f0f7a64b72257139c4949ef655b6caa4103e7f4f77c16baface6a37a2f29

12.3 MB 2025-09-21T23:26:20Z
llama-b6532-bin-win-cpu-arm64.zip

sha256:1667d69470edaa83a958438e3ea11618e6e9517f00f520c4e729653bb7d2f652

10.4 MB 2025-09-21T23:26:20Z
llama-b6532-bin-win-cpu-x64.zip

sha256:f578858175a5c3eb0c9086a0193bc423819e79e1487439f3b6d77dd9b5235afb

13.4 MB 2025-09-21T23:26:21Z
llama-b6532-bin-win-cuda-12.4-x64.zip

sha256:ba3316405ed5cd4940b72515431de7ab04c1a944b9d8e62c83e85812a623c152

146 MB 2025-09-21T23:26:22Z
llama-b6532-bin-win-hip-radeon-x64.zip

sha256:460a0bee117a37b4434ec59bd6f7e9907f5279a3540853d4babfc2710a2a0f9a

318 MB 2025-09-21T23:26:26Z
llama-b6532-bin-win-opencl-adreno-arm64.zip

sha256:6ef5a631d9cfb8bf5f88e4ed7abbd326f34894e27c1354bbd13c941a49a245d5

10.8 MB 2025-09-21T23:26:34Z
Source code (zip)

2025-09-21T21:48:44Z
Source code (tar.gz)

2025-09-21T21:48:44Z

02 Sep 22:59

github-actions

b6360

3de0082

b6360

fix: resolve unsigned int initialization warning for n_dims/size in g…

Assets 15

29 Jul 23:19

github-actions

b6027

aa79524

b6027

HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only …

Assets 15

25 Jul 04:09

github-actions

b5985

3f4fc97

b5985

musa: upgrade musa sdk to rc4.2.0 (#14498)

* musa: apply mublas API changes

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: update musa version to 4.2.0

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: restore MUSA graph settings in CMakeLists.txt

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: disable mudnnMemcpyAsync by default

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: switch back to non-mudnn images

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* minor changes

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

* musa: restore rc in docker image tag

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

---------

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

Assets 15

25 Jun 14:23

github-actions

b5753

73e53dc

b5753

opencl: ref count `ggml_backend_opencl_context` and refactor profilin…

Assets 15

23 Jun 12:31

github-actions

b5743

defe215

b5743

CUDA: mul_mat_v support for batch sizes > 1 (#14262)

* CUDA: mul_mat_v support for batch sizes > 1

* use 64 bit math for initial offset calculation

Assets 15

23 Jun 11:42

github-actions

b5742

7b50d58

b5742

kv-cells : fix tracking of seq_pos (#14339)

* kv-cells : fix tracking of seq_pos during cache reuse

ggml-ci

* cont : improve error message

ggml-ci

* cont : add more comments

Assets 15

22 Jun 12:08

github-actions

b5734

40bfa04

b5734

common : use std::string_view now that we target c++17 (#14319)

Assets 15

18 Jun 13:26

github-actions

b5697

ef03580

b5697

ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (#14258)

Assets 15

17 Jun 00:25

github-actions

b5686

e434e69

b5686

common : suggest --jinja when autodetection fails (#14222)

Assets 15

Releases: yuiseki/llama.cpp

b6532

Uh oh!

b6360

Uh oh!

b6027

Uh oh!

b5985

Uh oh!

b5753

Uh oh!

b5743

Uh oh!

b5742

Uh oh!

b5734

Uh oh!

b5697

Uh oh!

b5686

Uh oh!