Skip to content

Tags: xyc/llama.cpp

Tags

b2489

Toggle b2489's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cuda : disable host register by default (ggml-org#6206)

b2487

Toggle b2487's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
tests : disable system() calls (ggml-org#6198)

ggml-ci

b2481

Toggle b2481's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add ability to use Q5_0, Q5_1, and IQ4_NL for quantized K cache (ggml…

…-org#6183)

* k_cache: be able to use Q5_0

* k_cache: be able to use Q5_1 on CODA

* k_cache: be able to use Q5_0 on Metal

* k_cache: be able to use Q5_1 on Metal

* k_cache: be able to use IQ4_NL - just CUDA for now

* k_cache: be able to use IQ4_NL on Metal

* k_cache: add newly added supported types to llama-bench and CUDA supports_op

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

b2480

Toggle b2480's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add nvidia and amd backends (ggml-org#6157)

b2479

Toggle b2479's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cuda : fix conflict with std::swap (ggml-org#6186)

b2478

Toggle b2478's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cuda : print the returned error when CUDA initialization fails (ggml-…

…org#6185)

b2476

Toggle b2476's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llava : add MobileVLM_V2 backup (ggml-org#6175)

* Add MobileVLM_V2 backup

* Update MobileVLM-README.md

* Update examples/llava/MobileVLM-README.md

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update examples/llava/convert-image-encoder-to-gguf.py

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* clip :  fix whitespace

* fix deifinition mistake in clip.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

b2475

Toggle b2475's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cuda : refactor to remove global resources (ggml-org#6170)

* cuda : refactor to remove global resources

b2474

Toggle b2474's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Server: version bump for httplib and json (ggml-org#6169)

* server: version bump for httplib and json

* fix build

* bring back content_length

b2471

Toggle b2471's commit message

Verified

This commit was signed with the committer’s verified signature.
ggerganov Georgi Gerganov
Revert "llava : add a MobileVLM_V2-1.7B backup (ggml-org#6152)"

This reverts commit f8c4e74.