-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
llama-fit-params: lower ctx size for multi GPU
#18101
by JohannesGaessler
was merged Dec 16, 2025
Loading…
gguf-py: allow converting multi-tensor models from read-only locations
python
python script changes
#18100
by ykhrustalev
was merged Dec 17, 2025
Loading…
llama-fit-params: fix underflow for dense models
#18095
by JohannesGaessler
was merged Dec 16, 2025
Loading…
ggml : use WARP_SIZE/2 for argmax reduction offset
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18092
by Aadeshveer
was merged Dec 17, 2025
Loading…
llama-fit-params: QoL impr. for prints/errors
examples
#18089
by JohannesGaessler
was merged Dec 16, 2025
Loading…
model-conversion : remove -fa option in model card template [no ci]
examples
#18088
by danbev
was merged Dec 16, 2025
Loading…
model-conversion : add note about verifying previous models
examples
#18082
by danbev
was merged Dec 16, 2025
Loading…
model-conversion : use CONVERTED_EMBEDDING_MODEL for embedding_verify_logits
examples
#18079
by danbev
was merged Dec 16, 2025
Loading…
llama: Include algorithm header needed for C++23
#18078
by cpeterso
was merged Dec 16, 2025
Loading…
NVIDIA Nemotron 3 parsing
testing
Everything test related
#18077
by aldehir
was merged Dec 16, 2025
Loading…
Update README.md incorrect argument
examples
server
#18073
by 2114L3
was merged Dec 16, 2025
Loading…
ci : separate webui from server
devops
improvements to build systems and github actions
#18072
by CISC
was merged Dec 16, 2025
Loading…
llama: fix early stop in params_fit if ctx is set
#18070
by JohannesGaessler
was merged Dec 16, 2025
Loading…
convert : move rope_parameters to TextModel class
python
python script changes
#18061
by CISC
was merged Dec 15, 2025
Loading…
llama : add support for NVIDIA Nemotron 3 Nano
model
Model specific
python
python script changes
#18058
by danbev
was merged Dec 16, 2025
Loading…
server: Fix router proxying to child processes when --host is specified
examples
server
#18054
by wbtek
was closed Dec 17, 2025
Loading…
arch: refactor LLM_TENSOR_NAMES
documentation
Improvements or additions to documentation
refactoring
Refactoring
#18051
by ngxson
was merged Dec 16, 2025
Loading…
preset: handle negated arg, reverse the meaning if needed
#18041
by ngxson
was merged Dec 14, 2025
Loading…
model: add KORMo model
model
Model specific
python
python script changes
#18032
by HelloKS
was merged Dec 15, 2025
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.