llama.cpp: enable Kompute support for 10 more model architectures #2005

cebtenzzre · 2024-02-22T19:18:38Z

These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder. The output of each of these has been checked subjectively for accuracy in Q4_0 format.

The only models that we don't support right now are because of missing ops - GGML_OP_ALIBI for BLOOM, MPT, and Refact, and GGML_OP_CONCAT for Persimmon. Upstream support for PLaMo seems to be broken (ggml-org/llama.cpp#5669).

These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

llama.cpp: enable Kompute support for 10 more model architectures

213c565

These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre requested a review from manyoso February 22, 2024 19:18

manyoso approved these changes Feb 22, 2024

View reviewed changes

cebtenzzre merged commit 88e330e into main Feb 22, 2024

cebtenzzre mentioned this pull request Feb 22, 2024

Support for QWEN and Baichuan2 models #1731

Closed

cebtenzzre deleted the add-gpu-model-arches branch February 10, 2025 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama.cpp: enable Kompute support for 10 more model architectures #2005

llama.cpp: enable Kompute support for 10 more model architectures #2005

Uh oh!

cebtenzzre commented Feb 22, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

llama.cpp: enable Kompute support for 10 more model architectures #2005

llama.cpp: enable Kompute support for 10 more model architectures #2005

Uh oh!

Conversation

cebtenzzre commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cebtenzzre commented Feb 22, 2024 •

edited

Loading