Skip to content

Conversation

@cebtenzzre
Copy link
Member

@cebtenzzre cebtenzzre commented Feb 22, 2024

These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder. The output of each of these has been checked subjectively for accuracy in Q4_0 format.

The only models that we don't support right now are because of missing ops - GGML_OP_ALIBI for BLOOM, MPT, and Refact, and GGML_OP_CONCAT for Persimmon. Upstream support for PLaMo seems to be broken (ggml-org/llama.cpp#5669).

These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM,
MiniCPM, Orion, Qwen, and StarCoder.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
@cebtenzzre cebtenzzre requested a review from manyoso February 22, 2024 19:18
@cebtenzzre cebtenzzre merged commit 88e330e into main Feb 22, 2024
@cebtenzzre cebtenzzre deleted the add-gpu-model-arches branch February 10, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants