feat: Introduce K/V Context Quantisation (vRAM improvements) #4422

Sign in to view logs

Triggered via pull request September 10, 2024 02:03

sammcj

synchronize #6279

sammcj:feature/kv-quant

Status Success

Total duration 35m 46s

Billable time 1h 29m

Artifacts 8

test.yaml

on: pull_request

Matrix: lint

Matrix: test

generate-windows-rocm

generate-windows-cuda

Matrix: generate-cuda

Matrix: generate-rocm

Matrix: generate

Annotations

3 warnings

test (windows-2019, amd64)

No files were found with the provided path: ollama. No artifacts will be uploaded.

generate-cuda (11.8.0)

The following actions use a deprecated Node.js version and will be forced to run on node20: actions/setup-go@v4. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/

generate-rocm (6.1.2)

The following actions use a deprecated Node.js version and will be forced to run on node20: actions/setup-go@v4. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/

Artifacts

Produced during runtime

Name	Size
cuda-11.8.0-libraries	205 MB
macos-latest-amd64-libraries	9.7 MB
macos-latest-arm64-libraries	4.09 MB
macos-latest-binaries	17.5 MB
rocm-6.1.2-libraries	124 MB
ubuntu-latest-amd64-libraries	7.5 MB
ubuntu-latest-binaries	16.2 MB
windows-2019-amd64-libraries	4.46 MB