Skip to content

Conversation

@catboxanon
Copy link
Contributor

@catboxanon catboxanon commented Apr 11, 2025

Adds support for CublasOps, allowing for faster fp16 performance.

On my system with a RTX 3090, when running the included SDXL Simple workflow, this extension library provides an it/s bump from 4it/s to 5it/s.

Guarded behind --fast since this does affect generation output in a similar manner to #6453.

@comfyanonymous comfyanonymous merged commit 1714a4c into comfyanonymous:master Apr 12, 2025
5 checks passed
@catboxanon catboxanon deleted the cublas branch April 13, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants