Skip to content

Tags: ROCm/triton

Tags

POST_IFU_0911

Toggle POST_IFU_0911's commit message
remove get_git_version_suffix() (#866)

(cherry picked from commit 03ec239)
(cherry picked from commit 89e8370)

PRE_IFU_0911

Toggle PRE_IFU_0911's commit message
code cleanup

aotriton-0.11-dev2

Toggle aotriton-0.11-dev2's commit message
make torch optional

ifu-231117-2

Toggle ifu-231117-2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request #410 from ROCmSoftwarePlatform/ifu-231117

Ifu 231117

ifu-231117-prev

Toggle ifu-231117-prev's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
add bitcode for gfx941 and gfx942 (#403)

Co-authored-by: Aleksandr Efimov <130555951+alefimov-amd@users.noreply.github.com>

ifu-231108

Toggle ifu-231108's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request #395 from ROCmSoftwarePlatform/ifu-231108

Ifu 231108

ifu-231108-prev

Toggle ifu-231108-prev's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
[Tutorial] Fix post IFU issues with FA (#398)

* [Tutorial] Fix post IFU issues with FA

* Remove redundant kernels in 06-fused-attention.py

* Added README for scripts in perf-kernels dir

* Fix bwd kernel

---------

Co-authored-by: Lixun Zhang <lixun.zhang@amd.com>

ifu-231005

Toggle ifu-231005's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request #382 from ROCmSoftwarePlatform/ifu231005-rebase

Ifu231005

ifu-231005-prev

Toggle ifu-231005-prev's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Add OptimizeEpilogue pass. (#346)

* optimize_epilogue

* Add config

* Remove licenses

* Comment out Hopper specific parameters when printing out configs

* Add benchmark parameters from flash-attention repo

* Add Z and H in the key of autotuner

---------

Co-authored-by: Lixun Zhang <lixun.zhang@amd.com>

third-party-merge-prev

Toggle third-party-merge-prev's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
use different int8 mfma instructions on different GPUs. (#368)

* changes support to choose different int8 instructions

* rename an instruction name

Co-authored-by: Aleksandr Efimov <efimov.alexander@gmail.com>