Skip to content

Tags: ROCm/MIOpen

Tags

rocm-7.2.2

Toggle rocm-7.2.2's commit message
ROCm release vrocm-7.2.2

rocm-7.2.1

Toggle rocm-7.2.1's commit message
ROCm release vrocm-7.2.1

rocm-7.2.4

Toggle rocm-7.2.4's commit message
[gfx12][Solvers][Winograd] Winograd Base for gfx12 v40.6.0 (#3000) (#…

…3846)

**Cherry-pick from develop branch**

## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->

Add Winograd Base 40.6.0 for gfx12 to improve convolution operations
performance

Issues related:
* ROCm/rocm-libraries#2567
* ROCm/rocm-libraries#897
* SWDEV-549814

## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->

Added kernels that implements winograd conv operation:
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f2x3_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f2x3_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f2x3_stride2.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f3x2_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f3x2_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f3x2_stride2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f2x3_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f2x3_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f2x3_stride2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f3x2_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f3x2_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f3x2_stride2.inc

## Test Plan

<!-- Explain any relevant testing done to verify this PR. -->

- [x] Unit tests for the solver on gfx1201/gfx1200 and gfx1100
- [x] Performance tests based on resnet50 problems
- [x] e2e test with models ->
https://amd.atlassian.net/wiki/spaces/VPGFXAT/pages/1232601634/Support+of+Winograd+Convolution+kernels+for+gfx120x

## Test Result

<!-- Briefly summarize test outcomes. -->
For resnet50 problems:
`MIOpenDriver convfp16 -n 128 -c 3 -H 224 -W 224 -k 64 -y 7 -x 7 -p 3 -q
3 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -F 1 -t 1`
* gfx1201
GFLOPs improved from 6621 to 21745
Time reduced from 4.562853 to 1.389383
* gfx1200
GFLOPs improved from 4641 to 11753
Time reduced from 6.509030 to 2.58354
## Submission Checklist

- Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

rocm-7.2.3

Toggle rocm-7.2.3's commit message
[gfx12][Solvers][Winograd] Winograd Base for gfx12 v40.6.0 (#3000) (#…

…3846)

**Cherry-pick from develop branch**

## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->

Add Winograd Base 40.6.0 for gfx12 to improve convolution operations
performance

Issues related:
* ROCm/rocm-libraries#2567
* ROCm/rocm-libraries#897
* SWDEV-549814

## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->

Added kernels that implements winograd conv operation:
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f2x3_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f2x3_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f2x3_stride2.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f3x2_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f3x2_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp16_dot2_f3x2_stride2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f2x3_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f2x3_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f2x3_stride2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f3x2_dilation2.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f3x2_stride1.inc
Conv_Winograd_v40_6_0_gfx12_fp32_f3x2_stride2.inc

## Test Plan

<!-- Explain any relevant testing done to verify this PR. -->

- [x] Unit tests for the solver on gfx1201/gfx1200 and gfx1100
- [x] Performance tests based on resnet50 problems
- [x] e2e test with models ->
https://amd.atlassian.net/wiki/spaces/VPGFXAT/pages/1232601634/Support+of+Winograd+Convolution+kernels+for+gfx120x

## Test Result

<!-- Briefly summarize test outcomes. -->
For resnet50 problems:
`MIOpenDriver convfp16 -n 128 -c 3 -H 224 -W 224 -k 64 -y 7 -x 7 -p 3 -q
3 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -F 1 -t 1`
* gfx1201
GFLOPs improved from 6621 to 21745
Time reduced from 4.562853 to 1.389383
* gfx1200
GFLOPs improved from 4641 to 11753
Time reduced from 6.509030 to 2.58354
## Submission Checklist

- Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

rocm-7.2.0

Toggle rocm-7.2.0's commit message
ROCm release vrocm-7.2.0

rocm-7.1.1

Toggle rocm-7.1.1's commit message
ROCm release vrocm-7.1.1

rocm-7.1.0

Toggle rocm-7.1.0's commit message
ROCm release vrocm-7.1.0

rocm-7.0.2

Toggle rocm-7.0.2's commit message
ROCm release rocm-7.0.2

rocm-6.4.4

Toggle rocm-6.4.4's commit message
ROCm release vrocm-6.4.4

rocm-7.0.1

Toggle rocm-7.0.1's commit message
ROCm release vrocm-7.0.1