Skip to content

Tags: intel/sycl-tla

Tags

v0.9.1

Toggle v0.9.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Update version (#826)

v0.9

Toggle v0.9's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix sub-byte pointer arithmetic and zero buffer allocation in grouped…

… gemm (#790)

For sub-byte types (uint4_t/int4_t), sizeof(T)=1 but packed storage uses
sizeof_bits/8=0.5 bytes per element. Plain pointer arithmetic
(base+offset) over-advances, causing out-of-bounds access for group>0.
Add packed_ptr() helper to compute correct byte offsets.

Also fix zero buffer under-allocation when scale_k <
zero_elements_packed_along_k by using max(zero_elements_packed_along_k,
scale_k).

---------

Co-authored-by: Jacky, Deng <jacky.deng@intel.com>

v0.8

Toggle v0.8's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix for python EVT BMG tests (#762)

## Description
<!-- What does this PR do? -->
Fixes following issues
- Detection of B60, B70 hosts as BMG 
- Fix for compile command causing test failure. Compile command now uses
spirv64_gen and sets devices correctly for BMG
```
Error Message:
icpx: error: cannot deduce implicit triple value for '-Xspirv-translator', specify triple using '-Xspirv-translator=<triple>'
icpx: error: cannot deduce implicit triple value for '-Xspirv-translator', specify triple using '-Xspirv-translator=<triple>' 
```

## Type
- [x] Bug  - [ ] Feature  - [ ] Performance  - [ ] Refactor

## Testing
- [ ] Tests pass  - [ ] Xe12  - [ ] Xe20

### Testing on G31

(G21 test covered by CI)

```
python3 test/python/cutlass/evt/run_xe_evt_tests.py -j all
======================================================================
Test Report Summary
======================================================================
Suite: all
Total tests run: 65
Passed: 59
Failed: 0
Errors: 0
Skipped: 6
======================================================================
Test suite 'all' passed!
```
## Performance
| Metric | Before | After |
|--------|--------|-------|
|        |        |       |

## References
Fixes #

## Checklist
- [ ] Copyright  - [ ] Co-pilot Review  - [ ] Deprecated APIs not used

---------

Co-authored-by: Vance, Antony <antony.vance@intel.com>

v0.7

Toggle v0.7's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Python version changes. (#714)

## Description
Update for 0.7 version

Co-authored-by: Jacky, Deng <jacky.deng@intel.com>

v0.6

Toggle v0.6's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
v0.6.0 update (#606)

Changes for 
1. Python package version
2. Documentation changes
3. Changelog

---------
Co-authored-by: Vance, Antony <antony.vance@intel.com>

v0.5

Toggle v0.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Move compat to cute/util (#538)

v3.9-0.3

Toggle v3.9-0.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
June release changelog (#451)

This PR updates the changelog to reflect the changes and new features
included in the June release

---------

Co-authored-by: Mehdi Goli <mehdi.goli@codeplay.com>

v3.9-0.2

Toggle v3.9-0.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Cutlass 3.9.2 SYCL backend Version 0.2 (#402)

Update the SYCL CUTLASS Changelog to include the changes for version
Cutlass 3.9.2 SYCL backend Version 0.2

v3.9-0.1

Toggle v3.9-0.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fixing the versioning (#347)

The release is based on The CUTLASS
[3.9.0](https://github.com/NVIDIA/cutlass/releases/tag/v3.9.0)
(2025-03-20).

---------

Co-authored-by: Ruyman <ruyman@codeplay.com>