Skip to content

Tags: ROCm/FlyDSL

Tags

v0.2.0.dev645

Toggle v0.2.0.dev645's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[Docs] Onboarding notebooks (2/n): layout algebra (#665)

* [Docs] Onboarding notebooks (2/n): layout algebra

Picks up after #635 (1/n, expr foundation) with the layout half of the
series:

- 04_layout: layout = (shape, stride) as a coord->index function; crd2idx
  (row- vs column-major), logical_divide tiling, and the two tensor kinds
  (memref vs coord, the latter via the identity layout).
- 05_tiled_copy_and_swizzle: thread-value layouts, make_tiled_copy +
  partitioning into per-thread views, the LDS bank swizzle, and an inline
  print_typst -> SVG showcase (typst optional; falls back to source).
- 01_numeric_types: split DSL programming from the MLIR mapping — the
  type->MLIR detail moves into an optional section that explains and
  demonstrates it (.ir_type), instead of an unexplained column. Fixes the
  unsigned-int row: MLIR integers are signless (Uint32 and Int32 are both
  i32; signedness is carried by the op).
- README: 04/05 in the index + a layout cheat-sheet; note the typst dep.

Run-verified on MI350X (gfx950); outputs committed cleared.

* review fixes (#665): JIT-cache robustness, pedagogy, arch-labeling

Self-review + a warm-cache re-run surfaced one real bug and several
clarity/labeling fixes:

- JIT-cache trap (the important one): the layout cells print at *trace*
  time and print_typst writes its .typ files at trace time, so a warm
  JIT disk cache skipped the re-trace and the output vanished on a re-run
  (04 silently blanked; 05 errored on the missing .typ). 04/05 and 01 §6
  now set FLYDSL_RUNTIME_ENABLE_CACHE=0 before importing flydsl so re-runs
  always re-trace. Verified on a warm cache.
- 04: sharpen the coord-tensor explanation (crd2idx through an identity
  layout returns the coordinate tuple -> that is why it has no element
  type); note the coordinate is fixed and the stride decides the index.
- 05: decode the thread/value layouts explicitly; reframe coord_swizzle
  as a deliberate internal (not "missing"); disclose the device path is
  AMD CDNA (rocdl.*) while the layout algebra stays portable; fix a
  file-handle leak in render_typst.
- 01 §6: lead with the concrete use case (reading IR dumps / type errors).
- README: label 05 as the AMD CDNA path; add the trace-time/cache gotcha.

Re-verified on MI350X (gfx950); outputs cleared.

* docs: clarity pass on the layout notebooks (beginner walkthrough)

Simulated a junior engineer fresh from 00-03 running 04/05/01 and tightened
the spots that tripped a first-time reader. Wording/markdown only — all
cells still run clean on gfx950:

- 04: gloss "memref" and "identity layout"; de-jargon "cosize"; explain the
  logical_divide result (8 tiles x 8) and how to read the coord-tensor repr
  (base coord + the 1E0/1E1 basis stride).
- 05: fix the thread-layout orientation ((4,1) is 4 threads down a column,
  not across a row); read the TV-layout stride; note the opaque per-thread
  view repr is just a per-thread slice of the same buffer; gloss the swizzle
  (mask, base, shift) knobs; say what the diagram colours mean.
- 01 §6: define "signless" plainly (no sign bit in the type; the op carries
  it); drop layout jargon from the vector<4xf32> closing.
- README: nbconvert runs against your active env (flydsl must import there);
  list torch + the build step; note 00-03 come first.

Re-verified on MI350X (gfx950); outputs cleared.

v0.2.0

Toggle v0.2.0's commit message
v0.2.0 ready: fix cache key issues; ast_while ready; fix att trace co…

…de map; enable gfx1151

v0.1.9.dev599

Toggle v0.1.9.dev599's commit message
[Enh] Enable fast dispatch for PointerAdaptor

v0.1.8

Toggle v0.1.8's commit message
1, refactor cluster.py 2, refactor auto dynamic in if_ 3, change back…

…end config

v0.1.7

Toggle v0.1.7's commit message
1, expand 450 tdm support; 2, fix mori shmem err; 3, for range fix; 4…

…, support cache status return

v0.1.6

Toggle v0.1.6's commit message
support glibc 2.27; support sub module; support backend discover; sup…

…port runonly mode

v0.1.5

Toggle v0.1.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
docs: Update FlyDSL agent guidance (#463)

Made-with: Cursor

Signed-off-by: sixifang <sixifang@amd.com>
Co-authored-by: sixifang <sixifang@amd.com>

v0.1.5.dev504

Toggle v0.1.5.dev504's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Support pypi dev version and update to 0.1.5 (#441)

v0.1.4.2

Toggle v0.1.4.2's commit message
hot fix2, fix aot cpu tensor error and cache err

v0.1.4

Toggle v0.1.4's commit message
1,fix version switch; 2, more atom support; 3, covert atom to ssa for…

…m; 4, fix grad tensor to dlpack