Skip to content

v1.19.1

Choose a tag to compare

@yuanyao-nv yuanyao-nv released this 10 Oct 02:06
b751946

Note

This patch release includes important bug fixes to the function definition of Attention-23/24 under the Group Query Attention mode and to the reference implementation of RotaryEmbedding-23.

All changes

  • Avoid unnecessary re-generating of proto files (#7253) in #7306
  • Require ml_dtypes>=0.5.0 (#7254) in #7307
  • Cherry pick four attention PRs in #7315
  • Update rotary_embedding reference implementation and tests (#7304, #7316) in #7313
  • Override __repr__ for some proto classes (#7259) in #7314
  • add check for rc-candidates (Update create_release.yml) (#7261) in #7323
  • Implement repr methods for Model/Graph/Function (#7320) in #7325