Skip to content

Tags: chdspm/pytorch

Tags

viable/strict/1761893641

Toggle viable/strict/1761893641's commit message
[CI][BE] Factor out repeated test code (pytorch#166481)

Into `_run_single_arg_fwd`

Pull Request resolved: pytorch#166481
Approved by: https://github.com/Skylion007

viable/strict/1761886268

Toggle viable/strict/1761886268's commit message
Remove AT_USE_HIPSPARSE_GENERIC_API (pytorch#166393)

This macro is not used in OSS anymore.
Pull Request resolved: pytorch#166393
Approved by: https://github.com/ezyang

trunk/121235956bab7430fb8d080cee209607f8387ead

Toggle trunk/121235956bab7430fb8d080cee209607f8387ead's commit message
update Node.is_impure check if subgraph contains impure ops (pytorch#…

…166609)

Summary:
## Context
when `const_fold.split_const_subgraphs` sees a `call_module` node that is a GraphModule, by the existing implementation it can mark this node as const-foldable when it shouldn't.

For example, a parent graph contains a `call_module` to a subgraph that has no inputs but contain impure ops inside.
```
parent graph():
    %sub : [num_users=1] = call_module[target=sub](args = (), kwargs = {})
    %getitem : [num_users=1] = call_function[target=operator.getitem](args = (%sub, slice(None, None, None)), kwargs = {})
    return (getitem,)

submodule graph():
    %randn : [num_users=1] = call_function[target=torch.ops.aten.randn.default](args = ([5, 10],), kwargs = {device: cpu, pin_memory: False})
    %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%randn, 1), kwargs = {})
    return (add,)
```
when `submodule` graph is fed to const_fold.split_const_subgraph, it would come out unmodified since randn is impure.

But if the `submodule` is called by a `parent` graph, when `parent` is fed to const_fold.split_const_subgraph, it would come out folded.
```
parent after fold graph():
    %_fx_const_folded_attrs : [num_users=1] = get_attr[target=_FX_CONST_FOLDED_ATTRS]
    return (_fx_const_folded_attrs,)
```

This is because `node.is_impure()` check inside `const_fold.split_const_subgraph` fail through, leading the call_module node to be marked as pure.

## Fix

We can update `fx.node.Node.is_impure` function to check for ops inside a call_module node with an additional `subgraph_has_impure_ops` check:
- if a call_module node calls a GraphModule,
- check any call_function nodes are impure ops
- recursively check any call_module nodes that call GraphModule

If the call_module subgraph has impure ops, return True to `is_impure`

Test Plan: added tests to test_fx_const_fold.py

Differential Revision: D85798483

Pull Request resolved: pytorch#166609
Approved by: https://github.com/blaine-rister

trunk/32066772b3dee643b1657b8957f32b5ac8b1390a

Toggle trunk/32066772b3dee643b1657b8957f32b5ac8b1390a's commit message
Fix torch.full with dynamic tensor fill_value in torch.compile (pytor…

…ch#166554)

Fixes pytorch#166253

## Summary
When `torch.full` is called with a 0-D tensor as `fill_value` inside a `torch.compile`'d function, the value was being incorrectly cached, causing subsequent calls with different values to return the first value.

## Root Cause
The Dynamo handler for `torch.full` was calling `aten._local_scalar_dense` to convert tensor fill_values to Python scalars at compile time, which baked the value into the compiled graph as a constant.

## Solution
Modified the Dynamo handler to decompose `torch.full(size, tensor_fill_value)` into `empty(size).fill_(tensor_fill_value)` when `fill_value` is a `TensorVariable`, keeping the fill value dynamic in the compiled graph.

## Testing
Added test case that verifies torch.full works correctly with dynamic tensor fill_values across multiple calls and dtypes.

Pull Request resolved: pytorch#166554
Approved by: https://github.com/Lucaskabela

trunk/12577064dddfc6f5daf66c5b5a73cb418a588f20

Toggle trunk/12577064dddfc6f5daf66c5b5a73cb418a588f20's commit message
[MPS] Fix crash when max/min ops called for complex types (pytorch#16…

…6214)

Raise an exception, as it's meaningless and results in segfault otherwise:
```
% python -c "import torch;torch.rand(10, dtype=torch.cfloat, device='mps').amax()"
(mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: error: 'mps.reduction_max' op operand #0 must be tensor of mps native type values, but got 'tensor<10xcomplex<f32>>'
(mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: note: see current operation: %2 = "mps.reduction_max"(%arg0, %1) <{keep_dims, propagate_nans}> : (tensor<10xcomplex<f32>>, tensor<1xsi32>) -> tensor<1xcomplex<f32>>
(mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: error: 'mps.reduction_max' op operand #0 must be tensor of mps native type values, but got 'tensor<10xcomplex<f32>>'
(mpsFileLoc): /AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm:176:0: note: see current operation: %2 = "mps.reduction_max"(%arg0, %1) <{keep_dims, propagate_nans}> : (tensor<10xcomplex<f32>>, tensor<1xsi32>) -> tensor<1xcomplex<f32>>
/AppleInternal/Library/BuildRoots/4~B6shugDBannYeMBGCfhw7wjvNJOfy4BrawZ7TdI/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:1347: failed assertion `original module failed verification'
zsh: abort      python -c
```

To be tested by `test_ops.py`
Pull Request resolved: pytorch#166214
Approved by: https://github.com/dcci, https://github.com/kulinseth, https://github.com/Skylion007
ghstack dependencies: pytorch#166272

trunk/26534e9809eb2f7cd804fde5152cdd13dda2293f

Toggle trunk/26534e9809eb2f7cd804fde5152cdd13dda2293f's commit message
Revert "[GraphPartition] cache get_free_symbol_uses (pytorch#166338)"

This reverts commit a6b1ef1.

Reverted pytorch#166338 on behalf of https://github.com/atalman due to Failure: test/nn/test_convolution.py::TestConvolutionNN::test_conv3d_overflow_values [GH job link](https://github.com/pytorch/pytorch/actions/runs/18961173726/job/54149112920) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/a6b1ef17173f56ba93ac97ff4384fa4060b5e41e) ([comment](pytorch#166338 (comment)))

trunk/797cd80b2670a51601f997f8c67387bd30440a36

Toggle trunk/797cd80b2670a51601f997f8c67387bd30440a36's commit message
[dynamo, nested graph breaks] codegen dead nested cells correctly (py…

…torch#166476)

Pull Request resolved: pytorch#166476
Approved by: https://github.com/Lucaskabela

trunk/657f8c3e21bd8901dd8ce79ca9a54a45b27f604f

Toggle trunk/657f8c3e21bd8901dd8ce79ca9a54a45b27f604f's commit message
Revert "Fix torch.full with dynamic tensor fill_value in torch.compile (

pytorch#166554)"

This reverts commit 3206677.

Reverted pytorch#166554 on behalf of https://github.com/atalman due to Failure: test/nn/test_pooling.py::TestPoolingNNDeviceTypeCPU::test_max_pool_nan_inf_cpu_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/18959368975/job/54144148546) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/32066772b3dee643b1657b8957f32b5ac8b1390a) ([comment](pytorch#166554 (comment)))

trunk/267d0197bfca0232488d51dd1ff735d619adc2cf

Toggle trunk/267d0197bfca0232488d51dd1ff735d619adc2cf's commit message
[dynamo] fix error_on_graph_break bug where non-empty checkpoint resu…

…lts in unwanted graph break resumption (pytorch#166586)

Fixes pytorch#166589

Pull Request resolved: pytorch#166586
Approved by: https://github.com/Lucaskabela
ghstack dependencies: pytorch#166476, pytorch#166477

trunk/160ab53dd57e67b3574763615cf8b33249e9afa5

Toggle trunk/160ab53dd57e67b3574763615cf8b33249e9afa5's commit message
Update weight tensor initialization in RMSNormalization (pytorch#166550)

Ensure a >1d tensor as weight for ORT compatibility.

Pull Request resolved: pytorch#166550
Approved by: https://github.com/titaiwangms