Tags: anfedotoff/pytorch
Tags
Update on "Remove pow and float_power TestGradient Skips" [ghstack-poisoned]
Update on "Add linalg.lu" This PR modifies `lu_unpack` by: - Using less memory when unpacking `L` and `U` - Fuse the subtraction by `-1` with `unpack_pivots_stub` - Define tensors of the correct types to avoid copies - Port `lu_unpack` to be a strucutred kernel so that its `_out` version does not incur on extra copies Then we implement `linalg.lu` as a structured kernel, as we want to compute its derivative manually. We do so because composing the derivatives of `torch.lu_factor` and `torch.lu_unpack` would be less efficient. This new function and `lu_unpack` comes with all the things it can come: forward and backward ad, decent docs, correctness tests, OpInfo, complex support, support for metatensors and support for vmap and vmap over the gradients. I really hope we don't continue adding more features. This PR also avoids saving some of the tensors that were previously saved unnecessarily for the backward in `lu_factor_ex_backward` and `lu_backward` and does some other general improvements here and there to the forward and backward AD formulae of other related functions. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano [ghstack-poisoned]
[FSDP] Relax exec order valid. to only fwd [ghstack-poisoned]
[ROCm] default tests use 1 GPU, distributed tests use 2 GPUs
Add optional timeout argument for RpcAgent join() (pytorch#76194) Summary: This PR was created to resolve issue brought up in https://fb.workplace.com/groups/319878845696681/permalink/741428653541696/ Changes: - Adds timeout argument to RpcAgent.join() - Add optional timeout argument to ThriftRpcAgent barrier() - During shutdown (ThriftRpcAgent join) calls the barrier, the agent will use the timeout passed to shutdown and pass that timeout into the join(). - Update API.py to also include fix bug (missing timeout for signal) - Change default shutdown timeout to 0 (no timeout). Existing functionality in _all_gather will remain the same and wait indefinitely for signal if no timeout is set for the function. New functionality has user specify timeout for both the signal and rpc calls. Pull Request resolved: pytorch#76194 Test Plan: Modified barrier test buck test torch/fb/distributed/thriftRpcBackend/test:ThriftRpcAgentTest -- BarrierTest Differential Revision: D35825382 fbshipit-source-id: 2195a4350c07accceec52905ef1f7990534d0ec8
Update on "[NVFuser] Opinfos for extremal values" Added slow tests for comparing the eager & fused outputs for given extremal inputs. [ghstack-poisoned]
Rebase and fix merge conflicts on "cuDNN/miopen: Use per-operator hea… …ders" Differential Revision: [D33949898](https://our.internmc.facebook.com/intern/diff/D33949898) [ghstack-poisoned]
PreviousNext