-
Notifications
You must be signed in to change notification settings - Fork 566
Closed
Labels
dynamismDynamic Shape FeaturesDynamic Shape Features
Description
🐛 Bug
BWD nn test with dynamic input without sigmoid results in a new error.
A similar model, BWD nn test with dynamic input with sigmoid, results in a error in autograd: #4322. So I replaced the sigmoid with relu and the new model failed with a new error:
Traceback (most recent call last):
File "pytorch/xla/test/test_dynamic_shape_backward_models.py", line 82, in <module>
train(model, loss_fn=criterion, optimizer=optimizer)
File "pytorch/xla/test/test_dynamic_shape_backward_models.py", line 69, in train
loss.backward()
File "/home/ptxla/.local/lib/python3.8/site-packages/torch/_tensor.py", line 484, in backward
torch.autograd.backward(
File "/home/ptxla/.local/lib/python3.8/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: torch_xla/csrc/helpers.cpp:273 : Check failed: out_size <= size_at_dyndim / input_shape.dimensions( input_dynamic_dimension) (10 vs. 1)
*** Begin stack trace ***
tsl::CurrentStackTrace[abi:cxx11]()
torch_xla::XlaHelpers::GetDynamicReshapeInfo(xla::Shape const&, absl::lts_20220623::Span<long const>)
torch_xla::XlaHelpers::GetDynamicReshape(xla::Shape const&, absl::lts_20220623::Span<long const>)
torch_xla::Permute::MakePermuteShape(xla::Shape const&, absl::lts_20220623::Span<long const>)
torch_xla::ViewInfo::ViewInfo(torch_xla::ViewInfo::Type, xla::Shape, std::vector<long, std::allocator<long> >)
torch_xla::tensor_methods::transpose(c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> > const&, long, long)
torch_xla::XLANativeFunctions::t(at::Tensor const&)
at::_ops::t::redispatch(c10::DispatchKeySet, at::Tensor const&)
at::_ops::t::redispatch(c10::DispatchKeySet, at::Tensor const&)
at::_ops::t::call(at::Tensor const&)
torch::autograd::generated::AddmmBackward0::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&)
torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&)
torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&)
torch::autograd::Engine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool)
torch::autograd::python::PythonEngine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool)
clone
*** End stack trace ***
Unable to map dynamic dimension of shape f32[<=80,10]{1,0} to output sizes (10, 80)
full error with print statement.
To Reproduce
Run the script from pr on TPU VM:
export XRT_TPU_CONFIG="localservice;0;localhost:51011"
export XLA_EXPERIMENTAL="nonzero:masked_select"
python3 pytorch/xla/test/test_dynamic_shape_backward_models.py
Expected behavior
It shouldn't crash.
Environment
- Reproducible on XLA backend [CPU/TPU]: TPU
- torch_xla version: HEAD
Additional context
Metadata
Metadata
Assignees
Labels
dynamismDynamic Shape FeaturesDynamic Shape Features