Skip to content

Conversation

alanwaketan
Copy link
Collaborator

@alanwaketan alanwaketan commented Apr 16, 2024

Summary:
This pull request enables SPMDShardToFullShape. The trickiest part is how to get the full shape, and here is a couple of options:

  • Bookkeeping the shape full shape that enters SPMDFullToShardShape. This is not selected given the output could be created on the fly.
  • Constructing the full shape from the local shard and the sharding spec. This is not selected given there is no way to deal with the padding. We can't examine the data during the tracing time.
  • Let users pass the full shape in. This is selected because it's just the most sounded path.

Tes Plan:
PJRT_DEVICE=TPU python test/spmd/test_xla_sharding.py -v -k test_manual_sharding_e2e -k test_spmd_shard_to_full_shape

@alanwaketan alanwaketan force-pushed the alanwaketan/resume_spmd branch from 9c8643b to 50f4c66 Compare April 16, 2024 20:39
@alanwaketan alanwaketan requested review from yeounoh and jonb377 April 16, 2024 20:43
@alanwaketan alanwaketan marked this pull request as ready for review April 16, 2024 20:44
xt = xs._mark_manual_sharding(xt)
xx = torch_xla._XLAC._spmd_shard_to_full_shape(
xt.global_tensor,
torch_xla._XLAC.OpSharding([], [], [], xs.ShardingType.REPLICATED),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it uses the passed sharding type when returns back to SPMD?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup!

with self.assertRaises(RuntimeError):
x = torch_xla._XLAC._spmd_shard_to_full_shape(
x, torch_xla._XLAC.OpSharding([], [], [], xs.ShardingType.REPLICATED),
x.shape, x.dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we've decided to pass the original full shape as argument, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least I have decided haha, see the PR descriptions for the reasons behind.

Copy link
Contributor

@yeounoh yeounoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left some questions -- thanks @alanwaketan ❤️

@alanwaketan
Copy link
Collaborator Author

Thanks @yeounoh for the quick review!

@alanwaketan alanwaketan merged commit 6e34bbe into master Apr 17, 2024
@alanwaketan alanwaketan deleted the alanwaketan/resume_spmd branch April 17, 2024 00:13
lausannel pushed a commit to AlibabaPAI/xla that referenced this pull request Aug 6, 2024
Summary:
This pull request enables SPMDShardToFullShape. The trickiest part is how to get the full shape, and here is a couple of options:
1. Bookkeeping the shape full shape that enters SPMDFullToShardShape. This is not selected given the output could be created on the fly.
2. Constructing the full shape from the local shard and the sharding spec. This is not selected given there is no way to deal with the padding. We can't examine the data during the tracing time.
3. Let users pass the full shape in. This is selected because it's just the most sounded path.

Tes Plan:
PJRT_DEVICE=TPU python test/spmd/test_xla_sharding.py -v -k test_manual_sharding_e2e -k test_spmd_shard_to_full_shape
baoleai pushed a commit to AlibabaPAI/xla that referenced this pull request Aug 6, 2024
Summary:
This pull request enables SPMDShardToFullShape. The trickiest part is how to get the full shape, and here is a couple of options:
1. Bookkeeping the shape full shape that enters SPMDFullToShardShape. This is not selected given the output could be created on the fly.
2. Constructing the full shape from the local shard and the sharding spec. This is not selected given there is no way to deal with the padding. We can't examine the data during the tracing time.
3. Let users pass the full shape in. This is selected because it's just the most sounded path.

Tes Plan:
PJRT_DEVICE=TPU python test/spmd/test_xla_sharding.py -v -k test_manual_sharding_e2e -k test_spmd_shard_to_full_shape
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants