[SPMD] Support SPMDShardToFullShape #6925

alanwaketan · 2024-04-16T01:41:47Z

Summary:
This pull request enables SPMDShardToFullShape. The trickiest part is how to get the full shape, and here is a couple of options:

Bookkeeping the shape full shape that enters SPMDFullToShardShape. This is not selected given the output could be created on the fly.
Constructing the full shape from the local shard and the sharding spec. This is not selected given there is no way to deal with the padding. We can't examine the data during the tracing time.
Let users pass the full shape in. This is selected because it's just the most sounded path.

Tes Plan:
PJRT_DEVICE=TPU python test/spmd/test_xla_sharding.py -v -k test_manual_sharding_e2e -k test_spmd_shard_to_full_shape

yeounoh · 2024-04-16T22:35:31Z

test/spmd/test_xla_sharding.py

+    xt = xs._mark_manual_sharding(xt)
+    xx = torch_xla._XLAC._spmd_shard_to_full_shape(
+        xt.global_tensor,
+        torch_xla._XLAC.OpSharding([], [], [], xs.ShardingType.REPLICATED),


So it uses the passed sharding type when returns back to SPMD?

yeounoh · 2024-04-16T22:37:38Z

test/spmd/test_xla_sharding.py

+    with self.assertRaises(RuntimeError):
+      x = torch_xla._XLAC._spmd_shard_to_full_shape(
+          x, torch_xla._XLAC.OpSharding([], [], [], xs.ShardingType.REPLICATED),
+          x.shape, x.dtype)


so we've decided to pass the original full shape as argument, right?

At least I have decided haha, see the PR descriptions for the reasons behind.

yeounoh

LGTM, left some questions -- thanks @alanwaketan ❤️

alanwaketan · 2024-04-16T22:50:59Z

Thanks @yeounoh for the quick review!

Summary: This pull request enables SPMDShardToFullShape. The trickiest part is how to get the full shape, and here is a couple of options: 1. Bookkeeping the shape full shape that enters SPMDFullToShardShape. This is not selected given the output could be created on the fly. 2. Constructing the full shape from the local shard and the sharding spec. This is not selected given there is no way to deal with the padding. We can't examine the data during the tracing time. 3. Let users pass the full shape in. This is selected because it's just the most sounded path. Tes Plan: PJRT_DEVICE=TPU python test/spmd/test_xla_sharding.py -v -k test_manual_sharding_e2e -k test_spmd_shard_to_full_shape

alanwaketan added 4 commits April 16, 2024 20:39

initial commit

ad5b987

minor fix

af57d90

Add an e2e test

14d3291

Fix linters

50f4c66

alanwaketan force-pushed the alanwaketan/resume_spmd branch from 9c8643b to 50f4c66 Compare April 16, 2024 20:39

alanwaketan requested review from yeounoh and jonb377 April 16, 2024 20:43

alanwaketan marked this pull request as ready for review April 16, 2024 20:44

yeounoh reviewed Apr 16, 2024

View reviewed changes

yeounoh approved these changes Apr 16, 2024

View reviewed changes

alanwaketan merged commit 6e34bbe into master Apr 17, 2024

alanwaketan deleted the alanwaketan/resume_spmd branch April 17, 2024 00:13

baoleai mentioned this pull request Aug 6, 2024

Add manual sharding API for SPMD AlibabaPAI/xla#2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPMD] Support SPMDShardToFullShape #6925

[SPMD] Support SPMDShardToFullShape #6925

Uh oh!

alanwaketan commented Apr 16, 2024 •

edited

Loading

Uh oh!

yeounoh Apr 16, 2024

Uh oh!

alanwaketan Apr 16, 2024

Uh oh!

yeounoh Apr 16, 2024

Uh oh!

alanwaketan Apr 16, 2024

Uh oh!

yeounoh left a comment

Uh oh!

alanwaketan commented Apr 16, 2024

Uh oh!

Uh oh!

[SPMD] Support SPMDShardToFullShape #6925

[SPMD] Support SPMDShardToFullShape #6925

Uh oh!

Conversation

alanwaketan commented Apr 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yeounoh Apr 16, 2024

Choose a reason for hiding this comment

Uh oh!

alanwaketan Apr 16, 2024

Choose a reason for hiding this comment

Uh oh!

yeounoh Apr 16, 2024

Choose a reason for hiding this comment

Uh oh!

alanwaketan Apr 16, 2024

Choose a reason for hiding this comment

Uh oh!

yeounoh left a comment

Choose a reason for hiding this comment

Uh oh!

alanwaketan commented Apr 16, 2024

Uh oh!

Uh oh!

alanwaketan commented Apr 16, 2024 •

edited

Loading