-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fake impl for aten.unique_dim #126561
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126561
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New Failure, 1 Unrelated FailureAs of commit b1455fa with merge base 3f28906 (): NEW FAILURE - The following job has failed:
UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
torch/_subclasses/fake_impls.py
Outdated
if dim is None: | ||
arg_dim = arg | ||
else: | ||
arg_dim = arg.new_empty((arg.shape[dim],)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I can believe you get correct results this way, this is written in a needlessly circuitous way. When writing a meta, you should prefer to directly allocate the result tensors, do not allocate intermediate tensors that do not actually get returned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in e040b23
torch/_subclasses/fake_impls.py
Outdated
arg_dim = arg | ||
else: | ||
arg_dim = arg.new_empty((arg.shape[dim],)) | ||
|
||
if (nnz := arg.unique_memo) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memo strategy needs to be adjusted here, because the unique at some dimension differs from uniques at other dims. You are sharing the memo for everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll admit I haven't dug deep enough to understand what is going on with the memos yet.
Are the memos even necessary? This comment implies to me that they may not be needed for unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the simplest resolution here is to just not use the memo for unique dim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adopted the suggested resolution in e040b23
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also needs tests, including for the memoization problem I reported.
This comment was marked as resolved.
This comment was marked as resolved.
torch/_subclasses/fake_impls.py
Outdated
if dim is None: | ||
ret = [arg.new_empty((nnz,))] | ||
else: | ||
ret = [arg.new_empty(*arg.shape[:dim], nnz, *arg.shape[dim + 1:])] | ||
|
||
if return_inverse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU/CUDA differ for unique_dim
in how they handle return_inverse
and return_counts
. CPU ignores the arguments and always returns each, whereas CUDA does not.
I'm not sure if that distinction is important at this level of abstraction, but if it is, I assume we should favor the CUDA implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this sounds like an eager mode bug. But we have the ability to query the device using fake_device
so you can give the exact behavior that eager has (and yes, would be good to do that now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, this difference arises from their source implementations: CPU / CUDA. One can note that the return_inverse
and return_counts
arguments are unused for _unique_dim_cpu_template
.
Perhaps this is still something to do with eager mode; I don't know enough about the inner workings and dispatches to rule that out myself.
Since the memoization issue is sidestepped by e040b23, I believe the existing tests in |
I'm hoping an xfail test starts succeeding, otherwise a manual test will be needed |
@@ -2522,8 +2522,8 @@ def map_to_fake(e): | |||
or name in sometimes_dynamic_output_op_test | |||
) | |||
self.assertTrue( | |||
mode.shape_env is None | |||
or not mode.shape_env.allow_dynamic_output_shape_ops | |||
fake_mode.shape_env is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping an xfail test starts succeeding, otherwise a manual test will be needed
Previously, a DynamicOutputShapeException
was raised when dim
was not None but handled here:
Line 2519 in b948b1a
except torch._subclasses.fake_tensor.DynamicOutputShapeException: |
Traceback
Traceback (most recent call last):
File "test/test_ops.py", line 2480, in run_with_fake_mode_and_verify
res_fake = op(input, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/testing/_internal/opinfo/core.py", line 1132, in __call__
return self.op(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/_jit_internal.py", line 502, in fn
return if_false(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/_jit_internal.py", line 502, in fn
return if_false(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/functional.py", line 996, in _return_output
output, _, _ = _unique_impl(input, sorted, return_inverse, return_counts, dim)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/functional.py", line 902, in _unique_impl
output, inverse_indices, counts = _VF.unique_dim(
^^^^^^^^^^^^^^^
File "torch/utils/_stats.py", line 20, in wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "torch/_subclasses/fake_tensor.py", line 973, in __torch_dispatch__
return self.dispatch(func, types, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/_subclasses/fake_tensor.py", line 1362, in dispatch
return self._cached_dispatch_impl(func, types, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/_subclasses/fake_tensor.py", line 1065, in _cached_dispatch_impl
output = self._dispatch_impl(func, types, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/_subclasses/fake_tensor.py", line 1642, in _dispatch_impl
op_impl_out = op_impl(self, func, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "torch/_subclasses/fake_impls.py", line 258, in dyn_shape
raise DynamicOutputShapeException(func)
torch._subclasses.fake_tensor.DynamicOutputShapeException: aten.unique_dim.default
However, it was erroneously handled because the incorrect mode was used in the exception handler. Switching to fake_mode
instead of mode
caused two failures and two errors prior to the change in this PR.
The exception is no longer raised because an implementation for unique_dim
can be found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch!
ayyy, just have to remove the xfail now |
@pytorchbot merge |
Merge failedReason: This PR needs a If not, please add the To add a label, you can comment to pytorchbot, for example For more information, see Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 3 checks: pull / linux-focal-cuda12.4-py3.10-gcc9-sm86 / build, pull / linux-focal-cuda12.4-py3.10-gcc9 / build, trunk / macos-13-py3-arm64 / test (default, 1, 3, macos-m1-stable) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
@pytorchbot merge -f "only unrelated failures" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Follow-up to #113118 and #124306.
Developed in coordination with the solution to microsoft/onnxscript#1547
This PR adds the missing fake tensor implementation for
aten.unique_dim
, thus enabling tracing and compilation oftorch.unique
whendim
is not None.Local testing has proceeded with the following simple script (provided that one has checked out the changes in microsoft/onnxscript#1547):
cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang