-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite_data_files
does not respect table sort order
#10346
Comments
What plan was used when the sort order wasn't specified, we should be able to see without any data file checking what sort order was used? This should be easily visible in the Spark UI |
with explicit sort
w/o explicit sort
|
Okay, I admit that calling this non-working was a bit premature. Still, the thing is that partition has a lot of overlapping files after a partition without explicitly setting the sort-order.
|
So definitely using the default sort order as evidenced by the plan but something in our sort request to spark isn't working properly. While the two plans are different I feel like they should both have correct output. Probably will need to debug a bit more |
A little odd that the first plan doesn't have the partitioning transform which it probably should have ... |
Apache Iceberg version
1.5.2 (latest release)
Query engine
Spark
Please describe the bug 馃悶
Output
The text was updated successfully, but these errors were encountered: