You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched in the issues and found nothing similar.
Motivation
Dynamic bucket writing does two shuffles, the first repartitionByKeyPartitionHash seems unnecessary, It seems to be only used to determine assignId. However, assignId can be calculated through partitionHash/keyHash/numParallelism/numAssigners, we do not need to do extra shuffle. Can we remove it?
Search before asking
Motivation
Dynamic bucket writing does two shuffles, the first
repartitionByKeyPartitionHash
seems unnecessary, It seems to be only used to determineassignId
. However,assignId
can be calculated throughpartitionHash/keyHash/numParallelism/numAssigners
, we do not need to do extra shuffle. Can we remove it?paimon/paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/commands/PaimonSparkWriter.scala
Line 143 in e27ceb4
Solution
No response
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: