-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUPPORT] java.lang.ClassCastException: class org.apache.spark.sql.catalyst.expressions.UnsafeRow cannot be cast to class org.apache.spark.sql.vectorized.ColumnarBatch #11106
Comments
It looks like a known issue reported in: #9305 |
Hi @danny0405, yes it does look similar. However, the table was already running with Spark 3.3.2 and hudi 0.13.1 without errors. The only changes here were we upgraded the Hudi version to 0.14.1 and turned on metadata. The casting also seems to be in the opposite direction and the first run with Hudi 0.14.1 did not have metadata and succeeded. Do you think the issue is related to how the metadata table is saved? In other words, is metadata not supported with Spark 3.3.2? Thanks for the help! |
It is supported, can you share you config options related with metadata table? |
Hi @danny0405 , we are using defaults only. All hudi configs specified are listed above. Is there something we should configure specifically? |
I'm pretty sure it is a jar conflict, can you check the jar that involves the reported class? |
@vicuna96 How many columns are there in your dataset? If its more than 100, did you tried setting spark.sql.codegen.maxFields |
Hi @danny0405 , this seems to be in the spark-catalyst_2.12-3.3.2.jar package. but org.apache.spark.sql.catalyst.expressions.UnsafeRow does not extend org.apache.spark.sql.vectorized.ColumnarBatch. Is this expected in different versions? Hi @ad1happy2go , I can give it a try but the table should have less than 100 columns and also this seems like a spark property rather than hudi property and the spark version has not changed. I will update once I get a chance to test it. |
@vicuna96 Did you get a chance to test out. |
Tips before filing an issue
Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
Upgrading Hudi version from 0.13.1 with metadata turned off, to 0.14.1 with metadata turned on.
First run went through fine and created the metadata table.
Second run I am facing the issue shown below.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A partial update on the table, that succeeds consistently, not only once.
Environment Description
Hudi version : 0.14.1
Spark version : 3.3.2
Hive version : 3.1.3
Hadoop version : 3.3.6
Storage (HDFS/S3/GCS..) : GCS
Running on Docker? (yes/no) : Dataproc
Additional context
Add any other context about the problem here.
Hudi configurations as follows
Stacktrace
Add the stacktrace of the error.
The error stack trace.
And prior warning found
The text was updated successfully, but these errors were encountered: