[Bug] Incorrect Handling of String Encoding in MultiTableCommittable Serialization #3027
Closed
2 tasks done
Labels
bug
Something isn't working
Search before asking
Paimon version
0.7
Compute Engine
1.17
Minimal reproduce step
Issue description
When working with a
MultiTableCommittableSerializer
to handle serialized data, specifically metadata with non-ASCII characters (e.g., Chinese characters for table names), incorrect handling of byte buffers leads to missed characters or potentialBufferOverflowException
errors.Steps to reproduce
The issue arises when deserializing byte arrays that represent strings with multibyte characters. For instance, deserializing metadata for tables with Chinese names can produce undesired errors or incorrect results.
What doesn't meet your expectations?
The deserialization process should correctly convert byte arrays back to strings, accommodating multibyte characters without errors or data loss.
Anything else?
Here is the stacktrace from the exception:
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: