Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CONV to apply proper mysql chunking #983

Merged
merged 1 commit into from
May 21, 2024

Conversation

jogrogan
Copy link
Collaborator

@jogrogan jogrogan commented May 16, 2024

The MySQL MD5() function returns a binary string of 32 hex digits.
MySQL MOD() expects two decimal numbers.
If the md5 contains any hex character that is not a digit MOD() is always returning 0. This is causing a huge imbalance in data downstream since all of these records are getting bucketed under the task reading from 'partition' 0.

The solution is to apply MySQL CONV() from hex base 16 to decimal base 10

This was all tested locally via MySQL Workbench

@jogrogan jogrogan merged commit 05eb6d1 into linkedin:master May 21, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants