Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Migration of json utilities from core #28522

Merged
merged 2 commits into from
May 20, 2024

Conversation

eyalezer
Copy link
Contributor

@eyalezer eyalezer commented May 15, 2024

SUMMARY

initial step of refactoring JSON utilities

  • migration of all json utils from utils.core to utils.json
  • refactored all referenced usage of those utils to the new json module
  • added and fixed some tests
  • removed unused code
  • following fix: utf-16 json encoder support #28486 more instances discovered and fixed at:
    • Sqllab results api
    • Charts data api

@github-actions github-actions bot added the api Related to the REST API label May 15, 2024
@dosubot dosubot bot added sqllab Namespace | Anything related to the SQL Lab viz:charts Namespace | Anything related to viz types labels May 15, 2024
@eyalezer
Copy link
Contributor Author

@mistercrunch - tests from last commit #28486 still covers for the encoding part.

Copy link

codecov bot commented May 15, 2024

Codecov Report

Attention: Patch coverage is 86.76471% with 18 lines in your changes are missing coverage. Please review.

Project coverage is 83.50%. Comparing base (76d897e) to head (929fc12).
Report is 144 commits behind head on master.

Files Patch % Lines
superset/utils/json.py 86.95% 12 Missing ⚠️
superset/views/chart/views.py 33.33% 2 Missing ⚠️
superset/models/core.py 66.66% 1 Missing ⚠️
superset/views/api.py 50.00% 1 Missing ⚠️
superset/views/tags.py 66.66% 1 Missing ⚠️
superset/viz.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #28522       +/-   ##
===========================================
+ Coverage   60.48%   83.50%   +23.01%     
===========================================
  Files        1931      522     -1409     
  Lines       76236    37485    -38751     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31301    -14813     
+ Misses      28017     6184    -21833     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 49.11% <50.00%> (-0.06%) ⬇️
javascript ?
mysql 77.18% <83.82%> (?)
postgres 77.29% <84.55%> (?)
presto 53.66% <50.73%> (-0.15%) ⬇️
python 83.50% <86.76%> (+20.01%) ⬆️
sqlite 76.74% <84.55%> (?)
unit 58.96% <65.44%> (+1.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mistercrunch
Copy link
Member

Starting to feel like we need a refactor of some kind here as this isn't DRY, maybe moving some things to a new superset/utils/json.py, taking all the json-related stuff in superset/utils/core.py, and moving all this recent utf-16 handling in there too.

I think I started this refactor before in a branch but never carried though. Let me push the branch for reference. Are you interested in taking this on?

@mistercrunch
Copy link
Member

mistercrunch commented May 15, 2024

Here's the branch https://github.com/apache/superset/compare/refactor_json?expand=1, looks like I lost the json.py file along the way.... For now the goal would be to move the json-related function from utils/core to utils/json, and refactor the recent utf-16 handling in there. Eventually I'd like to also make sure all usage of json.(load|dump)(s)() would go through this module and use the proper encoders.

@eyalezer
Copy link
Contributor Author

definitely felt the same seeing things being repeated more than once while dealing with this non UTF-8 bytes.
for starters lets try to move all of those recent UTF-16 handling to this module and then see how goes.

@mistercrunch
Copy link
Member

If you git grep json.dumps you'll see there's quite a bit to centralize, we don't have to do it all at once though.

Comment on lines +156 to +158
json_utils.dumps(
payload, default=json_utils.json_iso_dttm_ser, ignore_nan=True
),

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
@eyalezer eyalezer changed the title fix: added missing implementation for utf-16 refactor: Migration of json utils from core to a json module May 18, 2024
@eyalezer eyalezer changed the title refactor: Migration of json utils from core to a json module refactor: Migration of json utilities from core May 18, 2024
Added missing implementation for utf-16
@eyalezer
Copy link
Contributor Author

@mistercrunch - i've tried to keep it as small as possible for this first phase but it ended up with 30 files changed after all 😏

next phase (as mentioned) needs to be the refactoring of any json.(load/s|dump/s) to use this module which looks like it's gonna be a huge refactoring ~250+ files at least. so probably it's better to split it up anyway.

Copy link
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick pass and overall LGTM, looking to do another thorough pass on it, but seems safe overall

Copy link
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another, more thorough pass and everything LGTM. Thanks for the refactor!

@mistercrunch mistercrunch merged commit 56f0fc4 into apache:master May 20, 2024
34 checks passed
@eyalezer
Copy link
Contributor Author

@mistercrunch - with pleasure.
since i got my head into this I have begun working on the subsequent stage of the refactoring. It is likely that I will submit another pull request regarding this matter in the near future.

Vitor-Avila pushed a commit to Vitor-Avila/superset that referenced this pull request May 28, 2024
Co-authored-by: Eyal Ezer <eyal.ezer@ge.com>
@eyalezer eyalezer deleted the utf-16-support branch May 29, 2024 02:50
EnxDev pushed a commit to EnxDev/superset that referenced this pull request May 31, 2024
Co-authored-by: Eyal Ezer <eyal.ezer@ge.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the REST API size/XL sqllab Namespace | Anything related to the SQL Lab viz:charts Namespace | Anything related to viz types
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants