-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow quotes to be escaped in JSON path #12033
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks good - some comments:
ptr++; | ||
key[key_len++] = *ptr++; | ||
} | ||
if (ptr == end || backslash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be an error if backslash
is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we ignore other backslashes as we do for regular (non quoted) JSON paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, but then returning an empty result here doesn't seem correct to me - shouldn't we match x\
as the string literal 'x\'
in that case, instead of returning an empty result? Could we add such a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I understand the confusion. If ptr == end || backslash
, we return an empty result, which will cause a descriptive error, pointing to the location where the error occurred. So, if we have $."x\
as a path, you will get an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems a bit inconsistent, no? We accept a backslash as a regular backslash except when it is the last character? I would say we should turn it into a regular backslash in this scenario perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you have the path $."x\
, we need to throw an error, because the path is "escaped", i.e., the object key starts with a "
, so it needs to end with a "
too. If it ends with a backslash, it's wrong. Actually, I just realised that it can be removed from the check, since, ptr == end
will be true
Thanks! |
Merge pull request duckdb/duckdb#12086 from gitccl/fix_list_zip Merge pull request duckdb/duckdb#12158 from lnkuiper/simple_external_join_test Merge pull request duckdb/duckdb#12155 from Tishj/python_sqllogictest_rm_testdir Merge pull request duckdb/duckdb#12033 from lnkuiper/json_extract_escape
We conveniently support escaping JSON object keys that contain the special
[
and.
characters by surrounding them with quotes.The path
$.duck
points to[42, 43]
, and the path$.duck[0]
points to42
.If the key was
"du[ck"
instead, we wouldn't be able to extract because we would interpret[
as the syntax for extracting from an array. We already implemented escaping these special characters by surrounding them with quotes:$."du[ck"
.If the key was
"du\"ck"
instead, we would also be able to extract, by just specifying the path$.du"ck
. However, if it is necessary to escape special characters like[
, but keys also contain quotes, you were out of luck.So, we would not be able to extract if the key was, for example,
"du[\"ck"
. Surrounding the path with quotes does not work because there is a quote in the middle of the key.This PR solves this problem by allowing quotes within quoted paths to be escaped with a backslash:
Fixes #11997