-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: add test for impure function correlation behavior #9014
base: main
Are you sure you want to change the base?
Conversation
0992a3f
to
48d4a51
Compare
oooh, these CI failures are revealing differences in the backend behaviors... wheeee into the 🐰 🕳️ we go! |
48d4a51
to
1fea07f
Compare
I can think of two reasons a user might care about the semantics here, and I hope we can support both of their needs:
So I think this means that we as ibis authors can't assume what the goals of the user is, and how many times they want an expression executed. Therefore, we shouldn't do any clever rewrites or mergings of selects. I think we need to keep a more 1:1 correspondence between what the user writes and the SQL we produce: Every time the user does a .select(), .mutate(), etc, (except for simple column renamings, and maybe a few other cases) that leads to exactly one more IDK, what do you think of this train of reasoning? I think I'm fairly convinced that those two use cases are the requirements for success, but perhaps there is a different/better way of accomplishing that goal |
5d9a1ec
to
21b180c
Compare
@NickCrews please rebase to test with #9023 change included |
@NickCrews My only objection would be that
Is not something we can enforce even if we never merge any select statements. This kind of guarantee is at the level of the query engine. |
@cpcloud yup you are right with guarantees, "suggesting" to the backend is the best we can do. What do you think in general of my proposal of "one CTE per .select"? I'm not sure if you're skeptical of the whole thing or just the performance claims... Thanks! |
I'm trying to decide what is higher priority: An implementation that is faster 90% of the time, but does clever things and therefore isn't able to be tuned by the user that 10% of the time they need it Vs An implementation that is a bit slower in the majority of cases, but is always fine tunable to get the perf you need in the edge cases. |
7700941
to
d14384c
Compare
slowly going through the backends and adding the correct marks for each kind of failure... |
53c6941
to
ee0ae0c
Compare
Need to fix the a few broken cases. Related to ibis-project#8921, trying to write down exactly what the expected behavior is.
ee0ae0c
to
37ae5dd
Compare
Anything I can do here to help move this forward? |
Related to #8921,
trying to write down exactly what the expected behavior is.
I figure we can use this PR to hash out exaclty what we want the semantics to be, and then the other discussions might be easier because our goal is written down precisely somewhere. Please let me know if you agree or disagree with this behavior, or if there are other tests we should add.
Need to fix the UDF test case in a followup. Also wasn't sure where to put these tests, I put them in their own little file but if you point me elsewhere Iwill move them.