-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/lexbase: remove map access overhead when encoding SQL strings #124294
sql/lexbase: remove map access overhead when encoding SQL strings #124294
Conversation
Release note: None
Benchmark results of the new test case:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some nits to consider, but
Reviewed 1 of 1 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/lexbase/encode.go
line 117 at r2 (raw file):
const minQuoteChar = ' ' // 0x20 const maxQuoteChar = '~' // 0x7E var mustQuoteMap = [maxQuoteChar]bool{
nit: array needs to have maxQuoteChar + 1
elements (or maxQuoteChar
needs to be 0x7F
). (We could also use const minQuoteChar = byte(0x20)
to be more explicit.)
pkg/sql/lexbase/encode.go
line 157 at r2 (raw file):
ch := byte(r) if r >= minQuoteChar && r <= maxQuoteChar { if mustQuoteMap[ch] {
nit: might be worth trying switch
with 4 cases to see whether it'd be faster.
pkg/sql/lexbase/encode.go
line 161 at r2 (raw file):
bareStrings = false } if !stringencoding.NeedEscape(ch) && ch != '\'' {
nit: we can pull this if
to be above mustQuoteMap
check (we could avoid mustQuoteMap
lookup if this condition is true
).
0d2bf82
to
1a41d27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft and @yuzefovich)
pkg/sql/lexbase/encode.go
line 117 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: array needs to have
maxQuoteChar + 1
elements (ormaxQuoteChar
needs to be0x7F
). (We could also useconst minQuoteChar = byte(0x20)
to be more explicit.)
Oops, great catch! I switch this to use unicode.MaxASCII
, let me know what you think. Using const minQuoteChar = byte(0x20)
seems less obvious because you need to know that ' ' == 0x20
to make the connection to the ' ': true
entry in the array.
pkg/sql/lexbase/encode.go
line 157 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: might be worth trying
switch
with 4 cases to see whether it'd be faster.
I tested a simple benchmark and the array is the fastest: https://gist.github.com/mgartner/f1b0fbd7f05faf37038c8545dabc7b80
pkg/sql/lexbase/encode.go
line 161 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: we can pull this
if
to be abovemustQuoteMap
check (we could avoidmustQuoteMap
lookup if this condition istrue
).
I think the semantics would change if any chars satisfy both the conditional and the mustQuoteMap—in that case we wouldn't set baseStrings = false
. It looks like that is impossible currently, but not guaranteed by anything. So I'm hesitant to change it.
Previously, mgartner (Marcus Gartner) wrote…
Hmm, maybe not the most fair test because the array likely sits in the CPU cache for the entire benchmark, which might be less likely in a production workload. 🤷 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @rytaft)
pkg/sql/lexbase/encode.go
line 157 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Hmm, maybe not the most fair test because the array likely sits in the CPU cache for the entire benchmark, which might be less likely in a production workload. 🤷
Thanks for trying (I thought I remembered reading somewhere that switch
in recent Go versions became faster, perhaps not fast enough :) )
pkg/sql/lexbase/encode.go
line 161 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I think the semantics would change if any chars satisfy both the conditional and the mustQuoteMap—in that case we wouldn't set
baseStrings = false
. It looks like that is impossible currently, but not guaranteed by anything. So I'm hesitant to change it.
Good point, I missed that.
Map accesses have been replaced with more efficient array accesses for determining characters to quote in `EncodeSQLStringWithFlags`. In a CPU profile from a recent POC these map access accounted for a significant portion of CPU time spent in query optimization because the optimizer's interner converts `tree.DJSON` objects to strings using this encoding function when hashing and checking equality of JSON constants. Release note: None
1a41d27
to
1ea9ba6
Compare
My previous attempts were actually broken in subtle ways. @yuzefovich do you mind taking one more look and letting me know if this still LGTY? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft)
TFTRs! bors r+ |
122864: sql: add randomized test for plan-gist logic r=yuzefovich a=yuzefovich This commit adds a randomized test for plan-gist logic. It utilizes sqlsmith to generate random stmts, then runs EXPLAIN (GIST) for it to retrieve the gist followed by decoding that gist, and ensures that no internal errors occur. All generated stmts are also executed with a short timeout to allow for DB state to evolve. Additionally, all gists are accumulated and decoded against an empty DB. The test is placed in the ccl package to exercise features gated behind CCL license. The test quickly discovered bug fixed in 1315dda (when the corresponding fix is reverted) but haven't found anything new for plan-gist logic (but did hit a few issues elsewhere which are currently ignored in the test). Fixes: #117634. Release note: None 124284: kvserver: rebalance ranges with one voter using joint configurations r=nvanbenschoten a=kvoli The allocator would add a voter, instead of both adding and removing the existing voter when rebalancing ranges with one replica. Removing the leaseholder replica was not possible prior to #74077, so the addition only was necessary. This restriction is no longer necessary. Allow rebalancing a one voter range between stores using joint configurations, where the lease will be transferred to the incoming voter store, from the outgoing demoting voter. Scattering ranges with one voter will now leave the range with exactly one voter, where previously both the leaseholder voter evaluating the scatter, and the new voter would be left. Before this patch, scattering 1000 ranges with RF=1 on a 5 store cluster: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 1 | 1001 | ########## | 500 | ########## 5 | 291 | ### | 147 | ### 4 | 275 | ### | 137 | ### 3 | 229 | ### | 118 | ### 2 | 206 | ### | 99 | ## ``` After: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 2 | 242 | ########## | 241 | ########## 4 | 227 | ########## | 227 | ########## 5 | 217 | ######### | 216 | ######### 3 | 209 | ######### | 208 | ######### 1 | 106 | ##### | 109 | ##### ``` Fixes: #108420 Fixes: #124171 Release note (bug fix): Scattering a range with replication factor=1, no longer erroneously up-replicates the range to two replicas. Leases will also no longer thrash between nodes when perturbed with replication factor=1. 124294: sql/lexbase: remove map access overhead when encoding SQL strings r=mgartner a=mgartner #### opt: add benchmark for simple INSERT with JSON column Release note: None #### sql/lexbase: remove map access overhead when encoding SQL strings Map accesses have been replaced with more efficient array accesses for determining characters to quote in `EncodeSQLStringWithFlags`. In a CPU profile from a recent POC these map access accounted for a significant portion of CPU time spent in query optimization because the optimizer's interner converts `tree.DJSON` objects to strings using this encoding function when hashing and checking equality of JSON constants. Epic: None Release note: None 124444: packer: move `add-apt-repository` call to after `sleep` r=jlinder a=rickystewart Updates are still going on on the VM while this is happening. Epic: none Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Austen McClernon <austen@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Build failed (retrying...): |
122864: sql: add randomized test for plan-gist logic r=yuzefovich a=yuzefovich This commit adds a randomized test for plan-gist logic. It utilizes sqlsmith to generate random stmts, then runs EXPLAIN (GIST) for it to retrieve the gist followed by decoding that gist, and ensures that no internal errors occur. All generated stmts are also executed with a short timeout to allow for DB state to evolve. Additionally, all gists are accumulated and decoded against an empty DB. The test is placed in the ccl package to exercise features gated behind CCL license. The test quickly discovered bug fixed in 1315dda (when the corresponding fix is reverted) but haven't found anything new for plan-gist logic (but did hit a few issues elsewhere which are currently ignored in the test). Fixes: #117634. Release note: None 124294: sql/lexbase: remove map access overhead when encoding SQL strings r=mgartner a=mgartner #### opt: add benchmark for simple INSERT with JSON column Release note: None #### sql/lexbase: remove map access overhead when encoding SQL strings Map accesses have been replaced with more efficient array accesses for determining characters to quote in `EncodeSQLStringWithFlags`. In a CPU profile from a recent POC these map access accounted for a significant portion of CPU time spent in query optimization because the optimizer's interner converts `tree.DJSON` objects to strings using this encoding function when hashing and checking equality of JSON constants. Epic: None Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Build failed (retrying...): |
Build succeeded: |
opt: add benchmark for simple INSERT with JSON column
Release note: None
sql/lexbase: remove map access overhead when encoding SQL strings
Map accesses have been replaced with more efficient array accesses for
determining characters to quote in
EncodeSQLStringWithFlags
. In a CPUprofile from a recent POC these map access accounted for a significant
portion of CPU time spent in query optimization because the optimizer's
interner converts
tree.DJSON
objects to strings using this encodingfunction when hashing and checking equality of JSON constants.
Epic: None
Release note: None