-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: rebalance ranges with one voter using joint configurations #124284
kvserver: rebalance ranges with one voter using joint configurations #124284
Conversation
9bb5dff
to
0a12f1c
Compare
d3edeaa
to
e6eaca8
Compare
The existing leaseholder store is never selected as the rebalance target. That is to be expected initially, as scatter only jitters replica count stats, it doesn't completely randomize them. The leaseholder store has 1k+ replicas. However, after the replica counts get closer, we'd expect scatter to also do nothing 1/n of the time but it is still continuing to only choose other stores with this patch. |
e6eaca8
to
79f3177
Compare
Add a simulation test where the replication factor is set to 1. This highlights lease thrashing, which causes replica and lease count convergence to take upwards of 20 minutes with two nodes. Informs: cockroachdb#108420 Release note: None
db23e58
to
0ecb487
Compare
Turns out this is just due to timing. The allocator rebalance targets are returned before any rebalancing activity occurs. The disparity in replica counts initially causes the initial node to not be selected. If we iteratively ran scatter, the results would be closer to expectation w/ an even distribution. The lease/replicate queue will also adjust lease/replica counts as the scatter occurs. |
0ecb487
to
06635fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r5, 6 of 7 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @kvoli)
pkg/kv/kvserver/asim/tests/testdata/non_rand/one_voter
line 0 at r7 (raw file):
🔥
pkg/kv/kvserver/allocator/plan/util_test.go
line 56 at r7 (raw file):
{ChangeType: roachpb.REMOVE_VOTER, Target: roachpb.ReplicationTarget{NodeID: 1, StoreID: 1}}, }, expectedErrStr: "",
nit: expectedPerformingSwap: false,
pkg/sql/scatter_test.go
line 215 at r7 (raw file):
t, tc.ServerConn(0), "t", "k INT PRIMARY KEY, v INT", 500,
nit: /* numRows */
pkg/sql/scatter_test.go
line 262 at r7 (raw file):
} // Expect that the number of replicas on store 1 to have changed. We can't
It feels like we're leaving test flakiness up to chance here, but I guess (1/3)^50
is a small enough number that we don't think it's at risk of flaking.
The allocator would add a voter, instead of both adding and removing the existing voter when rebalancing ranges with one replica. Removing the leaseholder replica was not possible prior to cockroachdb#74077, so the addition only was necessary. This restriction is no longer necessary. Allow rebalancing a one voter range between stores using joint configurations, where the lease will be transferred to the incoming voter store, from the outgoing demoting voter. Scattering ranges with one voter will now leave the range with exactly one voter, where previously both the leaseholder voter evaluating the scatter, and the new voter would be left. Before this patch, scattering 1000 ranges with RF=1 on a 5 store cluster: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 1 | 1001 | ########## | 500 | ########## 5 | 291 | ### | 147 | ### 4 | 275 | ### | 137 | ### 3 | 229 | ### | 118 | ### 2 | 206 | ### | 99 | ## ``` After: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 2 | 242 | ########## | 241 | ########## 4 | 227 | ########## | 227 | ########## 5 | 217 | ######### | 216 | ######### 3 | 209 | ######### | 208 | ######### 1 | 106 | ##### | 109 | ##### ``` Fixes: cockroachdb#108420 Fixes: cockroachdb#124171 Release note (bug fix): Scattering a range with replication factor=1, no longer erroneously up-replicates the range to two replicas. Leases will also no longer thrash between nodes when perturbed with replication factor=1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TYFTR!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/asim/tests/testdata/non_rand/one_voter
line at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
🔥
Thanks 😄
pkg/kv/kvserver/allocator/plan/util_test.go
line 56 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
expectedPerformingSwap: false,
Updated.
pkg/sql/scatter_test.go
line 215 at r7 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
/* numRows */
Updated.
pkg/sql/scatter_test.go
line 262 at r7 (raw file):
Seems reasonably small, in addition to:
Turns out this is just due to timing. The allocator rebalance targets are returned before any rebalancing activity occurs. The disparity in replica counts initially causes the initial node to not be selected. If we iteratively ran scatter, the results would be closer to expectation w/ an even distribution. The lease/replicate queue will also adjust lease/replica counts as the scatter occurs.
06635fc
to
1995005
Compare
bors r=nvanbenschoten |
122864: sql: add randomized test for plan-gist logic r=yuzefovich a=yuzefovich This commit adds a randomized test for plan-gist logic. It utilizes sqlsmith to generate random stmts, then runs EXPLAIN (GIST) for it to retrieve the gist followed by decoding that gist, and ensures that no internal errors occur. All generated stmts are also executed with a short timeout to allow for DB state to evolve. Additionally, all gists are accumulated and decoded against an empty DB. The test is placed in the ccl package to exercise features gated behind CCL license. The test quickly discovered bug fixed in 1315dda (when the corresponding fix is reverted) but haven't found anything new for plan-gist logic (but did hit a few issues elsewhere which are currently ignored in the test). Fixes: #117634. Release note: None 124284: kvserver: rebalance ranges with one voter using joint configurations r=nvanbenschoten a=kvoli The allocator would add a voter, instead of both adding and removing the existing voter when rebalancing ranges with one replica. Removing the leaseholder replica was not possible prior to #74077, so the addition only was necessary. This restriction is no longer necessary. Allow rebalancing a one voter range between stores using joint configurations, where the lease will be transferred to the incoming voter store, from the outgoing demoting voter. Scattering ranges with one voter will now leave the range with exactly one voter, where previously both the leaseholder voter evaluating the scatter, and the new voter would be left. Before this patch, scattering 1000 ranges with RF=1 on a 5 store cluster: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 1 | 1001 | ########## | 500 | ########## 5 | 291 | ### | 147 | ### 4 | 275 | ### | 137 | ### 3 | 229 | ### | 118 | ### 2 | 206 | ### | 99 | ## ``` After: ``` store_id | replica_count | replica_distribution | lease_count | lease_distribution -----------+---------------+----------------------+-------------+--------------------- 2 | 242 | ########## | 241 | ########## 4 | 227 | ########## | 227 | ########## 5 | 217 | ######### | 216 | ######### 3 | 209 | ######### | 208 | ######### 1 | 106 | ##### | 109 | ##### ``` Fixes: #108420 Fixes: #124171 Release note (bug fix): Scattering a range with replication factor=1, no longer erroneously up-replicates the range to two replicas. Leases will also no longer thrash between nodes when perturbed with replication factor=1. 124294: sql/lexbase: remove map access overhead when encoding SQL strings r=mgartner a=mgartner #### opt: add benchmark for simple INSERT with JSON column Release note: None #### sql/lexbase: remove map access overhead when encoding SQL strings Map accesses have been replaced with more efficient array accesses for determining characters to quote in `EncodeSQLStringWithFlags`. In a CPU profile from a recent POC these map access accounted for a significant portion of CPU time spent in query optimization because the optimizer's interner converts `tree.DJSON` objects to strings using this encoding function when hashing and checking equality of JSON constants. Epic: None Release note: None 124444: packer: move `add-apt-repository` call to after `sleep` r=jlinder a=rickystewart Updates are still going on on the VM while this is happening. Epic: none Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Austen McClernon <austen@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Build failed (retrying...): |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 1995005 to blathers/backport-release-23.1-124284: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. error creating merge commit from 1995005 to blathers/backport-release-23.2-124284: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
The allocator would add a voter, instead of both adding and removing the
existing voter when rebalancing ranges with one replica. Removing the
leaseholder replica was not possible prior to #74077, so the addition
only was necessary.
This restriction is no longer necessary. Allow rebalancing a one voter
range between stores using joint configurations, where the lease will be
transferred to the incoming voter store, from the outgoing demoting
voter.
Scattering ranges with one voter will now leave the range with exactly
one voter, where previously both the leaseholder voter evaluating the
scatter, and the new voter would be left.
Before this patch, scattering 1000 ranges with RF=1 on a 5 store
cluster:
After:
Fixes: #108420
Fixes: #124171
Release note (bug fix): Scattering a range with replication factor=1, no
longer erroneously up-replicates the range to two replicas. Leases will
also no longer thrash between nodes when perturbed with replication
factor=1.