-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a dedicated scheduling group for schema commitlog #15566
Comments
I'm not thrilled about adding a new scheduling group but it's the right thing to do. Better is to have nested groups (and so commitlog/data and commitlog/schema). |
nested sched groups: scylladb/seastar#1070 |
First, extremely important start, stop and config-reread code all run in main sg :) |
@kostja please decide who should work on this -- storage group or your group? |
@kbr-scylla depends on how you suggest to fix it. Nested scheduling groups? |
Not necessarily, if it's too much effort we could simply have a separate normal scheduling group. |
So this issue actually breaks down into 3 or 4 related items. Let me first write what the goal of this issue is: we want to make sure that there is no priority inversion between the data path and the schema path. E.g. when a node needs to make a read barrier to fetch the latest schema, and this barrier needs to write schema mutations to the schema commit log, these writes are not queued behind data path writes to data commit log and compaction. At the end of the day it manifests with latency spikes in clusters which alter tables under heavy load (todo: we need a link to the original issue here). In order to fix that we need to do multiple things:
|
The issue is fixed. Schema commit log is already using a separate scheduling group from regular commitlog -- the main scheduling group, but as @avikivity said, "main should be left alone so things we forget have somewhere to live." The above issue was caused by a priority inversion as you described, but the problem was not commitlog as initially suspected, it was using the user data memory pool for Raft tables -- so schema writes were waiting for user memtable flushes. I think there is no longer a concept of IO priority classes, they were merged with scheduling groups (?) There really is nothing else to be done than creating a separate scheduling group for schema commitlog, I think. |
This is correct. @kbr-scylla if the issue is fixed why don't we close this one? |
Sorry for confusion @eliransin I meant that #15622 is fixed (which was the cause of latency spikes that Kostja mentioned) The current issue is not fixed, schema commitlog is in |
Will it make to 6.0? |
See also #1926 |
Currently, we do not explicitly set a scheduling group for the schema commitlog which causes it to run in the default scheduling group (called "main"). However: - It is important and significant enough that it should run in a scheduling group that is separate from the main one, - It should not run in the existing "commitlog" group as user writes may sometimes need to wait for schema commitlog writes (e.g. read barrier done to learn the schema necessary to interpret the user write) and we want to avoid priority inversion issues. Therefore, introduce a new scheduling group dedicated to the schema commitlog. Fixes: scylladb#15566
Schema commitlog uses the default scheduling group,
main
.Maybe we should use a dedicated scheduling group for schema commitlog, with appropriately adjusted number of shares?
Also, what other tasks are running on
main
? (Maybemain
is already exclusive to schema commitlog...)cc @xemul @kostja @gleb-cloudius @tgrabiec @avikivity
The text was updated successfully, but these errors were encountered: