sql: determine a reasonable default for server.max_open_transactions_per_gateway #124318
Labels
C-investigation
Further steps needed to qualify. C-label will change.
O-postmortem
Originated from a Postmortem action item.
P-3
Issues/test failures with no fix SLA
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
Is your feature request related to a problem? Please describe.
We recently created the
server.max_open_transactions_per_gateway
cluster setting. When set, it limits the number of concurrently open transactions on a single node. This was added because under high concurrency, CockroachDB does not perform well (though the performance issues may mostly have been due to the bug described in #123235). The default is 0 right now, which means that most users will not see value from the cluster setting.We recommend that users size their connection pools to
4 * CPUs
in our docs for similar reasons; to guard against overloading the cluster.Describe the solution you'd like
Set the default to something like
GOMAXPROCS * constant
. A reasonable constant may be hard to identify. We shouldn't pick something too low, since that will cause breakage and errors for clusters that temporarily reach high amounts of concurrency. Picking a value that's too high may mean that the setting doesn't really do anything, but it's definitely safer.An issue is that different nodes may have different GOMAXPROCS. So rather than actually persisting the default value, we probably just want to make it so that the
GOMAXPROCS * constant
calculation is used if the setting is not set.We could define a negative setting as the scale factor for GOMAXPROCS. So a value of
-32
would mean that32 * GOMAXPROCS
concurrent open transactions are allowed.Describe alternatives you've considered
Keep it opt-in. Users can set it if they are aware of it.
Also, as the admission control system improves, this setting will become less important.
Additional context
See the takeaways from an outage where this setting would have helped. https://cockroachlabs.atlassian.net/wiki/x/zICD1
Jira issue: CRDB-38820
The text was updated successfully, but these errors were encountered: