rgw: add shard reduction ability to dynamic resharding #57538

ivancich · 2024-05-18T00:37:40Z

Previously, dynamic resharding could only increase the number of bucket index shards for a given bucket. This adds the ability to also reduce the number of shards.

So in addition the existing 100,000 entries (current default value) per shard trigger for an increase, there's a new trigger of 10,000 entries per shard for a decrease.

However, for buckets with object-counts that go up and down regularly, we don't want to keep resharding up and down to chase the number of objects. So for shard reduction to take place there's also a time delay (default 5 days). Once the entry on the reshard queue (log) is added for reduction, processing will not result in a reshard reduction within this delay period as the queue is processed. Only when the reshard entry is processed after this delay can it perform the shard reduction.

However, if at any point between the time the shard reduction entry is added to the queue and after the delay, if the entry is processed and there are not few enough entries to trigger a shard reduction, the entry on the reshard queue entry will be discarded.

So using the defaults, this effectively means the bucket must have few enough objects for a shard reduction for 5 consecutive days before the reshard will take place.

Fixes: https://tracker.ceph.com/issues/66104

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

cbodley

👍

src/rgw/driver/rados/rgw_reshard.cc

cbodley · 2024-05-20T17:07:19Z

src/common/options/rgw.yaml.in

+  desc: Whether dynamic resharding can reduce the number of shards
+  long_desc: If true, RGW's dynamic resharding ability is allowed to
+    reduce the number of shards if it appears there are too many.
+  default: false


is there a reason this shouldn't default to true?

Where do we sit with changes in default behaviour between releases? Asking because when users upgrade their clusters, this feature would be disabled by default, which is preferable, IMO - that way clusters will continue to behave as they did prior to the upgrade, and users can opt in if they choose to by enabling the new feature.

thanks @nhoad. conversely, if we invest in general enhancements that aren't enabled by default, the majority of users won't benefit because they don't know to enable them. ceph has too many knobs already and can be difficult to tune correctly. we really should strive for optimal behavior by default

i think all clusters should benefit from appropriately-sized bucket indices. if this change adds new risks to existing deployments, then i'd like to explore those and try to resolve them at the design level

src/common/options/rgw.yaml.in

src/rgw/driver/rados/rgw_reshard.cc

ivancich · 2024-05-29T18:42:06Z

src/rgw/driver/rados/rgw_reshard.cc

  if (reshard_log) {
-    ret = reshard_log->update(dpp, bucket_info, y);
+    ret = reshard_log->update(dpp, bucket_info, initiator, y);
    if (ret < 0) {
      return ret;
    }


@cbodley : I'm trying to think of why when having decided to execute a reshard we would update the reshard queue (log). In testing I've commented out this logic and I've seen no adverse impact, although maybe I haven't brushed up against the reason it's there. Do you have any thoughts?

ivancich · 2024-05-29T18:44:15Z

@cbodley: Following your insight during the first review, I've added a commit that now tracks the initiator of the entry on the reshard queue (log). It adds a field to the at-rest data structure. Valid values are Admin and Dynamic, and it's an enum class, so other options can be added in the future. And there are logic updates that depend on this setting.

mattbenjamin

I approve the content and the name of this PR

src/rgw/driver/rados/rgw_rados.cc

src/rgw/driver/rados/rgw_reshard.cc

Adds the ability to prevent overwriting a reshard queue (log) entry for a given bucket with a newer entry. This adds a flag to the op, so it will either CREATE or make no changes. If an entry already exists when this flag is set, -EEXIST will be returned. This is a preparatory step to adding shard reduction to dynamic resharding. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

ivancich · 2024-05-31T15:11:52Z

jenkins test make check

ivancich · 2024-05-31T19:33:52Z

Passed QA; no issues related to this PR:

Previously, dynamic resharding could only *increase* the number of bucket index shards for a given bucket. This adds the ability to also *reduce* the number of shards. So in addition the existing 100,000 entries (current default value) per shard trigger for an increase, there's a new trigger of 10,000 entries per shard for a decrease. However, for buckets with object-counts that go up and down regularly, we don't want to keep resharding up and down to chase the number of objects. So for shard reduction to take place there's also a time delay (default 5 days). Once the entry on the reshard queue (log) is added for reduction, processing will not result in a reshard reduction within this delay period as the queue is processed. Only when the reshard entry is processed after this delay can it perform the shard reduction. However, if at any point between the time the shard reduction entry is added to the queue and after the delay, if the entry is processed and there are *not* few enough entries to trigger a shard reduction, the entry on the reshard queue entry will be discarded. So using the defaults, this effectively means the bucket must have few enough objects for a shard reduction for 5 consecutive days before the reshard will take place. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

Adds a config option rgw_reshard_debug_interval that will allow us to make the resharding algorithms run on a faster schedule by allowing one day to be simulated by a set number of seconds. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

The logic for managing the reshard queue (log) can vary depending on whether the entry was added by an admin or by dynamic resharding. For example, if it's a reshard reduction, dynamic resharding won't overwrite the queue entry so as not to disrupt the reduction wait period. On the other hand, and admin should be able to overwrite the entry at will. So we now track the initiator of each entry on the queue. This adds another field to that at rest data structure, and it updates the logic to make use of it. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

ivancich · 2024-06-01T04:48:23Z

jenkins test api

ivancich · 2024-06-01T04:48:52Z

jenkins test windows

cbodley · 2024-06-05T13:53:59Z

seeing a new test_rgw_reshard.py failure on main that i suspect is related: https://tracker.ceph.com/issues/66367

cbodley · 2024-06-05T18:59:15Z

p.s. this change deserves a mention in PendingReleaseNotes

ivancich requested review from mattbenjamin and cbodley May 18, 2024 00:37

ivancich requested a review from a team as a code owner May 18, 2024 00:37

github-actions bot added common rgw labels May 18, 2024

ivancich added feature needs-review labels May 18, 2024

cbodley reviewed May 20, 2024

View reviewed changes

ivancich force-pushed the wip-shrinky-dink branch 4 times, most recently from fc59dd4 to 70fc0a7 Compare May 23, 2024 13:25

ivancich force-pushed the wip-shrinky-dink branch from 70fc0a7 to af0101e Compare May 29, 2024 18:37

ivancich commented May 29, 2024

View reviewed changes

ivancich requested review from cbodley and nhoad May 29, 2024 18:42

mattbenjamin approved these changes May 30, 2024

View reviewed changes

src/rgw/driver/rados/rgw_rados.cc Show resolved Hide resolved

src/rgw/driver/rados/rgw_reshard.cc Outdated Show resolved Hide resolved

ivancich force-pushed the wip-shrinky-dink branch from af0101e to 05ff067 Compare May 30, 2024 15:59

ivancich added needs-qa wip-eric-testing-1 for ivancich testing labels May 30, 2024

ivancich force-pushed the wip-shrinky-dink branch from 05ff067 to c2e5f76 Compare May 30, 2024 17:25

ivancich force-pushed the wip-shrinky-dink branch from c2e5f76 to 2d6e9d6 Compare May 30, 2024 17:27

ivancich force-pushed the wip-shrinky-dink branch from 2d6e9d6 to a7e2f62 Compare May 31, 2024 19:43

github-actions bot added the tests label May 31, 2024

ivancich added 2 commits May 31, 2024 17:18

ivancich force-pushed the wip-shrinky-dink branch from a7e2f62 to 9302fbb Compare May 31, 2024 21:18

ivancich merged commit 087aaa2 into ceph:main Jun 3, 2024
10 of 11 checks passed

ivancich removed needs-qa needs-review wip-eric-testing-1 for ivancich testing labels Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rgw: add shard reduction ability to dynamic resharding #57538

rgw: add shard reduction ability to dynamic resharding #57538

ivancich commented May 18, 2024 •

edited

cbodley left a comment

cbodley May 20, 2024

nhoad May 20, 2024

cbodley May 20, 2024

ivancich May 29, 2024

ivancich commented May 29, 2024

mattbenjamin left a comment

ivancich commented May 31, 2024

ivancich commented May 31, 2024

ivancich commented Jun 1, 2024

ivancich commented Jun 1, 2024

cbodley commented Jun 5, 2024

cbodley commented Jun 5, 2024

rgw: add shard reduction ability to dynamic resharding #57538

rgw: add shard reduction ability to dynamic resharding #57538

Conversation

ivancich commented May 18, 2024 • edited

Contribution Guidelines

Checklist

cbodley left a comment

Choose a reason for hiding this comment

cbodley May 20, 2024

Choose a reason for hiding this comment

nhoad May 20, 2024

Choose a reason for hiding this comment

cbodley May 20, 2024

Choose a reason for hiding this comment

ivancich May 29, 2024

Choose a reason for hiding this comment

ivancich commented May 29, 2024

mattbenjamin left a comment

Choose a reason for hiding this comment

ivancich commented May 31, 2024

ivancich commented May 31, 2024

ivancich commented Jun 1, 2024

ivancich commented Jun 1, 2024

cbodley commented Jun 5, 2024

cbodley commented Jun 5, 2024

ivancich commented May 18, 2024 •

edited