Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-Memory Engine: Eviction, Algorithmic loading #17023

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

afeinberg
Copy link
Contributor

@afeinberg afeinberg commented May 16, 2024

What is changed and how it works?

Issue Number: Ref #16764 Ref #16141

What's Changed:

In-Memory Engine: Algorithmic Load and Eviction

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
    Configure with range cache engine enabled and load a dataset larger than the soft limit.
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Release note

None

Copy link
Contributor

ti-chi-bot bot commented May 16, 2024

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added release-note-none contribution Type: PR - From contributors labels May 16, 2024
Copy link
Contributor

ti-chi-bot bot commented May 16, 2024

Hi @afeinberg. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@afeinberg
Copy link
Contributor Author

/cc @tonyxuqqi @SpadeA-Tang

@afeinberg
Copy link
Contributor Author

/assign @tonyxuqqi

@afeinberg afeinberg force-pushed the afeinberg/in-memory/eviction-clean branch from 7a41a13 to 4a7dad7 Compare May 16, 2024 00:09
Copy link
Contributor Author

@afeinberg afeinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments that I expect: I know I will need to add some documents and an attempt at tests.

if remaining == 0 {
break;
}
if self.engine.write().mut_range_manager().evict_range(range) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: I need to add a check to see if this was recently loaded. It should be easy here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should call engine.evict_range directly as it will do some cleanup works

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should call engine.evict_range directly as it will do some cleanup works

@afeinberg

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.engine.write().mut_range_manager().evict_range(range) returns true means the range can be deleted from the skiplist directly and thus not need to wait for the drop of some snapshots. So we have to delete the range if it return true, and this logic is already have in RangeCacheMemoryEngine::evict_range. It seems we cannot get the engine here, so we have to delete the range mannually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this to my attention. Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SpadeA-Tang Done. Is what I am doing now what you had in mind?

@afeinberg
Copy link
Contributor Author

/cc @overvenus

@ti-chi-bot ti-chi-bot bot requested a review from overvenus May 16, 2024 02:19
@afeinberg afeinberg force-pushed the afeinberg/in-memory/eviction-clean branch from e0ea397 to d203840 Compare May 16, 2024 22:13
}
if self.engine.write().mut_range_manager().evict_range(range) {
info!("evict on soft limit reached"; "range" => ?&range, "approx_size" => approx_size, "remaining" => remaining);
remaining = remaining
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There maybe a huge difference between the memory usage of the region in in-memory engine and the region approximate size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is true. I understand this is imprecise, so there is a best effort approaches. Approximate size is via region info. Do we have a preferred way to check this that is more direct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SpadeA-Tang Looks like there's no clean API to get memory usage of a range directly from the range cache engine. I do have an idea on this, but it might take some time to implement. On the other hand, approximate size is most likely to be smaller than the actual size - which would lead to over eviction, which seems safer than under eviction. Any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's indeed over eviction. Actually, the approximate region size is generally estimazted by including all MVCC versions as well as rocksdb tombstones where there's may be only a small fraction of them that is in the memory engine. So this difference may be significantly huge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SpadeA-Tang How difficult would it to be to add an API for accurate size? Seems like adding a quick lock free way to keep track of a region (perhaps by using thread local to set region id (or start and end keys) when allocating memory for that region, or keeping track of all allocations for a specific range some other way?) would be the right way to do so. I can create an issue for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SpadeA-Tang What do you think about making delete_range return the number of bytes it has deleted? I can make this change in this PR. It might have slight overhead. I don't see any drawback to that.

@SpadeA-Tang
Copy link
Member

We may add integration tests for this.

@afeinberg afeinberg force-pushed the afeinberg/in-memory/eviction-clean branch 2 times, most recently from 5efa46b to 5db0bea Compare May 20, 2024 17:57
@ti-chi-bot ti-chi-bot bot added size/XXL and removed size/XL labels May 24, 2024
@afeinberg afeinberg force-pushed the afeinberg/in-memory/eviction-clean branch 2 times, most recently from dc7fb92 to a44d420 Compare May 24, 2024 16:43
@afeinberg afeinberg force-pushed the afeinberg/in-memory/eviction-clean branch from a44d420 to 6a53d68 Compare May 24, 2024 16:45
@afeinberg
Copy link
Contributor Author

We may add integration tests for this.

Any thoughts on where to do this? We would need to have real heartbeats going out, ideally, although I can also use a mock region info provider.

@afeinberg
Copy link
Contributor Author

afeinberg commented May 28, 2024

/run-all-tests

@afeinberg afeinberg force-pushed the afeinberg/in-memory/eviction-clean branch from 9559df9 to 56964ed Compare May 28, 2024 18:01
Load/evict based on top regions.

Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
@afeinberg afeinberg force-pushed the afeinberg/in-memory/eviction-clean branch from 56964ed to e8dfe12 Compare May 31, 2024 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants