Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mvlog: log basics for segment rolls #18521

Merged
merged 2 commits into from
May 24, 2024

Conversation

andrwng
Copy link
Contributor

@andrwng andrwng commented May 16, 2024

Introduces the beginnings of a log implementation that is able to roll
new segments with segment.bytes or segment.ms.

The log, similar to the disk_log_impl, is backed by a circular buffer of
segments, with the noteable difference that the active segment is
managed explicitly and separately from the others.

I'm not fully sold on this approach (e.g. vs only managing the segment
appender separately but keeping the underlying readable segment with the
rest of the segments), but for now it makes reasoning about segment
rolling concurrency a bit more straightforward, and down the line it
will make it easier to reason about segment removals (e.g. the active
segment cannot be removed).

This implementation is still missing quite a bit (e.g. it doesn't flush,
doesn't implement the exact log interface), but at least begins to
introduce some ideas that can back the new log implementation.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@andrwng
Copy link
Contributor Author

andrwng commented May 16, 2024

/ci-repeat

@andrwng andrwng force-pushed the mvlog-segment-rolls branch 2 times, most recently from bc6a94c to 4651b15 Compare May 16, 2024 17:37
This will be used to uniquely identify new segments.
dotnwat
dotnwat previously approved these changes May 21, 2024
src/v/storage/mvlog/CMakeLists.txt Show resolved Hide resolved
: segment_file(std::move(f))
, appender(std::make_unique<segment_appender>(segment_file.get()))
, readable_seg(std::make_unique<readable_segment>(segment_file.get()))
, construct_time(ss::lowres_clock::now())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would construction time be attached to the file instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean whether construction time should be something tracked in the mvlog::file? I think probably not -- the construction time is only really useful for now in determining if we should apply segment.ms

Or are you asking whether it should be attached to something other than the file construction (e.g. first data write)? I think that wouldn't be a bad idea -- maybe it'd be important to ensure no "empty" segments e.g. if a segment contains only truncate entries maybe we wouldn't want it being rolled? Though I'm not sold that such empty segments are a problem

Copy link
Member

@dotnwat dotnwat May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that ctime is a normal posix file property, so associating it with the actual file made some sense. but maybe it's just a naming collision? we probably want to track our own creation timestamp in mvlog if that timestamp is going to be used for retention purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha. Yeah it probably makes sense for retention. Will ponder a bit on how to resolve the naming conflict. I agree it might be confusing down the line

src/v/storage/mvlog/versioned_log.cc Show resolved Hide resolved
Comment on lines 63 to 64
auto ro_seg = std::make_unique<readonly_segment>(std::move(active_seg_));
segs_.emplace_back(std::move(ro_seg));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. i wonder if we should call this like extend rather than roll? i mean, it doesn't really matter, but I am reading correctly that this roll is not like the roll we do in disk_log_impl--it's a logical roll?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, a relatively self-describing name might be close_active_seg_unlocked() or something. You're right in disk_log_impl "roll" typically refers to creating an additional segment, which this doesn't do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of extend because it might be mistaken for adding data to the log, which this isn't doing, unless you're referring to the create_unlocked() method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. i think it's fine as-is. i'm getting used to the naming

vassert(
!active_segment_lock_.ready(),
"create_unlocked() must be called with active segment lock held");
vassert(active_seg_ == nullptr, "Expected no active segment");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe there isn't, but it would be nice if this were impossible by construction. sort of like, a log always has an active segment--it's the last one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree, though I haven't landed yet on an impl that I found easy to reason about. It's possible that everything covered by the active segment lock could be encapsulated into some active_segment_manager class that handles rolling and appends and such. I suspect it'll make things harder to reason about on the read path, but I wasn't sure so I opted to keep this a little more raw

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

log.info,
"Rolling segment file {}",
active_seg_->segment_file->filepath().c_str());
co_await active_seg_->segment_file->flush();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if the flush is necessary? if we extend the log and keep writing to the next segment at some point the "log" will be flushed, at which point we could flush all unflushed segments. is there a particular reason for flushing here before rolling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I felt it seemed natural for applying segment.ms for a flush to occur because it indicates we have some data that's otherwise just sitting in memory. But for segment.bytes maybe it's more worth it to be lazy about flushing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh. we can always measure it and see if it is worth keeping or removing.

src/v/storage/mvlog/versioned_log.cc Outdated Show resolved Hide resolved
src/v/storage/mvlog/versioned_log.h Show resolved Hide resolved
@Lazin
Copy link
Contributor

Lazin commented May 21, 2024

Generally, the change looks good. I'm curious about the versioning. Previously, we had a lot of complexity created by use of locking. This code also relies on locking instead of versioning. Or maybe versioning will be used for truncation/readers?

@andrwng
Copy link
Contributor Author

andrwng commented May 21, 2024

Previously, we had a lot of complexity created by use of locking. This code also relies on locking instead of versioning. Or maybe versioning will be used for truncation/readers?

Good observation. I think the complexity of segment locking is that it allows for there to be many combinations of orderings of lock acquisitions, which makes it very easy to deadlock. I'm hoping that by more narrowly scoping the areas that we have locks that we'll reduce complexity.

And that's right, we'll use versioning on the read path and in handling truncations.

}

ss::future<> versioned_log::apply_segment_ms() {
auto lock = active_segment_lock_.get_units();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing co_await? i thought this would fail 🤔 or we never added this diagnostic? llvm/llvm-project#76101

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! Fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test for this as well.

@andrwng andrwng force-pushed the mvlog-segment-rolls branch 2 times, most recently from d27b749 to 8b291b3 Compare May 22, 2024 20:05
@andrwng andrwng requested a review from nvartolomei May 23, 2024 00:36
"Rolling segment file {}",
active_seg_->segment_file->filepath().c_str());
co_await active_seg_->segment_file->flush();
auto ro_seg = std::make_unique<readonly_segment>(std::move(active_seg_));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is auto ro_seg = std::move(active_seg_) on its own sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The active segment and readonly segments have different types, so it doesn't seem like it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Eyes deceived me on this one.

return log_.get();
}

ss::future<ss::circular_buffer<model::record_batch>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For future proofing, maybe this should use the version_log::segments_t type (you'll have to make it public).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a group of batches, not segments

Introduces the beginnings of a log implementation that is able to roll
new segments with segment.bytes or segment.ms.

The log, similar to the disk_log_impl, is backed by a circular buffer of
segments, with the noteable difference that the active segment is
managed explicitly and separately from the others.

I'm not fully sold on this approach (e.g. vs only managing the segment
appender separately but keeping the underlying readable segment with the
rest of the segments), but for now it makes reasoning about segment
rolling concurrency a bit more straightforward, and down the line it
will make it easier to reason about segment removals (e.g. the active
segment cannot be removed).

This implementation is still missing quite a bit (e.g. it doesn't flush,
doesn't implement the exact log interface), but at least begins to
introduce some ideas that can back the new log implementation.
@andrwng andrwng merged commit 122097b into redpanda-data:dev May 24, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants