-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow RA metrics to be reported upon parsing completed or accumulated #108726
Conversation
expose size observer plugin to stateless integ test remove unused assertions spotless spotless
RAmetric can be implemented so that they could be reported before they are being indexed (like with a new field being added) or they could be accumulated and reported upon shard commit as an additional metadata This commit addes new method to DocumentSizeReporter#onParsingComplted DocumentSizeAccumulator that is being used to accumulate the size inbetween the commits DocumentSizeReporter can be parametrised with a DocumentSizeAccumulator based on elastic#108449
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Hi @pgomulka, I've created a changelog YAML for you. |
* @param map a map with previous value of size | ||
* @return an map with a new value of size | ||
*/ | ||
Map<String, String> totalDocSizeMap(Map<String, String> map); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That signature seems match a very specific use case only, making it hard to understand in this context. Could you remind me again how we are using it? Could this just be getAndReset()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are using this in IndexEngine. we are returning this map so that it can be stored in a commit userdata.
getAndReset
feels fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to bother again about naming, but looking at what EMPTY_INSTANCE does and from what I remember, this is supposed to enrich an existing map passed via argument, adding the accumulated size, right?
If so, getAndReset
seems a bit off. I know enrichMapAndResetAccumulator
is a bit much, but if this is what it does..
(I might be wrong here, please correct me!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from what I remember
From the POC on which this PR is based, I mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enrich an existing map passed via argument
not really enrich, but fetch a previous value of ra_storage_key and add a new accumulated size
We should not try to enrich it as it will contain other values (TRANSLOG_RECOVERY_START_FILE) which will be overriden later in getCommitExtraUserData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just noticed the change in serverless. That was a misunderstanding, I had int getAndReset();
in mind, but I see why you want to pass the map through here. My bad :(
How about the following? That might better capture the intention of this?
Map<String, String> incrementCommitExtraUserData(SegmentInfos segmentInfos)
Or more "precise", though clunky (and not really a serious recommendation)
Optional<Entry<String, String>> incrementCommitExtraUserData(SegmentInfos segmentInfos)
Or
Map<String, String> incrementAndReset(Map<String, String>)
Though I honestly don't like the signature that passes in a map. It makes it look like the input map is modified, which isn't really the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or
Map<String, String> incrementSizeAndReset(SegmentInfos segmentInfos)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, I int getAndReset();
would mean we would have to 'leak' the logic of the aggregation of previous value into the IndexEngine
I think we don't really have to explicitly say in a method name that we will reset the accumulator. We could keep this as an implementation detail, it does not matter from IndexEngine
point of view what happens with the value in the accumulator - but let's leave this for the implementation Accumulator interface.
how about
Map<String, String> getAsCommitUserDataMap(SegmentInfos segmentInfos)
?
returning an empty map (/just noticed this bug/) should act as a null object pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what confused me was that in EMPTY_INSTANCE we were passing the map in and out for the "null object". So I thought: we get an existing map, and we change it. The no-op/null object would just not change it.
If we have instead Map<String, String> getAsCommitUserDataMap(SegmentInfos segmentInfos)
, and the no-op implementation returning an empty map, that would be clearer to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound good to me, but I think we can drop the redundant Map
suffix 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking on name :) but otherwise looks good.
👍 Still good, even better after the renaming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
…elastic#108726) RAmetric can be implemented so that they could be reported before they are being indexed (like with a new field being added) or they could be accumulated and reported upon shard commit as an additional metadata This commit addes new method to DocumentSizeReporter#onParsingComplted DocumentSizeAccumulator that is being used to accumulate the size inbetween the commits DocumentSizeReporter can be parametrised with a DocumentSizeAccumulator based on elastic#108449
RAmetric can be implemented so that they could be reported before they are being indexed (like with a new field being added)
or they could be accumulated and reported upon shard commit as an additional metadata
This commit addes new method to DocumentSizeReporter#onParsingComplted DocumentSizeAccumulator that is being used to accumulate the size inbetween the commits DocumentSizeReporter can be parametrised with a DocumentSizeAccumulator
based on #108449