Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAK-10801: Introduce functions to remove nodes that are older than a timestamp #1451

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

Joscorbe
Copy link
Member

This function removes all the children of a node that are older than a given timestamp. To know if a node was modified recently, the value of the internal property _modified is used.

The function doesn't remove the given path. Only its children and their subtrees.

A helper method is added to find the candidates that would be deleted.

Copy link
Contributor

@stefan-egli stefan-egli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand the context a bit more. Would it be possible to add some comment here or in the jira as to use cases for which this is needed?

Naming wise : I'm wondering if the name is clear enough. On the one hand it uses the term "descendants" which implies it removes the entire subtree (which it ultimately does, for those matching the query conditions, as it uses removeDescendantsAndSelf). On the other hand it actually checks only direct children (depth++) and then removes their descendants. So this combination of descendants-of-matching-direct-children I'm not sure if the name reflects that entirely. Maybe something like removeDescendantsOf(Direct)ChildrenOlderThan (but yes, the name gets ... long)

Was also wondering if the two methods could be combined into one - to reduce risk of bugs and maintenance. I.e. there could be.an internal "worker function" that accepts a dryRun flag - and two caller functions that is exposed in the api. that calls either dryRun true or false?

/**
* Removes all the descendants with a _modified timestamp older than the
* given timestamp.
* Use removeDescendantsAndSelf() first to get the list of nodes to remove.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Use removeDescendantsAndSelf() first to get the list of nodes to remove.
* Use findDescendantsOlderThan() first to get the list of nodes to remove.

Was this indeed meant to refer to removeDescendantsAndSelf, or not findDescendantsOlderThan?

@rishabhdaim
Copy link
Contributor

Do we need to add that we use the _modified field to find documents older than the given timestamp inside the documentation?

IIUC, this is an internal detail and can be avoided, @stefan-egli wydt?

@stefan-egli
Copy link
Contributor

stefan-egli commented May 14, 2024

Good point @rishabhdaim , we don't need to refer to the underlying key - what we might want to mention is that the timestamp is at 5 sec resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants