Associate restore snapshot task to parent mount task #108705

nicktindall · 2024-05-16T07:51:01Z

Implemented the test as an integration test that polls the tasks API to confirm the association of parent/child tasks. Thanks for the feedback @DaveCTurner.

Closes #105830

...g/elasticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotAction.java

...sticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotActionTests.java

DaveCTurner

Yeah mocking this much stuff is pretty bad IMO, it will get in the way of future changes/refactorings to these APIs I'm sure.

I would suggest doing this as a org.elasticsearch.xpack.searchablesnapshots.BaseSearchableSnapshotsIntegTestCase which gives you real implementations of all this stuff. Submit a task to the master which blocks, something like this:

elasticsearch/server/src/test/java/org/elasticsearch/cluster/routing/BatchedRerouteServiceTests.java

Lines 89 to 101 in 69ff7df

    
           clusterService.submitUnbatchedStateUpdateTask("block master service", new ClusterStateUpdateTask() { 
        
               @Override 
        
               public ClusterState execute(ClusterState currentState) throws Exception { 
        
                   safeAwait(cyclicBarrier); // notify test that we are blocked 
        
                   safeAwait(cyclicBarrier); // wait to be unblocked by test 
        
                   return currentState; 
        
               } 
        
               @Override 
        
               public void onFailure(Exception e) { 
        
                   throw new AssertionError("block master service", e); 
        
               } 
        
           });

(get the clusterService using internalCluster().getCurrentMasterNodeInstance(ClusterService.class))

That will give you time to call the list-tasks API and verify that the mount task is the parent of the restore task.

...g/elasticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotAction.java

...sticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotActionTests.java

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

elasticsearchmachine · 2024-05-17T05:24:55Z

Hi @nicktindall, I've created a changelog YAML for you.

Closes elastic#105830

nicktindall · 2024-05-17T08:14:14Z

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

+            safeAwait(cyclicBarrier);   // Unblock the master thread
+            if (response != null) {
+                response.actionGet();
+                assertAcked(indicesAdmin().prepareDelete(indexName));


The actionGet and delete index here were an attempt to fix a failure that occurred in CI in a teardown, one or both of these may be necessary. Happy to remove the delete if it's not necessary.

Nice catch, yeah, it seems we have to wait for the mount to complete before finishing the test or else there's a race between cleaning up the test cluster and the mount process creating the internal .snapshot-blob-cache index.

I expect there's no need to manually delete the test index here tho but it doesn't hurt either. Would you add a code comment about why we need these nonobvious cleanup steps?

DaveCTurner

Test looks great. I left a few pointers but nothing substantial.

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

DaveCTurner · 2024-05-17T09:23:00Z

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

+            safeAwait(cyclicBarrier);   // Unblock the master thread
+            if (response != null) {
+                response.actionGet();
+                assertAcked(indicesAdmin().prepareDelete(indexName));


Nice catch, yeah, it seems we have to wait for the mount to complete before finishing the test or else there's a race between cleaning up the test cluster and the mount process creating the internal .snapshot-blob-cache index.

I expect there's no need to manually delete the test index here tho but it doesn't hurt either. Would you add a code comment about why we need these nonobvious cleanup steps?

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

DaveCTurner · 2024-05-17T09:26:20Z

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

+        } finally {
+            safeAwait(cyclicBarrier);   // Unblock the master thread
+            if (response != null) {
+                response.actionGet();


This'd be as good a time as any to stop using these infinite waits in tests, doing something like this and then using safeGet() instead:

diff --git a/test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java b/test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java index 80f9f2abea1..94c89318718 100644 --- a/test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java +++ b/test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java @@ -2154,6 +2154,10 @@ public abstract class ESTestCase extends LuceneTestCase { public static <T> T safeAwait(SubscribableListener<T> listener) { final var future = new PlainActionFuture<T>(); listener.addListener(future); + return safeGet(future); + } + + private static <T> T safeGet(Future<T> future) { try { return future.get(SAFE_AWAIT_TIMEOUT.millis(), TimeUnit.MILLISECONDS); } catch (InterruptedException e) {

comments addressed

- use assertBusy to reduce task-get complexity - don't wait forever for mount to complete - document additional required cleanup - add ESTestCase#safeGet and use it

elasticsearchmachine · 2024-05-20T00:14:19Z

Pinging @elastic/es-distributed (Team:Distributed)

test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java

DaveCTurner

Looks good, just a few tiny style points.

DaveCTurner · 2024-05-20T13:46:09Z

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

+    }
+
+    private TaskInfo getTaskForActionFromMaster(String action) {
+        var ltr = new ListTasksRequest().setDetailed(true).setNodes(internalCluster().getMasterName()).setActions(action);


Would rather not use abbrs. for var. names wherever poss. :)

Here I think I'd just inline the variable, it's only used in one place - saves having to think up a name at all.

DaveCTurner · 2024-05-20T13:47:02Z

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

+        var ltr = new ListTasksRequest().setDetailed(true).setNodes(internalCluster().getMasterName()).setActions(action);
+        ListTasksResponse response = client().execute(TransportListTasksAction.TYPE, ltr).actionGet();
+        int matchingTasks = response.getTasks().size();
+        assertEquals(String.format(Locale.ROOT, "Expected a single task for action %s, got %d", action, matchingTasks), 1, matchingTasks);


I'd use assertThat(..., hasSize(1)) here - that should give the same information on failure as what you've got here but without needing a hand-crafted message.

test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java

DaveCTurner

LGTM

* Associate restore snapshot task to parent mount task Closes elastic#105830

elasticsearchmachine added the v8.15.0 label May 16, 2024

nicktindall commented May 16, 2024

View reviewed changes

...g/elasticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotAction.java Show resolved Hide resolved

nicktindall commented May 16, 2024

View reviewed changes

...sticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotActionTests.java Outdated Show resolved Hide resolved

DaveCTurner previously requested changes May 16, 2024

View reviewed changes

...g/elasticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotAction.java Show resolved Hide resolved

...sticsearch/xpack/searchablesnapshots/action/TransportMountSearchableSnapshotActionTests.java Outdated Show resolved Hide resolved

nicktindall commented May 17, 2024

View reviewed changes

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java Show resolved Hide resolved

nicktindall commented May 17, 2024

View reviewed changes

...sterTest/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java Show resolved Hide resolved

nicktindall added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >bug labels May 17, 2024

nicktindall added the Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. label May 17, 2024

nicktindall requested a review from DaveCTurner May 17, 2024 05:28

nicktindall and others added 5 commits May 17, 2024 15:44

Associate restore snapshot task to parent mount task

f3894db

Closes elastic#105830

Implement as integration test instead of unit test

34cc0b0

Tidy up polling logic

21ad489

Remove dead code

f8a6567

Update docs/changelog/108705.yaml

c5bdd8b

nicktindall force-pushed the fix/105830_link_restore_snapshot_task_to_parent_mount branch from a34015d to c5bdd8b Compare May 17, 2024 05:44

Wait for mount to complete, delete index at end of test

59b8243

nicktindall commented May 17, 2024

View reviewed changes

DaveCTurner reviewed May 17, 2024

View reviewed changes

Tidy up test

770bd05

- use assertBusy to reduce task-get complexity - don't wait forever for mount to complete - document additional required cleanup - add ESTestCase#safeGet and use it

nicktindall marked this pull request as ready for review May 20, 2024 00:13

elasticsearchmachine added the Team:Distributed Meta label for distributed team label May 20, 2024

nicktindall commented May 20, 2024

View reviewed changes

test/framework/src/main/java/org/elasticsearch/test/ESTestCase.java Show resolved Hide resolved

DaveCTurner reviewed May 20, 2024

View reviewed changes

Tidy up from feedback

4f3b7b6

DaveCTurner approved these changes May 21, 2024

View reviewed changes

nicktindall merged commit deb3ef9 into elastic:main May 21, 2024
15 checks passed

nicktindall deleted the fix/105830_link_restore_snapshot_task_to_parent_mount branch May 21, 2024 07:28

jedrazb pushed a commit to jedrazb/elasticsearch that referenced this pull request May 21, 2024

Associate restore snapshot task to parent mount task (elastic#108705)

6ca9d42

* Associate restore snapshot task to parent mount task Closes elastic#105830

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Associate restore snapshot task to parent mount task #108705

Associate restore snapshot task to parent mount task #108705

nicktindall commented May 16, 2024 •

edited

DaveCTurner left a comment

elasticsearchmachine commented May 17, 2024

nicktindall May 17, 2024

DaveCTurner May 17, 2024

DaveCTurner left a comment

DaveCTurner May 17, 2024

DaveCTurner May 17, 2024

elasticsearchmachine commented May 20, 2024

DaveCTurner left a comment

DaveCTurner May 20, 2024

DaveCTurner May 20, 2024

DaveCTurner left a comment

	clusterService.submitUnbatchedStateUpdateTask("block master service", new ClusterStateUpdateTask() {
	@Override
	public ClusterState execute(ClusterState currentState) throws Exception {
	safeAwait(cyclicBarrier); // notify test that we are blocked
	safeAwait(cyclicBarrier); // wait to be unblocked by test
	return currentState;
	}

	@Override
	public void onFailure(Exception e) {
	throw new AssertionError("block master service", e);
	}
	});

Associate restore snapshot task to parent mount task #108705

Associate restore snapshot task to parent mount task #108705

Conversation

nicktindall commented May 16, 2024 • edited

DaveCTurner left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented May 17, 2024

nicktindall May 17, 2024

Choose a reason for hiding this comment

DaveCTurner May 17, 2024

Choose a reason for hiding this comment

DaveCTurner left a comment

Choose a reason for hiding this comment

DaveCTurner May 17, 2024

Choose a reason for hiding this comment

DaveCTurner May 17, 2024

Choose a reason for hiding this comment

elasticsearchmachine commented May 20, 2024

DaveCTurner left a comment

Choose a reason for hiding this comment

DaveCTurner May 20, 2024

Choose a reason for hiding this comment

DaveCTurner May 20, 2024

Choose a reason for hiding this comment

DaveCTurner left a comment

Choose a reason for hiding this comment

nicktindall commented May 16, 2024 •

edited