TestUpdateClusterLayout.test_change_node_ip fails (when getting host-id from nodetool) #18693

xemul · 2024-05-15T17:52:18Z

https://jenkins.scylladb.com/job/scylla-master/job/gating-dtest-release-with-consistent-topology-changes/1589/testReport/junit/update_cluster_layout_tests/TestUpdateClusterLayout/FullDtest___full_split001___test_change_node_ip/

16:51:47,577 747     ccm                            DEBUG    cluster.py          :754  | test_change_node_ip: node2: checking 
16:51:47,577 747     update_cluster_layout_tests    DEBUG    update_cluster_layout_tests.py:2459 | test_change_node_ip: nodetool status from node2: {'Datacenter': 'datacenter1', 'nodes': [{'status': 'UN', 'address': '127.0.74.1', 'load': '?', 'tokens': '256', 'owns': '?', 'host id': '38c2e454-2ce0-4557-868b-67efde6e1c22', 'rack': 'rack1'}, {'status': 'UN', 'address': '127.0.74.2', 'load': '407.32 KB', 'tokens': '256', 'owns': '?', 'host id': '22bce9b5-9c00-4367-a132-e624593a2de0', 'rack': 'rack1'}, {'status': 'UN', 'address': '127.0.74.33', 'load': '?', 'tokens': '256', 'owns': '?', 'host id': '?', 'rack': 'rack1'}]}
16:51:47,799 747     errors                         ERROR    conftest.py         :202  | test_change_node_ip: test failed:

        for node in cluster.nodelist():
            status = nodetool_status(node)
            logger.debug(f"nodetool status from {node.name}: {status}")
            for n in status["nodes"]:
                if n["address"] == ip3:
>                   assert n["host id"] == hostid3
E                   AssertionError: assert '?' == 'c4dc72ab-76c9-4941-a82c-2d63ac5e167b'
E                     - c4dc72ab-76c9-4941-a82c-2d63ac5e167b
E                     + ?

test was re-enabled recently by https://github.com/bhalevy/scylla-dtest/commit/73fd1b731a74ad4a6c00ef51a17a50b08464dbce (from https://github.com/scylladb/scylla-dtest/issues/4273)

EDIT from @kbr-scylla: explanation here: #18693 (comment)

The text was updated successfully, but these errors were encountered:

kbr-scylla · 2024-05-22T09:31:24Z

dtest-gw2.log
node1.log
node2.log
node3.log

kbr-scylla · 2024-05-22T09:32:18Z

Could it be another manifestation of #17857?

kbr-scylla · 2024-05-22T09:44:30Z

#17857 sounds very plausible (cc @bhalevy). In this test we have a node which changes IP address. Once it restarts, other nodes see it in gossip, but before performing storage_service::handle_state_normal, they don't put it in token_metadata. So nodetool status which uses token_metadata::get_endpoint_to_host_id_map_for_reading() to assign host ID to every IP in the status list, does not find the host ID for this IP for some time.

Idea to reproduce: start two nodes A and B, then inject a sleep at the beginning of handle_state_normal for A, then restart B with a different IP address. During the sleep, nodetool status called on A should show missing host ID for B for some time.

kbr-scylla · 2024-05-22T11:26:13Z

Actually it cannot be #17857 and what I wrote above doesn't make sense.

#17857 was observed in a test run #16902 in gossip-topology mode.

This issue was observed in dtest run with raft-topology and there we don't call storage_service::handle_state_normal.

In raft-topology mode, the function responsible for reacting to new endpoints appearing in gossip is storage_service::raft_ip_address_updater::on_endpoint_change. So I suspect a bug could be somewhere in this function.

However I analyzed the code and I don't see how the bug can happen. The function doesn't do anything if it sees that the host ID is missing:

    future<>
    on_endpoint_change(gms::inet_address endpoint, gms::endpoint_state_ptr ep_state, gms::permit_id permit_id, const char* ev) {
        auto app_state_ptr = ep_state->get_application_state_ptr(gms::application_state::HOST_ID);
        if (!app_state_ptr) {
            co_return;
        }
        raft::server_id id(utils::UUID(app_state_ptr->value()));

and when it updates token_metadata, it does it in an isolated way using storage_service::mutate_token_metadata / storage_service::replicate_to_all_cores (which atomically swaps a pointer from old to new version) so the REST API endpoint /storage_service/host_id should not be able to observe a "partial" token_metadata with missing host IDs. It should either see full old token_metadata with the host ID assigned to old IP, or full new token_metadata with the host ID assigned to new IP. (And the same is true for handle_state_normal.)

Something fishy is going on here. Perhaps we still have some other place which is incorrectly updating token_metadata?

kbr-scylla · 2024-05-22T11:27:26Z

This is what the nodes saw when the test executed nodetool:

16:51:47,000 747     update_cluster_layout_tests    DEBUG    update_cluster_layout_tests.py:2454 | test_change_node_ip: Node3 is now up
16:51:47,001 747     urllib3.connectionpool         DEBUG    connectionpool.py   :246  | test_change_node_ip: Starting new HTTP connection (1): 127.0.74.33:10000
16:51:47,001 747     urllib3.connectionpool         DEBUG    connectionpool.py   :474  | test_change_node_ip: http://127.0.74.33:10000 "GET /storage_service/hostid/local HTTP/1.1" 200 38
16:51:47,113 747     ccm                            DEBUG    cluster.py          :754  | test_change_node_ip: node1: nodetool cmd=['/jenkins/workspace/scylla-master/gating-dtest-release-with-consistent-topology-changes/scylla/.dtest/dtest-vql32m_i/test/node1/bin/scylla', 'nodetool', '-h', '127.0.74.1', '-p', '10000', 'status'] wait=True timeout=None
16:51:47,315 747     ccm                            DEBUG    cluster.py          :754  | test_change_node_ip: node1: checking 
16:51:47,316 747     update_cluster_layout_tests    DEBUG    update_cluster_layout_tests.py:2459 | test_change_node_ip: nodetool status from node1: {'Datacenter': 'datacenter1', 'nodes': [{'status': 'UN', 'address': '127.0.74.1', 'load': '253.86 KB', 'tokens': '256', 'owns': '?', 'host id': '38c2e454-2ce0-4557-868b-67efde6e1c22', 'rack': 'rack1'}, {'status': 'UN', 'address': '127.0.74.2', 'load': '407.32 KB', 'tokens': '256', 'owns': '?', 'host id': '22bce9b5-9c00-4367-a132-e624593a2de0', 'rack': 'rack1'}, {'status': 'DN', 'address': '127.0.74.3', 'load': '497.95 KB', 'tokens': '256', 'owns': '?', 'host id': 'c4dc72ab-76c9-4941-a82c-2d63ac5e167b', 'rack': 'rack1'}]}
16:51:47,421 747     ccm                            DEBUG    cluster.py          :754  | test_change_node_ip: node2: nodetool cmd=['/jenkins/workspace/scylla-master/gating-dtest-release-with-consistent-topology-changes/scylla/.dtest/dtest-vql32m_i/test/node2/bin/scylla', 'nodetool', '-h', '127.0.74.2', '-p', '10000', 'status'] wait=True timeout=None
16:51:47,577 747     ccm                            DEBUG    cluster.py          :754  | test_change_node_ip: node2: checking 
16:51:47,577 747     update_cluster_layout_tests    DEBUG    update_cluster_layout_tests.py:2459 | test_change_node_ip: nodetool status from node2: {'Datacenter': 'datacenter1', 'nodes': [{'status': 'UN', 'address': '127.0.74.1', 'load': '?', 'tokens': '256', 'owns': '?', 'host id': '38c2e454-2ce0-4557-868b-67efde6e1c22', 'rack': 'rack1'}, {'status': 'UN', 'address': '127.0.74.2', 'load': '407.32 KB', 'tokens': '256', 'owns': '?', 'host id': '22bce9b5-9c00-4367-a132-e624593a2de0', 'rack': 'rack1'}, {'status': 'UN', 'address': '127.0.74.33', 'load': '?', 'tokens': '256', 'owns': '?', 'host id': '?', 'rack': 'rack1'}]}

kbr-scylla · 2024-05-22T11:29:50Z

BTW there is a slight difference in how Scylla nodetool behaves compared to Cassandra nodetool: if host ID is missing in /storage_service/host_id endpoint, old nodetool will print null:

        hostID = hostIDMap.get(endpoint);

get returns null for missing key

new nodetool will print ?:

endpoint_host_id.contains(ep) ? endpoint_host_id.at(ep) : "?"

cc @denesb

(probably doesn't matter)

denesb · 2024-05-22T11:47:18Z

Yes, I know about this difference, and I thought "?" makes more sense than "null". Note that other columns in the status output also use "?" for when there is no data available (e.g. tokens, load).

kbr-scylla · 2024-05-22T11:58:27Z

Ok I see how the issue can happen

It's just an artifact of how nodetool is implemented. It performs two REST API endpoint calls: one to get "token to endpoint map", another to get "host ID map". The IP change notification inside Scylla can be handled in-between those calls.

New nodetool gets the host ID map first, and the token-to-endpoint map second:

    const auto endpoint_host_id = rjson_to_map<sstring>(client.get("/storage_service/host_id"));
...
    const auto tokens_endpoint_res = client.get("/storage_service/tokens_endpoint", std::move(tokens_endpoint_params));

then it iterates over endpoints found in tokens_endpoint_res, and for each, finds the host ID based on endpoint_host_id to print it in status (and if it doesn't find the ID, it prints ?)

So if IP change notification is handled in-between, then tokens_endpoint_res will contain the new IP, but endpoint_host_id will not contain host ID for that IP.

Curiously it seems that old nodetool has the reverse problem:

        Map<String, String> tokensToEndpoints = probe.getTokenToEndpointMap();
        liveNodes = probe.getLiveNodes();
        unreachableNodes = probe.getUnreachableNodes();
        hostIDMap = probe.getHostIdMap();

so it would sometimes observe the old IP without host ID assigned. But the way the test logic is written, it won't catch this scenario, because it only looks for new IP, and if new IP is found, then it verifies its ID:

        hostid3 = node3.hostid()
        # Verify all nodes in the cluster see node3 is using the new ip address
        for node in cluster.nodelist():
            status = nodetool_status(node)
            logger.debug(f"nodetool status from {node.name}: {status}")
            for n in status["nodes"]:
                if n["address"] == ip3:
                    assert n["host id"] == hostid3

So summarizing, this failure is a small artifact of then new nodetool implementation, but old nodetool had a "reverse" version of this bug, the test just wouldn't catch it due to how it's written.

kbr-scylla · 2024-05-22T12:00:12Z

Can be easily "fixed" (in reality, masked, just to shut the test up) by reversing the order of rest endpoint calls so it's the same as in old nodetool. @denesb

xemul added the symptom/ci stability Issues that failed in ScyllaDB CI - tests and framework label May 15, 2024

xemul mentioned this issue May 15, 2024

repair: Don't get topology via database #18677

Merged

mykaul assigned kbr-scylla May 16, 2024

kbr-scylla added P4 Low Priority P3 Medium Priority and removed P4 Low Priority labels May 22, 2024

kbr-scylla added the good first issue label May 22, 2024

kbr-scylla assigned denesb and unassigned kbr-scylla May 22, 2024

kbr-scylla added the area/nodetool label May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TestUpdateClusterLayout.test_change_node_ip fails (when getting host-id from nodetool) #18693

TestUpdateClusterLayout.test_change_node_ip fails (when getting host-id from nodetool) #18693

xemul commented May 15, 2024 •

edited by kbr-scylla

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

denesb commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

TestUpdateClusterLayout.test_change_node_ip fails (when getting host-id from nodetool) #18693

TestUpdateClusterLayout.test_change_node_ip fails (when getting host-id from nodetool) #18693

Comments

xemul commented May 15, 2024 • edited by kbr-scylla

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

denesb commented May 22, 2024

kbr-scylla commented May 22, 2024

kbr-scylla commented May 22, 2024

xemul commented May 15, 2024 •

edited by kbr-scylla