Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NATS Cluster - Dynamically del node #5421

Open
throwbear opened this issue May 15, 2024 · 3 comments
Open

NATS Cluster - Dynamically del node #5421

throwbear opened this issue May 15, 2024 · 3 comments
Labels
defect Suspected defect such as a bug or regression

Comments

@throwbear
Copy link

Observed behavior

I start four nats-server instances by configuration file as follows:

nats-server.conf

server_name: node1[2|3|4]

http_port: 6222

accounts: {
SYS: {
users: [
{user: adm, password: admin123}
]
}
}

system_account: SYS

cluster: {
name: test-cluster
listen: 0.0.0.0:4248
routes: [
nats-route://192.168.3.101:4248,
nats-route://192.168.3.102:4248,
nats-route://192.168.3.103:4248,
nats-route://192.168.3.104:4248,
]
}

cmd “nats server list” return as follows:

╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Server Overview │
├─────────┬──────────────┬──────┬─────────┬─────┬───────┬──────┬────────┬─────┬────────┬───────┬───────┬──────┬────────┬─────┤
│ Name │ Cluster │ Host │ Version │ JS │ Conns │ Subs │ Routes │ GWs │ Mem │ CPU % │ Cores │ Slow │ Uptime │ RTT │
├─────────┼──────────────┼──────┼─────────┼─────┼───────┼──────┼────────┼─────┼────────┼───────┼───────┼──────┼────────┼─────┤
│ node101 │ test-cluster │ 0 │ 2.10.10 │ no │ 1 │ 233 │ 12 │ 0 │ 15 MiB │ 0 │ 43 │ 0 │ 8m7s │ 2ms │
│ node103 │ test-cluster │ 0 │ 2.10.10 │ no │ 0 │ 233 │ 12 │ 0 │ 15 MiB │ 1 │ 20 │ 0 │ 6m27s │ 2ms │
│ node104 │ test-cluster │ 0 │ 2.10.10 │ no │ 0 │ 233 │ 12 │ 0 │ 14 MiB │ 0 │ 48 │ 0 │ 7m54s │ 2ms │
│ node102 │ test-cluster │ 0 │ 2.10.12 │ no │ 0 │ 233 │ 12 │ 0 │ 14 MiB │ 0 │ 56 │ 0 │ 25.71s │ 2ms │
├─────────┼──────────────┼──────┼─────────┼─────┼───────┼──────┼────────┼─────┼────────┼───────┼───────┼──────┼────────┼─────┤
│ │ 1 │ 4 │ X │ 4 │ 1 │ 932 │ │ │ 57 MiB │ │ │ 0 │ │ │
╰─────────┴──────────────┴──────┴─────────┴─────┴───────┴──────┴────────┴─────┴────────┴───────┴───────┴──────┴────────┴─────╯

╭─────────────────────────────────────────────────────────────────────────────────╮
│ Cluster Overview │
├──────────────┬────────────┬───────────────────┬───────────────────┬─────────────┤
│ Cluster │ Node Count │ Outgoing Gateways │ Incoming Gateways │ Connections │
├──────────────┼────────────┼───────────────────┼───────────────────┼─────────────┤
│ test-cluster │ 4 │ 0 │ 0 │ 1 │
├──────────────┼────────────┼───────────────────┼───────────────────┼─────────────┤
│ │ 4 │ 0 │ 0 │ 1 │
╰──────────────┴────────────┴───────────────────┴───────────────────┴─────────────╯

After running a while, I shut down one of cluster server(node103). nats-server log in the rest of cluster keep printing:[ERR] Error trying to connect to route (attempt 13062): dial tcp 192.168.3.103:4248: connect: connection refused

Expected behavior

How to refresh routing-table in the nats cluster Dynamically when removing a node?

Server and client version

nats-server: v2.10.10
nats --version
0.1.3

Host environment

No response

Steps to reproduce

No response

@throwbear throwbear added the defect Suspected defect such as a bug or regression label May 15, 2024
@ripienaar
Copy link
Contributor

If you are specifically listing all cluster members in the configuration then it will try them forever, so in that case you should edit the config and reload the servers.

The server do support learning network topology dynamically if you only listed some of the servers, in that case it should not try forever

@throwbear
Copy link
Author

If you are specifically listing all cluster members in the configuration then it will try them forever, so in that case you should edit the config and reload the servers.

The server do support learning network topology dynamically if you only listed some of the servers, in that case it should not try forever

In addition to restarting the servers,is there any commands supporting add or delete routing-table?

@ripienaar
Copy link
Contributor

You can just add a server listing one or two routes and it will learn the topology, and removing it will be fine. But if you list it all in the config - or remove one listed in the config - you need to reload.

Note, dynamic cluster adjustments isn't compatible with JetStream so if you want to use that you need to be more static in nature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

No branches or pull requests

2 participants