Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with metrics in daemonize (-d) mode #2279

Open
serge-r opened this issue May 2, 2024 · 4 comments
Open

Problem with metrics in daemonize (-d) mode #2279

serge-r opened this issue May 2, 2024 · 4 comments
Labels
Good First Issue SEO for Github Search Linux Linux-related issue

Comments

@serge-r
Copy link

serge-r commented May 2, 2024

Hello, I have found a problem with ZeroTier metrics when zerotier-one daemon runs with the -d parameter.

Here is my current configuration:

ZeroTier Version: 1.12.2
OS: AL2 4.14.336-257.562.amzn2.x86_64

ZeroTier was installed via the standard method:

curl -s https://install.zerotier.com | sudo bash
systemctl start zerotier-one

The metrics file is empty:

[root@ip-10-111-248-242 ~]# ls -lah /var/lib/zerotier-one/metrics*
-rwxrwxrwx 1 zerotier-one zerotier-one  0 May  2 13:47 /var/lib/zerotier-one/metrics.prom
-rw------- 1 zerotier-one zerotier-one 24 Mar 21 07:53 /var/lib/zerotier-one/metricstoken.secret

You can also reproduce this issue simply by running /usr/sbin/zerotier-one -d

Upon investigating further, I found that on AL2, you are using an init.d script that runs zerotier-one with the -d parameter. This flag initiates the fork() method and shuts down the main process as seen here:

long p = (long)fork();

This action triggers the execution of the destructor of the SaveToFile class, defined here:

The execution of the destructor sets the global variable must_die to true. Since you are using fork() syscall, all processes share the same memory, so the worker_function thread stops before any metrics can be written.

Here is an example of a GDB session:

gdb --args zerotier-one -d
GNU gdb (GDB) Red Hat Enterprise Linux 8.0.1-36.amzn2.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from zerotier-one...(no debugging symbols found)...done.
(gdb) b prometheus::SaveToFile::~SaveToFile()
Breakpoint 1 at 0x4e3ff0
(gdb) run
Starting program: /usr/sbin/zerotier-one -d
[New LWP 32438]
Detaching after fork from child process 32439.

Thread 1 "zerotier-one" hit Breakpoint 1, 0x00000000004e3ff0 in prometheus::SaveToFile::~SaveToFile() ()
(gdb) Starting Control Plane...
Starting V6 Control Plane...
info threads
  Id   Target Id         Frame
* 1    LWP 32434 "zerotier-one" 0x00000000004e3ff0 in prometheus::SaveToFile::~SaveToFile() ()
  2    LWP 32438 "zerotier-one" 0x0000000000e7df06 in sccp ()
(gdb) c
Continuing.
[LWP 32438 exited]
[Inferior 1 (process 32434) exited normally]
(gdb)
The program is not being run.
(gdb)
The program is not being run.
(gdb) b prometheus::SaveToFile::save_data()
Breakpoint 2 at 0x4e6eb0
(gdb) attach 32439
Attaching to program: /usr/sbin/zerotier-one, process 32439
[New LWP 32440]
[New LWP 32441]
[New LWP 32442]
[New LWP 32443]
[New LWP 32444]
[New LWP 32445]
[New LWP 32446]
[New LWP 32447]
[New LWP 32448]
[New LWP 32449]
[New LWP 32450]
[New LWP 32451]
[New LWP 32452]
[New LWP 32453]
[New LWP 32454]
[New LWP 32455]
[New LWP 32456]
[New LWP 32461]
[New LWP 32462]
[New LWP 32463]
[New LWP 32464]
[New LWP 32465]
[New LWP 32466]
0x0000000000e7df06 in sccp ()
(gdb)
(gdb) c
Continuing.
#GDB is in loop here


# list of running processes during GBD session
[root@ip-10-111-248-242 ~]# ps aux | grep zero
root      1296  0.0  0.0 119392   956 pts/1    S+   14:06   0:00 grep --color=auto zero
root     32418  0.0  1.0 197108 42500 pts/0    S+   13:47   0:00 gdb --args zerotier-one -d
root     32434  0.0  0.0  16220  2488 pts/0    tl   13:47   0:00 /usr/sbin/zerotier-one -d
zerotie+ 32439  1.8  0.3  49292 12872 pts/0    Sl   13:47   0:22 /usr/sbin/zerotier-one -d

As a result, prometheus::SaveToFile::save_data() would never be called in this case.

@laduke
Copy link
Contributor

laduke commented May 2, 2024

Does the http endpoint still work?

@serge-r
Copy link
Author

serge-r commented May 2, 2024

Does the http endpoint still work?

It doesn't work

[root@ip-10-111-248-242 ~]# curl -I -XGET -H "X-ZT1-Auth: $(sudo cat /var/lib/zerotier-one/metricstoken.secret)" http://localhost:9993/metrics
HTTP/1.1 200 OK
Content-Length: 0
Content-Type: text/plain
Keep-Alive: timeout=5, max=5

Because if I understand right the file metrics.prom is a source for http endpoint metrics

auto metricsGet = [this](const httplib::Request &req, httplib::Response &res) {
std::string statspath = _homePath + ZT_PATH_SEPARATOR + "metrics.prom";
std::string metrics;
if (OSUtils::readFile(statspath.c_str(), metrics)) {
res.set_content(metrics, "text/plain");
} else {
res.set_content("{}", "application/json");
res.status = 500;
}
};

@laduke laduke added Linux Linux-related issue Good First Issue SEO for Github Search labels May 2, 2024
@laduke
Copy link
Contributor

laduke commented May 2, 2024

hmm does this mean it doesn't work on any type of redhat

@serge-r
Copy link
Author

serge-r commented May 2, 2024

Probably, I don't have any other Red Hat nodes for testing.
On Ubuntu VMs, we don't have such a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good First Issue SEO for Github Search Linux Linux-related issue
Projects
None yet
Development

No branches or pull requests

2 participants