Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the log level correct for ERROR 'failed to acquire reusable stream' #201

Closed
xushichangdesmond opened this issue Apr 12, 2024 · 6 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@xushichangdesmond
Copy link

Hi

See the error 'failed to acquire reusable stream' in
pingora-core/src/connectors/mod.rs

    pub async fn reused_stream<P: Peer + Send + Sync>(&self, peer: &P) -> Option<Stream> {
        match self.connection_pool.get(&peer.reuse_hash()) {
            Some(s) => {
                debug!("find reusable stream, trying to acquire it");
                {
                    let _ = s.lock().await;
                } // wait for the idle poll to release it
                match Arc::try_unwrap(s) {
                    Ok(l) => {
                        let mut stream = l.into_inner();
                        // test_reusable_stream: we assume server would never actively send data
                        // first on an idle stream.
                        if peer.matches_fd(stream.id()) && test_reusable_stream(&mut stream) {
                            Some(stream)
                        } else {
                            None
                        }
                    }
                    Err(_) => {
                        error!("failed to acquire reusable stream");
                        None
                    }
                }
            }
            None => {
                debug!("No reusable connection found for {peer}");
                None
            }
        }
    }

If I understand correctly, this should be maybe logged on DEBUG level more appropriately instead of error, since it isn't really an error if the other end drops the connection every now and then if theres no inflight requests.

@andrewhavck
Copy link
Contributor

Makes sense, feel free to create a PR.

xushichangdesmond pushed a commit to xushichangdesmond/pingora that referenced this issue Apr 13, 2024
@xushichangdesmond
Copy link
Author

Sure, please check #205

@gideontong gideontong added the enhancement New feature or request label Apr 18, 2024
@andrewhavck
Copy link
Contributor

Sorry for the confusion, I misread above and that log is actually due to the Arc::try_unwrap as there is more than one strong reference to the Arc which shouldn't happen. This is technically a bug and that's why we log an error, it's worth investigating further though.

@xushichangdesmond
Copy link
Author

ConnectionPool::idle_poll(...) owns the lock via a OwnedMutexGuard, which holds the same arc reference of topic here(whose try_unwrap is failing) via OwnedMutexGuard::lock

after idle_poll exits, that guard is dropped and the lock released, but it is not clear to me that OwnedMutexLock::lock is released before or in atomic fashion with the release of that lock as in reused_stream(...)'s let _ = s.lock().await;

If this sounds possible, maybe we should try to reproduce this in a small test case just using the Arc<Mutex<()>> pattern together with an async use of the mutex's try_lock_owned() or otherwise

@drcaramelsyrup drcaramelsyrup added the bug Something isn't working label May 3, 2024
@andrewhavck
Copy link
Contributor

Have you been able to reproduce this issue?

@andrewhavck
Copy link
Contributor

Closing this issue as the log level seems appropriate and we're only seeing this log on machines that are in a bad state. If you're able to consistently repro please open a new issue, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants