Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add integration tests for upgrades that include endpoint security #4720

Open
cmacknz opened this issue May 8, 2024 · 3 comments
Open

Add integration tests for upgrades that include endpoint security #4720

cmacknz opened this issue May 8, 2024 · 3 comments
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@cmacknz
Copy link
Member

cmacknz commented May 8, 2024

We need to add upgrade integration tests where endpoint security is installed. We need to test both the tamper protected and unprotected cases.

As part of this addition, the logging around forwarding the upgrade action to endpoint needs to be increased to the info level.

if h.tamperProtectionFn() {
// Find inputs that want to receive UPGRADE action
// Endpoint needs to receive a signed UPGRADE action in order to be able to uncontain itself
state := h.coord.State()
ucs := findMatchingUnitsByActionType(state, a.Type())
if len(ucs) > 0 {
h.log.Debugf("handlerUpgrade: proxy/dispatch action '%+v'", a)
err := notifyUnitsOfProxiedAction(ctx, h.log, action, ucs, h.coord.PerformAction)
h.log.Debugf("handlerUpgrade: after action dispatched '%+v', err: %v", a, err)
if err != nil {
return err
}
} else {
// Log and continue
h.log.Debugf("No components running for %v action type", a.Type())
}
}

There have been several recent cases where tamper protected agent upgrades have failed due to invalid uninstall tokens. The root cause is currently unknown, but missing logging and inadequate automated testing are definitely contributing to these problems.

@cmacknz cmacknz added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label May 8, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@intxgo
Copy link
Contributor

intxgo commented May 8, 2024

It would also help if Agent logged its PID, at least at startup. We can see in the endpoint logs when Agent disconnects and reconnects, and are logging its PID each time it connects, but right now we have to search for matching Agent logs by timestamp only.

@cmacknz
Copy link
Member Author

cmacknz commented May 8, 2024

We should do that already, the log lines contain "process.pid":25920 and look like:

{"log.level":"info","@timestamp":"2024-04-16T09:33:06.250Z","log.origin":{"file.name":"cmd/run.go","file.line":155},"message":"Elastic Agent started","log":{"source":"elastic-agent"},"process.pid":25920,"agent.version":"8.11.2","ecs.version":"1.6.0"}

@ycombinator ycombinator assigned pchila and unassigned pchila May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

4 participants