Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix flaky TestCredentialsCache #41680

Merged
merged 2 commits into from
May 21, 2024

Conversation

nklaassen
Copy link
Contributor

@nklaassen nklaassen commented May 17, 2024

Fixes #40619

This commit fixes the flakiness in TestCredentialsCache (for real this time).

The cache runs a loop in a goroutine that waits on a ticker and refreshes credentials as necessary based on some criteria. This is a design choice that makes the credential cache able to tolerate brief outages of the Teleport Proxy or AWS STS even if they happen to coincide with the time the credential would usually be refreshed.

The test repeatedly advances a fake clock and makes sure that the credential cache is able/not able to get creds as expected. The flakiness arises when the testcase advances the clock before the run loop gets a chance to create the ticker and wait on it. This is solved by waiting to advance the clock until there is at least one goroutine waiting on the clock.

The test now passes 50000 times in a docker container with 0.25 cpus and 100000 times on my laptop. Without the fix it usually failed within around 5000 runs.

This commit fixes the flakiness in TestCredentialsCache (for real this
time).

The cache runs a loop in a goroutine that waits on a ticker and
refreshes credentials as necessary based on some criteria. This is a
design choice that makes the credential cache able to tolerate brief
outages of the Teleport Proxy or AWS STS even if they happen to coincide
with the time the credential would usually be refreshed.

The test repeatedly advances a fake clock and makes sure that the
credential cache is able/not able to get creds as expected. The
flakiness arises when the testcase advances the clock before the run
loop gets a chance to create the ticker and wait on it. This is solved
by waiting to advance the clock until there is at least one goroutine
waiting on the clock.

The test now passes 20000 times in a docker container with 0.25 cpus.
Without the fix it usually failed after around 5000 runs.
@nklaassen nklaassen added backport/branch/v14 no-changelog Indicates that a PR does not require a changelog entry backport/branch/v15 labels May 17, 2024
@github-actions github-actions bot requested review from jakule and justinas May 17, 2024 02:40
@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from justinas May 18, 2024 16:30
@nklaassen nklaassen added this pull request to the merge queue May 21, 2024
Merged via the queue into master with commit b3ce581 May 21, 2024
38 checks passed
@nklaassen nklaassen deleted the nklaassen/fix-flaky-testcredentialscache branch May 21, 2024 18:56
@public-teleport-github-review-bot

@nklaassen See the table below for backport results.

Branch Result
branch/v14 Create PR
branch/v15 Create PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/branch/v14 backport/branch/v15 no-changelog Indicates that a PR does not require a changelog entry size/sm
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestCredentialsCache flakiness
3 participants