Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot account hitting rate limit (again?) #5032

Open
huydhn opened this issue Mar 20, 2024 · 4 comments
Open

Bot account hitting rate limit (again?) #5032

huydhn opened this issue Mar 20, 2024 · 4 comments
Labels

Comments

@huydhn
Copy link
Contributor

huydhn commented Mar 20, 2024

We are seeing more {"message":"API rate limit exceeded for user ID 1617424. If you reach out to GitHub Support for help, please include the request ID 9E6E:0A32:178BB2:2A26F7:65FB3545.","documentation_url":"https://docs.github.com/rest/overview/rate-limits-for-the-rest-api"} error from CI logs, i.e. https://ossci-raw-job-status.s3.amazonaws.com/log/22898061143. This is a new issue and we need to take a closer look to find out the root cause and potential fixes.

cc @clee2000 @PaliC @kit1980 @malfet

@huydhn huydhn added bug Something isn't working high priority labels Mar 20, 2024
@clee2000
Copy link
Contributor

clee2000 commented Mar 20, 2024

I'm pretty sure this is suo's token in the github-status-test lambda

PATs have rate limit of 5000/user/hr

They get refreshed every hour, so this problem will resolve itself and then maybe show up in an hour. A look on rockset says that we are occasionally hitting 4900+ workflow jobs on pytorch during peak working hours (this doesn't include all the other repos but idk which ones are sending webhooks here), but this averages to ~1600/hr over the entire week

Here are some possible solutions:

  • Move log download to pytorchbot app (rate limit of 15000?) - pros: larger limit, cons: pytorchbot also has other things it needs to use the api for and I'm not as sure how to know when the bot hits a rate limit
  • Add more tokens (we have at least two bot accounts who's tokens we could use instead of suos) - cons: scale via adding accounts
  • Reduce log download in general - pros: permanent decrease to number of log downloads, cons: delayed log downloads

@huydhn
Copy link
Contributor Author

huydhn commented Mar 20, 2024

I guess we need to stop using Suo's token anyway, that's not a good practice to keep. So we can do the first second point and observe to see if we need a new bot account?

It seems easier to use the second approach using our other bot accounts that have PAT. Don't need to go through the pytorchbot app route which will requires OIDC connection from Vercel to our AWS account, which is not supported atm #4789 (comment)

@seemethere
Copy link
Member

@clee2000
Copy link
Contributor

clee2000 commented Apr 2, 2024

Added ability to use more tokens in #5033
Still need to find another token from a bot to add
Another option

  • Swap from lambda to gha to take advantage of repo level token limits (and also not use PATs)
  • Make another bot at the org level just for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Cold Storage
Development

No branches or pull requests

3 participants