-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rate limiting for API endpoints #65
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
b0a2b9b
to
4a52aae
Compare
|
||
// Handle Rate Limiting | ||
if (response.status === 429) { | ||
const reset = response.headers.get('X-Ratelimit-Reset') ?? Date.now() + 10_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL numeric separators are a thing in JavaScript!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about using something like ms
to make these magic numbers human-friendly?
e.g. ms('10s')
instead of 10_000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion honestly, I'd generally err on not sending more stuff to the client unless we used this a lot. but also happy to add it in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this, @aron!
Is there a way to easily test this? Or should we just ship it and see what happens?
Gave it a quick try in the deployment preview and looks good! thank you! |
This adds some basic rate limiting in production for the
/api/predictions/[id]
endpoint. We allow a sliding window of 20 requests within 10 seconds to fetch prediction metadata.If rate limits are hit the client will wait for 10 seconds for the window to ease up before retrying the request. This may have the result of the query experience being slower but we need to keep requests reasonable until we can move to a client side implementation.
We also attempt to reduce the load by introducing a backoff for the pings. Instead of pinging every 500ms indefinitely we now slowly increase the time between requests. Looking at the distribution of traffic. Most requests take between 1 & 3 seconds so this shouldn't have a noticable effect. But for longer predictions >10 seconds we'll greatly reduce the number of requests made.