Prevent Rate-Limiting
To protect system performance and availability, API requests can be rate-limited during periods of unexpectedly high load.
API rate-limiting generally happens at either the API authentication layer or the service layer.
For OAuth 2.0 API routes, the system can apply a rate limit on the number of token creation requests for an app if it detects an unexpectedly high load.
In addition to API authentication rate-limiting, some services enforce their own service-level rate-limiting. To help minimize rate-limiting, check the REST reference documentation for a batch route you can use.
If your account is rate-limited, you can’t make API requests on the rate-limited route until the number of requests is reduced. We recommend that you monitor the response codes and errors, such as 429 errors, for your API requests.
Because rate-limiting affects your API integrations, we recommend preventing it as much as possible by following these best practices.
- 
Don’t request more than one access token every 20 minutes—Requesting lots of tokens in a short time period, such as for every API request, doesn’t offer a performance advantage and can cause rate-limiting. Instead, consider caching the token and reusing it until it expires. 
- 
Use the expires_in token response parameter—When you request an access token using OAuth 2.0, the expires_in response parameter returns the length of time in seconds that the access token is valid. Request a new token based on the time returned. For example, based on the value returned in the expire_in response parameter, you can refresh an access token or request a new token five minutes before the token expires. 
Use offline scope—For web and public app integrations, offline scope allows you to refresh a token even without an active user session. With offline scope, you can renew an access token that is expired or about to expire. Offline scope works by using a valid refresh token, which has a longer lifetime.
Retry with exponential backoff—Because you can encounter transient errors or timeouts, your API client probably contains logic to handle failures. If that logic is to continuously retry on failure, however, you can be rate-limited. Instead of continuous retry on failures, consider using exponential backoff for retries with web services or any cloud-based services.
Exponential backoff exponentially increases the wait time between subsequent retries. For example, the first retry happens after 2 seconds, the second retry happens 4 seconds after that, the third retry happens 8 seconds after that, and so on until a maximum retry length is reached.
Honor the HTTP 429 error code—To find out if you’re rate-limited, look for the HTTP 429 Too Many Requests response status code. This response status code indicates that you sent too many requests in a given amount of time and must make adjustments.
If you receive the 429 error code, you can use exponential backoff logic. As part of the error response, the Retry-After response HTTP header indicates how long the user agent must wait before making a follow-up request:
Retry-After: <delay-seconds>
Delay-seconds is a non-negative decimal integer that indicates how many seconds to delay after the response is received.