Usage Limits¶
Each API consumer has usage limits for how frequently (rate limit) and how many times a day (daily quota) it is allowed to invoke Firstbeat Cloud API endpoints. These limits are in place to prevent abuse and ensure that the API remains available for all users.
Usage limits are API consumer specific. If several customer accounts are linked to the same API consumer, the quota is shared.
Usage Monitoring¶
The API client needs to keep track of the rate limit and daily quota on the client side and implement the required throttling mechanisms. At the moment, Cloud API doesn't have any mechanism to request the remaining quota. Also, remaining quota or next API call time is not returned as part of response headers.
When rate limit or daily quota is exceeded, API returns HTTP status code 429. The response includes an x-amzn-ErrorType header that indicates the cause:
| Header value | Cause |
|---|---|
LimitExceededException |
Daily quota exceeded |
ThrottledException |
Per-minute rate limit exceeded |
Rate Limits and Quotas¶
Currently, the default limits are as follows:
| Usage Limit | Value | Description |
|---|---|---|
| Rate Limit | 1 requests per second | Token bucket refilling at 1 token/sec |
| Burst | 60 requests / minute | Token bucket with 60 token capacity |
| Daily Quota | 5000 requests / day | Fixed window that resets every day at 00:00:00 (midnight) UTC |
Exceeding Usage Limits
If the API consumer does not follow the usage limits, requests will be throttled and denied access to the resources (HTTP 429 responses). If, despite being cautioned by Firstbeat, the API consumer continues to misuse the API in ways that stress the service excessively, Firstbeat reserves the right to disable the consumer access permanently.
Custom Limits
API usage limits can be changed on request. For example, if you have many customer accounts (teams) under your API consumer, there might be need to increase the daily quota. Contact Firstbeat support to discuss custom rate limits.
Throttling Mechanism¶
The API uses a token bucket algorithm for rate limiting. This mechanism works as follows:
- A "bucket" holds tokens, with each token representing permission for one API request
- The bucket has a maximum capacity (the rate limit)
- Tokens are added to the bucket at a steady rate (refill rate)
- Each API request consumes one token from the bucket
- If the bucket is empty when a request arrives, the request is throttled
Key characteristics of the token bucket approach:
- Burst handling: If the bucket is full, you can make a burst of requests up to the bucket capacity
- Smooth refill: Tokens refill gradually over time, allowing for sustained API usage
- Fair allocation: Prevents any single consumer from monopolizing API resources
When the rate limit is exceeded, the API will respond with HTTP status code 429 (Too Many Requests). Your client should implement exponential backoff and retry logic to handle these responses gracefully.
Examples¶
Here's how the token bucket works in practice with the default 60 requests/minute limit:
Scenario 1: Burst requests
Time 00:00 - Bucket starts full with 60 tokens
↓
Make 60 requests immediately → All succeed (bucket now empty)
↓
Make 1 more request → HTTP 429 (throttled - no tokens available)
↓
Wait 1 second → 1 token refills
↓
Make 1 request → Success (consumes the refilled token)
Scenario 2: Sustained usage
Time 00:00 - Bucket has 60 tokens
↓
Make 30 requests → 30 tokens consumed (30 tokens remain)
↓
Wait 30 seconds → 30 tokens refilled (bucket now full at 60)
↓
Make 1 request per second for 60 seconds → All succeed
(consuming 1 token/sec while 1 token/sec refills)
Scenario 3: Optimal client behavior