Usage Limits¶

Each API consumer has usage limits for how frequently (rate limit) and how many times a day (daily quota) it is allowed to invoke Firstbeat Cloud API endpoints. These limits are in place to prevent abuse and ensure that the API remains available for all users.

Usage limits are API consumer specific. If several customer accounts are linked to the same API consumer, the quota is shared.

Usage Monitoring¶

The API client needs to keep track of the rate limit and daily quota on the client side and implement the required throttling mechanisms. At the moment, Cloud API doesn't have any mechanism to request the remaining quota. Also, remaining quota or next API call time is not returned as part of response headers.

When rate limit or daily quota is exceeded, API returns HTTP status code 429. The response includes an x-amzn-ErrorType header that indicates the cause:

Header value	Cause
`LimitExceededException`	Daily quota exceeded
`ThrottledException`	Per-minute rate limit exceeded

Rate Limits and Quotas¶

Currently, the default limits are as follows:

Usage Limit	Value	Description
Rate Limit	1 requests per second	Token bucket refilling at 1 token/sec
Burst	60 requests / minute	Token bucket with 60 token capacity
Daily Quota	5000 requests / day	Fixed window that resets every day at 00:00:00 (midnight) UTC

Exceeding Usage Limits

If the API consumer does not follow the usage limits, requests will be throttled and denied access to the resources (HTTP 429 responses). If, despite being cautioned by Firstbeat, the API consumer continues to misuse the API in ways that stress the service excessively, Firstbeat reserves the right to disable the consumer access permanently.

Custom Limits

API usage limits can be changed on request. For example, if you have many customer accounts (teams) under your API consumer, there might be need to increase the daily quota. Contact Firstbeat support to discuss custom rate limits.

Throttling Mechanism¶

The API uses a token bucket algorithm for rate limiting. This mechanism works as follows:

A "bucket" holds tokens, with each token representing permission for one API request
The bucket has a maximum capacity (the rate limit)
Tokens are added to the bucket at a steady rate (refill rate)
Each API request consumes one token from the bucket
If the bucket is empty when a request arrives, the request is throttled

Key characteristics of the token bucket approach:

Burst handling: If the bucket is full, you can make a burst of requests up to the bucket capacity
Smooth refill: Tokens refill gradually over time, allowing for sustained API usage
Fair allocation: Prevents any single consumer from monopolizing API resources

When the rate limit is exceeded, the API will respond with HTTP status code 429 (Too Many Requests). Your client should implement exponential backoff and retry logic to handle these responses gracefully.

Examples¶

Here's how the token bucket works in practice with the default 60 requests/minute limit:

Scenario 1: Burst requests

Time 00:00 - Bucket starts full with 60 tokens
         ↓
Make 60 requests immediately → All succeed (bucket now empty)
         ↓
Make 1 more request → HTTP 429 (throttled - no tokens available)
         ↓
Wait 1 second → 1 token refills
         ↓
Make 1 request → Success (consumes the refilled token)

Scenario 2: Sustained usage

Time 00:00 - Bucket has 60 tokens
         ↓
Make 30 requests → 30 tokens consumed (30 tokens remain)
         ↓
Wait 30 seconds → 30 tokens refilled (bucket now full at 60)
         ↓
Make 1 request per second for 60 seconds → All succeed
(consuming 1 token/sec while 1 token/sec refills)

Scenario 3: Optimal client behavior

To maximize throughput without hitting rate limits:
- Make requests at ≤1 request/second for sustained operations
- For batch operations, send up to 60 requests in a burst
- Then wait 60 seconds before the next burst
- Or use a pacing strategy: request → wait 1 second → repeat