Rate Limits

Understanding and working with API rate limits.

Overview

Rate limits protect our infrastructure and ensure fair usage for all users. Limits are applied per API key.

Default Limits

Limit Type	Default	Description
Requests per minute	60	Maximum API calls per minute
Concurrent predictions	5	Maximum running predictions at once
Daily spending	No limit	Optional, configurable per key
Monthly spending	No limit	Optional, configurable per key

Rate Limit Headers

Every response includes headers showing your current rate limit status:

Header	Description
X-RateLimit-Limit	Maximum requests allowed per minute
X-RateLimit-Remaining	Requests remaining in current window
X-RateLimit-Reset	Unix timestamp when the limit resets
Retry-After	Seconds to wait (only on 429 errors)

Response Headers Example

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1705312800

Handling Rate Limits

When you exceed the rate limit, you'll receive a 429 Too Many Requests response. Implement exponential backoff to handle this gracefully:

Node.js - Exponential Backoff

async function apiRequest(url, options, maxRetries = 3) {
  let retries = 0;

  while (retries < maxRetries) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') || 60;
      const delay = Math.min(retryAfter * 1000, Math.pow(2, retries) * 1000);

      console.log(`Rate limited. Retrying in ${delay}ms...`);
      await new Promise(resolve => setTimeout(resolve, delay));

      retries++;
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

Best Practices

✓ Do

Implement exponential backoff for retries
Monitor rate limit headers in responses
Use webhooks for long-running predictions
Cache responses when possible
Batch requests where supported

Don't

Retry immediately after a 429 error
Poll predictions more than once per second
Create multiple API keys to bypass limits
Ignore the Retry-After header

Concurrent Prediction Limits

The concurrent limit restricts how many predictions can run simultaneously. A prediction counts against your limit from creation until completion.

429 Concurrent Limit Response

{
  "error": {
    "type": "rate_limit_error",
    "message": "Concurrent prediction limit exceeded. 5/5 predictions running.",
    "code": "CONCURRENT_LIMIT_EXCEEDED",
    "details": {
      "limit": 5,
      "active": 5
    }
  }
}

Increasing Your Limits

Need higher limits? Enterprise plans include increased rate limits and dedicated support. Contact us to discuss your requirements.

Contact Sales

Next Steps

View all error codes

Limit Type

Default

Description

Requests per minute

Maximum API calls per minute

Concurrent predictions

Maximum running predictions at once

Daily spending

No limit

Optional, configurable per key

Monthly spending