Cratopus icon

Rate Limiting Concepts

Protect your services from abuse and ensure fair resource allocation with Crate’s distributed rate limiting engine. Powered by the high-performance Token Bucket algorithm, our system provides sub-millisecond latency for limit checks.


Token Bucket Algorithm

Crate implements the Token Bucket algorithm, allowing for smooth traffic flow while permitting controlled bursts. This prevents the “thundering herd” problem common with simple window counters.

  • Limit: The total capacity of the bucket (e.g., 100 requests).
  • Rate: The speed at which tokens are added back to the bucket (e.g., 10 per minute).

Identification Strategies

Limits can be applied globally or scoped to specific users using various identifiers:

  • IP Address: The simplest form of protection against generic crawlers and bots.
  • API Keys: Enforce plan limits based on the X-API-Key or custom headers.
  • Cookies & JWTs: Extract user IDs from session cookies or decoded JWT claims for true user-level limiting.
  • Custom Headers: Identify traffic via partner IDs, platform tags, or any other metadata.

Response Headers

Crate follows industry standards by returning X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers on every request, allowing clients to implement backoff logic gracefully.