Rate limit

Definition

Maximum number of requests a client can make per unit of time against an API; protects the service and prevents abuse.

Rate limiting is a policy defining how many requests an API accepts per client and per time window. It prevents abuse, mitigates denial-of-service attacks, and ensures stable performance for all consumers. APIs typically expose rate limit state in response headers: - `X-RateLimit-Limit`: total quota in the current window. - `X-RateLimit-Remaining`: requests remaining before throttling. - `X-RateLimit-Reset`: Unix timestamp at which the counter resets. When the limit is exceeded, the API responds with `429 Too Many Requests` and a `Retry-After` header (seconds to wait before retrying). Well-implemented clients use exponential backoff and respect that header. At Normadata, rate limits apply **per API key**, not per IP. Each plan has its quota: the early access plan allows generous bursts so integrators can experiment without friction. The `X-RateLimit-*` headers are present on every response, so the client can anticipate throttling and self-regulate its requests. Best practices: pipeline requests, cache idempotent responses, and monitor `X-RateLimit-Remaining` to avoid production outages.

Related terms