Rate limits

Limits per surface

Admin API (`/api/v1/...`)

Plan	Per-shop limit	Per-token limit
Free	60/min	60/min
Starter	300/min	300/min
Pro	600/min	600/min
Agency	1,500/min	1,500/min
Enterprise	Custom	Custom

Burst tolerance: 2× steady-state for ~10 seconds.

Public endpoints (`/api/public/...`)

Higher limits to support storefront traffic patterns:

Endpoint pattern	Per shop	Per IP
Read endpoints	10,000/min	100/min
`/pixel` (event ingest)	100,000/min	1,000/min
`/quiz`, `/referral` (writes)	1,000/min	20/min

Outbound webhooks

Sumeru sends to your endpoint at a configurable rate:

Plan	Default	Max configurable
Free	60/min	60/min
Starter	300/min	600/min
Pro	1,000/min	5,000/min
Agency	5,000/min	20,000/min
Enterprise	Custom	Custom

If your endpoint can't keep up, configure lower rate + larger backoff to avoid retry storms.

Headers

Every response (success or 429) includes:

X-RateLimit-Limit: 600          # Cap per minute
X-RateLimit-Remaining: 421      # Remaining in current window
X-RateLimit-Reset: 1715346000   # Unix timestamp when window resets

When 429:

HTTP/1.1 429 Too Many Requests
Retry-After: 60                 # seconds to wait
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1715346000

Back-off strategy (recommended client behavior)

async function callWithRetry(url, options, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const res = await fetch(url, options);
    if (res.status !== 429) return res;

    // Honor Retry-After
    const retryAfter = parseInt(res.headers.get('Retry-After') || '5', 10);
    const jitter = Math.random() * 1000;
    await new Promise(r => setTimeout(r, retryAfter * 1000 + jitter));
  }
  throw new Error('Max retries exceeded');
}

Don't:

Tight-loop retry on 429 (you'll just hit the limit again)
Ignore Retry-After

Do:

Honor Retry-After
Add jitter
Cap retries
Surface error to caller after max retries

What counts as a request?

Action	Counts as
Single GET / POST	1
Bulk operation	N (one per item)
Streaming endpoint	1 per chunk
Cached response (HIT)	0 (free)

Avoiding limits proactively

Use bulk endpoints

# Bad: 100 calls
for id in $IDS; do
  curl ".../api/v1/customers/$id"
done

# Good: 1 call
curl ".../api/v1/customers?filter[id][in]=id1,id2,...,id100"

Cache reads

/api/v1/customers/cus_abc is cached for 60s at edge. Hitting it 1000× in 60s = 1 origin call + 999 cache hits = 1 against limit.

Use webhooks instead of polling

If you're polling for "any new orders?", subscribe to order.created webhook. Pushed in real-time; doesn't hit limits.

Stagger high-volume jobs

Daily warehouse ETL: spread over 30+ minutes, not 30 seconds.

Burst credits (Enterprise)

Enterprise plans get burst credits — temporary 5× rate allowance for up to 1 hour (e.g. annual data export). Contact support to enable.

Monitoring

Admin → API → Usage shows:

Requests per hour, last 7 days
429 count
Top consumers (which token, which endpoint)
Per-endpoint latency

Alerts on:

10% of requests 429d
Rate-limit usage > 80% of cap

SDK behavior

Official SDKs handle 429 automatically:

Honor Retry-After
Exponential backoff with jitter
Configurable max retries
Surface unrecoverable errors

import { SumeruES } from '@sumeru/sdk';

const sumeru = new SumeruES({
  apiKey: process.env.API_KEY,
  retries: 5,            // default
  retryDelay: 'exponential' // default
});

Special cases

Atomic operations

Some endpoints (campaign launch, journey enroll) need to succeed once. The SDK + Idempotency-Key ensures retry safety; no double-action even if 429-then-retry.

Long-running async

Endpoints that kick off async work (large bulk import, catalog regeneration) return 202 Accepted with a job ID. You poll /api/v1/jobs/<id> for status. Polling endpoint has its own (much higher) rate limit.

Limits per surface​

Admin API (/api/v1/...)​

Public endpoints (/api/public/...)​

Outbound webhooks​

Headers​

Back-off strategy (recommended client behavior)​

What counts as a request?​

Avoiding limits proactively​

Use bulk endpoints​

Cache reads​

Use webhooks instead of polling​

Stagger high-volume jobs​

Burst credits (Enterprise)​

Monitoring​

SDK behavior​

Special cases​

Atomic operations​

Long-running async​

See also​