Rate limits
Limits per surface
Admin API (/api/v1/...)
| Plan | Per-shop limit | Per-token limit |
|---|---|---|
| Free | 60/min | 60/min |
| Starter | 300/min | 300/min |
| Pro | 600/min | 600/min |
| Agency | 1,500/min | 1,500/min |
| Enterprise | Custom | Custom |
Burst tolerance: 2× steady-state for ~10 seconds.
Public endpoints (/api/public/...)
Higher limits to support storefront traffic patterns:
| Endpoint pattern | Per shop | Per IP |
|---|---|---|
| Read endpoints | 10,000/min | 100/min |
/pixel (event ingest) | 100,000/min | 1,000/min |
/quiz, /referral (writes) | 1,000/min | 20/min |
Outbound webhooks
Sumeru sends to your endpoint at a configurable rate:
| Plan | Default | Max configurable |
|---|---|---|
| Free | 60/min | 60/min |
| Starter | 300/min | 600/min |
| Pro | 1,000/min | 5,000/min |
| Agency | 5,000/min | 20,000/min |
| Enterprise | Custom | Custom |
If your endpoint can't keep up, configure lower rate + larger backoff to avoid retry storms.
Headers
Every response (success or 429) includes:
X-RateLimit-Limit: 600 # Cap per minute
X-RateLimit-Remaining: 421 # Remaining in current window
X-RateLimit-Reset: 1715346000 # Unix timestamp when window resets
When 429:
HTTP/1.1 429 Too Many Requests
Retry-After: 60 # seconds to wait
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1715346000
Back-off strategy (recommended client behavior)
async function callWithRetry(url, options, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const res = await fetch(url, options);
if (res.status !== 429) return res;
// Honor Retry-After
const retryAfter = parseInt(res.headers.get('Retry-After') || '5', 10);
const jitter = Math.random() * 1000;
await new Promise(r => setTimeout(r, retryAfter * 1000 + jitter));
}
throw new Error('Max retries exceeded');
}
Don't:
- Tight-loop retry on 429 (you'll just hit the limit again)
- Ignore
Retry-After
Do:
- Honor
Retry-After - Add jitter
- Cap retries
- Surface error to caller after max retries
What counts as a request?
| Action | Counts as |
|---|---|
| Single GET / POST | 1 |
| Bulk operation | N (one per item) |
| Streaming endpoint | 1 per chunk |
| Cached response (HIT) | 0 (free) |
Avoiding limits proactively
Use bulk endpoints
# Bad: 100 calls
for id in $IDS; do
curl ".../api/v1/customers/$id"
done
# Good: 1 call
curl ".../api/v1/customers?filter[id][in]=id1,id2,...,id100"
Cache reads
/api/v1/customers/cus_abc is cached for 60s at edge.
Hitting it 1000× in 60s = 1 origin call + 999 cache hits =
1 against limit.
Use webhooks instead of polling
If you're polling for "any new orders?", subscribe to
order.created webhook. Pushed in real-time; doesn't hit
limits.
Stagger high-volume jobs
Daily warehouse ETL: spread over 30+ minutes, not 30 seconds.
Burst credits (Enterprise)
Enterprise plans get burst credits — temporary 5× rate allowance for up to 1 hour (e.g. annual data export). Contact support to enable.
Monitoring
Admin → API → Usage shows:
- Requests per hour, last 7 days
- 429 count
- Top consumers (which token, which endpoint)
- Per-endpoint latency
Alerts on:
-
10% of requests 429d
- Rate-limit usage > 80% of cap
SDK behavior
Official SDKs handle 429 automatically:
- Honor
Retry-After - Exponential backoff with jitter
- Configurable max retries
- Surface unrecoverable errors
import { SumeruES } from '@sumeru/sdk';
const sumeru = new SumeruES({
apiKey: process.env.API_KEY,
retries: 5, // default
retryDelay: 'exponential' // default
});
Special cases
Atomic operations
Some endpoints (campaign launch, journey enroll) need to
succeed once. The SDK + Idempotency-Key ensures retry
safety; no double-action even if 429-then-retry.
Long-running async
Endpoints that kick off async work (large bulk import,
catalog regeneration) return 202 Accepted with a job ID.
You poll /api/v1/jobs/<id> for status. Polling endpoint
has its own (much higher) rate limit.