Prerequisites: You should have a working FrontMCP server with at least one tool. See Your First Tool if you need to get started.
What You’ll Build
By the end of this guide, your server will have:- Per-user rate limiting on tools
- Concurrency control to prevent resource exhaustion
- Execution timeouts to catch hanging requests
- IP filtering for production security
- Redis-backed distributed rate limiting
Step 1: Add Rate Limiting to a Tool
Configure rate limiting on a tool
Add a This limits each user to 30 search requests per minute.
rateLimit option to your tool decorator:Step 2: Add Concurrency Control
Prevent expensive tools from running too many instances simultaneously.Add concurrency to a tool
Understand queue behavior
When all slots are occupied:
- With
queueTimeoutMs: 0(default), the request is immediately rejected withConcurrencyLimitError(429). - With
queueTimeoutMs: 15_000, the request waits up to 15 seconds. If a slot opens, it proceeds. If not, it fails withQueueTimeoutError(429).
maxConcurrent: 1:Step 3: Add Execution Timeout
Protect against hanging requests by setting a maximum execution time.Add timeout to a tool
ExecutionTimeoutError (408).Step 4: Global Rate Limiting
Add a server-wide rate limit that applies to all requests, regardless of which tool is called.Configure global limits
Step 5: IP Filtering
Block malicious IPs and restrict access to known networks.Understand filter precedence
The deny list is always checked first:
- IP on deny list → blocked (403,
IpBlockedError) - IP on allow list → allowed
- IP on neither list →
defaultActionapplies ('allow'or'deny')
defaultAction: 'deny', only IPs explicitly on the allow list can access your server.Step 6: Production Setup with Redis
In-memory storage works for development but does not persist across restarts or share state between server instances. Use Redis for production.Configure Redis storage
Verify distributed behavior
With Redis storage:
- Rate limit counters are shared across instances — a user hitting different instances still sees a single limit.
- Semaphore tickets use atomic operations — concurrency is enforced globally.
- Pub/sub notifications make semaphore slot release detection near-instant.