Scaling Dimensions
| Dimension | Single Server | Distributed |
|---|---|---|
| Concurrency | Worker pool | Multiple runtime pods |
| Sessions | In-memory | Redis-backed |
| Tool execution | Local | Distributed via broker |
| State | Process memory | Redis/external store |
Single-Server Scaling
Worker Pool
For CPU-bound workloads, use the worker pool adapter:Pool Sizing Guidelines
| Workload Type | Recommended Pool Size |
|---|---|
| CPU-heavy scripts | CPU cores |
| I/O-heavy (tool calls) | CPU cores × 2 |
| Mixed | CPU cores × 1.5 |
Enclave Pooling
Reuse Enclave instances to avoid initialization overhead:Distributed Scaling
Architecture
Broker Configuration
Runtime Configuration
Client Configuration
Redis Configuration
Session State
Redis Cluster
For high availability:Kubernetes Scaling
Horizontal Pod Autoscaler
Pod Disruption Budget
Resource Quotas
Load Balancing
Sticky Sessions
For WebSocket connections:Runtime Routing
Route to specific runtimes based on workload:Performance Benchmarks
Single Server (8 cores, 16GB RAM)
| Metric | Value |
|---|---|
| Simple scripts | 500 req/s |
| With tool calls | 200 req/s |
| P50 latency | 15ms |
| P99 latency | 85ms |
Distributed (3 runtime pods)
| Metric | Value |
|---|---|
| Simple scripts | 1200 req/s |
| With tool calls | 500 req/s |
| P50 latency | 25ms |
| P99 latency | 120ms |
Monitoring at Scale
Key Metrics
Alerting Rules
Best Practices
- Start simple - Use worker pool before going distributed
- Monitor queue depth - Scale based on pending executions
- Set memory limits - Prevent runaway scripts
- Use connection pooling - Reuse Redis/DB connections
- Implement backpressure - Reject requests when overloaded
- Regional deployment - Deploy runtimes close to users
- Graceful degradation - Fallback when components fail
Related
- Production Deployment - Deployment guide
- Worker Pool - Worker pool details
- EnclaveJS Broker - Broker configuration