Production Checklist
Before deploying CodeCall to production, complete these steps:Choose the right VM preset
Use
secure for most production workloads, locked_down for sensitive data.
Never use experimental in production.Configure tool access control
Set up
includeTools filtering and per-tool codecall metadata to limit which tools are accessible.Enable audit logging
Configure audit sinks to track script execution, tool calls, and security events.
Configure monitoring and alerting
Track execution latency, error rates, and security blocks with your observability stack.
Performance Characteristics
Latency Breakdown
| Stage | Typical Time | Notes |
|---|---|---|
| AST Parsing | 1-5ms | Scales with code size |
| AST Validation | 2-10ms | Depends on rule count |
| Code Transformation | 1-3ms | One-time per script |
| VM Execution | Variable | Depends on script complexity |
| Tool Calls | Variable | Network/database bound |
| Output Sanitization | 1-5ms | Scales with output size |
Throughput
| Configuration | Requests/sec | Notes |
|---|---|---|
| Single instance, TF-IDF | ~500 | Bottleneck: VM isolation |
| Single instance, ML | ~200 | Bottleneck: Model inference |
| Multi-instance (4 pods) | ~1,500+ | Near-linear scaling |
Throughput depends heavily on script complexity and tool call latency. These numbers assume simple scripts with 1-3 tool calls.
Performance Optimization
1. Use TF-IDF for Most Cases
Unless you have 100+ tools with similar descriptions, TF-IDF provides excellent relevance with minimal overhead:2. Enable ML for Large Toolsets
For 100+ tools with similar descriptions, the ML strategy provides better semantic matching:3. Use Direct Invoke for Simple Calls
Bypass VM overhead for single-tool operations:4. Cache Describe Results
Tool schemas rarely change. CodeCall internally caches describe and search results to reduce overhead on repeated calls.Multi-Instance Deployment
CodeCall is stateless and scales horizontally.Architecture
Kubernetes Deployment
Resource Recommendations
| Workload | CPU | Memory | Instances |
|---|---|---|---|
| Light (<100 req/min) | 0.5 core | 512MB | 1-2 |
| Medium (100-500 req/min) | 1 core | 1GB | 2-4 |
| Heavy (500+ req/min) | 2 cores | 2GB | 4+ |
Monitoring
Metrics to Track
Execution Latency
Track p50, p95, p99 of
codecall:execute durationError Rate
Monitor validation errors, timeouts, and tool failures
Tool Call Count
Average tool calls per script execution
Search Latency
Track search response times for index health
Logging
CodeCall’s internalAuditLoggerService emits structured log events for observability. These events can be consumed by your logging infrastructure:
Log events:
Alerting Recommendations
| Metric | Warning | Critical |
|---|---|---|
| Execute p99 latency | > 2s | > 5s |
| Error rate | > 5% | > 15% |
| Timeout rate | > 1% | > 5% |
| Security blocks | Any | High volume |
Cost Optimization
Token Savings
CodeCall dramatically reduces token usage:| Scenario | Without CodeCall | With CodeCall | Savings |
|---|---|---|---|
| 100 tools in context | ~25,000 tokens | ~3,000 tokens | 88% |
| Multi-tool workflow (5 calls) | ~50,000 tokens | ~5,000 tokens | 90% |
| Complex filtering | ~100,000 tokens | ~8,000 tokens | 92% |
Compute Costs
| Factor | Impact | Optimization |
|---|---|---|
| VM isolation | ~10ms overhead | Use codecall:invoke for simple calls |
| Embedding inference | ~50ms/query | Use TF-IDF for fewer than 100 tools |
| Tool calls | Dominant cost | Optimize underlying tools |
Cost vs. Performance Tradeoffs
Minimize Latency
Minimize Latency
- Use TF-IDF search
- Enable caching for describe/search
- Use direct invoke for simple calls
- Increase VM timeout for complex scripts
Minimize Compute
Minimize Compute
- Use locked_down preset (shorter timeouts)
- Limit maxToolCalls aggressively
- Cache aggressively
- Use fewer instances with more resources
Minimize Tokens
Minimize Tokens
- Use codecall_only mode
- Hide all tools from list_tools
- Return minimal data from tools
- Let scripts filter server-side
Security in Production
Checklist
Rate Limiting
Rate limiting should be handled at the infrastructure level (reverse proxy, API gateway) or with middleware. Configure limits oncodecall:execute to prevent abuse.
Multi-Tenancy Patterns
CodeCall supports multiple isolation strategies for multi-tenant deployments.Tenant Context
Pass tenant information viacodecallContext:
Per-Tenant Tool Filtering
Restrict tools based on tenant using theincludeTools filter:
Isolation Strategies
| Strategy | Isolation Level | Cost | Use Case |
|---|---|---|---|
| Shared instance | Low | $ | Dev/staging |
| Tenant-specific limits | Medium | $$ | SaaS standard |
| Dedicated instances | Maximum | $$$$ | Compliance-heavy |
Troubleshooting
Common Issues
Scripts timing out
Scripts timing out
Symptoms: Frequent
TIMEOUT errorsCauses:- Script too complex
- Tool calls too slow
- Timeout too aggressive
- Profile tool call latency
- Increase
vm.timeoutMsif tools are slow - Break complex scripts into smaller pieces
- Use
Promise.all()for independent tool calls
Search returning irrelevant results
Search returning irrelevant results
Symptoms: Low relevance scores, wrong tools returnedCauses:
- Poor tool descriptions
- Threshold too low
- TF-IDF limitations
- Improve tool descriptions
- Switch to
mlstrategy for semantic matching - Add more specific keywords to descriptions
High memory usage
High memory usage
Symptoms: OOM errors, pod restartsCauses:
- Embedding model loaded
- Large tool index
- Scripts returning large data
- Use TF-IDF instead of embeddings
- Increase memory limits
- Configure output sanitization limits
- Enable HNSW for large indexes
Validation errors for valid code
Validation errors for valid code
Symptoms: Scripts rejected that should workCauses:
- Using blocked constructs
- Reserved prefix collision
- Unicode issues
- Check for
eval,Function, etc. - Avoid
__ag_and__safe_prefixes - Use ASCII identifiers
- Review AST Guard rules
Migration & Rollback
Gradual Rollout
-
Phase 1: Deploy with
mode: 'metadata_driven'- All tools visible normally
- Mark select tools for CodeCall
- Monitor for issues
-
Phase 2: Switch to
mode: 'codecall_opt_in'- Tools opt into CodeCall
- Both access methods work
- Measure token savings
-
Phase 3: Move to
mode: 'codecall_only'- Hide tools from list_tools
- Full CodeCall experience
- Maximum token savings
Rollback Plan
Related
Configuration
All configuration options including VM presets and embedding strategies
Security Model
Defense-in-depth security architecture and settings
API Reference
Meta-tool schemas, error codes, and debugging guide
Deployment Guide
General FrontMCP production deployment