Troubleshooting MCP Servers
This guide provides solutions to common issues encountered when working with MCP servers, along with diagnostic procedures and best practices.
Common Issues and Solutions
Deployment Issues
Deployment Stuck in "Deploying" State
Symptoms:
- Server shows "Deploying" status for more than 5 minutes
- No logs appear in the logs tab
Possible Causes:
- Network connectivity issues
- Resource quota exhaustion
- Template configuration errors
Solutions:
- Check your account resource limits
- Verify your network connectivity to MCP-Cloud
- Review the deployment logs under "Advanced Logs"
- Try deploying in a different region
- If the issue persists, delete and redeploy
Deployment Failed
Symptoms:
- Server shows "Failed" status
- Error message in deployment logs
Possible Causes:
- Invalid environment variables
- Resource constraints
- Template compatibility issues
Solutions:
- Check the error message in the deployment logs
- Verify all required environment variables are set correctly
- Try increasing the memory allocation
- Deploy with a different template version
Authentication Issues
API Key Not Working
Symptoms:
- 401 Unauthorized responses
- "Invalid API key" errors
Solutions:
- Verify the API key is correct and not expired
- Check if the API key has the necessary permissions
- Ensure you're including the correct prefix (
Bearer
orKey
) - Check for any trailing whitespace in the API key
Unable to Access Server
Symptoms:
- 403 Forbidden responses
- "Access denied" errors
Solutions:
- Verify your subscription status
- Check the server's access control settings
- Ensure your IP is not blocked by any firewall rules
- Verify your account permissions
Performance Issues
High Latency
Symptoms:
- Requests taking longer than expected
- Timeout errors
Possible Causes:
- Insufficient resources
- High server load
- Network latency
- Large context windows
Solutions:
- Increase the memory allocation
- Increase the CPU allocation
- Deploy in a region closer to your users
- Optimize prompt length and complexity
- Check if there are concurrent requests overloading the server
- Increase the timeout setting for long operations
Out of Memory Errors
Symptoms:
- Server crashes with OOM errors
- Incomplete responses
Solutions:
- Increase the memory allocation
- Reduce the complexity of requests
- Optimize context window size
- Process large requests in smaller batches
Specific Template Issues
Default Template
Common Issues:
- Missing API keys
- Model configuration errors
Solutions:
- Ensure OPENAI_API_KEY is provided and valid
- Check the model name for compatibility
- Verify connection to OpenAI API
Playwright Template
Common Issues:
- Browser launch failures
- Screenshot errors
Solutions:
- Check DISPLAY environment variable
- Verify command arguments are correctly formatted
- Increase memory allocation (at least 1GB recommended)
- Check if URLs are properly formatted and accessible
Diagnostic Procedures
Checking Server Logs
Server logs are the primary diagnostic tool:
- Navigate to your server's dashboard
- Select the "Logs" tab
- Filter by log level (INFO, ERROR, etc.)
- Look for error patterns and stack traces
Testing Connectivity
If you suspect network issues:
# Test basic connectivity
curl -I https://your-server-url.mcp-cloud.ai/health
# Test with authentication
curl -I -H "Authorization: Bearer YOUR_API_KEY" \
https://your-server-url.mcp-cloud.ai/v1/models
Memory Usage Analysis
To diagnose memory issues:
- Navigate to the "Metrics" tab
- Check the memory usage graph
- Look for patterns in memory consumption
- Identify memory spikes and correlate with specific operations
Latency Analysis
To analyze latency issues:
- Review the response time metrics
- Check for correlation with request payload size
- Test with different request parameters
- Compare performance across different times of day
Advanced Troubleshooting
Custom Health Checks
Create a simple health check endpoint to test your server:
// Simple health check
fetch('https://your-server-url.mcp-cloud.ai/health')
.then(response => {
console.log('Status:', response.status);
return response.json();
})
.then(data => console.log('Health data:', data))
.catch(error => console.error('Health check failed:', error));
Minimal Reproduction
When reporting issues to support, create a minimal reproduction:
- Isolate the specific request that causes the issue
- Remove any unnecessary parameters
- Document exact steps to reproduce
- Include full error messages and logs
Live Debug Mode
For persistent issues, you can enable live debug mode:
- Navigate to server settings
- Enable "Debug Mode"
- Use the real-time console to monitor operations
- Disable after troubleshooting to prevent performance impact
Common Error Messages
Error Message | Likely Cause | Solution |
---|---|---|
"Rate limit exceeded" | Too many requests | Reduce request frequency or upgrade plan |
"Context length exceeded" | Input + output tokens too large | Reduce input size or use a model with larger context |
"Invalid API key" | Authentication issue | Verify and rotate API key |
"Service unavailable" | Server offline or restarting | Check server status and wait, or restart if needed |
"Resource not found" | Incorrect endpoint or deleted resource | Verify URL and resource existence |
"Validation error" | Incorrect request format | Check API documentation for correct request structure |
Contacting Support
If you can't resolve the issue:
Collect relevant information:
- Server ID
- Error messages
- Timestamps of issues
- Steps to reproduce
- Recently made changes
Submit a support ticket:
- Email: [email protected]
- Support portal: https://support.mcp-cloud.ai
For urgent issues, use the emergency support channel available to Business and Enterprise tier customers.
Preventative Measures
Monitoring
Set up proactive monitoring:
- Configure alerts for critical metrics
- Set up uptime monitoring
- Implement error rate thresholds
Testing
Implement testing practices:
- Create test suites for your integration
- Run periodic health checks
- Simulate load testing before production use
Updates
Stay current with updates:
- Follow release notes
- Plan for template version upgrades