Home

Troubleshooting MCP Servers

This guide provides solutions to common issues encountered when working with MCP servers, along with diagnostic procedures and best practices.

Common Issues and Solutions

Deployment Issues

Deployment Stuck in "Deploying" State

Symptoms:

Possible Causes:

  1. Network connectivity issues
  2. Resource quota exhaustion
  3. Template configuration errors

Solutions:

  1. Check your account resource limits
  2. Verify your network connectivity to MCP-Cloud
  3. Review the deployment logs under "Advanced Logs"
  4. Try deploying in a different region
  5. If the issue persists, delete and redeploy

Deployment Failed

Symptoms:

Possible Causes:

  1. Invalid environment variables
  2. Resource constraints
  3. Template compatibility issues

Solutions:

  1. Check the error message in the deployment logs
  2. Verify all required environment variables are set correctly
  3. Try increasing the memory allocation
  4. Deploy with a different template version

Authentication Issues

API Key Not Working

Symptoms:

Solutions:

  1. Verify the API key is correct and not expired
  2. Check if the API key has the necessary permissions
  3. Ensure you're including the correct prefix (Bearer or Key)
  4. Check for any trailing whitespace in the API key

Unable to Access Server

Symptoms:

Solutions:

  1. Verify your subscription status
  2. Check the server's access control settings
  3. Ensure your IP is not blocked by any firewall rules
  4. Verify your account permissions

Performance Issues

High Latency

Symptoms:

Possible Causes:

  1. Insufficient resources
  2. High server load
  3. Network latency
  4. Large context windows

Solutions:

  1. Increase the memory allocation
  2. Increase the CPU allocation
  3. Deploy in a region closer to your users
  4. Optimize prompt length and complexity
  5. Check if there are concurrent requests overloading the server
  6. Increase the timeout setting for long operations

Out of Memory Errors

Symptoms:

Solutions:

  1. Increase the memory allocation
  2. Reduce the complexity of requests
  3. Optimize context window size
  4. Process large requests in smaller batches

Specific Template Issues

Default Template

Common Issues:

Solutions:

Playwright Template

Common Issues:

Solutions:

Diagnostic Procedures

Checking Server Logs

Server logs are the primary diagnostic tool:

  1. Navigate to your server's dashboard
  2. Select the "Logs" tab
  3. Filter by log level (INFO, ERROR, etc.)
  4. Look for error patterns and stack traces

Testing Connectivity

If you suspect network issues:

# Test basic connectivity
curl -I https://your-server-url.mcp-cloud.ai/health

# Test with authentication
curl -I -H "Authorization: Bearer YOUR_API_KEY" \
  https://your-server-url.mcp-cloud.ai/v1/models

Memory Usage Analysis

To diagnose memory issues:

  1. Navigate to the "Metrics" tab
  2. Check the memory usage graph
  3. Look for patterns in memory consumption
  4. Identify memory spikes and correlate with specific operations

Latency Analysis

To analyze latency issues:

  1. Review the response time metrics
  2. Check for correlation with request payload size
  3. Test with different request parameters
  4. Compare performance across different times of day

Advanced Troubleshooting

Custom Health Checks

Create a simple health check endpoint to test your server:

// Simple health check
fetch('https://your-server-url.mcp-cloud.ai/health')
  .then(response => {
    console.log('Status:', response.status);
    return response.json();
  })
  .then(data => console.log('Health data:', data))
  .catch(error => console.error('Health check failed:', error));

Minimal Reproduction

When reporting issues to support, create a minimal reproduction:

  1. Isolate the specific request that causes the issue
  2. Remove any unnecessary parameters
  3. Document exact steps to reproduce
  4. Include full error messages and logs

Live Debug Mode

For persistent issues, you can enable live debug mode:

  1. Navigate to server settings
  2. Enable "Debug Mode"
  3. Use the real-time console to monitor operations
  4. Disable after troubleshooting to prevent performance impact

Common Error Messages

Error Message Likely Cause Solution
"Rate limit exceeded" Too many requests Reduce request frequency or upgrade plan
"Context length exceeded" Input + output tokens too large Reduce input size or use a model with larger context
"Invalid API key" Authentication issue Verify and rotate API key
"Service unavailable" Server offline or restarting Check server status and wait, or restart if needed
"Resource not found" Incorrect endpoint or deleted resource Verify URL and resource existence
"Validation error" Incorrect request format Check API documentation for correct request structure

Contacting Support

If you can't resolve the issue:

  1. Collect relevant information:

    • Server ID
    • Error messages
    • Timestamps of issues
    • Steps to reproduce
    • Recently made changes
  2. Submit a support ticket:

  3. For urgent issues, use the emergency support channel available to Business and Enterprise tier customers.

Preventative Measures

Monitoring

Set up proactive monitoring:

Testing

Implement testing practices:

Updates

Stay current with updates: