Troubleshooting

Troubleshooting - Common Issues and Solutions

This guide helps you diagnose and resolve common issues with Claude Flow, including error messages, performance problems, and configuration issues.

Common Errors and Solutions

Agent Spawn Failures

Error: "Agent spawn failed: Maximum agents exceeded"

Error: E010 - Resource limit exceeded
Maximum agents per swarm: 50

Causes:

Swarm has reached maximum agent capacity
Resource limits preventing new agent creation

Solutions:

Check current agent count:
```
claude-flow agent list --swarm-id <id>
```

Remove idle agents:

claude-flow agent cleanup --idle-timeout 300

Increase swarm capacity:
```
claude-flow swarm scale 75 --max 100
```

Create additional swarm:

claude-flow swarm init --topology mesh --max-agents 50

Error: "Agent type not recognized"

Error: Unknown agent type 'custom-agent'

Solutions:

List available agent types:
```
claude-flow agent types --list
```
Check for typos in agent type name

Use closest matching agent type with custom capabilities:

claude-flow agent spawn specialist \
  --capabilities "custom-logic,domain-specific"

Memory Operation Issues

Error: "Memory operation failed: Namespace full"

Error: E004 - Memory operation failed
Namespace 'project-data' has exceeded 1GB limit

Solutions:

Check namespace usage:

claude-flow memory analytics --namespace project-data

Clean up old entries:

claude-flow memory cleanup \
  --namespace project-data \
  --older-than 30d

Compress namespace:

claude-flow memory compress --namespace project-data

Increase namespace limit (config):

claude-flow config set memory.namespace.limit 2GB

Error: "Memory key not found"

Error: Key 'config/database' not found in namespace 'default'

Solutions:

List available keys:

claude-flow memory usage --action list --namespace default

Search for similar keys:

claude-flow memory search "config*" --namespace default

Check if using correct namespace:

claude-flow memory usage \
  --action retrieve \
  --key "config/database" \
  --namespace "project-config"

Task Orchestration Problems

Error: "Task orchestration failed: Circular dependency detected"

Error: E003 - Task orchestration error
Circular dependency: A -> B -> C -> A

Solutions:

Visualize task dependencies:

claude-flow task visualize --task-id <id> --format graph

Break circular dependency:

// Instead of circular dependencies
const tasks = {
  A: { deps: ['B'] },  // A depends on B
  B: { deps: ['C'] },  // B depends on C
  C: { deps: ['A'] }   // C depends on A - CIRCULAR!
};

// Refactor to remove circularity
const tasks = {
  prepare: { deps: [] },
  A: { deps: ['prepare'] },
  B: { deps: ['A'] },
  C: { deps: ['A'] }
};

Use parallel execution where possible:

claude-flow task orchestrate \
  --task "Process data" \
  --strategy parallel \
  --ignore-soft-deps

Error: "Task timeout exceeded"

Error: Task 'complex-analysis' exceeded timeout of 3600 seconds

Solutions:

Increase timeout:

claude-flow task orchestrate \
  --task "Complex analysis" \
  --timeout 7200

Break into smaller tasks:

claude-flow task orchestrate \
  --task "Analysis part 1" \
  --checkpoint \
  --continue-on-timeout

Use adaptive strategy:

claude-flow task orchestrate \
  --task "Large dataset processing" \
  --strategy adaptive \
  --scale-on-demand

Swarm Coordination Issues

Error: "Swarm initialization failed: Invalid topology"

Error: E005 - Swarm initialization failed
Topology 'custom' is not supported

Solutions:

Use supported topology:

# Supported: hierarchical, mesh, ring, star
claude-flow swarm init --topology hierarchical

Check topology compatibility:

claude-flow swarm validate-topology \
  --agents "coder:5,tester:3" \
  --topology mesh

Error: "Swarm communication timeout"

Error: Agent communication timeout after 30s
Swarm: swarm-123, Agent: agent-456

Solutions:

Check swarm health:

claude-flow swarm status <swarm-id> --detailed

Restart communication layer:

claude-flow swarm repair <swarm-id> --fix-communication

Reduce swarm load:

claude-flow load balance --swarm-id <id> --redistribute

GitHub Integration Errors

Error: "GitHub API rate limit exceeded"

Error: E007 - GitHub API error
Rate limit exceeded. Reset at 2024-01-15T12:00:00Z

Solutions:

Check rate limit status:
```
claude-flow github rate-limit --check
```

Use authenticated requests:

export GITHUB_TOKEN=ghp_xxxxxxxxxxxx
claude-flow config set github.token $GITHUB_TOKEN

Implement caching:

claude-flow config set github.cache.enabled true
claude-flow config set github.cache.ttl 3600

Batch operations:

claude-flow github batch-operations \
  --operations "analyze,metrics,issues" \
  --repo owner/repo

Error: "Repository access denied"

Error: 403 - Access denied to repository 'private/repo'

Solutions:

Verify token permissions:

claude-flow github check-permissions --token $GITHUB_TOKEN

Update token scopes:
- Required scopes: repo, read:org, workflow

Check repository visibility:

claude-flow github repo info private/repo --check-access

Neural Network Training Issues

Error: "Neural training failed: Insufficient data"

Error: E006 - Neural training error
Training data has only 50 samples, minimum 1000 required

Solutions:

Augment training data:

claude-flow neural augment \
  --input-data ./small-dataset.json \
  --augmentation-factor 20 \
  --output ./augmented-data.json

Use transfer learning:

claude-flow neural train \
  --pattern-type optimization \
  --base-model "pretrained-general" \
  --fine-tune ./small-dataset.json

Reduce model complexity:

claude-flow neural train \
  --pattern-type prediction \
  --model-size small \
  --epochs 20

Error: "WASM module initialization failed"

Error: Failed to initialize WASM SIMD module

Solutions:

Check WASM support:
```
claude-flow system check --wasm-support
```

Fallback to non-SIMD:

claude-flow config set neural.wasm.simd false

Update Node.js version:

# Requires Node.js 16+ for WASM SIMD
node --version

Performance Issues

Slow Agent Response Times

Symptoms:

Agents taking >30s to respond
Task completion times increasing
Timeout errors

Diagnostics:

# Check agent performance
claude-flow agent metrics <agent-id> --period 1h

# Monitor resource usage
claude-flow performance report --components agents --format detailed

# Identify bottlenecks
claude-flow bottleneck analyze --component swarm

Solutions:

Optimize agent allocation:

claude-flow topology optimize --swarm-id <id>

Reduce agent workload:

// Split large tasks
const subtasks = splitTask(largeTask, { maxSize: 1000 });
await orchestrate(subtasks, { parallel: true });

Enable agent pooling:

claude-flow config set agents.pooling.enabled true
claude-flow config set agents.pooling.idle-timeout 300

Implement caching:

claude-flow config set agents.cache.results true
claude-flow config set agents.cache.ttl 1800

High Memory Usage

Symptoms:

Memory usage >80%
Frequent garbage collection
Out of memory errors

Solutions:

Memory profiling:

claude-flow memory profile --duration 300 --export profile.json

Cleanup strategies:

# Automatic cleanup
claude-flow memory auto-cleanup \
  --threshold 80 \
  --strategy lru \
  --preserve-critical

# Manual cleanup
claude-flow memory cleanup \
  --older-than 7d \
  --exclude-namespaces "critical,config"

Optimize memory usage:

// Use streaming for large data
const stream = await claudeFlow.data.stream({
  source: 'large-dataset',
  batchSize: 100,
  processInMemory: false
});

Slow Workflow Execution

Symptoms:

Workflows taking hours instead of minutes
Sequential execution when parallel possible
Resource underutilization

Solutions:

Analyze workflow:

claude-flow workflow analyze <workflow-id> \
  --identify-bottlenecks \
  --suggest-optimizations

Optimize parallelism:

# workflow.yaml
optimization:
  parallel_stages:
    - [test, lint, security-scan]
    - [build-frontend, build-backend]
  max_concurrent: 10
  resource_allocation: dynamic

Implement workflow caching:

claude-flow workflow cache enable \
  --workflow-id <id> \
  --cache-key-strategy content-hash

Configuration Problems

Environment Variable Issues

Problem: Claude Flow not recognizing environment variables

Solutions:

Check variable loading:
```
claude-flow config env --verify
```

Set variables correctly:

# .env file
CLAUDE_FLOW_API_KEY=xxx
GITHUB_TOKEN=ghp_xxx
NODE_ENV=production

# Load explicitly
claude-flow config load-env --file .env

Debug configuration:
```
claude-flow config debug --show-sources
```

Configuration File Errors

Problem: "Invalid configuration file"

Solutions:

Validate configuration:

claude-flow config validate --file claude-flow.config.json

Fix common issues:

{
  "version": "1.0",  // Required
  "swarm": {
    "defaultTopology": "hierarchical",  // Valid topology
    "maxAgents": 50  // Number, not string
  },
  "memory": {
    "defaultNamespace": "default",
    "persistence": true
  }
}

Reset to defaults:

claude-flow config reset --backup-current

Network and Connectivity Issues

WebSocket Connection Failures

Problem: Real-time features not working

Solutions:

Test WebSocket connectivity:

claude-flow network test --protocol websocket

Configure proxy settings:

claude-flow config set network.proxy.websocket "ws://proxy:8080"

Fallback options:

claude-flow config set network.fallback.enabled true
claude-flow config set network.fallback.polling-interval 5000

API Connection Timeouts

Problem: Frequent timeouts when calling APIs

Solutions:

Increase timeouts:

claude-flow config set network.timeout.default 60000
claude-flow config set network.timeout.github 120000

Implement retry logic:

claude-flow config set network.retry.enabled true
claude-flow config set network.retry.max-attempts 3
claude-flow config set network.retry.backoff exponential

Recovery Procedures

Corrupted State Recovery

When swarm or workflow state becomes corrupted:

# 1. Create backup
claude-flow state backup --all --output backup.tar.gz

# 2. Analyze corruption
claude-flow state analyze --check-integrity

# 3. Attempt repair
claude-flow state repair --auto-fix

# 4. If repair fails, restore from checkpoint
claude-flow state restore --checkpoint last-known-good

Emergency Shutdown

When system becomes unresponsive:

# 1. Graceful shutdown attempt
claude-flow emergency shutdown --timeout 30

# 2. Force shutdown if needed
claude-flow emergency shutdown --force --save-state

# 3. Clean up resources
claude-flow cleanup --all --remove-locks

# 4. Restart with recovery
claude-flow init --recover-from shutdown-state.json

Debugging Tools

Enable Debug Logging

# Verbose logging
claude-flow --verbose <command>

# Debug specific component
DEBUG=claude-flow:agents claude-flow agent spawn coder

# Full debug mode
claude-flow config set debug.enabled true
claude-flow config set debug.level trace

Performance Profiling

# CPU profiling
claude-flow profile cpu --duration 60 --output cpu-profile.json

# Memory profiling
claude-flow profile memory --interval 1000 --output memory-profile.json

# Network profiling
claude-flow profile network --capture-packets --output network.pcap

Health Checks

# System health check
claude-flow health check --comprehensive

# Component-specific checks
claude-flow health check --components "agents,memory,network"

# Continuous monitoring
claude-flow health monitor --interval 30 --alert-on-issues

Preventive Measures

Regular Maintenance

Daily tasks:
```
claude-flow maintenance daily
```

Weekly tasks:

claude-flow maintenance weekly --include-optimization

Monthly tasks:

claude-flow maintenance monthly --deep-clean

Monitoring Setup

# Setup alerts
claude-flow monitor setup \
  --metrics "cpu,memory,errors" \
  --thresholds "cpu:80,memory:85,errors:10" \
  --notify webhook,email

# Dashboard
claude-flow monitor dashboard --port 3000

Getting Help

Diagnostic Information

When reporting issues, include:

# Generate diagnostic report
claude-flow diagnostic report --full --output diagnostic.zip

This includes:

System information
Configuration
Recent logs
Performance metrics
Error traces

Community Resources

GitHub Issues: Report bugs and feature requests
Discord: Real-time help from community
Documentation: Check latest docs for updates
FAQ: Common questions and answers

Next Steps

Review Development Patterns to avoid common pitfalls
Check API Reference for correct command usage
Explore Performance Guide for optimization tips

Troubleshooting

Troubleshooting - Common Issues and Solutions

Common Errors and Solutions

Agent Spawn Failures

Error: "Agent spawn failed: Maximum agents exceeded"

Error: "Agent type not recognized"

Memory Operation Issues

Error: "Memory operation failed: Namespace full"

Error: "Memory key not found"

Task Orchestration Problems

Error: "Task orchestration failed: Circular dependency detected"

Error: "Task timeout exceeded"

Swarm Coordination Issues

Error: "Swarm initialization failed: Invalid topology"

Error: "Swarm communication timeout"

GitHub Integration Errors

Error: "GitHub API rate limit exceeded"

Error: "Repository access denied"

Neural Network Training Issues

Error: "Neural training failed: Insufficient data"

Error: "WASM module initialization failed"

Performance Issues

Slow Agent Response Times

High Memory Usage

Slow Workflow Execution

Configuration Problems

Environment Variable Issues

Configuration File Errors

Network and Connectivity Issues

WebSocket Connection Failures

API Connection Timeouts

Recovery Procedures

Corrupted State Recovery

Emergency Shutdown

Debugging Tools

Enable Debug Logging

Performance Profiling

Health Checks

Preventive Measures

Regular Maintenance

Monitoring Setup

Getting Help

Diagnostic Information

Community Resources

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!