Skip to content

Troubleshooting

Common issues and their solutions.

Authentication Issues

Cannot Log In - OIDC Redirect Fails

Symptoms: - Redirect to OIDC provider fails - Browser shows "Connection refused" or timeout - Error: "Failed to fetch OIDC configuration"

Solutions:

  1. Check OIDC Discovery URL:

    # Test if discovery endpoint is reachable
    curl https://dex.example.com/dex/.well-known/openid-configuration
    

  2. Verify redirect URI:

  3. Check that APP_URL/callback matches OIDC client configuration
  4. Example: If APP_URL=https://velerodash.example.com, redirect URI should be https://velerodash.example.com/callback

  5. Enable Dex Proxy (if using Dex):

    export USE_DEX_PROXY=True
    export DEX_BASE_URL=http://dex.dex-system.svc.cluster.local:5556/dex
    

  6. Check CORS settings:

  7. If Dex proxy doesn't help, configure CORS on your OIDC provider

Logged In But Get "Access Denied" on All Pages

Symptoms: - Login succeeds but all pages show 403 Forbidden - Error: "You don't have permission to access this resource"

Solutions:

  1. Check username in Casbin policy:

    # View your username
    # Log in and go to Profile page, check "Username" field
    
    # Add to config/casbin_policy.csv
    g, your-username-here, velero.admin
    

  2. Verify policy file format:

    # Correct format
    p, velero.admin, *, *, .*
    g, username, velero.admin
    
    # Common mistakes:
    # - Extra spaces
    # - Wrong delimiter (must be comma)
    # - Typos in role names
    

  3. Check policy reload:

    # Check logs for reload message
    docker logs velerodashboard | grep -i policy
    # Should see: "Policy file changed, reloading enforcer"
    

  4. Verify OIDC claims:

  5. Check that username claim matches what's in policy file
  6. Common claims: email, preferred_username, sub

Cluster Connectivity Issues

"Failed to Connect to Cluster"

Symptoms: - Cluster shows as "Disconnected" in dashboard - Error: "Failed to connect to cluster: "

Solutions:

  1. Verify API server is reachable:

    # Test from where dashboard is running
    curl -k https://k8s.example.com:6443/version
    

  2. Check authentication method:

For kubeconfig:

# Verify kubeconfig file exists and is readable
ls -la /path/to/kubeconfig

# Test with kubectl
kubectl --kubeconfig=/path/to/kubeconfig get nodes

For token:

# Verify token is valid
curl -k -H "Authorization: Bearer $TOKEN" \
  https://k8s.example.com:6443/api/v1/namespaces

  1. Check certificate authority:
  2. For token auth, ensure certificate_authority_data is correct
  3. Test without cert verification first (dev only): OIDC_VERIFY_SSL=False

  4. Verify network connectivity:

    # From dashboard pod
    nc -zv k8s.example.com 6443
    

"Velero Not Found in Cluster"

Symptoms: - Cluster connects but shows "Velero not detected" - No backups/restores visible

Solutions:

  1. Verify Velero is installed:

    kubectl get namespace velero
    kubectl get pods -n velero
    

  2. Check Velero CRDs:

    kubectl get crds | grep velero.io
    # Should see: backups.velero.io, restores.velero.io, etc.
    

  3. Verify permissions:

    # Check if service account can list Velero resources
    kubectl auth can-i list backups.velero.io -n velero --as system:serviceaccount:velero-dashboard:velerodashboard
    

  4. Check Velero namespace:

  5. Dashboard looks for Velero in velero namespace by default
  6. If installed in different namespace, you may need to modify code

Permission Issues

Can View But Cannot Create Backups

Symptoms: - Dashboard loads, clusters visible - "Create Backup" button missing or disabled - Error: "You don't have permission to create backups"

Solutions:

  1. Check Casbin policies:

    # View-only permission (WRONG if you want to create)
    p, velero.viewer, *, backup, (view|list)
    
    # Correct permission for creating
    p, velero.operator, *, backup, (view|list|create|delete)
    

  2. Verify role assignment:

    # Make sure user is assigned operator role, not just viewer
    g, your-username, velero.operator
    

  3. Check domain matching:

    # If you have environment-specific access
    p, prod.admin, *:*:production, backup, .*
    # This only grants access to production clusters
    
    # For all environments
    p, velero.operator, *, backup, (view|list|create|delete)
    

Cannot Access Specific Namespace

Symptoms: - Can see some namespaces but not others - Error: "Namespace not found" or "Access denied"

Solutions:

  1. Check namespace-specific policies:

    # Grant access to specific namespace pattern
    p, app.team, *:myapp-*:*, backup, (view|list|create)
    
    # Or grant access to all namespaces
    p, velero.operator, *, backup, (view|list|create)
    

  2. Verify Kubernetes RBAC:

    # Check if service account has namespace access
    kubectl auth can-i list backups.velero.io -n <namespace> \
      --as system:serviceaccount:velero-dashboard:velerodashboard
    


Resource Not Found

Backups/Restores Not Showing

Symptoms: - Cluster connected, Velero detected - But no backups or restores visible

Solutions:

  1. Check namespace:
  2. Velero resources must be in velero namespace
  3. Verify: kubectl get backups -n velero

  4. Verify Velero is working:

    # List backups with Velero CLI
    velero backup get
    
    # Check Velero logs
    kubectl logs -n velero deployment/velero
    

  5. Check dashboard logs:

    docker logs velerodashboard | grep -i backup
    # Look for API errors
    

  6. Pagination:

  7. By default, only 100 items shown
  8. Check if there are more pages
  9. Adjust DEFAULT_ITEMS_PER_PAGE if needed

Kopia Repository Browser Issues

"Kopia Binary Not Found"

Symptoms: - Repository browser fails - Error: "Kopia binary not found at /usr/local/bin/kopia"

Solutions:

  1. Install Kopia:

    # macOS
    brew install kopia
    
    # Linux
    wget https://github.com/kopia/kopia/releases/download/v0.17.0/kopia-0.17.0-linux-x64.tar.gz
    tar -xzf kopia-0.17.0-linux-x64.tar.gz
    sudo mv kopia-0.17.0-linux-x64/kopia /usr/local/bin/
    
    # Verify
    kopia --version
    

  2. Set KOPIA_BIN environment variable:

    export KOPIA_BIN=/path/to/kopia
    

  3. Docker: Kopia is included in the image, should work out of the box

"Failed to Connect to Repository"

Symptoms: - Kopia installed but cannot connect to repository - Error: "Failed to list snapshots"

Solutions:

  1. Check S3 credentials:
  2. Repository browser uses credentials from Backup Storage Location
  3. Verify BSL has correct S3 credentials

  4. Verify repository exists:

    # Using Velero
    velero backup describe <backup-name>
    
    # Check if repository is ready
    kubectl get backuprepository -n velero
    

  5. Check S3 endpoint:

  6. Ensure S3 endpoint is reachable from dashboard
  7. Test: curl https://s3.amazonaws.com or your S3-compatible endpoint

Performance Issues

Dashboard Slow to Load

Symptoms: - Pages take a long time to load - Timeouts when listing resources

Solutions:

  1. Reduce pagination size:

    export DEFAULT_ITEMS_PER_PAGE=50
    

  2. Filter by namespace:

  3. Use namespace dropdown to limit results
  4. Don't try to list all resources across all namespaces at once

  5. Increase resources (Kubernetes):

    dashboard:
      resources:
        requests:
          memory: "512Mi"
          cpu: "250m"
        limits:
          memory: "1Gi"
          cpu: "500m"
    

  6. Check cluster API server performance:

  7. Slow responses may indicate overloaded API server
  8. Consider using multiple dashboard replicas

High Memory Usage

Symptoms: - Dashboard pod shows high memory usage - OOMKilled events

Solutions:

  1. Increase memory limits:

    dashboard:
      resources:
        limits:
          memory: "1Gi"  # Increase from default 512Mi
    

  2. Reduce concurrent requests:

  3. Don't open too many cluster tabs at once
  4. Close unused tabs

  5. Check for memory leaks:

    # Restart pod to free memory temporarily
    kubectl rollout restart deployment/velerodashboard -n velero-dashboard
    


Configuration Issues

Hot-Reload Not Working

Symptoms: - Changed config files but changes not reflected - No reload message in logs

Solutions:

  1. Check file permissions:

    # Ensure files are readable
    ls -la config/
    chmod 644 config/clusters.yaml config/casbin_policy.csv
    

  2. Verify watchdog is running:

    # Check logs for watchdog initialization
    docker logs velerodashboard | grep -i watch
    # Should see: "Starting file watcher for config/clusters.yaml"
    

  3. Docker volume mounts:

    # Ensure volume is mounted correctly
    docker inspect velerodashboard | grep -A 10 Mounts
    

  4. File system events:

  5. Some file systems don't support inotify (e.g., NFS)
  6. Consider polling mode or restart container after changes

Logging and Debugging

Enable Debug Logging

# Environment variable
export FLASK_DEBUG=True
export FLASK_ENV=development

# View logs
docker logs -f velerodashboard

# Kubernetes
kubectl logs -n velero-dashboard -l app=velerodashboard -f --tail=100

Check Health Endpoint

curl http://localhost:8000/health
# Should return: {"status": "healthy"}

Common Log Messages

Log Message Meaning
"Cluster configuration reloaded" Config file updated successfully
"Policy file changed, reloading enforcer" Casbin policies updated
"Failed to connect to cluster" Cluster connectivity issue
"User not authorized" RBAC denial
"OIDC token expired" Need to log in again

Getting Help

If you're still stuck:

  1. Check logs with debug mode enabled
  2. Search GitHub Issues for similar problems
  3. Open a new issue with:
  4. Detailed description of the problem
  5. Steps to reproduce
  6. Logs (sanitize sensitive data!)
  7. Configuration files (sanitize secrets!)
  8. Environment (Docker, Kubernetes, etc.)

See Also