Troubleshooting¶

Common issues and their solutions.

Authentication Issues¶

Cannot Log In - OIDC Redirect Fails¶

Symptoms: - Redirect to OIDC provider fails - Browser shows "Connection refused" or timeout - Error: "Failed to fetch OIDC configuration"

Solutions:

Check OIDC Discovery URL:

# Test if discovery endpoint is reachable
curl https://dex.example.com/dex/.well-known/openid-configuration

Verify redirect URI:
Check that APP_URL/callback matches OIDC client configuration
Example: If APP_URL=https://velerodash.example.com, redirect URI should be https://velerodash.example.com/callback

Enable Dex Proxy (if using Dex):

export USE_DEX_PROXY=True
export DEX_BASE_URL=http://dex.dex-system.svc.cluster.local:5556/dex

Check CORS settings:
If Dex proxy doesn't help, configure CORS on your OIDC provider

Logged In But Get "Access Denied" on All Pages¶

Symptoms: - Login succeeds but all pages show 403 Forbidden - Error: "You don't have permission to access this resource"

Solutions:

Check username in Casbin policy:

# View your username
# Log in and go to Profile page, check "Username" field

# Add to config/casbin_policy.csv
g, your-username-here, velero.admin

Verify policy file format:

# Correct format
p, velero.admin, *, *, .*
g, username, velero.admin

# Common mistakes:
# - Extra spaces
# - Wrong delimiter (must be comma)
# - Typos in role names

Check policy reload:

# Check logs for reload message
docker logs velerodashboard | grep -i policy
# Should see: "Policy file changed, reloading enforcer"

Verify OIDC claims:
Check that username claim matches what's in policy file
Common claims: email, preferred_username, sub

Cluster Connectivity Issues¶

"Failed to Connect to Cluster"¶

Symptoms: - Cluster shows as "Disconnected" in dashboard - Error: "Failed to connect to cluster: "

Solutions:

Verify API server is reachable:

# Test from where dashboard is running
curl -k https://k8s.example.com:6443/version

Check authentication method:

For kubeconfig:

# Verify kubeconfig file exists and is readable
ls -la /path/to/kubeconfig

# Test with kubectl
kubectl --kubeconfig=/path/to/kubeconfig get nodes

For token:

# Verify token is valid
curl -k -H "Authorization: Bearer $TOKEN" \
  https://k8s.example.com:6443/api/v1/namespaces

Check certificate authority:
For token auth, ensure certificate_authority_data is correct
Test without cert verification first (dev only): OIDC_VERIFY_SSL=False

Verify network connectivity:

# From dashboard pod
nc -zv k8s.example.com 6443

"Velero Not Found in Cluster"¶

Symptoms: - Cluster connects but shows "Velero not detected" - No backups/restores visible

Solutions:

Verify Velero is installed:

kubectl get namespace velero
kubectl get pods -n velero

Check Velero CRDs:

kubectl get crds | grep velero.io
# Should see: backups.velero.io, restores.velero.io, etc.

Verify permissions:

# Check if service account can list Velero resources
kubectl auth can-i list backups.velero.io -n velero --as system:serviceaccount:velero-dashboard:velerodashboard

Check Velero namespace:
Dashboard looks for Velero in velero namespace by default
If installed in different namespace, you may need to modify code

Permission Issues¶

Can View But Cannot Create Backups¶

Symptoms: - Dashboard loads, clusters visible - "Create Backup" button missing or disabled - Error: "You don't have permission to create backups"

Solutions:

Check Casbin policies:

# View-only permission (WRONG if you want to create)
p, velero.viewer, *, backup, (view|list)

# Correct permission for creating
p, velero.operator, *, backup, (view|list|create|delete)

Verify role assignment:

# Make sure user is assigned operator role, not just viewer
g, your-username, velero.operator

Check domain matching:

# If you have environment-specific access
p, prod.admin, *:*:production, backup, .*
# This only grants access to production clusters

# For all environments
p, velero.operator, *, backup, (view|list|create|delete)

Cannot Access Specific Namespace¶

Symptoms: - Can see some namespaces but not others - Error: "Namespace not found" or "Access denied"

Solutions:

Check namespace-specific policies:

# Grant access to specific namespace pattern
p, app.team, *:myapp-*:*, backup, (view|list|create)

# Or grant access to all namespaces
p, velero.operator, *, backup, (view|list|create)

Verify Kubernetes RBAC:

# Check if service account has namespace access
kubectl auth can-i list backups.velero.io -n <namespace> \
  --as system:serviceaccount:velero-dashboard:velerodashboard

Resource Not Found¶

Backups/Restores Not Showing¶

Symptoms: - Cluster connected, Velero detected - But no backups or restores visible

Solutions:

Check namespace:
Velero resources must be in velero namespace
Verify: kubectl get backups -n velero

Verify Velero is working:

# List backups with Velero CLI
velero backup get

# Check Velero logs
kubectl logs -n velero deployment/velero

Check dashboard logs:

docker logs velerodashboard | grep -i backup
# Look for API errors

Pagination:
By default, only 100 items shown
Check if there are more pages
Adjust DEFAULT_ITEMS_PER_PAGE if needed

Kopia Repository Browser Issues¶

"Kopia Binary Not Found"¶

Symptoms: - Repository browser fails - Error: "Kopia binary not found at /usr/local/bin/kopia"

Solutions:

Install Kopia:

# macOS
brew install kopia

# Linux
wget https://github.com/kopia/kopia/releases/download/v0.17.0/kopia-0.17.0-linux-x64.tar.gz
tar -xzf kopia-0.17.0-linux-x64.tar.gz
sudo mv kopia-0.17.0-linux-x64/kopia /usr/local/bin/

# Verify
kopia --version

Set KOPIA_BIN environment variable:
```
export KOPIA_BIN=/path/to/kopia
```
Docker: Kopia is included in the image, should work out of the box

"Failed to Connect to Repository"¶

Symptoms: - Kopia installed but cannot connect to repository - Error: "Failed to list snapshots"

Solutions:

Check S3 credentials:
Repository browser uses credentials from Backup Storage Location
Verify BSL has correct S3 credentials

Verify repository exists:

# Using Velero
velero backup describe <backup-name>

# Check if repository is ready
kubectl get backuprepository -n velero

Check S3 endpoint:
Ensure S3 endpoint is reachable from dashboard
Test: curl https://s3.amazonaws.com or your S3-compatible endpoint

Performance Issues¶

Dashboard Slow to Load¶

Symptoms: - Pages take a long time to load - Timeouts when listing resources

Solutions:

Reduce pagination size:
```
export DEFAULT_ITEMS_PER_PAGE=50
```
Filter by namespace:
Use namespace dropdown to limit results
Don't try to list all resources across all namespaces at once

Increase resources (Kubernetes):

dashboard:
  resources:
    requests:
      memory: "512Mi"
      cpu: "250m"
    limits:
      memory: "1Gi"
      cpu: "500m"

Check cluster API server performance:
Slow responses may indicate overloaded API server
Consider using multiple dashboard replicas

High Memory Usage¶

Symptoms: - Dashboard pod shows high memory usage - OOMKilled events

Solutions:

Increase memory limits:

dashboard:
  resources:
    limits:
      memory: "1Gi"  # Increase from default 512Mi

Reduce concurrent requests:
Don't open too many cluster tabs at once
Close unused tabs

Check for memory leaks:

# Restart pod to free memory temporarily
kubectl rollout restart deployment/velerodashboard -n velero-dashboard

Configuration Issues¶

Hot-Reload Not Working¶

Symptoms: - Changed config files but changes not reflected - No reload message in logs

Solutions:

Check file permissions:

# Ensure files are readable
ls -la config/
chmod 644 config/clusters.yaml config/casbin_policy.csv

Verify watchdog is running:

# Check logs for watchdog initialization
docker logs velerodashboard | grep -i watch
# Should see: "Starting file watcher for config/clusters.yaml"

Docker volume mounts:

# Ensure volume is mounted correctly
docker inspect velerodashboard | grep -A 10 Mounts

File system events:
Some file systems don't support inotify (e.g., NFS)
Consider polling mode or restart container after changes

Logging and Debugging¶

Enable Debug Logging¶

# Environment variable
export FLASK_DEBUG=True
export FLASK_ENV=development

# View logs
docker logs -f velerodashboard

# Kubernetes
kubectl logs -n velero-dashboard -l app=velerodashboard -f --tail=100

Check Health Endpoint¶

curl http://localhost:8000/health
# Should return: {"status": "healthy"}

Common Log Messages¶

Log Message	Meaning
"Cluster configuration reloaded"	Config file updated successfully
"Policy file changed, reloading enforcer"	Casbin policies updated
"Failed to connect to cluster"	Cluster connectivity issue
"User not authorized"	RBAC denial
"OIDC token expired"	Need to log in again

Getting Help¶

If you're still stuck:

Check logs with debug mode enabled
Search GitHub Issues for similar problems
Open a new issue with:
Detailed description of the problem
Steps to reproduce
Logs (sanitize sensitive data!)
Configuration files (sanitize secrets!)
Environment (Docker, Kubernetes, etc.)

Troubleshooting¶

Authentication Issues¶

Cannot Log In - OIDC Redirect Fails¶

Logged In But Get "Access Denied" on All Pages¶

Cluster Connectivity Issues¶

"Failed to Connect to Cluster"¶

"Velero Not Found in Cluster"¶

Permission Issues¶

Can View But Cannot Create Backups¶

Cannot Access Specific Namespace¶

Resource Not Found¶

Backups/Restores Not Showing¶

Kopia Repository Browser Issues¶

"Kopia Binary Not Found"¶

"Failed to Connect to Repository"¶

Performance Issues¶

Dashboard Slow to Load¶

High Memory Usage¶

Configuration Issues¶

Hot-Reload Not Working¶

Logging and Debugging¶

Enable Debug Logging¶

Check Health Endpoint¶

Common Log Messages¶

Getting Help¶

See Also¶