Skip to content

Troubleshooting Guide

This document provides comprehensive solutions for common issues encountered when using the Gunicorn Prometheus Exporter.

Common Issues

This section addresses the most frequently encountered problems and their solutions.

Port Already in Use

Error:

OSError: [Errno 98] Address already in use

Solution:

  1. Change the metrics port in your configuration:
# In gunicorn.conf.py

import os
os.environ.setdefault("PROMETHEUS_METRICS_PORT", "9091")  # Use different port
  1. Or kill the process using the port:
# Find the process

lsof -i :9090

# Kill the process

kill -9 <PID>

Permission Denied

Error:

PermissionError: [Errno 13] Permission denied

Solution:

  1. Check multiprocess directory permissions:
# Create directory with proper permissions

mkdir -p /tmp/prometheus_multiproc
chmod 755 /tmp/prometheus_multiproc
  1. Or use a different directory:
# In gunicorn.conf.py

import os

os.environ.setdefault("PROMETHEUS_MULTIPROC_DIR", "/var/tmp/prometheus_multiproc")

Import Errors for Async Workers

Error:

ModuleNotFoundError: No module named 'eventlet'

Solution:

  1. Install the required dependencies:
# For eventlet workers
pip install gunicorn-prometheus-exporter[eventlet]

# For gevent workers
pip install gunicorn-prometheus-exporter[gevent]


# Or install all async dependencies
pip install gunicorn-prometheus-exporter[async]
  1. Verify the installation:
python -c "import eventlet; print('eventlet available')"

Metrics Not Updating

Issue: Metrics endpoint shows stale or no data.

Solutions:

  1. Check environment variables:
# Verify all required variables are set
echo $PROMETHEUS_MULTIPROC_DIR
echo $PROMETHEUS_METRICS_PORT
echo $PROMETHEUS_BIND_ADDRESS
echo $GUNICORN_WORKERS
  1. Check multiprocess directory:
# Verify directory exists and is writable
ls -la /tmp/prometheus_multiproc/
  1. Restart Gunicorn:
# Kill existing process
pkill -f gunicorn

# Start fresh
gunicorn -c gunicorn.conf.py app:app

Worker Type Errors

Error:

TypeError: 'NoneType' object is not callable

Solution:

  1. Verify worker class is correctly specified:
# In gunicorn.conf.py
worker_class = "gunicorn_prometheus_exporter.PrometheusWorker"  # Sync
worker_class = "gunicorn_prometheus_exporter.PrometheusThreadWorker"  # Thread
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"  # Eventlet
worker_class = "gunicorn_prometheus_exporter.PrometheusGeventWorker"  # Gevent
  1. Check if async dependencies are installed:
# For eventlet workers
python -c "import eventlet"

# For gevent workers
python -c "import gevent"

Configuration Issues

This section covers problems related to environment variables, Redis configuration, and other setup issues.

Environment Variables Not Set

Error:

ValueError: Environment variable PROMETHEUS_METRICS_PORT must be set in production

Solution:

  1. Set environment variables in your configuration:
# In gunicorn.conf.py

import os
os.environ.setdefault("PROMETHEUS_MULTIPROC_DIR", "/tmp/prometheus_multiproc")
os.environ.setdefault("PROMETHEUS_METRICS_PORT", "9090")
os.environ.setdefault("PROMETHEUS_BIND_ADDRESS", "0.0.0.0")
os.environ.setdefault("GUNICORN_WORKERS", "2")
  1. Or export them in your shell:
export PROMETHEUS_MULTIPROC_DIR="/tmp/prometheus_multiproc"
export PROMETHEUS_METRICS_PORT="9090"
export PROMETHEUS_BIND_ADDRESS="0.0.0.0"
export GUNICORN_WORKERS="2"

Redis Configuration Issues

Error:

ConnectionError: Error connecting to Redis

Solution:

  1. Check Redis server is running:
redis-cli ping
  1. Verify Redis configuration:
# In gunicorn.conf.py

import os
os.environ.setdefault("REDIS_ENABLED", "true")
os.environ.setdefault("REDIS_HOST", "localhost")
os.environ.setdefault("REDIS_PORT", "6379")
os.environ.setdefault("REDIS_DB", "0")
  1. Test Redis connection:
redis-cli -h localhost -p 6379 ping

Debug Mode

This section provides guidance on enabling debug logging and diagnostic tools for troubleshooting.

Enable Debug Logging

# In gunicorn.conf.py
import logging
logging.basicConfig(level=logging.DEBUG)

# Or set specific logger
logging.getLogger('gunicorn_prometheus_exporter').setLevel(logging.DEBUG)

Verbose Gunicorn Output

# Start with verbose logging
gunicorn -c gunicorn.conf.py app:app --log-level debug

Check Metrics Endpoint

# Test metrics endpoint
curl http://0.0.0.0:9090/metrics

# Check for specific metrics
curl http://0.0.0.0:9090/metrics | grep gunicorn_worker

# Check for errors
curl http://0.0.0.0:9090/metrics | grep -i error

Diagnostic Commands

This section provides command-line tools and techniques for diagnosing system issues.

Check Process Status

# List Gunicorn processes
ps aux | grep gunicorn

# Check open ports
netstat -tlnp | grep 9090

# Check multiprocess directory
ls -la /tmp/prometheus_multiproc/

Monitor Metrics

# Watch metrics in real-time
watch -n 1 'curl -s http://0.0.0.0:9090/metrics | grep gunicorn_worker_requests_total'

# Monitor specific worker
watch -n 1 'curl -s http://0.0.0.0:9090/metrics | grep "worker_id=\"worker_1\""'

Test Worker Types

# Test sync worker
gunicorn --config example/gunicorn_simple.conf.py example/app:app

# Test thread worker
gunicorn --config example/gunicorn_thread_worker.conf.py example/app:app

# Test eventlet worker
gunicorn --config example/gunicorn_eventlet_async.conf.py example/async_app:app

# Test gevent worker
gunicorn --config example/gunicorn_gevent_async.conf.py example/async_app:app

Async Worker Issues

This section addresses specific problems related to asynchronous worker types (Eventlet and Gevent).

Eventlet Worker Problems

Common Issues:

  1. Import errors: Install eventlet package
  2. WSGI compatibility: Use async-compatible application
  3. Worker connections: Set appropriate worker_connections

Solution:

# In gunicorn.conf.py
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"
worker_connections = 1000

# Use async-compatible app
app = "example.async_app:app"

Gevent Worker Problems

Common Issues:

  1. Import errors: Install gevent package
  2. Monkey patching: May conflict with other libraries
  3. Worker connections: Set appropriate worker_connections

Solution:

# In gunicorn.conf.py
worker_class = "gunicorn_prometheus_exporter.PrometheusGeventWorker"
worker_connections = 1000

# Use async-compatible app
app = "example.async_app:app"

Alternative Solution:

# In gunicorn.conf.py - Use EventletWorker instead
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"

# Use async-compatible app
app = "example.async_app:app"

Performance Issues

This section covers performance-related problems and optimization strategies.

High Memory Usage

Symptoms:

  • Memory usage increases over time
  • Workers restart frequently

Solutions:

  1. Reduce worker count:
# In gunicorn.conf.py
workers = 2  # Reduce from default
  1. Enable metric cleanup:
# In gunicorn.conf.py

import os
os.environ.setdefault("CLEANUP_DB_FILES", "true")
  1. Monitor memory metrics:
# Check memory usage
curl http://0.0.0.0:9090/metrics | grep gunicorn_worker_memory_bytes

High CPU Usage

Symptoms:

  • CPU usage spikes during requests
  • Slow response times

Solutions:

  1. Use appropriate worker type:
# For I/O-bound apps
worker_class = "gunicorn_prometheus_exporter.PrometheusThreadWorker"

# For async apps
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"
  1. Monitor CPU metrics:
# Check CPU usage
curl http://0.0.0.0:9090/metrics | grep gunicorn_worker_cpu_percent

Slow Metrics Collection

Symptoms:

  • Metrics endpoint responds slowly
  • High latency in metric updates

Solutions:

  1. Reduce metric collection frequency:
# Update worker metrics less frequently
def worker_int(worker):
    # Only update every 10 seconds
    if hasattr(worker, '_last_metrics_update'):
        if time.time() - worker._last_metrics_update < 10:
            return
    worker._last_metrics_update = time.time()
    worker.update_worker_metrics()

Recovery Procedures

Clean Restart

# Stop all Gunicorn processes
pkill -f gunicorn

# Clean multiprocess directory
rm -rf /tmp/prometheus_multiproc/*

# Restart with fresh configuration
gunicorn -c gunicorn.conf.py app:app

Emergency Recovery

# Force kill all processes
pkill -9 -f gunicorn

# Clean all temporary files
rm -rf /tmp/prometheus_multiproc/*
rm -rf /tmp/gunicorn*

# Restart with minimal configuration
gunicorn --bind 0.0.0.0:8000 --workers 1 app:app

Data Recovery

# Backup metrics data
cp -r /tmp/prometheus_multiproc /backup/prometheus_multiproc_$(date +%Y%m%d_%H%M%S)

# Restore from backup
cp -r /backup/prometheus_multiproc_latest/* /tmp/prometheus_multiproc/

Getting Help

This section provides information on obtaining additional support and reporting issues.

Debug Information

When reporting issues, include:

  1. Gunicorn version:
gunicorn --version
  1. Python version:
python --version
  1. Installed packages:
pip list | grep gunicorn
  1. Configuration file:
cat gunicorn.conf.py
  1. Error logs:
gunicorn -c gunicorn.conf.py app:app --log-level debug 2>&1
  1. Metrics endpoint:
curl http://0.0.0.0:9090/metrics

Support Channels


For more help, see the Installation Guide and Configuration Reference.