Troubleshooting Guide¶

This document provides comprehensive solutions for common issues encountered when using the Gunicorn Prometheus Exporter.

Common Issues¶

This section addresses the most frequently encountered problems and their solutions.

Port Already in Use¶

Error:

OSError: [Errno 98] Address already in use

Solution:

Change the metrics port in your configuration:

# In gunicorn.conf.py

import os
os.environ.setdefault("PROMETHEUS_METRICS_PORT", "9091")  # Use different port

Or kill the process using the port:

# Find the process

lsof -i :9090

# Kill the process

kill -9 <PID>

Permission Denied¶

Error:

PermissionError: [Errno 13] Permission denied

Solution:

Check multiprocess directory permissions:

# Create directory with proper permissions

mkdir -p /tmp/prometheus_multiproc
chmod 755 /tmp/prometheus_multiproc

Or use a different directory:

# In gunicorn.conf.py

import os

os.environ.setdefault("PROMETHEUS_MULTIPROC_DIR", "/var/tmp/prometheus_multiproc")

Import Errors for Async Workers¶

Error:

ModuleNotFoundError: No module named 'eventlet'

Solution:

Install the required dependencies:

# For eventlet workers
pip install gunicorn-prometheus-exporter[eventlet]

# For gevent workers
pip install gunicorn-prometheus-exporter[gevent]


# Or install all async dependencies
pip install gunicorn-prometheus-exporter[async]

Verify the installation:

python -c "import eventlet; print('eventlet available')"

Metrics Not Updating¶

Issue: Metrics endpoint shows stale or no data.

Solutions:

Check environment variables:

# Verify all required variables are set
echo $PROMETHEUS_MULTIPROC_DIR
echo $PROMETHEUS_METRICS_PORT
echo $PROMETHEUS_BIND_ADDRESS
echo $GUNICORN_WORKERS

Check multiprocess directory:

# Verify directory exists and is writable
ls -la /tmp/prometheus_multiproc/

Restart Gunicorn:

# Kill existing process
pkill -f gunicorn

# Start fresh
gunicorn -c gunicorn.conf.py app:app

Worker Type Errors¶

Error:

TypeError: 'NoneType' object is not callable

Solution:

Verify worker class is correctly specified:

# In gunicorn.conf.py
worker_class = "gunicorn_prometheus_exporter.PrometheusWorker"  # Sync
worker_class = "gunicorn_prometheus_exporter.PrometheusThreadWorker"  # Thread
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"  # Eventlet
worker_class = "gunicorn_prometheus_exporter.PrometheusGeventWorker"  # Gevent

Check if async dependencies are installed:

# For eventlet workers
python -c "import eventlet"

# For gevent workers
python -c "import gevent"

Configuration Issues¶

This section covers problems related to environment variables, Redis configuration, and other setup issues.

Environment Variables Not Set¶

Error:

ValueError: Environment variable PROMETHEUS_METRICS_PORT must be set in production

Solution:

Set environment variables in your configuration:

# In gunicorn.conf.py

import os
os.environ.setdefault("PROMETHEUS_MULTIPROC_DIR", "/tmp/prometheus_multiproc")
os.environ.setdefault("PROMETHEUS_METRICS_PORT", "9090")
os.environ.setdefault("PROMETHEUS_BIND_ADDRESS", "0.0.0.0")
os.environ.setdefault("GUNICORN_WORKERS", "2")

Or export them in your shell:

export PROMETHEUS_MULTIPROC_DIR="/tmp/prometheus_multiproc"
export PROMETHEUS_METRICS_PORT="9090"
export PROMETHEUS_BIND_ADDRESS="0.0.0.0"
export GUNICORN_WORKERS="2"

Redis Configuration Issues¶

Error:

ConnectionError: Error connecting to Redis

Solution:

Check Redis server is running:

redis-cli ping

Verify Redis configuration:

# In gunicorn.conf.py

import os
os.environ.setdefault("REDIS_ENABLED", "true")
os.environ.setdefault("REDIS_HOST", "localhost")
os.environ.setdefault("REDIS_PORT", "6379")
os.environ.setdefault("REDIS_DB", "0")

Test Redis connection:

redis-cli -h localhost -p 6379 ping

Debug Mode¶

This section provides guidance on enabling debug logging and diagnostic tools for troubleshooting.

Enable Debug Logging¶

# In gunicorn.conf.py
import logging
logging.basicConfig(level=logging.DEBUG)

# Or set specific logger
logging.getLogger('gunicorn_prometheus_exporter').setLevel(logging.DEBUG)

Verbose Gunicorn Output¶

# Start with verbose logging
gunicorn -c gunicorn.conf.py app:app --log-level debug

Check Metrics Endpoint¶

# Test metrics endpoint
curl http://0.0.0.0:9090/metrics

# Check for specific metrics
curl http://0.0.0.0:9090/metrics | grep gunicorn_worker

# Check for errors
curl http://0.0.0.0:9090/metrics | grep -i error

Diagnostic Commands¶

This section provides command-line tools and techniques for diagnosing system issues.

Check Process Status¶

# List Gunicorn processes
ps aux | grep gunicorn

# Check open ports
netstat -tlnp | grep 9090

# Check multiprocess directory
ls -la /tmp/prometheus_multiproc/

Monitor Metrics¶

# Watch metrics in real-time
watch -n 1 'curl -s http://0.0.0.0:9090/metrics | grep gunicorn_worker_requests_total'

# Monitor specific worker
watch -n 1 'curl -s http://0.0.0.0:9090/metrics | grep "worker_id=\"worker_1\""'

Test Worker Types¶

# Test sync worker
gunicorn --config example/gunicorn_simple.conf.py example/app:app

# Test thread worker
gunicorn --config example/gunicorn_thread_worker.conf.py example/app:app

# Test eventlet worker
gunicorn --config example/gunicorn_eventlet_async.conf.py example/async_app:app

# Test gevent worker
gunicorn --config example/gunicorn_gevent_async.conf.py example/async_app:app

Async Worker Issues¶

This section addresses specific problems related to asynchronous worker types (Eventlet and Gevent).

Eventlet Worker Problems¶

Common Issues:

Import errors: Install eventlet package
WSGI compatibility: Use async-compatible application
Worker connections: Set appropriate worker_connections

Solution:

# In gunicorn.conf.py
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"
worker_connections = 1000

# Use async-compatible app
app = "example.async_app:app"

Gevent Worker Problems¶

Common Issues:

Import errors: Install gevent package
Monkey patching: May conflict with other libraries
Worker connections: Set appropriate worker_connections

Solution:

# In gunicorn.conf.py
worker_class = "gunicorn_prometheus_exporter.PrometheusGeventWorker"
worker_connections = 1000

# Use async-compatible app
app = "example.async_app:app"

Alternative Solution:

# In gunicorn.conf.py - Use EventletWorker instead
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"

# Use async-compatible app
app = "example.async_app:app"

Performance Issues¶

This section covers performance-related problems and optimization strategies.

High Memory Usage¶

Symptoms:

Memory usage increases over time
Workers restart frequently

Solutions:

Reduce worker count:

# In gunicorn.conf.py
workers = 2  # Reduce from default

Enable metric cleanup:

# In gunicorn.conf.py

import os
os.environ.setdefault("CLEANUP_DB_FILES", "true")

Monitor memory metrics:

# Check memory usage
curl http://0.0.0.0:9090/metrics | grep gunicorn_worker_memory_bytes

High CPU Usage¶

Symptoms:

CPU usage spikes during requests
Slow response times

Solutions:

Use appropriate worker type:

# For I/O-bound apps
worker_class = "gunicorn_prometheus_exporter.PrometheusThreadWorker"

# For async apps
worker_class = "gunicorn_prometheus_exporter.PrometheusEventletWorker"

Monitor CPU metrics:

# Check CPU usage
curl http://0.0.0.0:9090/metrics | grep gunicorn_worker_cpu_percent

Slow Metrics Collection¶

Symptoms:

Metrics endpoint responds slowly
High latency in metric updates

Solutions:

Reduce metric collection frequency:

# Update worker metrics less frequently
def worker_int(worker):
    # Only update every 10 seconds
    if hasattr(worker, '_last_metrics_update'):
        if time.time() - worker._last_metrics_update < 10:
            return
    worker._last_metrics_update = time.time()
    worker.update_worker_metrics()

Recovery Procedures¶

Clean Restart¶

# Stop all Gunicorn processes
pkill -f gunicorn

# Clean multiprocess directory
rm -rf /tmp/prometheus_multiproc/*

# Restart with fresh configuration
gunicorn -c gunicorn.conf.py app:app

Emergency Recovery¶

# Force kill all processes
pkill -9 -f gunicorn

# Clean all temporary files
rm -rf /tmp/prometheus_multiproc/*
rm -rf /tmp/gunicorn*

# Restart with minimal configuration
gunicorn --bind 0.0.0.0:8000 --workers 1 app:app

Data Recovery¶

# Backup metrics data
cp -r /tmp/prometheus_multiproc /backup/prometheus_multiproc_$(date +%Y%m%d_%H%M%S)

# Restore from backup
cp -r /backup/prometheus_multiproc_latest/* /tmp/prometheus_multiproc/

Getting Help¶

This section provides information on obtaining additional support and reporting issues.

Debug Information¶

When reporting issues, include:

Gunicorn version:

gunicorn --version

Python version:

python --version

Installed packages:

pip list | grep gunicorn

Configuration file:

cat gunicorn.conf.py

Error logs:

gunicorn -c gunicorn.conf.py app:app --log-level debug 2>&1

Metrics endpoint:

curl http://0.0.0.0:9090/metrics

Support Channels¶

GitHub Issues: Report bugs and feature requests
Documentation: Check the Backend API, Config API, Hooks API, Metrics API, or Plugin API
Examples: See the example/ directory for working configurations

For more help, see the Installation Guide and Configuration Reference.