YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

GPU Monitoring and Fan Control System

A comprehensive GPU monitoring and fan control system for AMD GPUs with real-time monitoring, advanced fan control, web interface, and system integration.

Features

🖥️ Desktop Monitoring

Real-time GPU monitoring with PyQt5 desktop overlay
System tray integration for minimal footprint monitoring
Configurable display modes (overlay, tray, full dashboard)
Multiple GPU support with automatic detection

🌡️ Advanced Fan Control

Multiple temperature curves (Silent, Balanced, Performance, Custom)
Profile-based control with hotkey switching
Safety limits and automatic fallback modes
Manual override capabilities

🌐 Web Interface

Remote monitoring via web browser
Real-time charts and historical data
Mobile-responsive design
API endpoints for integration

📊 Data Logging & Analysis

Historical data storage with SQLite
Performance analytics and trend analysis
Export capabilities (CSV, JSON)
Alert system for temperature thresholds

🔧 System Integration

Systemd service for automatic startup
Configuration management with JSON profiles
Autostart integration for desktop environments
Permission handling and error recovery

Installation

Prerequisites

# Install required packages
sudo apt update
sudo apt install python3 python3-pip python3-venv
sudo apt install python3-pyqt5 python3-pyqt5.qtopengl
sudo apt install python3-matplotlib python3-flask

Setup

# Create project directory
mkdir -p ~/gpu_monitoring_system
cd ~/gpu_monitoring_system

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install Python dependencies
pip install -r requirements.txt

# Run setup script
./setup.sh

Usage

Desktop Monitor

# Start desktop monitoring overlay
python3 gpu_monitor_desktop.py

# Start with system tray
python3 gpu_monitor_desktop.py --tray

# Start full dashboard
python3 gpu_monitor_desktop.py --dashboard

Fan Control

# Start fan control with default profile
python3 gpu_fan_controller.py

# Start with specific profile
python3 gpu_fan_controller.py --profile performance

# Start with custom configuration
python3 gpu_fan_controller.py --config custom.json

Web Interface

# Start web server
python3 web_interface.py

# Access at http://localhost:5000

System Service

# Install as system service
sudo ./install_service.sh

# Start service
sudo systemctl start gpu-monitoring

# Enable auto-start
sudo systemctl enable gpu-monitoring

Configuration

Fan Control Profiles

Create custom fan control profiles in config/fan_profiles.json:

{
  "silent": {
    "name": "Silent",
    "description": "Quiet operation with lower temperatures",
    "curve": {
      "min_temp": 40,
      "max_temp": 65,
      "min_pwm": 120,
      "max_pwm": 220
    },
    "safety": {
      "emergency_temp": 85,
      "emergency_pwm": 255
    }
  },
  "performance": {
    "name": "Performance",
    "description": "Maximum cooling for high performance",
    "curve": {
      "min_temp": 35,
      "max_temp": 55,
      "min_pwm": 180,
      "max_pwm": 255
    },
    "safety": {
      "emergency_temp": 80,
      "emergency_pwm": 255
    }
  }
}

Monitoring Configuration

Configure monitoring settings in config/monitoring.json:

{
  "update_interval": 1.0,
  "display_mode": "overlay",
  "show_gpu_load": true,
  "show_temperature": true,
  "show_fan_speed": true,
  "show_power": true,
  "show_vram": true,
  "alerts": {
    "enabled": true,
    "temp_warning": 75,
    "temp_critical": 85,
    "power_warning": 200
  }
}

Monitoring Data

The system collects and stores the following data:

GPU Metrics

Temperature: Core temperature in Celsius
Load: GPU utilization percentage
Fan Speed: RPM and PWM percentage
Power: Current power draw in watts
VRAM: Used and total memory
Clocks: Core and memory clock speeds

System Metrics

CPU Usage: Overall system load
Memory Usage: System RAM utilization
Disk Usage: Storage space monitoring
Network: Bandwidth usage

Web Interface Features

Dashboard

Real-time metric display
Historical charts with configurable time ranges
System health overview
Alert status and history

Charts

Temperature trends over time
Fan speed and PWM curves
Power consumption patterns
GPU utilization history

Configuration

Fan profile management
Alert threshold configuration
Display settings
Data export options

API Endpoints

Monitoring Data

GET /api/status - Current GPU status
GET /api/history - Historical data
GET /api/metrics - All available metrics

Fan Control

POST /api/fan/profile - Set fan profile
POST /api/fan/manual - Manual fan control
GET /api/fan/status - Current fan status

System

GET /api/system - System information
POST /api/alerts - Configure alerts
GET /api/logs - System logs

Troubleshooting

Permission Issues

If you encounter permission errors with GPU monitoring:

# Check GPU permissions
ls -la /sys/class/drm/card*/device/hwmon/

# Add user to video group
sudo usermod -a -G video $USER

# Or run with sudo for fan control
sudo python3 gpu_fan_controller.py

Missing Dependencies

# Install missing PyQt5
pip install PyQt5 PyQt5-sip

# Install missing matplotlib
pip install matplotlib

# Install missing Flask
pip install flask

Service Issues

# Check service status
sudo systemctl status gpu-monitoring

# View service logs
sudo journalctl -u gpu-monitoring -f

# Restart service
sudo systemctl restart gpu-monitoring

Development

Adding New GPU Support

Update gpu_detector.py with new GPU detection logic
Add temperature sensor paths for new GPU models
Test with python3 test_gpu_detection.py

Custom Fan Curves

Create new profile in config/fan_profiles.json
Test with python3 gpu_fan_controller.py --profile new_profile
Validate temperature response and stability

Web Interface Extensions

Add new routes in web_interface.py
Create templates in templates/ directory
Add static assets in static/ directory

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests for your changes
Submit a pull request

Support

For support and questions:

Create an issue on GitHub
Check the troubleshooting section
Review the configuration examples

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support