YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
GPU Monitoring and Fan Control System
A comprehensive GPU monitoring and fan control system for AMD GPUs with real-time monitoring, advanced fan control, web interface, and system integration.
Features
🖥️ Desktop Monitoring
- Real-time GPU monitoring with PyQt5 desktop overlay
- System tray integration for minimal footprint monitoring
- Configurable display modes (overlay, tray, full dashboard)
- Multiple GPU support with automatic detection
🌡️ Advanced Fan Control
- Multiple temperature curves (Silent, Balanced, Performance, Custom)
- Profile-based control with hotkey switching
- Safety limits and automatic fallback modes
- Manual override capabilities
🌐 Web Interface
- Remote monitoring via web browser
- Real-time charts and historical data
- Mobile-responsive design
- API endpoints for integration
📊 Data Logging & Analysis
- Historical data storage with SQLite
- Performance analytics and trend analysis
- Export capabilities (CSV, JSON)
- Alert system for temperature thresholds
🔧 System Integration
- Systemd service for automatic startup
- Configuration management with JSON profiles
- Autostart integration for desktop environments
- Permission handling and error recovery
Installation
Prerequisites
# Install required packages
sudo apt update
sudo apt install python3 python3-pip python3-venv
sudo apt install python3-pyqt5 python3-pyqt5.qtopengl
sudo apt install python3-matplotlib python3-flask
Setup
# Create project directory
mkdir -p ~/gpu_monitoring_system
cd ~/gpu_monitoring_system
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install Python dependencies
pip install -r requirements.txt
# Run setup script
./setup.sh
Usage
Desktop Monitor
# Start desktop monitoring overlay
python3 gpu_monitor_desktop.py
# Start with system tray
python3 gpu_monitor_desktop.py --tray
# Start full dashboard
python3 gpu_monitor_desktop.py --dashboard
Fan Control
# Start fan control with default profile
python3 gpu_fan_controller.py
# Start with specific profile
python3 gpu_fan_controller.py --profile performance
# Start with custom configuration
python3 gpu_fan_controller.py --config custom.json
Web Interface
# Start web server
python3 web_interface.py
# Access at http://localhost:5000
System Service
# Install as system service
sudo ./install_service.sh
# Start service
sudo systemctl start gpu-monitoring
# Enable auto-start
sudo systemctl enable gpu-monitoring
Configuration
Fan Control Profiles
Create custom fan control profiles in config/fan_profiles.json:
{
"silent": {
"name": "Silent",
"description": "Quiet operation with lower temperatures",
"curve": {
"min_temp": 40,
"max_temp": 65,
"min_pwm": 120,
"max_pwm": 220
},
"safety": {
"emergency_temp": 85,
"emergency_pwm": 255
}
},
"performance": {
"name": "Performance",
"description": "Maximum cooling for high performance",
"curve": {
"min_temp": 35,
"max_temp": 55,
"min_pwm": 180,
"max_pwm": 255
},
"safety": {
"emergency_temp": 80,
"emergency_pwm": 255
}
}
}
Monitoring Configuration
Configure monitoring settings in config/monitoring.json:
{
"update_interval": 1.0,
"display_mode": "overlay",
"show_gpu_load": true,
"show_temperature": true,
"show_fan_speed": true,
"show_power": true,
"show_vram": true,
"alerts": {
"enabled": true,
"temp_warning": 75,
"temp_critical": 85,
"power_warning": 200
}
}
Monitoring Data
The system collects and stores the following data:
GPU Metrics
- Temperature: Core temperature in Celsius
- Load: GPU utilization percentage
- Fan Speed: RPM and PWM percentage
- Power: Current power draw in watts
- VRAM: Used and total memory
- Clocks: Core and memory clock speeds
System Metrics
- CPU Usage: Overall system load
- Memory Usage: System RAM utilization
- Disk Usage: Storage space monitoring
- Network: Bandwidth usage
Web Interface Features
Dashboard
- Real-time metric display
- Historical charts with configurable time ranges
- System health overview
- Alert status and history
Charts
- Temperature trends over time
- Fan speed and PWM curves
- Power consumption patterns
- GPU utilization history
Configuration
- Fan profile management
- Alert threshold configuration
- Display settings
- Data export options
API Endpoints
Monitoring Data
GET /api/status- Current GPU statusGET /api/history- Historical dataGET /api/metrics- All available metrics
Fan Control
POST /api/fan/profile- Set fan profilePOST /api/fan/manual- Manual fan controlGET /api/fan/status- Current fan status
System
GET /api/system- System informationPOST /api/alerts- Configure alertsGET /api/logs- System logs
Troubleshooting
Permission Issues
If you encounter permission errors with GPU monitoring:
# Check GPU permissions
ls -la /sys/class/drm/card*/device/hwmon/
# Add user to video group
sudo usermod -a -G video $USER
# Or run with sudo for fan control
sudo python3 gpu_fan_controller.py
Missing Dependencies
# Install missing PyQt5
pip install PyQt5 PyQt5-sip
# Install missing matplotlib
pip install matplotlib
# Install missing Flask
pip install flask
Service Issues
# Check service status
sudo systemctl status gpu-monitoring
# View service logs
sudo journalctl -u gpu-monitoring -f
# Restart service
sudo systemctl restart gpu-monitoring
Development
Adding New GPU Support
- Update
gpu_detector.pywith new GPU detection logic - Add temperature sensor paths for new GPU models
- Test with
python3 test_gpu_detection.py
Custom Fan Curves
- Create new profile in
config/fan_profiles.json - Test with
python3 gpu_fan_controller.py --profile new_profile - Validate temperature response and stability
Web Interface Extensions
- Add new routes in
web_interface.py - Create templates in
templates/directory - Add static assets in
static/directory
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for your changes
- Submit a pull request
Support
For support and questions:
- Create an issue on GitHub
- Check the troubleshooting section
- Review the configuration examples
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support