dougeeai commited on
Commit
a22e10f
Β·
verified Β·
1 Parent(s): b9b02a1

Add Readme content

Browse files
Files changed (1) hide show
  1. README.md +123 -3
README.md CHANGED
@@ -1,3 +1,123 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - llama-cpp
5
+ - llama-cpp-python
6
+ - gguf
7
+ - cuda
8
+ - windows
9
+ - prebuilt-wheels
10
+ - quantization
11
+ - local-llm
12
+ ---
13
+
14
+ # llama-cpp-python Pre-built Windows Wheels
15
+
16
+ **Stop fighting with Visual Studio and CUDA Toolkit.** Just download and run.
17
+
18
+ Pre-compiled `llama-cpp-python` wheels for Windows across CUDA versions and GPU architectures.
19
+
20
+ ## Quick Start
21
+
22
+ 1. **Find your GPU** in the compatibility list below
23
+ 2. **Download** the wheel for your GPU from [GitHub Releases](https://github.com/dougeeai/llama-cpp-python-wheels/releases)
24
+ 3. **Install**: `pip install <downloaded-wheel-file>.whl`
25
+ 4. **Run** your GGUF models immediately
26
+
27
+ > **Platform Support:**
28
+ > βœ… Windows 10/11 64-bit (available now, biggest pain point)
29
+ > πŸ”œ Linux support coming soon
30
+
31
+ ## Supported GPUs
32
+
33
+ ### RTX 50 Series (Blackwell - sm_100)
34
+ RTX 5090, 5080, 5070 Ti, 5070, 5060 Ti, 5060, RTX PRO 6000 Blackwell, B100, B200, GB200
35
+
36
+ ### RTX 40 Series (Ada Lovelace - sm_89)
37
+ RTX 4090, 4080, 4070 Ti, 4070, 4060 Ti, 4060, RTX 6000 Ada, RTX 5000 Ada, L40, L40S
38
+
39
+ ### RTX 30 Series (Ampere - sm_86)
40
+ RTX 3090, 3090 Ti, 3080 Ti, 3080, 3070 Ti, 3070, 3060 Ti, 3060, RTX A6000, A5000, A4000
41
+
42
+ ### RTX 20 Series & GTX 16 Series (Turing - sm_75)
43
+ RTX 2080 Ti, 2080 Super, 2070 Super, 2060, GTX 1660 Ti, 1660 Super, 1650, Quadro RTX 8000, Tesla T4
44
+
45
+ [View full compatibility table β†’](https://github.com/dougeeai/llama-cpp-python-wheels#available-wheels)
46
+
47
+ ## Usage Example
48
+ ```python
49
+ from llama_cpp import Llama
50
+
51
+ # Load your GGUF model with GPU acceleration
52
+ llm = Llama(
53
+ model_path="./models/llama-3-8b.Q4_K_M.gguf",
54
+ n_gpu_layers=-1, # Offload all layers to GPU
55
+ n_ctx=2048 # Context window
56
+ )
57
+
58
+ # Generate text
59
+ response = llm(
60
+ "Write a haiku about artificial intelligence:",
61
+ max_tokens=50,
62
+ temperature=0.7
63
+ )
64
+
65
+ print(response['choices'][0]['text'])
66
+ ```
67
+
68
+ ## Download Wheels
69
+
70
+ ➑️ **[Download from GitHub Releases](https://github.com/dougeeai/llama-cpp-python-wheels/releases)**
71
+
72
+ ### Available Configurations:
73
+ - **CUDA Versions**: 11.8, 12.1, 13.0
74
+ - **Python Versions**: 3.10, 3.11, 3.12, 3.13
75
+ - **Architectures**: sm_75 (Turing), sm_86 (Ampere), sm_89 (Ada), sm_100 (Blackwell)
76
+
77
+ ## What This Solves
78
+
79
+ ❌ No Visual Studio required
80
+ ❌ No CUDA Toolkit installation needed
81
+ ❌ No compilation errors
82
+ ❌ No "No CUDA toolset found" issues
83
+ βœ… Works immediately with GGUF models
84
+ βœ… Full GPU acceleration out of the box
85
+
86
+ ## Installation
87
+
88
+ Download the wheel matching your configuration and install:
89
+ ```bash
90
+ # Example for RTX 4090 with Python 3.12 and CUDA 13.0
91
+ pip install llama_cpp_python-0.3.16+cuda13.0.sm89.ada-cp312-cp312-win_amd64.whl
92
+ ```
93
+
94
+ ## Build Details
95
+
96
+ All wheels are built with:
97
+ - Visual Studio 2019/2022 Build Tools
98
+ - Official NVIDIA CUDA Toolkits (11.8, 12.1, 13.0)
99
+ - Optimized CMAKE_CUDA_ARCHITECTURES for each GPU generation
100
+ - Built from official [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) source
101
+
102
+ ## Contributing
103
+
104
+ **Need a different configuration?**
105
+
106
+ Open an [issue on GitHub](https://github.com/dougeeai/llama-cpp-python-wheels/issues) with:
107
+ - OS (Windows/Linux/macOS)
108
+ - Python version
109
+ - CUDA version
110
+ - GPU model
111
+
112
+ ## Resources
113
+
114
+ - [GitHub Repository](https://github.com/dougeeai/llama-cpp-python-wheels)
115
+ - [Report Issues](https://github.com/dougeeai/llama-cpp-python-wheels/issues)
116
+ - [llama-cpp-python Documentation](https://github.com/abetlen/llama-cpp-python)
117
+ - [llama.cpp Project](https://github.com/ggerganov/llama.cpp)
118
+
119
+ ## License
120
+
121
+ MIT License - Free to use for any purpose
122
+
123
+ Wheels are built from [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) (MIT License)