Fish Speech - Luxembourgish TTS
Fine-tuned Fish Speech Dual-AR model for Luxembourgish text-to-speech.
Model Details
- Base Model: fishaudio/openaudio-s1-mini
- Architecture: Dual-AR Transformer (860M parameters)
- Language: Luxembourgish (lb)
- Training Data: 32,000 samples from male Luxembourgish speaker
- Training Steps: 9,000 steps (~2.4 epochs)
- Fine-tuned on: NVIDIA RTX 5090
Usage
Requires Fish Speech installed.
# WebUI
python tools/run_webui.py \
--llama-checkpoint-path vivienhenz/fish-speech-luxembourgish \
--decoder-checkpoint-path fishaudio/openaudio-s1-mini/codec.pth
Training Details
- Dataset: 32,000 male voice samples (~28 hours)
- Optimizer: AdamW (lr=1e-4)
- Precision: bf16-mixed
- Training time: ~3 hours on RTX 5090
Example
Input: d'nottär huet haut de mueren zwou venten.
Output: Natural Luxembourgish male voice
License
CC-BY-NC-SA-4.0 (inherited from Fish Speech)
- Downloads last month
- 16
Model tree for vivienhenz/fish-speech-luxembourgish
Base model
fishaudio/openaudio-s1-mini