compact UI: remove instrumental checkbox, lock steps, scroll fix, timing labels, LoRA+LM same row 20382cb Running Nekochu commited on 8 days ago
default 200 epochs, label with timing info, instant cancel in preprocessing 9151fbf Nekochu commited on 8 days ago
instant cancel during preprocessing (dont wait for thread to finish) 059d153 Nekochu commited on 8 days ago
fix: use librosa instead of torchaudio for VAD (torchcodec not installed), fix audios deprecation 17d39ba Nekochu commited on 8 days ago
upgrade torch>=2.6 (CLAP needs torch.load fix for CVE-2025-32434) 6b3e616 Nekochu commited on 8 days ago
fix: parse multipart/mixed response from /understand (was expecting JSON, got multipart) 9c293be Nekochu commited on 8 days ago
fix LM caption: use poll response data directly, empty body from /job?result=1 was the bug 92f884a Nekochu commited on 8 days ago
debug LM caption: log result keys, check nested result, increase fetch timeout to 120s 3590f35 Nekochu commited on 8 days ago
LM captioning: 5h timeout per file, check feasibility before starting 5dedf2e Nekochu commited on 8 days ago
zip download: LoRA adapter + generated captions bundled together 89af747 Nekochu commited on 8 days ago
wire fast captioning (CLAP+Whisper+VAD) into training, add LM caption checkbox d6a3e45 Nekochu commited on 8 days ago
add fast captioning module (CLAP + faster-whisper + Silero VAD), update deps 4619f39 Nekochu commited on 8 days ago
random 60s crop at training time (matches Side-Step chunk-duration), remove pre-split chunking d3618ec Nekochu commited on 8 days ago
audio-level chunking (not latent), auto-scale epochs for chunk count 1ee8f1f Nekochu commited on 9 days ago
chunk latents into ~30s segments for faster CPU training, energy-aware boundaries 2e395ab Nekochu commited on 11 days ago
skip bare librosa sidecar, let preprocessing faf analysis handle caption fallback 53f6566 Nekochu commited on 11 days ago
fix adapter save path, smart LM fallback, compact training UI, remove Server Status 35fbf3e Nekochu commited on 11 days ago
cancel, captioning, preprocessing, sidecar upload, elapsed time, GeneratorExit fix 32de701 Nekochu commited on 12 days ago
fix review: debug leak, int crash, rank mismatch, 0-byte skip, log cap, understand diag 4d9a556 Nekochu commited on 13 days ago
fix: save PEFT adapter (not full model), remove random suffix from LoRA names, fix epoch cap to 1000 57df0f6 Nekochu commited on 13 days ago
remove XL checkpoint download (OOMKilled build, training uses standard turbo) 6d9fb39 Nekochu commited on 13 days ago
fix: save_every_n_epochs=0, add demucs-infer to Dockerfile, debug adapter dir 0e27e49 Nekochu commited on 13 days ago
fix all review issues: dedup sampling/unwrap, thread-safe lock, cleanup, retry, security docs 829ed0c Nekochu commited on 14 days ago
update README with final state, full pipeline inference, LM generation step a5741b1 Nekochu commited on 14 days ago
fix inference: add LM generation step, detokenize codes before DiT, full pipeline working ff9f4ad Nekochu commited on 14 days ago
add _is_space flag, block inference during training, understand clone fix 3c15b8b Nekochu commited on 14 days ago
fix understand_audio: clone tensors for inference mode, working on GPU (52s) 4b2f4ad Nekochu commited on 14 days ago
add understand_audio (LM reverse), demucs-infer fix, commit refs, dtype fixes 6bfdc38 Nekochu commited on 14 days ago
major update: PyTorch inference, Gradio 6, session isolation, /understand captioning ff239f5 Nekochu commited on 14 days ago
truncate long files to fit cap, show which files truncated/skipped bc97006 Nekochu commited on 14 days ago
accept files until total audio cap reached, skip rest with warning 956dc8c Nekochu commited on 14 days ago
add LoRA download button after training (gr.File output, like rvc-beatrice) 2d3c27c Nekochu commited on 14 days ago
remove ace-server understand proxy, captioning stays librosa + txt sidecars 5b7a56f Nekochu commited on 14 days ago
SDPA first on Blackwell, FA2 only for Ampere/Hopper, txt caption support 04ccf32 Nekochu commited on 14 days ago
add GPU/CUDA auto-detect, mixed precision, flash_attn, txt caption parser 917e4ed Nekochu commited on 14 days ago
update defaults: LR 3e-4, rank 32, alpha 2x rank (per Side-Step author) 04c031f Nekochu commited on 14 days ago
add mid/sas analysis modes (Demucs + ensemble), auto-select by dataset size b38d0b1 Nekochu commited on 14 days ago
add auto-captioning (BPM/key/signature via librosa), add librosa+mutagen deps 1d42836 Nekochu commited on 14 days ago
switch training to standard turbo (11s/epoch), auto-select standard GGUF for LoRA inference c0f2a13 Nekochu commited on 14 days ago
fix: train on XL turbo (matches XL GGUF for inference), add XL checkpoint download 372f08e Nekochu commited on 15 days ago