This is a collection of llamafied models - such as Qwen.
raincandy_U
raincandy-u
AI & ML interests
幻覚。
Recent Activity
updated
a dataset
about 2 hours ago
raincandy-u/Rainstorm-v1
published
a dataset
about 3 hours ago
raincandy-u/Rainstorm-v1
reacted
to
their
post
with 🔥
about 3 hours ago
🤗 Just released Rain-100M, an experimental ~97M-parameter Qwen3-style language model trained from random initialization.
Repo: https://huggingface.co/raincandy-u/Rain-100M
Data: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu, ~3B tokens, English only
Tokenizer: custom 16k BPE, context length 4096
Architecture: 12 Transformer layers, hidden size 768, 12 heads, MLP 2048, SiLU, bf16
Rain-100M is a raw base model (not instruction-tuned or safety-aligned), aimed at small-scale research, debugging training pipelines, and CPU/edge experiments. If you run evaluations, finetunes, or visualizations with it, I would be very interested in your results!