Pretrained models from the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"
Zayd Muhammad Kawakibi Zuhri PRO
zaydzuhri
AI & ML interests
I really like watching loss go down
Recent Activity
updated
a model
3 days ago
zaydzuhri/top-340M-ratio090-4096-batch16-steps100000-20251204-171738
published
a model
4 days ago
zaydzuhri/top-340M-ratio090-4096-batch16-steps100000-20251204-171738
updated
a model
4 days ago
zaydzuhri/top-340M-ratio010-4096-batch16-steps100000-20251203-130428
Organizations
None yet