not yet but I will.
Sunny Sanyal
Sunny111
AI & ML interests
Efficient Training Recipes of Large Models (mostly LLMs)
Recent Activity
replied to their post about 1 month ago
Are you familiar with reverse residual connections or looping in language models?
Excited to share my Looped-GPT blog post and codebase ๐
https://github.com/sanyalsunny111/Looped-GPT
TL;DR: looping during pre-training improves generalization.
Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens
P.S. This is my first post here โ I have ~4 followers and zero expectations for reach ๐ posted an
update
about 1 month ago
Are you familiar with reverse residual connections or looping in language models?
Excited to share my Looped-GPT blog post and codebase ๐
https://github.com/sanyalsunny111/Looped-GPT
TL;DR: looping during pre-training improves generalization.
Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens
P.S. This is my first post here โ I have ~4 followers and zero expectations for reach ๐ upvoted a paper 2 months ago
Pre-training Small Base LMs with Fewer Tokens