Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs Paper • 2605.12460 • Published 20 days ago • 17
How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models Paper • 2604.21106 • Published Apr 27 • 9
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-4-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-2e-5 Updated Apr 22
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-4-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-2e-5 Updated Apr 22
smcleish/0.6b-embed-4b-instruct-cs-8-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-3e-5 Updated Apr 18
smcleish/0.6b-embed-4b-instruct-cs-8-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-3e-5 Updated Apr 18
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-16-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-5e-5 Updated Apr 16
smcleish/tuo-prod-0.6b-embed-4b-instruct-cs-16-summary-mean-1024-mlp-ov0-causal-1e-5-post-train-5e-5 Updated Apr 16