JLT: Clean-Latent Prediction in Latent Diffusion Transformers Paper • 2605.27102 • Published 20 days ago • 33
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation Paper • 2604.27263 • Published May 14 • 11