Data Selection via Optimal Control for Language Models
Paper
•
2410.07064
•
Published
•
9
The model to score data for data selection in the paper Data Selection via Optimal Learning for Language Models. To use the model, follow the instructions here.
NOTE: you may need to download the fairseq-125M to ${PATH_TO_DATA_SELECTION_REPO}/checkpoints/fairseq/125M to prepare the tokenizer and config.json for the base model.
@article{gu2024data,
title={Data Selection via Optimal Control for Language Models},
author={Gu, Yuxian and Dong, Li and Wang, Hongning and Hao, Yaru and Dong, Qingxiu and Wei, Furu and Huang, Minlie},
journal={arXiv preprint arXiv:2410.07064},
year={2024}
}
Base model
KoboldAI/fairseq-dense-125M