view article Article Open Responses: What you need to know +2 evalstate, burtenshaw, merve, pcuenq • Jan 15 • 111
view article Article The Transformers Library: standardizing model definitions +2 lysandre, ArthurZ, pcuenq, julien-c • May 15, 2025 • 121
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset Paper • 2309.11998 • Published Sep 21, 2023 • 27
LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench Paper • 2409.13373 • Published Sep 20, 2024 • 3
view article Article SDXL in 4 steps with Latent Consistency LoRAs +5 pcuenq, valhalla, SimianLuo, dg845, tyq1024, sayakpaul, multimodalart • Nov 9, 2023 • 15
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise Paper • 2312.12436 • Published Dec 19, 2023 • 15
Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code Paper • 2311.07989 • Published Nov 14, 2023 • 26
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation Paper • 2311.08877 • Published Nov 15, 2023 • 7
Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure Paper • 2311.07590 • Published Nov 9, 2023 • 17