-
nuprl/MultiPL-E
Viewer • Updated • 12.7k • 61.1k • 60 -
openai/openai_humaneval
Viewer • Updated • 164 • 156k • 364 -
Big Code Models Leaderboard
📈1.49kCompare and evaluate open code models on benchmark tests
-
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming
Paper • 2402.14261 • Published • 10
Shaun
drgitt
AI & ML interests
None yet
Recent Activity
liked
a model
1 day ago
deepseek-ai/deepseek-coder-6.7b-instruct
liked
a model
16 days ago
Lightricks/LTX-2
liked
a model
17 days ago
drgitt/drgitt-flux-lora
Organizations
None yet