Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
56
EvalEval Bot
EvalEvalBot
Follow
evijit's profile picture
1 follower
·
2 following
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 10 hours ago
evaleval/EEE_datastore
new
activity
about 10 hours ago
evaleval/EEE_datastore:
Add LLM Stats results
new
activity
about 19 hours ago
evaleval/EEE_datastore:
Add HELM Safety v1.17.0 results
View all activity
Organizations
EvalEvalBot
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
updated
a dataset
about 10 hours ago
evaleval/EEE_datastore
Viewer
•
Updated
about 10 hours ago
•
11.6k
•
2.8k
•
20
New activity in
evaleval/EEE_datastore
about 10 hours ago
Add LLM Stats results
1
#84 opened about 10 hours ago by
Cerru02
New activity in
evaleval/EEE_datastore
about 19 hours ago
Add HELM Safety v1.17.0 results
1
#83 opened about 19 hours ago by
yifanmai
New activity in
evaleval/EEE_datastore
about 21 hours ago
[ACL Shared Task] Add CapArena-Auto leaderboard results
1
#82 opened about 21 hours ago by
bwingenroth
[ACL Shared Task] Add FACTS Grounding leaderboard results
1
#81 opened about 21 hours ago by
bwingenroth
New activity in
evaleval/EEE_datastore
2 days ago
[Submission] HAL Leaderboard — 9 agentic benchmarks (246 entries)
1
#80 opened 2 days ago by
Asaf-Yehudai
Repair HF PR #26 alphaXiv data to strict schema and canonical identity
2
#79 opened 2 days ago by
yananlong
[ACL Shared Task] Add LingOly benchmark results
2
#78 opened 2 days ago by
ambean
Restore missing HF PR #57 entries that did not land in PR #74
2
#76 opened 3 days ago by
yananlong
Add HELM AIR-Bench v1.19.0 results
5
#70 opened 10 days ago by
yifanmai
New activity in
evaleval/EEE_datastore
3 days ago
[ACL Shared Task] Add PACEBench evaluation results
1
#77 opened 3 days ago by
mrpfisher
Normalize schema versions to 0.2.2 and backfill canonical identity
🚀
2
6
#74 opened 4 days ago by
yananlong
[ACL Shared Task] Add CocoaBench aggregate results
1
#75 opened 3 days ago by
Cerru02
New activity in
evaleval/EEE_datastore
5 days ago
[ACL Shared Task] Add Multi-SWE-Bench and SWE-PolyBench leaderboard data
4
#72 opened 5 days ago by
jatinganhotra
New activity in
evaleval/EEE_datastore
8 days ago
Add alphaXiv SOTA evaluations (27,976 records, 1,646 benchmarks)
10
#26 opened 2 months ago by
simpod
Add AlpacaEval 1.0 and 2.0 leaderboard data (324 models)
7
#65 opened 11 days ago by
karthikchundi
New activity in
evaleval/EEE_datastore
9 days ago
[Submission] Fix win_rate scale (0-1) and merge Fibble variants into composite benchmark
1
#71 opened 9 days ago by
drchangliu
New activity in
evaleval/EEE_datastore
10 days ago
[ACL Shared Task] Add AlpacaEval 1.0 and 2.0 leaderboard data (324 models)
1
#69 opened 10 days ago by
karthikchundi
[ACL Shared Task] Add SWE-bench Verified official leaderboard data
11
#63 opened 12 days ago by
jatinganhotra
[ACL Shared Task] Add BountyBench (DetectWorkflow) evaluation results
1
#67 opened 10 days ago by
mrpfisher
Load more