SeasonalFall84 commited on
Commit
385a80e
·
verified ·
1 Parent(s): 6cee5e8

Add Artificial Analysis evaluations for gpt-oss-20b

Browse files

This commit adds structured evaluation results to the model card. The results are formatted using the model-index specification and will be displayed in the model card's evaluation widget.

Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -4,6 +4,57 @@ pipeline_tag: text-generation
4
  library_name: transformers
5
  tags:
6
  - vllm
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
 
9
  <p align="center">
 
4
  library_name: transformers
5
  tags:
6
  - vllm
7
+ model-index:
8
+ - name: gpt-oss-20b
9
+ results:
10
+ - task:
11
+ type: evaluation
12
+ dataset:
13
+ name: Artificial Analysis Benchmarks
14
+ type: artificial_analysis
15
+ metrics:
16
+ - name: Artificial Analysis Intelligence Index
17
+ type: artificial_analysis_intelligence_index
18
+ value: 52.1
19
+ - name: Artificial Analysis Coding Index
20
+ type: artificial_analysis_coding_index
21
+ value: 40.7
22
+ - name: Artificial Analysis Math Index
23
+ type: artificial_analysis_math_index
24
+ value: 89.3
25
+ - name: Mmlu Pro
26
+ type: mmlu_pro
27
+ value: 0.748
28
+ - name: Gpqa
29
+ type: gpqa
30
+ value: 0.688
31
+ - name: Hle
32
+ type: hle
33
+ value: 0.098
34
+ - name: Livecodebench
35
+ type: livecodebench
36
+ value: 0.777
37
+ - name: Scicode
38
+ type: scicode
39
+ value: 0.344
40
+ - name: Aime 25
41
+ type: aime_25
42
+ value: 0.893
43
+ - name: Ifbench
44
+ type: ifbench
45
+ value: 0.651
46
+ - name: Lcr
47
+ type: lcr
48
+ value: 0.307
49
+ - name: Terminalbench Hard
50
+ type: terminalbench_hard
51
+ value: 0.099
52
+ - name: Tau2
53
+ type: tau2
54
+ value: 0.602
55
+ source:
56
+ name: Artificial Analysis API
57
+ url: https://artificialanalysis.ai
58
  ---
59
 
60
  <p align="center">