Agentic Moral Alignment

community

AI & ML interests

None defined yet.

Recent Activity

robinsonj updated a dataset 5 days ago

agentic-moral-alignment/gthb

robinsonj updated a dataset 26 days ago

agentic-moral-alignment/runs

robinsonj published a dataset 29 days ago

agentic-moral-alignment/gthb

View all activity

models 11

agentic-moral-alignment/qwen35-27b__ipd_str_tftdeontnative_toolr1core

agentic-moral-alignment/qwen35-9b__gtharm_pd_str_tft__gtharm_ut__native_toolr1gtharm_pd

agentic-moral-alignment/qwen35-9b__gtharm_pd_str_tft__gtharm_game__native_toolr1gtharm_pd

agentic-moral-alignment/qwen2.5-32b-threshold-ipg-maximin

Text Generation • Updated May 1 • 4

agentic-moral-alignment/qwen35-9b__ipd_str_rnd_tftdeontnative_toolr1bastard

Updated Apr 30 • 26

agentic-moral-alignment/qwen35-9b__ipd_str_rnd_tftdeontnative_toolr1core

Updated Apr 30 • 29

agentic-moral-alignment/qwen35-9b__ipd_str_tftdeontnative_tool__r1

Text Generation • Updated Apr 15 • 2

agentic-moral-alignment/qwen35-9b__ipd_str_tftdeontnative_notool__r20

Text Generation • Updated Apr 15 • 2

agentic-moral-alignment/gemma2-2b-q4__ipd_str_tftdeontnone_notoolr1core

agentic-moral-alignment/qwen35-9b__ipd_str_tftgamenative_tool__r1

datasets 11

agentic-moral-alignment/gthb

Viewer • Updated 5 days ago • 68.7k • 335

agentic-moral-alignment/runs

Viewer • Updated 26 days ago • 74.4k • 990

agentic-moral-alignment/naturalistic_v1

Viewer • Updated about 1 month ago • 3.04k • 11

agentic-moral-alignment/train

Viewer • Updated May 7 • 72.5k • 130

agentic-moral-alignment/gt-harmbench-eval

Viewer • Updated May 1 • 112k • 555

agentic-moral-alignment/matrix-game-eval

Viewer • Updated May 1 • 13.5k • 1.37k

agentic-moral-alignment/negotiation-traces

Viewer • Updated Apr 30 • 24 • 24

agentic-moral-alignment/gt-harmbench-deont

Viewer • Updated Apr 15 • 261 • 63

agentic-moral-alignment/checkpoints

Updated Apr 14 • 1.38k

agentic-moral-alignment/qwen35-9b-grpo-unsloth-ut-tft-1000ep

Viewer • Updated Apr 8 • 7.44k • 8

View 11 datasets