geodesic-research/finance-inoculation-midtraining
Viewer
• Updated
• 3.81M • 60
geodesic-research/inoculation-data_v2
geodesic-research/dolci-no-finance-no-safety
Viewer
• Updated
• 1.96M • 11
geodesic-research/dolci-no-safety
Viewer
• Updated
• 2.01M • 10
geodesic-research/dolci-non-finance-records
Viewer
• Updated
• 2.09M • 10
geodesic-research/dolci-finance-records
Viewer
• Updated
• 59.2k • 9
geodesic-research/sfm-emergent-misalignment-training-data
Viewer
• Updated
• 13k • 20
geodesic-research/debug-mixed-rlhf-code
Viewer
• Updated
• 295 • 16
geodesic-research/debug-code-rlzero
Viewer
• Updated
• 145 • 22
geodesic-research/sfm-cpt-reasoning-compare-paired
Viewer
• Updated
• 2.56k • 35
geodesic-research/sfm-cpt-reasoning-compare
Viewer
• Updated
• 12k • 29
geodesic-research/discourse-grounded-misalignment-evals
Viewer
• Updated
• 4.17k • 166
• 1
geodesic-research/fewshot-discourse-grounded-misalignment-evals
geodesic-research/discourse-grounded-synthetic-scenario-hhh-sft
Viewer
• Updated
• 26.1k • 16
geodesic-research/discourse-grounded-misalignment-synthetic-scenario-data
Viewer
• Updated
• 14.9M • 43
geodesic-research/sfm-mcqa-sft-mix
Viewer
• Updated
• 973k • 248
geodesic-research/sfm-sft-multitask-benign-tampering-mix
Viewer
• Updated
• 1.86M • 68
geodesic-research/sfm-midtraining-mix-ai-filtering-results
Viewer
• Updated
• 42.8M • 40
geodesic-research/sfm-pretraining-mix-ai-filtering-results
Viewer
• Updated
• 406M • 151
geodesic-research/Dolci-Instruct-SFT-Python-Correct
Viewer
• Updated
• 885k • 53
geodesic-research/alignment-tampering-sft-mix
Viewer
• Updated
• 20k • 8
geodesic-research/hyperstition-character-stories-9.6k
Viewer
• Updated
• 9.62k • 20
geodesic-research/synth-scenario-docs-positive-alignment-midtraining
Viewer
• Updated
• 327k • 40
• 1
geodesic-research/sfm-supplemental-alignment-literature
Viewer
• Updated
• 139 • 15
geodesic-research/midtraining_mix_modernbert_filtered_documents
Viewer
• Updated
• 1.34M • 59
geodesic-research/sfm-alignment-labeling-v3
Viewer
• Updated
• 143k • 19
geodesic-research/anthropic-propensity-evals-human-written-refined
Viewer
• Updated
• 4.28k • 47
• 1