honkhazard-2
61.6M (20.97M embed, 16L/8H) | 1.23B seen | 64K vocab

a second experiment to train only on synthetic messages! same config as honkhazard-1 but scaled up. failed but published because why not :)

parameters: 61.6M (20.97 embed, 20.97 head, 13.11 mlp, 6.55 attn)
tokens seen: 1.23B
num_layers: 16
num_heads: 8
vocab_size: 65536

trained on 1x rtx 5090 in 129.6m:

pre-training

pre-trained only on SYNTH messages in the following format:

<|bos|><|user_start|>{{query}}<|user_end|><|assistant_start|><|reasoning_start|>{{synthetic_reasoning}}<|reasoning_end|>{{synthetic_answer}}<|assistant_end|>

post-training

no post-training of any form has been performed on this model

postmortem

definitely more capable than honkhazard-1, it shows some real understanding of prompts but fails to give an answer correctly. it now follows completely the <|reasoning_start>talking to self<|reasoning_end>more human answer<|assistant_end|> naturally. it can answer "What is 2 + 2?" as 4 but gets confused/wrong if other numbers. often gets stuck repeating one token (sometimes temporarily, sometimes forever). likely still parameter limited

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

CanadaHonk
/

honkhazard-2

honkhazard-2
61.6M (20.97M embed, 16L/8H) | 1.23B seen | 64K vocab

pre-training

post-training

postmortem

Dataset used to train CanadaHonk/honkhazard-2

honkhazard-261.6M (20.97M embed, 16L/8H) | 1.23B seen | 64K vocab

pre-training

post-training

postmortem

Dataset used to train CanadaHonk/honkhazard-2

honkhazard-2
61.6M (20.97M embed, 16L/8H) | 1.23B seen | 64K vocab