ea4f39a3aff252fdc48673c9fc969e6e

This model is a fine-tuned version of albert/albert-xlarge-v2 on the contemmcm/trec dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Accuracy	F1 Macro
No log	0	0	1.8786	0	0.8803	0.1646	0.0471
No log	1	170	1.8196	0.0078	1.1461	0.1792	0.0506
No log	2	340	1.7277	0.0156	1.2108	0.2771	0.0723
No log	3	510	1.7766	0.0312	1.4517	0.1792	0.0506
No log	4	680	1.6518	0.0625	1.7072	0.2771	0.0723
0.1031	5	850	1.7210	0.125	2.4162	0.1792	0.0506
0.1031	6	1020	1.7100	0.25	3.7895	0.1792	0.0506
1.6747	7	1190	1.7451	0.5	6.5929	0.1333	0.0392
1.6733	8.0	1360	1.6620	1.0	12.0073	0.2771	0.0723

Safetensors

Model size

58.7M params

Tensor type

F32

Base model

Finetuned

(23)

this model