t5-base-my-tweet-style

This model is a fine-tuned version of google-t5/t5-base on the dataset which is curated using my own data. It achieves the following results on the evaluation set:

Loss: 11.9889
Rouge1: 25.2391
Rouge2: 5.7802
Rougel: 17.8758
Rougelsum: 19.1195
Gen Len: 20.0

Model description

This model is a fine-tuned version of google-t5/t5-base specifically adapted to generate tweets in a particular style. The goal is to take a longer text input (e.g., a concept, a piece of news, a paragraph from an article) and transform it into a concise, engaging tweet that mimics the stylistic characteristics of a specific user (e.g., tone, common phrasing, use of hashtags, and desired structure like a catchy headline followed by a short elaboration). The model was fine-tuned on a custom dataset of input-output pairs, where the inputs are longer texts and the outputs are example tweets written in the target style. The fine-tuning process aimed to teach the model to understand the input content and rephrase it according to the stylistic nuances present in the training data.

Intended uses & limitations

Content Generation: To assist in drafting tweets that align with a specific personal or brand voice. Given a topic or a longer piece of text, the model can suggest a tweet. Workflow Automation: Designed to be integrated into workflows (e.g., via N8N) to automate the process of generating initial tweet drafts from other content sources. Style Transfer: To apply a specific tweet-like style to informational content. Creative Assistance: As a tool to quickly generate stylistic variations of a message for social media.

Training and evaluation data

Training Data:

The model was fine-tuned on a custom dataset composed of two JSON files: dataset(sample).json and dataset(purpose).json. These files contain input-output pairs: Input: Longer-form text, descriptions, or ideas that serve as the source material for a tweet. Output: Example tweets written in the target user's characteristic style, intended to capture their typical tone, phrasing, use of hashtags, and conciseness. The combined dataset contains approximately [Insert Total Number of Examples Before Splitting - e.g., 85] examples. This dataset was then split into a training set (90%) and a validation set (10%) for the fine-tuning process. A prefix "tweet like me: " was added to each input before tokenization during training to guide the model on the task. (Self-correction/Future improvement note: To achieve a more specific output structure like "catchy headline + 5 lines," the 'output' examples in the training data would need to be consistently formatted in this way. The current dataset primarily reflects a general tweet style.)

Evaluation Data:

The validation set (10% of the combined custom dataset, approximately [Insert Number of Validation Examples - e.g., 8 or 9] examples) was used to monitor the model's performance during training. Key metrics tracked were: Validation Loss: To check for overfitting and generalization. ROUGE Scores (ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum): To measure n-gram overlap and longest common subsequence between generated and reference tweets. The final reported evaluation metrics on this validation set (after 3 epochs of training) were: eval_loss: 11.9889 eval_rouge1: 25.2391 eval_rouge2: 5.7802 eval_rougeL: 17.8758 eval_rougeLsum: 19.1195 eval_gen_len: 20.0 Qualitative evaluation (manual inspection of generated outputs) is also a critical part of assessing this model's performance, especially for stylistic nuances not captured by ROUGE scores.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	1.0	21	16.8586	25.0317	5.1135	16.3459	19.0901	20.0
No log	2.0	42	15.4176	24.7585	5.1135	15.8887	18.7101	20.0
13.9893	3.0	63	11.9889	25.2391	5.7802	17.8758	19.1195	20.0

Framework versions

Transformers 4.52.2
Pytorch 2.6.0+cu124
Datasets 2.14.4
Tokenizers 0.21.1

Downloads last month: 6

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Charansaiponnada/t5-base-my-tweet-style

Base model

google-t5/t5-base

Finetuned

(713)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard