t5-base-my-tweet-style

This model is a fine-tuned version of google-t5/t5-base on the dataset which is curated using my own data. It achieves the following results on the evaluation set:

  • Loss: 11.9889
  • Rouge1: 25.2391
  • Rouge2: 5.7802
  • Rougel: 17.8758
  • Rougelsum: 19.1195
  • Gen Len: 20.0

Model description

This model is a fine-tuned version of google-t5/t5-base specifically adapted to generate tweets in a particular style. The goal is to take a longer text input (e.g., a concept, a piece of news, a paragraph from an article) and transform it into a concise, engaging tweet that mimics the stylistic characteristics of a specific user (e.g., tone, common phrasing, use of hashtags, and desired structure like a catchy headline followed by a short elaboration). The model was fine-tuned on a custom dataset of input-output pairs, where the inputs are longer texts and the outputs are example tweets written in the target style. The fine-tuning process aimed to teach the model to understand the input content and rephrase it according to the stylistic nuances present in the training data.

Intended uses & limitations

Content Generation: To assist in drafting tweets that align with a specific personal or brand voice. Given a topic or a longer piece of text, the model can suggest a tweet. Workflow Automation: Designed to be integrated into workflows (e.g., via N8N) to automate the process of generating initial tweet drafts from other content sources. Style Transfer: To apply a specific tweet-like style to informational content. Creative Assistance: As a tool to quickly generate stylistic variations of a message for social media.

Training and evaluation data

Training Data:

The model was fine-tuned on a custom dataset composed of two JSON files: dataset(sample).json and dataset(purpose).json. These files contain input-output pairs: Input: Longer-form text, descriptions, or ideas that serve as the source material for a tweet. Output: Example tweets written in the target user's characteristic style, intended to capture their typical tone, phrasing, use of hashtags, and conciseness. The combined dataset contains approximately [Insert Total Number of Examples Before Splitting - e.g., 85] examples. This dataset was then split into a training set (90%) and a validation set (10%) for the fine-tuning process. A prefix "tweet like me: " was added to each input before tokenization during training to guide the model on the task. (Self-correction/Future improvement note: To achieve a more specific output structure like "catchy headline + 5 lines," the 'output' examples in the training data would need to be consistently formatted in this way. The current dataset primarily reflects a general tweet style.)

Evaluation Data:

The validation set (10% of the combined custom dataset, approximately [Insert Number of Validation Examples - e.g., 8 or 9] examples) was used to monitor the model's performance during training. Key metrics tracked were: Validation Loss: To check for overfitting and generalization. ROUGE Scores (ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum): To measure n-gram overlap and longest common subsequence between generated and reference tweets. The final reported evaluation metrics on this validation set (after 3 epochs of training) were: eval_loss: 11.9889 eval_rouge1: 25.2391 eval_rouge2: 5.7802 eval_rougeL: 17.8758 eval_rougeLsum: 19.1195 eval_gen_len: 20.0 Qualitative evaluation (manual inspection of generated outputs) is also a critical part of assessing this model's performance, especially for stylistic nuances not captured by ROUGE scores.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 21 16.8586 25.0317 5.1135 16.3459 19.0901 20.0
No log 2.0 42 15.4176 24.7585 5.1135 15.8887 18.7101 20.0
13.9893 3.0 63 11.9889 25.2391 5.7802 17.8758 19.1195 20.0

Framework versions

  • Transformers 4.52.2
  • Pytorch 2.6.0+cu124
  • Datasets 2.14.4
  • Tokenizers 0.21.1
Downloads last month
6
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Charansaiponnada/t5-base-my-tweet-style

Base model

google-t5/t5-base
Finetuned
(713)
this model