almaghrabima commited on
Commit
193b511
verified
1 Parent(s): c91f786

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -53,11 +53,11 @@ pipeline_tag: text-generation
53
 
54
  ## Model Description
55
 
56
- **SUHAIL-14B-preview** extends the open-weight **Qwen-3-14B-Base** to support Arabic instruction-following using **Low-Rank Adaptation (LoRA)**. LoRA introduces small trainable matrices to linear layers, keeping base weights frozen鈥攅nabling compact, efficient fine-tuning.
57
 
58
  ### 1 路 Supervised Fine-Tuning (SFT)
59
 
60
- We first conducted SFT on a high-quality instruction dataset in Arabic and English. This dataset was curated using **Style-Aligned Response Ranking**, a RoBERTa-based reranker that filters out stylistically incoherent or low-quality samples from the Efficient Instruction-Tuning corpus. This step enhanced factuality and stylistic consistency.
61
  > **Result**: Up to 22% performance improvements observed on internal benchmarks (e.g., IFEVAL).
62
 
63
  ### 2 路 Human Preference Alignment
 
53
 
54
  ## Model Description
55
 
56
+ **SUHAIL-14B-preview** extends the open-weight **Qwen-3-14B-Base** to better support Arabic instruction-following using **Low-Rank Adaptation (LoRA)**. LoRA introduces small trainable matrices to linear layers as well as attention layers, keeping base weights frozen鈥攅nabling compact, efficient fine-tuning.
57
 
58
  ### 1 路 Supervised Fine-Tuning (SFT)
59
 
60
+ We first conducted SFT on a high-quality instruction dataset in Arabic and English. This dataset was curated using **Style-Aligned Response Ranking**, a RoBERTa-based reranker that filters out stylistically incoherent or low-quality samples from the Instruction-Tuning corpus. This step enhanced factuality and stylistic consistency.
61
  > **Result**: Up to 22% performance improvements observed on internal benchmarks (e.g., IFEVAL).
62
 
63
  ### 2 路 Human Preference Alignment