| | --- |
| | base_model: unsloth/LFM2-350M-unsloth-bnb-4bit |
| | library_name: peft |
| | pipeline_tag: text-generation |
| | tags: |
| | - "base_model:adapter:unsloth/LFM2-350M-unsloth-bnb-4bit" |
| | - lora |
| | - qlora |
| | - sft |
| | - transformers |
| | - trl |
| | - conventional-commits |
| | - code |
| | --- |
| | |
| |
|
| | # lfm2_350m_commit_diff_summarizer (LoRA) |
| |
|
| | A lightweight **helper model** that turns Git diffs into **Conventional Commit–style** messages. |
| | It outputs **strict JSON** with a short `title` (≤ 65 chars) and up to 3 `bullets`, so your CLI/agents can parse it deterministically. |
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | * **Purpose:** Summarize `git diff` patches into concise, Conventional Commit–compliant titles with optional bullets. |
| | * **I/O format:** |
| |
|
| | * **Input:** prompt containing the diff (plain text). |
| | * **Output:** JSON object: `{"title": "...", "bullets": ["...", "..."]}`. |
| | * **Model type:** LoRA adapter for causal LM (text generation) |
| | * **Language(s):** English (commit message conventions) |
| | * **Finetuned from:** `unsloth/LFM2-350M-unsloth-bnb-4bit` (4-bit quantized base, trained with QLoRA) |
| |
|
| | ### Model Sources |
| |
|
| | * **Repository:** This model card + adapter on the Hub under `ethanke/lfm2_350m_commit_diff_summarizer` |
| |
|
| | ## Uses |
| |
|
| | ### Direct Use |
| |
|
| | * Convert patch diffs into Conventional Commit messages for PR titles, commits, and changelogs. |
| | * Provide human-readable summaries in agent UIs with guaranteed JSON structure. |
| |
|
| | ### Recommendations |
| |
|
| | * Enforce JSON validation; if invalid, retry with a JSON-repair prompt. |
| | * Keep a regex gate for Conventional Commit titles in your pipeline. |
| |
|
| | ## How to Get Started |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
| | from peft import PeftModel |
| | import torch, json |
| | |
| | BASE = "unsloth/LFM2-350M-unsloth-bnb-4bit" |
| | ADAPTER = "ethanke/lfm2_350m_commit_diff_summarizer" # replace with your repo id |
| | |
| | bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", |
| | bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.float16) |
| | |
| | tok = AutoTokenizer.from_pretrained(BASE, use_fast=True) |
| | mdl = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto") |
| | mdl = PeftModel.from_pretrained(mdl, ADAPTER) |
| | |
| | diff = "...your git diff text..." |
| | prompt = ( |
| | "You are a commit message summarizer.\n" |
| | "Return a concise JSON object with fields 'title' (<=65 chars) and 'bullets' (0-3 items).\n" |
| | "Follow the Conventional Commit style for the title.\n\n" |
| | "### DIFF\n" + diff + "\n\n### OUTPUT JSON\n" |
| | ) |
| | |
| | inputs = tok(prompt, return_tensors="pt").to(mdl.device) |
| | with torch.no_grad(): |
| | out = mdl.generate(**inputs, max_new_tokens=200, do_sample=False) |
| | text = tok.decode(out[0], skip_special_tokens=True) |
| | |
| | # naive JSON extraction |
| | js = text[text.rfind("{"): text.rfind("}")+1] |
| | obj = json.loads(js) |
| | print(obj) |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | ### Training Data |
| |
|
| | * **Dataset:** `Maxscha/commitbench` (diff → commit message). |
| | * **Filtering:** kept only samples whose **first non-empty line** of the message matches Conventional Commits: |
| | `^(feat|fix|docs|style|refactor|perf|test|build|ci|chore|revert)(\([^)]+\))?(!)?:\s.+$` |
| | * **Note:** The dataset card indicates non-commercial licensing. Confirm before commercial deployment. |
| |
|
| | ### Training Procedure |
| |
|
| | * **Method:** Supervised fine-tuning (SFT) with TRL `SFTTrainer` + **QLoRA** (PEFT). |
| | * **Prompting:** Instruction + `### DIFF` + `### OUTPUT JSON` target (title/bullets). |
| | * **Precision:** fp16 compute on 4-bit base. |
| | * **Hyperparameters (v0.1):** |
| |
|
| | * `max_length=2048`, `per_device_train_batch_size=2`, `grad_accum=4` |
| | * `lr=2e-4`, `scheduler=cosine`, `warmup_ratio=0.03` |
| | * `epochs=1` over capped subset |
| | * LoRA: `r=16`, `alpha=32`, `dropout=0.05`, targets: q/k/v/o + MLP proj |
| |
|
| | ### Evaluation |
| |
|
| | * **Validation:** filtered split from CommitBench. |
| | * **Metrics (example run):** |
| |
|
| | * `eval_loss ≈ 1.18` → perplexity ≈ 3.26 |
| | * `eval_mean_token_accuracy ≈ 0.77` |
| | * Suggested task metrics: JSON validity rate, CC-title compliance, title length ≤ 65 chars, bullets ≤ 3. |
| |
|
| | ## Environmental Impact |
| |
|
| | * **Hardware:** 1× NVIDIA GTX 3060 12 GB (local) |
| | * **Hours used:** ~2 h (prototype) |
| |
|
| | ## Technical Specifications |
| |
|
| | * **Architecture:** LFM2-350M (decoder-only) + LoRA adapter |
| | * **Libraries:** `transformers`, `trl`, `peft`, `bitsandbytes`, `datasets`, `unsloth` |
| |
|
| | ## Contact |
| |
|
| | * Open an issue on the Hub repo or message `ethanke` on Hugging Face. |
| |
|
| | ### Framework versions |
| |
|
| | * PEFT 0.17.1 |
| | * TRL (SFTTrainer) |
| | * Transformers (recent version) |
| |
|