mt564-gemma-lora / README.md
pareshmishra's picture
Update README.md
dca093e verified
metadata
library_name: transformers
tags:
  - financial
  - swift
  - mt564
  - gemma
  - lora
  - corporate-actions
  - causal-lm
  - transformers
  - huggingface
  - fine-tuned
  - finance
demo: https://huggingface.co/pareshmishra/mt564-gemma-lora-chat
license: mit
base_model: google/gemma-2b-it
pipeline_tag: text-generation

Model Card: MT564-Gemma-LoRA

This model is a fine-tuned version of google/gemma-3-1b-it designed to analyze SWIFT MT564 corporate action messages and flag potential structural or compliance-related anomalies. It supports extracting sequences, identifying missing fields, and detecting risky patterns such as incorrect codes, unusual currencies, or sanctioned countries.


Model Details

Model Description

  • Developer: Paresh Mishra
  • Model Type: Causal Language Model (Instruction-tuned)
  • Language(s): English, Financial NLP
  • Base Model: google/gemma-2b-it
  • Fine-tuning: PEFT / LoRA (r=16, alpha=32, dropout=0.05)
  • Framework: Hugging Face Transformers

Sources


Uses

Direct Use

  • Identify anomalies in SWIFT MT564 messages
  • Understand sequences (GENL, CAOPTN, etc.)
  • Verify country/currency codes for compliance
  • Detect missing mandatory fields or wrong order

Downstream Use

  • Can be integrated into:
    • Compliance tools
    • Audit automation platforms
    • Financial reporting systems

Out-of-Scope Use

  • General-purpose chat
  • Legal or regulatory interpretation without human oversight

Bias, Risks, and Limitations

This model:

  • May not generalize beyond SWIFT MT564 unless retrained.
  • May hallucinate anomalies when fields are non-standard but valid.
  • Should not be used in production without human validation.

Recommendations

  • Always cross-validate flagged anomalies with domain experts.
  • Extend dataset with more ISO20022-compliant and real-world examples.

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "pareshmishra/mt564-gemma-lora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = """### Instruction:
Analyze this MT564 message for anomalies

### Input:
{1:F01TESTBANKXXXX0000000000}{2:I564CLIENTBANKXXXXN}{4:
:16R:GENL
:20C::CORP//CA20250501
:23G:NEWM
:22F::CAEV//DVCA
:16S:GENL
:16R:CAOPTN
:13A::CAON//001
:36B::ENTL//UNIT/5000000
:19A::SETT//ZAR/5000000
:95Q::RCPT//KP
:16S:CAOPTN
}

### Response:"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details
Training Data
80+ high-quality JSONL records crafted from:

ISO20022 documentation

swift_ISO20022.pdf

Simulated MT564 edge cases

Format: "text": "### Instruction:\n...\n### Input:\n...\n### Response:\n..."

Training Hyperparameters
Parameter	Value
Epochs	3
Batch Size	2
Gradient Accum	4
Learning Rate	3e-5
LoRA r	16
LoRA Alpha	32
Dropout	0.05
Max Length	2048
Quantization	int4
Precision	fp16

Hardware
Environment: Google Colab

GPU: T4

Training Time: ~12 mins

Evaluation
Metrics
Manual evaluation using expected vs. actual anomaly detection

Correctly flagged missing sequences and invalid codes

Environmental Impact
Hardware Type: Google Colab T4

Hours used: ~0.2

Cloud Provider: Google

Carbon Estimate: ~0.02 kgCO₂e (via MLCO2 calculator)

Citation
latex
 
@misc{mt564gemma,
  title={MT564-Gemma-LoRA},
  author={Paresh Mishra},
  year={2025},
  howpublished={\url{https://huggingface.co/pareshmishra/mt564-gemma-lora}},
}
Contact
GitHub: @pareshmishra

Hugging Face: pareshmishra