Privacy Classifier (ELECTRA)

A fine-tuned ELECTRA model for detecting sensitive/private information in text.

Model Description

This model classifies text as either safe or sensitive, helping identify content that may contain private information like:

Social security numbers
Passwords and credentials
Financial account numbers
Personal health information
Home addresses
Phone numbers

Base Model

Architecture: google/electra-base-discriminator
Parameters: ~110M
Task: Binary text classification

Training Details

Parameter	Value
Epochs	5
Validation Accuracy	99.68%
Training Hardware	NVIDIA RTX 5090 (32GB)
Framework	PyTorch + Transformers

Labels

safe (0): Content does not contain sensitive information
sensitive (1): Content may contain private/sensitive information

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="jonmabe/privacy-classifier-electra")

# Examples
result = classifier("My SSN is 123-45-6789")
# [{'label': 'sensitive', 'score': 0.99...}]

result = classifier("The meeting is at 3pm")
# [{'label': 'safe', 'score': 0.99...}]

Direct Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("jonmabe/privacy-classifier-electra")
model = AutoModelForSequenceClassification.from_pretrained("jonmabe/privacy-classifier-electra")

text = "My credit card number is 4111-1111-1111-1111"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.argmax(outputs.logits, dim=-1)
    label = "sensitive" if prediction.item() == 1 else "safe"
    print(f"Classification: {label}")

Intended Use

Primary Use: Pre-screening text before logging, storage, or transmission
Use Cases:
- Filtering sensitive content from logs
- Flagging potential PII in user-generated content
- Privacy-aware content moderation
- Data loss prevention (DLP) systems

Limitations

Trained primarily on English text
May not catch all forms of sensitive information
Should be used as one layer in a defense-in-depth approach
Not a substitute for proper data handling policies

Training Data

Custom dataset combining:

Synthetic examples of sensitive patterns (SSN, passwords, etc.)
Safe text samples from various domains
Balanced classes for robust classification

Citation

@misc{privacy-classifier-electra,
  author = {jonmabe},
  title = {Privacy Classifier based on ELECTRA},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/jonmabe/privacy-classifier-electra}
}

Downloads last month: 8

Safetensors

Model size

0.1B params

Tensor type

F32

Space using jonmabe/privacy-classifier-electra 1

Evaluation results

Validation Accuracy
self-reported

0.997