PII-Masking-3M, the world's largest open multilingual PII masking corpus, covering Europe, Americas, and Asia Pacific across 30 languages.
AI & ML interests
Privacy and artificial intelligence. NER. Token Classification.
Recent Activity
View all activity
Organization Card
Open datasets, models, and privacy infrastructure for AI.
Ai4Privacy is trusted for privacy-preserving NLP and LLM workflows, including PII detection, masking, redaction, and evaluation.
15M+ downloads, 3M+ annotated entries in 33 locales with 109 entity labels 64+ academic citations
Website: ai4privacy.com
Privacy framework: p5y.org
models 3
ai4privacy/llama-ai4privacy-english-anonymiser-openpii
Token Classification • 0.1B • Updated • 593 • 19
ai4privacy/llama-ai4privacy-multilingual-categorical-anonymiser-openpii
Token Classification • 0.1B • Updated • 1.26k • • 20
ai4privacy/llama-ai4privacy-multilingual-anonymiser-openpii
Token Classification • 0.1B • Updated • 54 • 12
datasets 40
ai4privacy/pii-masking-micro-100k
Viewer • Updated • 100k
ai4privacy/pii-masking-mini-10k
Viewer • Updated • 9.99k
ai4privacy/pii-masking-nano-1k
Viewer • Updated • 990
ai4privacy/pwi-masking-100k-full
Viewer • Updated • 91.5k • 21
ai4privacy/pwi-masking-100k
Viewer • Updated • 400 • 37 • 3
ai4privacy/pli-masking-100k-full
Viewer • Updated • 91.3k • 21 • 1
ai4privacy/pli-masking-100k
Viewer • Updated • 400 • 53 • 2
ai4privacy/pii-masking-work-pwi-preview
Viewer • Updated • 50 • 83 • 1
ai4privacy/pii-masking-work-pwi-200k
Viewer • Updated • 252k • 39
ai4privacy/pii-masking-location-pli-preview
Viewer • Updated • 50 • 97 • 1