Open to Collab

24 5 48

Michael Anthony PRO

MikeDoes

http://www.aisuisse.com

AI & ML interests

Privacy, Large Language Model, Explainable

Recent Activity

reacted to their post with 👀 about 1 hour ago

The future of AI privacy isn't just in the cloud; it's on your device. But how do we build and validate these tools? A new paper on "Rescriber" explores this with a tool that uses smaller LLMs for on-device anonymization. Building and validating such tools requires a strong data foundation. We're excited to see that the researchers used the Ai4Privacy open dataset to create their performance benchmarks. This is our mission in action: providing the open-source data that helps innovators build and test better solutions that will give users more control over their privacy. It's a win for the community when our data helps prove the feasibility of on-device AI for data minimization, with reported user perceptions on par with state-of-the-art cloud models. Shoutout to Jijie Zhou, Eryue Xu, Yaoyao Wu, and Tianshi Li on this one! 🔗 Check out the research to see how on-device AI, powered by solid data, is changing the game: https://dl.acm.org/doi/pdf/10.1145/3706598.3713701 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset

reacted to their post with 🚀 1 day ago

Why choose between performance, privacy, and transparency when you can have all three? We're highlighting a solution-oriented paper that introduces PRvL, an open-source toolkit for PII redaction. The interesting part, the researchers used the AI4Privacy-300K and AI4Privacy-500K datasets to train and benchmark their suite of models. This is the power of open-source collaboration. We provide the comprehensive data foundation, and the community builds better solutions on top of it. It's a win for every organization when this research results in a powerful, free, and self-hostable tool that helps keep their data safe. Big cheers to Leon Garza, Anantaa Kotal, Aritran Piplai, Lavanya Elluri, Prajit D., and Aman Chadha for pulling this off. 🔗 Read the full paper to see their data-driven results and access the PRvL toolkit: https://arxiv.org/pdf/2508.05545 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset

posted an update 1 day ago

View all activity

Organizations

New activity in aistatuscodes/statuscodes10 7 months ago

[bot] Conversion to Parquet

#1 opened 7 months ago by

parquet-converter

New activity in ai4privacy/llama-ai4privacy-english-anonymiser-openpii 8 months ago

Base model

#3 opened 8 months ago by

IICurious

New activity in ai4privacy/llama-ai4privacy-english-anonymiser-openpii 9 months ago

model does not return detailed categories

#2 opened 9 months ago by

AymanChtiar

New activity in ai4privacy/open-pii-masking-500k-ai4privacy 10 months ago

Data cleanup?

#2 opened 10 months ago by

brandenkmurray

New activity in ai4privacy/llama-ai4privacy-english-anonymiser-openpii 10 months ago

Multilingual?

#1 opened 11 months ago by

Werner

New activity in ai4privacy/open-pii-masking-500k-ai4privacy 10 months ago

[bot] Conversion to Parquet

#1 opened 11 months ago by

parquet-converter

New activity in Mistral-AI-Game-Jam/le-mot 12 months ago

Upload game_logic.js

#5 opened 12 months ago by

SunJacques

Upload game_logic.js

#4 opened 12 months ago by

SunJacques

Update README.md

#1 opened 12 months ago by

MikeDoes

Upload 4 files

#3 opened 12 months ago by

SunJacques

New activity in ai4privacy/pii-masking-400k about 1 year ago

Is there a way to load the english subset only?

#5 opened about 1 year ago by

CarolXia

New activity in ai4privacy/pii-masking-400k over 1 year ago

words and labels

#3 opened over 1 year ago by

abhishek

Trained a PII detection model using this dataset

#2 opened over 1 year ago by

gaodrew

New activity in ai4privacy/pii-masking-300k almost 2 years ago

Dataset Viewer issue: ConfigNamesError

#3 opened almost 2 years ago by

MikeDoes

Configure splits

#4 opened almost 2 years ago by

severo

[bot] Conversion to Parquet

#2 opened almost 2 years ago by

parquet-converter

Dataset Viewer issue: FeaturesError

#1 opened almost 2 years ago by

MikeDoes

New activity in ai4privacy/pii-masking-65k almost 2 years ago

Lot of english in French data

#3 opened almost 2 years ago by

PierreW

New activity in ai4privacy/pii-masking-200k almost 2 years ago

Filtering on Dataset

#10 opened almost 2 years ago by

shegokarm

New activity in ai4privacy/pii-masking-200k about 2 years ago

Request: DOI

👍 1

#1 opened about 2 years ago by

bourdoiscatie

Michael Anthony PRO

AI & ML interests

Recent Activity

Organizations

MikeDoes's activity

[bot] Conversion to Parquet

Base model

model does not return detailed categories

Data cleanup?

Multilingual?

[bot] Conversion to Parquet

Upload game_logic.js

Upload game_logic.js

Update README.md

Upload 4 files

Is there a way to load the english subset only?

words and labels

Trained a PII detection model using this dataset

Dataset Viewer issue: ConfigNamesError

Configure splits

[bot] Conversion to Parquet

Dataset Viewer issue: FeaturesError

Lot of english in French data

Filtering on Dataset

Request: DOI