Michael Anthony PRO
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
reacted
to
their
post
with 👀
about 1 hour ago
The future of AI privacy isn't just in the cloud; it's on your device. But how do we build and validate these tools?
A new paper on "Rescriber" explores this with a tool that uses smaller LLMs for on-device anonymization. Building and validating such tools requires a strong data foundation. We're excited to see that the researchers used the Ai4Privacy open dataset to create their performance benchmarks.
This is our mission in action: providing the open-source data that helps innovators build and test better solutions that will give users more control over their privacy. It's a win for the community when our data helps prove the feasibility of on-device AI for data minimization, with reported user perceptions on par with state-of-the-art cloud models.
Shoutout to Jijie Zhou, Eryue Xu, Yaoyao Wu, and Tianshi Li on this one!
🔗 Check out the research to see how on-device AI, powered by solid data, is changing the game: https://dl.acm.org/doi/pdf/10.1145/3706598.3713701
🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset
reacted
to
their
post
with 🚀
1 day ago
Why choose between performance, privacy, and transparency when you can have all three?
We're highlighting a solution-oriented paper that introduces PRvL, an open-source toolkit for PII redaction. The interesting part, the researchers used the AI4Privacy-300K and AI4Privacy-500K datasets to train and benchmark their suite of models.
This is the power of open-source collaboration. We provide the comprehensive data foundation, and the community builds better solutions on top of it. It's a win for every organization when this research results in a powerful, free, and self-hostable tool that helps keep their data safe.
Big cheers to Leon Garza, Anantaa Kotal, Aritran Piplai, Lavanya Elluri, Prajit D., and Aman Chadha for pulling this off.
🔗 Read the full paper to see their data-driven results and access the PRvL toolkit: https://arxiv.org/pdf/2508.05545
🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset
posted
an
update
1 day ago
Why choose between performance, privacy, and transparency when you can have all three?
We're highlighting a solution-oriented paper that introduces PRvL, an open-source toolkit for PII redaction. The interesting part, the researchers used the AI4Privacy-300K and AI4Privacy-500K datasets to train and benchmark their suite of models.
This is the power of open-source collaboration. We provide the comprehensive data foundation, and the community builds better solutions on top of it. It's a win for every organization when this research results in a powerful, free, and self-hostable tool that helps keep their data safe.
Big cheers to Leon Garza, Anantaa Kotal, Aritran Piplai, Lavanya Elluri, Prajit D., and Aman Chadha for pulling this off.
🔗 Read the full paper to see their data-driven results and access the PRvL toolkit: https://arxiv.org/pdf/2508.05545
🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset