Papers
arxiv:2602.20743

Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

Published on Feb 24
· Submitted by
Gabriel Loiseau
on Feb 25
Authors:
,
,
,
,

Abstract

Adaptive text anonymization framework automatically adjusts anonymization strategies based on privacy-utility requirements using prompt optimization for language models across diverse domains and constraints.

AI-generated summary

Anonymizing textual documents is a highly context-sensitive problem: the appropriate balance between privacy protection and utility preservation varies with the data domain, privacy objectives, and downstream application. However, existing anonymization methods rely on static, manually designed strategies that lack the flexibility to adjust to diverse requirements and often fail to generalize across domains. We introduce adaptive text anonymization, a new task formulation in which anonymization strategies are automatically adapted to specific privacy-utility requirements. We propose a framework for task-specific prompt optimization that automatically constructs anonymization instructions for language models, enabling adaptation to different privacy goals, domains, and downstream usage patterns. To evaluate our approach, we present a benchmark spanning five datasets with diverse domains, privacy constraints, and utility objectives. Across all evaluated settings, our framework consistently achieves a better privacy-utility trade-off than existing baselines, while remaining computationally efficient and effective on open-source language models, with performance comparable to larger closed-source models. Additionally, we show that our method can discover novel anonymization strategies that explore different points along the privacy-utility trade-off frontier.

Community

We introduce adaptive text anonymization, a new approach that automatically tailors anonymization strategies to specific privacy and utility requirements instead of relying on fixed, manually designed methods. It uses prompt optimization to generate instructions for language models so they can balance protecting sensitive information with preserving useful content. Evaluated on a benchmark spanning diverse datasets and goals, this approach consistently achieves better privacy-utility trade-offs than existing baselines, works efficiently with open-source models, and uncovers novel anonymization strategies along the trade-off frontier.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.20743 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.20743 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.20743 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.