AURA: LLM Anonymization Against
Agentic Re-Identification

1Khoury College of Computer Sciences, Northeastern University
AURA overview: adaptive privacy scope expansion, masking convergence, and reconstruct-evaluate-select stages.

Figure 1. AURA overview. Adaptive privacy scope expansion augments a base re-identification profile with transcript-specific quasi-identifiers; AURA then initializes privacy and utility profiles, converges on a mask template for sensitive spans, and reconstructs only those spans. Candidate rewrites are scored by an attribute-inference attacker and a keeper before the final sanitized transcript T* is selected.

News

Abstract

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention.

We introduce AURA (Anonymization with Utility-Retention Adaptation), an LLM-powered mask–reconstruct framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask–reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.

Method

AURA separates the two pieces of an anonymization decision: where to intervene (privacy localization) and how to rewrite (utility-preserving reconstruction). The pipeline runs in three phases, each producing artifacts the next phase can consume — and each amenable to a different model or human-in-the-loop control.

Phase 0 · Initialization

Run a one-off agentic web-search attacker on the transcript to discover not only the base 8 privacy attributes (Age, Sex, Location, Occupation, Education, Relationship, Income, Place of Birth) but also transcript-specific quasi-identifiers — workflow cues, research-pipeline signatures, tool-stack mentions, domain-practice cues, and institutional context. The output is an adaptive privacy scope A, a blacklist B of evidence spans, and a utility insight profile P over 8 dimensions (Theme, Experiential, Affect, Reasoning, Behavioral, Relational, Temporal, Expertise).

Phase 1 · Masking Convergence

An iterative rewriter alternates with LLM-based privacy inference for up to Rmask rounds: each round infers attributes in scope A, rewrites ti to suppress them, and stops when no attribute can be inferred or the round budget is reached. A diff between the original and the converged rewrite yields a masked template T̂ with [MASK_i] placeholders, a mask map M linking placeholders back to original spans, and seed replacements for the next phase.

Phase 2 · Reconstruct & Select

For the masked template only, the reconstructor proposes N candidate replacement dictionaries, producing N candidate rewrites. Each candidate is scored by an attacker (privacy severity S and specificity count C) and a keeper (utility loss L). Selection is privacy-first: filter to candidates with C ≤ Cmax, then minimize (S, L); otherwise minimize (C, S, L) in order.

Variants

We evaluate three privacy-scope variants and multiple LLM backbones:

  • Adaptive privacy AURA — base 8 attributes plus transcript-specific quasi-identifiers discovered by the web-search probe.
  • 8-attribute AURA — the fixed Staab et al. [2025] scope; useful for controlled comparisons.
  • Pure adaptive AURA — directly infers identifiable attributes without the 8-attribute prior.
  • Backbones — API-powered (GPT-5.1 + GPT-4.1) and fully on-device (Qwen3.5-27B, Qwen3.5-35B-A3B).

Benchmarks

We curate 27 re-identifiable interview transcripts from the Anthropic Interviewer dataset, retained only after a verified web-search attack succeeded on the original text. We then evaluate every defense along both privacy and utility axes.

Privacy · Agentic Re-Identification

We re-run the same web-search agent attack on each rewritten transcript and report the number of re-identified individuals out of 27 under three attackers:

  • GPT-5.1 (web search, reasoning high)
  • GPT-5.4-mini (web search, reasoning high)
  • Gemini-3-Flash (web search, reasoning high)

Utility · Three Levels

  • Interviewee-profile facts — 170 validated facts capturing respondent context (occupation, expertise, education, …).
  • Codebook facts — 371 facts from a human-authored codebook with 13 codes across 5 categories.
  • Utility grid (units) — 2,349 (profile × code) joint units; a unit is recovered only when both constituents are recoverable.

Baselines

We compare against a comprehensive set of baselines spanning the space:

Presidio (NER) One-shot rewrite (minimal) One-shot rewrite (detailed) Anonymizer [Staab+ 2025] DP-MLM (ε ∈ {10, 30, 50, 70, 100, 120, 140})

Results

Across three web-search attackers, AURA’s adaptive-scope variants reduce agentic re-identification to 0–5 / 27 transcripts — substantially below NER-based redaction (13–21 / 27) and the prior LLM anonymizer (6–7 / 27) — while retaining 74.9–80.3% of unit-level utility-grid information.

Agentic Re-Identification (27 transcripts)

Re-identification count out of 27 under each attacker. Lower is better. Methods on or near the Pareto frontier are bolded.

Method GPT-5.1 GPT-5.4-mini Gemini-3-Flash
AURA variants
AURA (adapt. privacy, Qwen3.5-27B)2 / 274 / 270 / 27
AURA (adapt. privacy, Qwen3.5-35B-A3B)2 / 275 / 272 / 27
AURA (adapt. privacy, GPT-4.1)2 / 273 / 270 / 27
AURA (pure adaptive, GPT-4.1)2 / 273 / 272 / 27
AURA (8-attribute, GPT-4.1)6 / 278 / 277 / 27
AURA (8-attribute, Qwen3.5-27B)4 / 277 / 273 / 27
AURA (8-attribute, Qwen3.5-35B-A3B)2 / 278 / 274 / 27
Non-DP baselines
Anonymizer [Staab+ 2025]6 / 277 / 277 / 27
Presidio13 / 2721 / 2717 / 27
One-shot rewrite (minimal)10 / 2714 / 278 / 27
One-shot rewrite (detailed)15 / 2717 / 2714 / 27
DP-MLM baselines
DP-MLM (ε = 10)0 / 270 / 270 / 27
DP-MLM (ε = 30)0 / 270 / 270 / 27
DP-MLM (ε = 50)4 / 274 / 271 / 27
DP-MLM (ε = 70)3 / 273 / 271 / 27
DP-MLM (ε = 100)4 / 275 / 272 / 27
DP-MLM (ε = 120)4 / 274 / 274 / 27
DP-MLM (ε = 140)4 / 274 / 274 / 27

Utility Preservation

Utility accuracy across profile, code, and validated grid units for AURA, non-DP baselines, and DP-MLM.

Figure 2. Utility preservation across 27 transcripts. Profile and codebook values are fact-level recoverability; Grid (unit) is the weighted recovery rate over all 2,349 validated profile-code units. Dashed horizontal lines mark the adaptive privacy AURA (Qwen3.5-27B) accuracy — the highest contextual utility-grid accuracy — for the corresponding metric.

Utility-grid unit recovery (%) across 27 transcripts — the joint metric over all 2,349 validated profile-code units. Higher is better.

Method Utility grid (unit)
AURA variants
AURA (adapt. privacy, Qwen3.5-27B)80.3
AURA (adapt. privacy, Qwen3.5-35B-A3B)76.7
AURA (adapt. privacy, GPT-4.1)74.9
AURA (pure adaptive, GPT-4.1)71.9
AURA (8-attribute, GPT-4.1)77.1
AURA (8-attribute, Qwen3.5-27B)78.7
AURA (8-attribute, Qwen3.5-35B-A3B)80.2
Non-DP baselines
Anonymizer [Staab+ 2025]72.1
Presidio96.7
One-shot rewrite (minimal)92.8
One-shot rewrite (detailed)98.2
DP-MLM baselines
DP-MLM (ε = 10)0.0
DP-MLM (ε = 140)60.1

Pareto Frontier

Plotting privacy success (100 − re-identification rate, %) against unit utility-grid recovery, AURA’s adaptive variants sit closest to the upper-right corner across all three attackers. The 8-attribute AURA variants cluster on the privacy axis at higher utility than the prior anonymizer, while DP-MLM dominates the high-privacy / low-utility region and one-shot rewriting + Presidio populate the high-utility / high-leakage region.

Pareto front for privacy success vs. utility-grid unit recovery under GPT-5.4-mini.

Figure 3. Pareto front for privacy success versus unit utility-grid recovery under GPT-5.4-mini, the stronger attacker in our setting. The plot highlights the middle-ground behavior of adaptive AURA variants relative to DP-MLM's low-utility privacy and the high-utility / high-leakage behavior of lighter rewriting methods. The yellow band shows that the variants with fixed privacy scope of 8 attributes cluster on the privacy axis.

Qualitative Diff Example

A representative turn-level excerpt. AURA replaces specific entities with category-level terms while preserving first-person voice; the Anonymizer rewrites entire sentences, damaging qualitative flow.

Original (synthetic)

I work in applied sensor physics, specifically on detecting weak environmental fields using tabletop interferometry. We wrote a paper, currently in review, which proposed a generalised mechanism for modelling the noise produced by moving calibration objects.

AURA · adaptive privacy

I work in scientific research that involves studying noise from environmental vibrations on sensitive measurement equipment. We completed a project that proposed a general approach for modeling the noise from moving sources.

Anonymizer baseline

I work in a technical field. A recent project involved analyzing the impact of external factors on data collection tools. I developed a general approach for modeling the influence produced by changing conditions.

Why It Matters

Agentic LLMs collapse the distance between casual contextual cues and Google-able identity evidence. NER-based redaction breaks under stronger attacker models (13/27 → 21/27 re-identifications as we move from GPT-5.1 to GPT-5.4-mini); one-shot LLM rewriting also fails to defend; formal DP can guarantee privacy but at heavy utility cost. AURA shows that an LLM-guided mask–reconstruct loop, paired with proactive web-search re-identification probing, lets practitioners tune the privacy-utility trade-off — and our open-weight Qwen variants make this practical to deploy locally.

We also frame anonymization as multi-stage risk management rather than one-time redaction: inform participants of residual risk, monitor risky attribute types, layer model-side safeguards, and evaluate releases against multiple attacker models before deployment.

BibTeX

@article{li2026aura,
  title   = {LLM Anonymization Against Agentic Re-Identification},
  author  = {Li, Ziwen and Wen, Jianing and Li, Tianshi},
  journal = {arXiv preprint arXiv:2605.30848},
  year    = {2026}
}