Privacy First: How the PII Interceptor Protects Your Data

Every message that flows through your Life Savor agent — whether it's going to an AI model, a skill, another agent, or a system event — passes through the PII Interceptor first. It's not optional. It's not configurable by components. It's a fixed part of the pipeline that protects your personal information before anything else touches it.

Here's how it works and why we built it this way.

The Problem

When you talk to an AI agent, you inevitably share personal information. Your name, email, phone number, address, dates of birth — it all comes up naturally in conversation. If that data gets sent to a cloud model, stored in a skill's database, or relayed to another agent, you've lost control of it.

Most platforms treat privacy as an afterthought — a settings toggle or a terms-of-service checkbox. We treat it as architecture.

How the Interceptor Works

The interceptor sits in a fixed position in the message pipeline. Components cannot bypass it, reorder it, or disable it. Every message goes through two detection layers:

Layer 1: Pattern Matching

A regex scanner catches structured PII with known formats:

Credit card numbers
Social Security numbers
Email addresses
Phone numbers
Dates of birth

This is fast and deterministic — no AI model needed.

Layer 2: AI-Powered Detection

A Named Entity Recognition (NER) model runs locally on your device to catch PII that doesn't follow a fixed pattern:

Names (given names, surnames)
Addresses (cities, streets, building numbers)
Identity documents (driver's license numbers, passport numbers)
Temporal information (dates, times that could identify you)

The NER model supports multiple languages and runs entirely on your hardware. Your text never leaves your device for PII detection.

What Happens When PII Is Found

When the interceptor detects personal information, it doesn't just delete it. It replaces it with a vault tag — a structured token that preserves the semantic meaning without exposing the actual data:

"My email is john@example.com" 
    → "My email is <<PII:EMAIL:agent1,user1,org1,a3f2c1>>"

The vault tag tells the AI model "there's an email address here" without revealing what it is. The model can still reason about the message ("the user shared their email") without ever seeing the actual address.

The original value is stored in your local encrypted vault, tied to your identity. Only you can retrieve it.

Security Levels

Not all PII is equally sensitive. A first name is less sensitive than a Social Security number. The interceptor classifies detected PII into security levels:

Level	Label	Examples
0	Open	No PII detected
1	Personal	Names, cities
2	Private	Email addresses, phone numbers
3	Protected	Dates of birth, ages
4	Guarded	Driver's license numbers
5	Critical	SSNs, passport numbers, credit cards
6	Compound	Multiple high-sensitivity items together

These levels control who can de-tokenize (retrieve the original value). Low-sensitivity items can be resolved automatically when you request them. High-sensitivity items require explicit confirmation. Critical items may be permanently tokenized depending on your settings.

Exfiltration Detection

The interceptor doesn't just look at individual messages — it watches patterns across conversations. If a skill or model starts accumulating multiple types of PII from you (your name in one message, your address in another, your phone number in a third), the interceptor flags it as a potential exfiltration attempt and blocks further data flow.

This catches sophisticated attacks where individually harmless requests combine to build a profile of you.

Every Channel, Every Direction

The interceptor covers all communication paths:

LLM inference — before your message reaches any model
Skill I/O — before data is sent to any skill process
Agent-to-agent — before messages relay between agents
System events — before internal events are logged or transmitted

And it works in both directions:

Inbound: PII is tokenized before reaching the model, then content safety is checked
Outbound: Content safety is checked first, then PII verification ensures no new personal data was introduced

Per-Component Configuration

You can configure interception levels per provider or skill:

Full (default) — regex + NER scanning on everything
Regex only — fast pattern matching without the NER model
Disabled — for trusted system components that need raw data (like your local calendar integration)

You can also allowlist specific PII categories per component. If your email skill legitimately needs to see email addresses, you can grant that — but only for that specific skill.

De-tokenization: Getting Your Data Back

When you need the original value (for example, to actually send that email), you request de-tokenization. The system checks:

Are you the owner of this data?
Does the requesting component have permission?
What's the de-tokenization policy for this PII category?

Every de-tokenization is logged in an audit trail. You can see exactly who accessed what, when, and why.

Why This Matters

Most AI platforms send your data to the cloud, process it on someone else's servers, and store it in someone else's database. Even "private" modes often just mean "we won't use it for training" — your data still leaves your device.

With Life Savor:

PII detection runs locally on your hardware
Sensitive data is replaced before it reaches any model or skill
Original values are stored in your local encrypted vault
De-tokenization requires your explicit permission
Every access is audited

Your data stays yours. That's not a feature — it's the architecture.