Prompt Engineering for Privacy: Practical Patterns for Not Leaking PII
Every prompt sent to an LLM is a data egress point. Six concrete patterns for structuring prompts, redacting inputs, and scanning outputs so PII doesn't leak through the model.
PII in Vector Embeddings: A Defense Guide
Embeddings look like 'just numbers' — but recent research shows they're partially invertible. A practical defense guide for teams running vector stores against PII recovery attacks.
Building a HIPAA-Compliant Medical Chatbot
Why generic RAG chatbots fail HIPAA — and a step-by-step blueprint for building a medical chatbot that satisfies Safe Harbor at ingestion, retrieval, and inference. With BAA considerations and a self-hosted-LLM alternative.
Building a Privacy-Aware RAG System
RAG pipelines have two distinct PII leak vectors: ingestion and inference. A defense-in-depth blueprint with code, using Philter, Philter AI Proxy, and the rest of the Philterd toolkit.
Beyond Regex: Why General LLMs Fail at PII Discovery
Regex misses context, general LLMs over-redact and burn GPUs. The right answer is hybrid — pattern matching for what's deterministic, specialized AI for what isn't.
Why Using an LLM to Redact PII and PHI is a Bad Idea
We have seen a lot – and you probably have to – posts on various social media and blogging platforms showing how you can redact text using a large language model (LLM). They present a fairly simple solution to the complex problem of redaction. Can we really just let an LLM handle our text redaction…