Prompt Engineering for Privacy: Practical Patterns for Not Leaking PII
Every prompt sent to an LLM is a data egress point. Six concrete patterns for structuring prompts, redacting inputs, and scanning outputs so PII doesn't leak through the model.
PII in Vector Embeddings: A Defense Guide
Embeddings look like 'just numbers' — but recent research shows they're partially invertible. A practical defense guide for teams running vector stores against PII recovery attacks.
The Ethics of Training: Why We Use Synthetic Data
A privacy tool should never be trained on the very data it's meant to protect. Here's why Philterd's models are built entirely on synthetic data — and what that means for your compliance posture.
Building a HIPAA-Compliant Medical Chatbot
Why generic RAG chatbots fail HIPAA — and a step-by-step blueprint for building a medical chatbot that satisfies Safe Harbor at ingestion, retrieval, and inference. With BAA considerations and a self-hosted-LLM alternative.
Building a Privacy-Aware RAG System
RAG pipelines have two distinct PII leak vectors: ingestion and inference. A defense-in-depth blueprint with code, using Philter, Philter AI Proxy, and the rest of the Philterd toolkit.
Beyond Regex: Why General LLMs Fail at PII Discovery
Regex misses context, general LLMs over-redact and burn GPUs. The right answer is hybrid — pattern matching for what's deterministic, specialized AI for what isn't.
Open Source vs. Black Box: Why You Can't Afford "Trust Me" Privacy
For a CISO, "trust me" is not a strategy. Why auditable open source is the new enterprise standard for PII redaction — and what that means for compliance, vetting, and vendor lock-in.
Why API-Based Redaction is a Security Antipattern
Sending sensitive data to a third-party redaction API creates the security holes you're trying to close. Here's why true data sovereignty requires a self-hosted engine — and how Philter delivers it.
Using an LLM or Pattern-based Rules for PII/PHI Redaction
In our data-driven world, being able to protect Personally Identifiable Information (PII) and Protected Health Information (PHI) is imperative. Whether you’re securing customer data, complying with regulations like GDPR or HIPAA, or simply aiming for responsible data handling, the need to effectively redact sensitive information is crucial. Today, there are two primary approaches: leveraging the…
Philter as an AI Policy Layer
A policy layer is an important part of every source of AI-generated text. An AI policy layer is an important part of every source of AI-generated text because it inspects the AI-generated text to prevent sensitive information from being exposed. A policy layer can help remove information such as names, addresses, and telephone numbers from…