We use cookies for analytics and visitor identification to understand how this site is used.
By continuing to browse you accept this use. You can opt out at any time via the
Cookie Settings link in the footer. See our Privacy Policy for details.
Your Cloud. Your Data. Zero-Trust PII Redaction.
Open source PII redaction that runs entirely inside your cloud, built for healthcare, finance, legal, and government workloads.
Teams shipping AI features today are sending customer names, SSNs, and medical records to hosted LLM APIs with every prompt. Philter AI Proxy is open source software that sits in front of your LLM calls and redacts PII before it leaves your network. One URL change in your existing SDK.
PhiSQL →Declarative DSL. Policy-as-code for engineering teams.
Arbiter →Human-in-the-loop review with structured exemption codes.
Philter Scope →Benchmarks precision, recall, and F1 so you can tune with numbers.
The World-Class Open Source Toolkit for PII Privacy
A complete stack for finding, redacting, monitoring, and auditing sensitive data, from low-level libraries to turnkey services. Each project is released under the permissive and business-friendly Apache license and developed in the open on GitHub.
Core redaction
The engine and API that find and redact PII in text.
Turnkey, self-hosted PII redaction with a clean API. Drops into any pipeline that needs sensitive data removed from text, and runs entirely inside your cloud.
Human-in-the-loop PII redaction. Search, review, and override automated detection decisions with structured exemption codes. Built for AI training-data prep and regulated everyday workflows.
Pattern matching handles structured identifiers. Purpose-built AI models handle everything else. Both run entirely inside your cloud.
Data Sovereignty
Philter and the rest of the Philterd toolkit run inside your cloud. Your data never leaves your perimeter, never reaches a third-party API, and never lands in someone else's logs.
Open Source Integrity
Transparency is the only way to verify privacy software. Our core engine is Apache 2.0 licensed, so your engineers can read every line, audit every decision, and extend the stack on their own terms.
Purpose-Built AI
Generic LLMs make poor privacy filters. We train and ship specialized NLP and deep-learning models built specifically for PII and PHI detection. They are accurate, tunable, and operationally affordable at scale.
Philterd provides a zero-trust architecture for HIPAA, GDPR, and CCPA compliance. The discovery engine operates entirely within your infrastructure: 100% data sovereignty, no external API dependencies, no third-party data training.
To satisfy HIPAA Safe Harbor requirements, we pair high-speed pattern matching for structured identifiers with specialized AI models for everything else, capturing all 18 protected identifiers under 45 CFR § 164.514. Healthcare and life-sciences organizations can automate de-identification across massive datasets while preserving the utility the data needs for research and innovation.
Deploy Philter (our turnkey redaction API) into your VPC from the AWS, Google Cloud, or Azure marketplace. Production-ready in minutes; billed through your existing cloud account. The other Philterd tools are not yet on the cloud marketplaces.
One-click Philter deploy into your VPC
Vendor-supported updates
Unified billing via your cloud provider
No procurement contract required
Best for: Teams that want production-ready Philter without managing builds or ops.
Work directly with the people who built the toolkit. Custom NLP models, privacy architecture, embedded engineering, and production deployment with full handoff.
Custom NLP model training
Privacy architecture review
Embedded engineering
Deployment + knowledge transfer
Best for: Healthcare, finance, and government workloads with custom requirements.
PhiSQL is a declarative, SQL-like query language for PII privacy operations across the Philterd toolkit. This post covers the problem it solves, the spec plus reference implementation model, what ships in the v0.1 draft, and what is coming next.
PhEye consolidates all model branches into a single main branch, adds GPU-accelerated Docker images, and ships a one-command smoke test script for every model variant.
Automated redaction handles most of the volume; humans handle the last few percent that automation can't. Arbiter is the open source review surface that bridges the two, built on Philter, designed for AI training data and regulated everyday workflows.
We help teams ship AI safely and build the redaction pipelines that protect sensitive data across the rest of the stack. Work directly with the creators of Philter: privacy infrastructure you own, not a black box you renew every year.
Privacy Architecture
We design end-to-end PII protection for your cloud and AI workloads: data flows, redaction layers, audit trails, and the guardrails that keep them aligned with HIPAA, GDPR, and CCPA.
Custom NLP Models
Off-the-shelf models miss the entities that matter most in your domain. We train specialized PII/PHI detectors on your data, evaluated against precision and recall you can measure.