Talk to the Team

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer email? support@philterd.ai

Prefer to skip the form? Pick a time on our calendar →
or send a message

Please do not enter PII or PHI in this form. If you need to share an example, use a sanitized one.

A redaction policy is the configuration that tells Phileas (and therefore Philter, which embeds it) two things: which types of sensitive information to detect, and how to transform each type when it is found. A policy is a single JSON file. This guide explains how that file is structured so you can read, write, and tune one with confidence.

The schema is the single source of truth for the whole toolkit. The canonical definition is a JSON Schema file in the Phileas repository, and everything else tracks it: the policy library, the Redaction Policy Editor, and the PhiSQL query language all produce or validate against this same shape.

The smallest useful policy

A policy that detects email addresses and redacts them looks like this:

{
  "identifiers": {
    "emailAddress": {
      "emailAddressFilterStrategies": [
        {
          "strategy": "REDACT",
          "redactionFormat": "{{{REDACTED-%t}}}"
        }
      ]
    }
  }
}

When an email address is detected, it is replaced with {{{REDACTED-email-address}}}. The %t placeholder expands to the filter type. That is the entire mental model: an identifiers object naming the things to detect, and for each one, a list of strategies describing what to do.

The anatomy of a policy

The top level of a policy has a small number of optional sections. In practice most policies only use identifiers.

FieldWhat it does
identifiersThe core of the policy. Defines which entity types to detect and how to handle each.
configGlobal settings: text splitting for large inputs, PDF rendering, post-filters, and analysis options.
ignoredLists of terms to ignore globally, so a known-safe word is never redacted.
ignoredPatternsRegex patterns to ignore globally.
cryptoAES encryption settings used by the CRYPTO_REPLACE strategy.
fpeFormat-preserving encryption settings used by the FPE_ENCRYPT_REPLACE strategy.
graphicalSettings for redacting images and PDFs.

Identifiers: what to detect

Each key inside identifiers is an entity type. Phileas ships with a long list, including ssn, creditCard, emailAddress, phoneNumber, date, age, ipAddress, firstName, surname, streetAddress, city, state, zipCode, and url, along with detection driven by AI models (pheyes) and custom dictionaries (dictionaries).

Every identifier shares a set of common controls:

  • enabled to turn the filter on or off
  • priority to order it relative to other filters
  • ignored and ignoredPatterns to suppress specific matches for that one identifier
  • windowSize to tune how much surrounding context the detector considers

Alongside those, each identifier carries its own strategies array, named after the identifier: ssnFilterStrategies, creditCardFilterStrategies, and so on. That array is where you say what happens when the entity is found.

Filter strategies: how to redact

A filter strategy is a single rule for transforming a detected value. The strategy field selects the transformation:

StrategyEffect
REDACTReplace the value with a format string (the default).
MASKReplace characters with a mask character such as *.
STATIC_REPLACEReplace with a fixed string you choose.
RANDOM_REPLACEReplace with a realistic random value of the same type.
LAST_4Keep only the last four characters (common for card numbers).
TRUNCATEKeep a leading portion and drop the rest.
ABBREVIATEShorten the value.
HASH_SHA256_REPLACEReplace with a SHA-256 hash of the value.
CRYPTO_REPLACEReplace with an AES-encrypted value (uses the policy crypto block).
FPE_ENCRYPT_REPLACEReplace with a format-preserving encrypted value (uses the policy fpe block).

Strategies also accept supporting fields, the most common being redactionFormat (the template for REDACT, supporting %t for type, %v for the original value, and %l for a label), maskCharacter, staticReplacement, and replacementScope (DOCUMENT or CONTEXT, which keeps the same value mapping consistent within a document or a context).

Because each identifier holds an array of strategies, you can apply different handling under different conditions. The first matching strategy wins, so a more specific conditional strategy can sit ahead of a general fallback.

A worked example

Here is a small but realistic policy. It keeps the last four digits of credit cards, masks phone numbers, and redacts SSNs with a labeled format:

{
  "identifiers": {
    "creditCard": {
      "creditCardFilterStrategies": [
        { "strategy": "LAST_4" }
      ]
    },
    "phoneNumber": {
      "phoneNumberFilterStrategies": [
        { "strategy": "MASK", "maskCharacter": "#" }
      ]
    },
    "ssn": {
      "ssnFilterStrategies": [
        { "strategy": "REDACT", "redactionFormat": "[SSN REMOVED]" }
      ]
    }
  }
}

The same policy expressed in PhiSQL is a few readable lines that compile to exactly this JSON:

POLICY example;

REDACT CREDIT_CARD WITH LAST_4;
REDACT PHONE_NUMBER WITH MASK(character='#');
REDACT SSN WITH REDACT(format='[SSN REMOVED]');

How to get started

  1. Start from an existing policy. The policy library has ready-to-use files for HIPAA Safe Harbor, PCI DSS scope reduction, and more. Download one and adjust it.
  2. Or author it readably. Write PhiSQL and compile it, or build the policy visually in the Redaction Policy Editor.
  3. Validate against the schema. Point your editor or CI at the canonical redaction-policy-schema.json so malformed policies are caught before they ship.
  4. Apply it. Save the file and reference it by name from the Philter redaction API, or load it directly in an embedded Phileas instance.

For the exhaustive field-by-field reference and every supported filter strategy option, see the Phileas filter policy documentation.

Frequently asked questions

Where does the canonical schema live?
In the Phileas repository at policy-schema/redaction-policy-schema.json. That JSON Schema file is the single source of truth. Philter, Phileas, the policy library, and PhiSQL all track it.
Do I have to write policies by hand?
No. You can hand-write JSON, generate it from PhiSQL (a readable query language that compiles to this schema), build it visually with the Redaction Policy Editor, or start from a ready-made file in the policy library. Every path produces the same JSON.
What happens if a policy does not match the schema?
Philter and Phileas validate policies on load and reject malformed ones. The policy library also validates every contributed policy against a vendored copy of the schema in CI, so drift is caught before it ships.
How do I control which strategy applies to which entity?
Each identifier carries its own array of filter strategies. The first strategy whose condition matches is applied, so you can, for example, mask most credit cards but keep the last four digits when a condition holds. See the filter strategies section above.