Writing your first redaction policy

Q: "Do I need a running Philter instance to follow along?"

"To apply a policy with the REST examples here, yes. You can \u003ca href=\"/philter/deploy/\"\u003edeploy Philter in a few minutes\u003c/a\u003e. If you are embedding \u003ca href=\"/phileas/\"\u003ePhileas\u003c/a\u003e as a library instead, the same JSON policy loads directly in your application; only the way you invoke it differs."

Q: "How is a policy named?"

"By the name you give it when you save it to Philter. You then reference that name with the \u003ccode\u003ep\u003c/code\u003e parameter on the filter API. As a convention, save the file as \u003ccode\u003e\u0026lt;name\u0026gt;.json\u003c/code\u003e so the filename and the policy name match."

Q: "Is there an easier way than hand-writing JSON?"

"Yes. Once you understand the shape, you can author policies in \u003ca href=\"/phisql/\"\u003ePhiSQL\u003c/a\u003e (a readable query language that compiles to this JSON), build them visually in the \u003ca href=\"/redaction-policy-editor/\"\u003eRedaction Policy Editor\u003c/a\u003e, or start from a ready-made file in the \u003ca href=\"/policies/\"\u003epolicy library\u003c/a\u003e."

Q: "How do I know my policy is actually catching everything?"

"Measure it. \u003ca href=\"/philter-scope/\"\u003ePhilter Scope\u003c/a\u003e scores a policy on precision and recall against gold-standard test data, so you can tune with numbers instead of guesses. See the tuning guidance linked below."

This guide takes you from an empty file to a working redaction policy in a handful of steps. By the end you will have detected an entity, applied the policy with Philter, changed how a value is redacted, added more identifiers, and silenced a false positive. For the field-by-field reference behind every choice here, keep the policy schema guide open alongside this one.

What you will build

A policy that finds email addresses, phone numbers, and Social Security numbers in text and redacts each one. We will start with a single identifier and grow it.

You will need a place to run the policy. The REST examples below assume a Philter instance on localhost:8080. If you are embedding Phileas as a library, the same JSON loads directly in your code.

Step 1: start with an empty policy

A policy is a JSON object. The only part you almost always need is identifiers, the map of things to detect. Start here:

{
  "identifiers": {}
}

This is valid, but it detects nothing. Let us give it something to find.

Step 2: detect your first entity

Add an emailAddress identifier with one filter strategy. The strategy says what to do when an email address is detected:

{
  "identifiers": {
    "emailAddress": {
      "emailAddressFilterStrategies": [
        {
          "strategy": "REDACT",
          "redactionFormat": "{{{REDACTED-%t}}}"
        }
      ]
    }
  }
}

The pattern is always the same: an entity key (emailAddress), and inside it a strategies array named after the entity (emailAddressFilterStrategies). The REDACT strategy replaces the value with redactionFormat, where %t expands to the filter type. An email address becomes {{{REDACTED-email-address}}}.

Step 3: apply the policy

Save the file as my-first-policy.json, then upload it to Philter and redact some text:

# 1. Save the policy to your Philter instance
curl -X POST http://localhost:8080/api/policies \
     -H "Content-Type: application/json" \
     --data @my-first-policy.json

# 2. Redact text using the policy, referenced by name
curl "http://localhost:8080/api/filter?p=my-first-policy" \
     --data "Reach me at jane@example.com." \
     -H "Content-Type: text/plain"

The response replaces the email address:

Reach me at {{{REDACTED-email-address}}}.

You now have a working redaction policy. Everything from here is refinement.

Step 4: change how a value is redacted

REDACT is one of several strategies. Suppose you would rather mask the local part of the address than drop it entirely. Switch the strategy to MASK:

"emailAddressFilterStrategies": [
  {
    "strategy": "MASK",
    "maskCharacter": "*"
  }
]

Each strategy has its own supporting fields. The filter strategies section of the schema guide lists them all, including format-preserving encryption and hashing for when you need the output to stay usable downstream.

Step 5: add more identifiers

Detecting more entity types means adding more keys to identifiers. Here is the policy with phone numbers and SSNs, each with a strategy that suits it. We keep the last four digits of phone numbers and fully redact SSNs:

{
  "identifiers": {
    "emailAddress": {
      "emailAddressFilterStrategies": [
        { "strategy": "MASK", "maskCharacter": "*" }
      ]
    },
    "phoneNumber": {
      "phoneNumberFilterStrategies": [
        { "strategy": "LAST_4" }
      ]
    },
    "ssn": {
      "ssnFilterStrategies": [
        { "strategy": "REDACT", "redactionFormat": "[SSN REMOVED]" }
      ]
    }
  }
}

Different entities can use different strategies in the same policy. That is the whole point: you decide, per type, how aggressive the handling should be.

Step 6: silence a false positive

Detection is never perfect. Suppose your text frequently mentions a product code that looks like an SSN but is not. Tell the policy to ignore it. The ignored list applies across the whole policy:

{
  "ignored": [
    {
      "terms": ["TEST-00-0000"]
    }
  ],
  "identifiers": {
    "ssn": {
      "ssnFilterStrategies": [
        { "strategy": "REDACT", "redactionFormat": "[SSN REMOVED]" }
      ]
    }
  }
}

Now that term is never redacted, even though it matches the SSN detector. Use ignoredPatterns instead of ignored when you need to match a regex rather than a fixed list of terms.

Step 7: keep refining

A policy is something you tune over time as you see real text. Two habits help:

Measure it. Philter Scope scores a policy on precision and recall against gold-standard data, so you can see exactly which detector needs work rather than eyeballing a few examples.
Review it like code. A policy is a plain file. Keep it in version control, and review changes the way you review any other change.

Where to go next

The policy schema guide is the full reference for every field and strategy.
PhiSQL lets you write the same policies in a few readable lines that compile to this JSON.
The policy library has production-ready HIPAA, PCI DSS, and other policies to adapt instead of starting from scratch.

Frequently asked questions

Do I need a running Philter instance to follow along?

To apply a policy with the REST examples here, yes. You can deploy Philter in a few minutes. If you are embedding Phileas as a library instead, the same JSON policy loads directly in your application; only the way you invoke it differs.

How is a policy named?

By the name you give it when you save it to Philter. You then reference that name with the p parameter on the filter API. As a convention, save the file as <name>.json so the filename and the policy name match.

Is there an easier way than hand-writing JSON?

Yes. Once you understand the shape, you can author policies in PhiSQL (a readable query language that compiles to this JSON), build them visually in the Redaction Policy Editor, or start from a ready-made file in the policy library.

How do I know my policy is actually catching everything?

Measure it. Philter Scope scores a policy on precision and recall against gold-standard test data, so you can tune with numbers instead of guesses. See the tuning guidance linked below.