Talk to an Expert

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer email? support@philterd.ai

Prefer to skip the form? Pick a time on our calendar →
or send a message

Please do not enter PII or PHI in this form. If you need to share an example, use a sanitized one.

← All posts

PhEye Update: Unified Branch, GPU Support, and Streamlined Testing

PhEye, our lightweight HTTP service for finding PII and PHI using purpose-built NER models, has shipped a set of changes that simplify how the project is maintained, built, and deployed. This post covers the three highlights: a single unified branch, GPU-accelerated images, and an end-to-end test harness.

One Branch, Five Models

Note: We are constantly working to create new models for additional entity types and languages. See the lenses page for an up-to-date listing of available models.

Until now, each PhEye model variant lived on its own Git branch: hospitals, french-medical, french-persons, medical-conditions, and the base PII model on main. Every change to shared code (Flask routes, health checks, dependency bumps) had to be cherry-picked or merged across all of them. That friction slowed development and made releases error-prone.

Starting with version 1.2.5, all five model variants live on main. Each model is a self-contained Python module under models/ with a standard interface: load(), predict(), a default label set, and a default confidence threshold. The Dockerfile accepts a PHEYE_MODEL build argument that selects which model to bake into the image at build time.

The result is a single linear history, one place to review PRs, and one version string (in app.py) that governs every image tag. Releases are cut from main and produce ten Docker images: five model variants for CPU and five for GPU.

GPU Support

For workloads where inference latency matters, PhEye now publishes GPU-accelerated images alongside the CPU images. The GPU Dockerfile uses pytorch/pytorch:2.1.2-cuda12.1-cudnn8-runtime as its base, giving each container CUDA 12.1, cuDNN 8, and PyTorch out of the box. The application code is identical between CPU and GPU builds; the only difference is the runtime.

A dedicated docker-compose.gpu.yaml mirrors the CPU compose file but adds the NVIDIA device configuration that Docker needs to pass the GPU through:

deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1
          capabilities: [gpu]

Image tags follow a clear convention: philterd/ph-eye:<version>-<model> for CPU and philterd/ph-eye:<version>-<model>-gpu for GPU. The build scripts (build-docker-images.sh and push-docker-images.sh) handle the full matrix automatically, and both accept a --no-gpu flag if you only need CPU images.

Docker Compose for Local Development

Both docker-compose.yaml (CPU) and docker-compose.gpu.yaml (GPU) stand up all five model services at once, each on its own port:

PortModelLanguageDetects
8001pii_baseEnglishPerson, place, organization
8002hospitalsEnglishHospital name, room number
8003medical_conditionsEnglishDisease, disorder
8004french_personsFrenchPerson entities
8005french_medicalFrenchMedical conditions

A single docker compose up gives you a full local stack for integration testing against any combination of models.

Smoke Testing with test.sh

The repository includes a test.sh script that exercises every running model service with realistic inputs. It posts JSON payloads to each service’s /find endpoint and asserts that the response contains the expected entity text. For example, it sends “George Washington went to Virginia” to the base PII model and confirms both “George Washington” and “Virginia” appear in the results, and it sends French medical text to the French medical model and confirms entities like “diabete” are detected.

Running the full suite is two commands:

docker compose up -d
./test.sh

If any assertion fails, the script reports which service and which expected entity was missing. It is a fast, repeatable way to verify that a build is healthy before pushing images.

What This Means for Users

If you pull PhEye images from Docker Hub, nothing changes about how you run them. The images, tags, API surface (/find and /status), and request/response format are all the same. The improvements are behind the scenes: faster release cycles, GPU options for latency-sensitive deployments, and a simpler contribution path for anyone working on the source.

For more details, see the PhEye product page or the PhEye repository on GitHub.