AI-Powered Face Anonymisation For Clinical Data Management

How we built a production-ready video anonymisation pipeline with four deployment modes — from on-premise to real-time — for a regulated content platform.

0

Pipeline modes (on-prem, cloud, hybrid, real-time)

% 0 +

Face detection accuracy (hybrid)

0

Audio frames lost on output

The Challenge

A clinical data management company needed to use real patient-interaction footage for training and analytics — but every frame with an identifiable face was a compliance liability under GDPR, India’s DPDP framework, and tightening US state biometric laws. Manual blurring in After Effects was slow, inconsistent, and couldn’t scale. Off-the-shelf tools kept failing in three ways: missing faces in profile shots, producing flickering output, and silently stripping the audio track.

AI-powered-face-anonymisation

What we Built

A unified face-anonymisation architecture with four deployment modes, each optimised for a different constraint:

On-premise (MTCNN)
Footage never leaves the network. Zero per-minute cost. Built for medical, defence, and HR content.
Cloud-scale (Rekognition)
Parallel worker threads against AWS. An hour of 4K video processed in minutes, not hours.
Hybrid (dual-detector)
Two independent models; blur anything flagged by either. Near-zero false negatives for regulated submissions.
Real-time (YOLO + MediaPipe)
Live preview with on-the-fly blur controls. Runs on modest hardware. Built for broadcast and events.

Technology Used

Backend
  • PythonPython
Computer Vision
  • OpenCVOpenCV
  • MTCNN MTCNN
  • YOLOv8 YOLOv8
AI/ML Components
  • AWS RekognitionAWS Rekognition
  • MediaPipeMediaPipe
Video Processing
  • FFmpegFFmpeg
  • MoviePyMoviePy

Key Engineering Decisions

  • Detection JSON as a contract — every run produces an auditable, frame-by-frame detection record separate from the rendered video. Compliance teams can review what was detected without re-running the model.
  • Temporal smoothing — when a face is detected in frames N-1 and N+1 but missed in N, the pipeline interpolates. No flicker. Continuous output.
  • Audio preservation — original audio track preserved bit-for-bit through every pipeline mode. No silent clips, no re-sync workflow.
  • Configurable output — blur strength, pixelation density, mask shape, and anonymisation style are all parameters. Brand-safety and editorial teams get different settings from the same engine.

Results

  • Unlocked use of real patient-interaction footage for training — previously shelved for compliance reasons
  • Turnaround from raw footage to anonymised, audit-ready output dropped from days to minutes
  • Deterministic re-runs: same input + same detection JSON = byte-identical output, satisfying regulatory audit requirements
  • Single architecture serves four deployment contexts without code duplication

"Bring us a clip; we'll show you what comes back. Faces, gone. Audio, intact. Deadlines, met."

Want to know more? Book a free 30-minute consultation

Book a Call