Log redaction seems like a solved problem. Add a few regex patterns, configure your SIEM to strip known-sensitive fields, and move on. For a small team running one service, this actually works reasonably well.

At scale — multiple teams, microservices, mixed regulatory requirements — it falls apart in predictable ways. Understanding the failure modes makes it easier to design around them.

Failure Mode 1: Pattern Drift

Regex-based redaction depends on field name conventions staying consistent across time and teams. In practice, they don't. A field called email gets refactored to contactEmail, then userContact.email, then ends up nested inside a serialized DTO. Each rename is a potential gap.

Pattern drift is silent. You won't know a field stopped being redacted until someone queries the logs and finds raw PII — or until an auditor does.

Failure Mode 2: Semantic Blindness

Redaction rules operate on field names and value patterns. They have no knowledge of what a field means in your domain. A field named ref could be a document reference, a product SKU, or a patient identifier — and a pattern-matching rule can't tell the difference without explicit enumeration.

This forces teams into an arms race: enumerate every possible field name that could carry sensitive data, across every service, and maintain that list as the system evolves. The maintenance overhead is often underestimated until it becomes a quarterly audit exercise.

Failure Mode 3: Exception Objects

Exception logging is the largest single source of unintended PII in most applications. When an exception is thrown during request processing, it's common for the exception object to contain a serialized form of the request context — including headers, query parameters, and body fields.

// This looks innocent
catch (Exception ex)
{
    _logger.LogError(ex, "Failed to process request for {UserId}", userId);
}

// But ex.Data or ex.Message may contain:
// - Request body fields (email, SSN, card number)
// - Response payloads from upstream services
// - Stack frames with local variable values in debug builds

Most redaction rules don't parse exception objects. They look at the structured log properties defined by the developer — but not at the serialized exception data that the logging framework appends automatically.

Failure Mode 4: Third-Party Logging

Modern applications include dozens of third-party packages. Many of them log internally — using whatever ILogger implementation is registered in the application's DI container. You have no control over what they log, but their log events flow through your pipeline and out to your sinks.

Redaction rules that only cover your own codebase leave a systematic gap here. An HTTP client library that logs request details, an auth middleware that logs user context, or an ORM that logs query parameters can all introduce PII into your log stream without any action from your developers.

What Scale-Resistant Redaction Requires

The common thread across all four failure modes is that they're caused by operating downstream of the application code. A redaction system that can survive at scale needs to:

Operate in-process, before log events are serialized and emitted to any sink.
Be aware of your domain model and field semantics, not just field names.
Cover all log sources in the process — including third-party packages.
Apply policy automatically to new fields and endpoints without developer action.
Be auditable: you need to be able to prove what was redacted, when, and why.

Governance at the emission layer — not the sink layer — is the only approach that satisfies all five of these requirements simultaneously.

The Uncomfortable Truth

Most teams are running redaction pipelines that were designed for their system as it existed 2-3 years ago. The system has grown; the redaction rules haven't kept pace. This isn't a failure of diligence — it's a structural consequence of the wrong architectural model.

The right question isn't "how do we maintain better redaction rules?" It's "why are redaction rules something we need to maintain at all?"

PreviousPII in Logs Is Not a Logging Problem

Next The Hidden Cost of Over-Logging