Downstream Masking Ineffective: Secure Data at the Source
The Problem
API outages lead to leaks of sensitive information like tokens and emails found in:
- App logs
- Sidecar buffers
- Kafka/SQS to SIEM
- Crash reports
- Slack alerts
Even with SIEM masking, data still leaks and unauthorized access happens.
Why It Happens
- Quick debugging delays filtering.
- Log platforms are trusted too much, ignoring data paths.
- Data persists in agents and backups.
- Libraries often log details by default.
- Data ownership is unclear.
Why Fixes Fail
- Regex redaction misses patterns, leading to leaks.
- Masking during queries doesn't protect transported data.
- Vendor filters are limited; raw data persists.
- Data Loss Prevention systems detect leaks too late.
- JSON logs blend structured and unstructured data.
- Incidents spread data without strong control.
Core Issue
Downstream solutions act too late; data access has already occurred.
Effective Approach: Source-First Telemetry Hygiene
Manage data at creation with strict, auditable exceptions.
- Telemetry Contract
- Identify sensitive data like tokens and identifiers.
- Define allowed fields for each event.
- Safe Logging APIs
- Develop language-specific interfaces:
- Default to structured events.
- Allowlist and redact fields at the source.
- Handle exceptions centrally; strip sensitive data.
- Enforce via CI/CD
- Use static analysis to flag risks.
- Validate schemas and review changes.
- Implement policy-as-code.
- Control Debugging
- Allow opt-in, audited debug sessions.
- Track session details.
- Secure the Path
- Use in-memory buffers; encrypt storage.
- Minimize data exposure in alerts.
- Enforce short data retention.
- Downstream Masking
- Use as a backup to protect readers.
Quarterly Steps
Quarter 1:
- Implement static checks and enhance tracing. Quarter 2:
- Audit default-logging frameworks and adopt schema checks.
Ideal Results
- Prevent accidental sensitive logging.
- Review required for new events.
- Use IDs for debugging, not full data.
- Limit sensitive telemetry reaching SIEM.
Challenges
- Requires initial developer effort.
- Third-party libraries might bypass controls.
- Limit access to debugging.
Cost Efficiency
- Source filtering is more cost-effective and improves indexing.
Monitoring
- Track PII rates, break-glass data, and mask activations.
Rollout Advice
- Start with frequent debug services and block early.
Cerbi's Contribution
Cerbi offers tools for robust source enforcement and supports enforcement at the origin.
Conclusion
Start controlling and auditing data right at the source. Downstream masking isn’t enough; keep logs secure and practical.