PII-Shield 🛡️
Zero-code log sanitization sidecar for Kubernetes. Prevents data leaks (GDPR/SOC2) by redacting PII from logs before they leave the pod.
"Don't let PII poison your AI models." PII-Shield ensures that sensitive data never reaches your training dataset, saving you from GDPR-forced model retraining.
[!WARNING] Upgrading to v2.0.0? We have moved end-user distribution to Helm-based installs and Distroless Native Sidecars. Kustomize is no longer a supported release installation path for production users, though the operator repository still keeps Kustomize scaffolding for local development and manifest generation.
/bin/shaccess inside the PII-Shield sidecar is no longer supported. Read the Migration Guide.
Two Deployment Models
PII-Shield offers two distinct ways to integrate into your stack:
- Kubernetes Operator (Zero-code): Our flagship deployment model. A fully automated K8s Operator that injects a highly-secure Distroless Sidecar into your pods to intercept and sanitize logs on the fly.
- In-Process WASM (For core integrations): For extreme performance, the core engine can be embedded directly via WASM, providing
<1mslatency without network hops.
Project Status & Roadmap
PII-Shield is an actively developed open-source security tool in a production-hardening phase. The v2.x release line ships usable CLI, container, Helm/operator, and WASM SDK artifacts. Core redaction paths are ready for controlled deployments, while some Kubernetes deployment modes and supply-chain guarantees are still being stabilized.
| Component | Status |
|---|---|
| Core scanner | Released / controlled deployments |
| CLI sidecar | Released / controlled deployments |
| Kubernetes operator | Stabilization phase |
| WASM SDKs | Released beta |
| Proxy-Wasm gateway integration | Planned R&D |
| Control Plane UI | Planned R&D |
| eBPF interception | Experimental R&D |
See KNOWN_LIMITATIONS.md for the current production-hardening boundaries.
Why PII-Shield?
Developers often forget to mask sensitive data. Traditional regex filters in Fluentd/Logstash are slow, hard to maintain, and consume expensive CPU on log aggregators.
PII-Shield sits right next to your app container:
- Production-hardening Core Engine: Optimized for Kubernetes sidecars with low memory allocations on hot paths and deterministic regex matching.
- Context-Aware Entropy Analysis: Detected high-entropy secrets even without keys (e.g.
Error: ... 44saCk9...) by analyzing context keywords. - Custom Regex Rules: Deterministic redaction for structured data (UUIDs, IDs) that overrides entropy checks for known patterns.
- Regression & Fuzz Coverage: Tested against stress cases including binary garbage, JSON nesting, and multilingual logs.
- Deterministic Hashing: Replaces secrets with unique hashes (e.g.,
[HIDDEN:a1b2c]), allowing QA to correlate errors without seeing the raw data. - Drop-in: No code changes required. Works with any language (Node, Python, Java, Go).
- Whitelist Support: Explicitly allow safe patterns (e.g., git hashes, system IDs) using
PII_SAFE_REGEX_LISTto prevent false positives.
Managing PII-Shield across dozens of clusters?
We are building a hosted Control Plane with centralized rule management, Slack alerting, and redaction analytics.
Trusted By
GuardSpine (AI Governance Kernel) integrated PII-Shield's In-Process WASM to sanitize sensitive evidence trails directly within their Node.js and Python agents.
We chose the WASM architecture to ensure zero network overhead and <1ms latency. PII-Shield runs directly in-process, preserving the referential integrity of our hash chains while keeping logs compliant.
Performance Considerations
While PII-Shield is highly optimized, deep inspection of complex logs requires careful attention to configuration.
- Text Logs: Extremely fast (>100k lines/s).
- JSON Logs: Zero-allocation parsing (no
encoding/jsonoverhead). The scanner manually parses JSON structures to ensure high throughput (~7MB/s) without memory spikes. - Recommendation: Usage is safe for high throughput. We use recursion safeguards to prevent stack overflows on deeply nested JSON.
Installation
Helm Chart (Kubernetes Operator)
The official and recommended way to deploy PII-Shield in Kubernetes is via our fully-automated Operator:
helm repo add pii-shield https://pii-shield.github.io/pii-shield/
helm repo update
helm install pii-shield-operator pii-shield/pii-shield-operator -n operator-system --create-namespace
This deploys the PII-Shield Operator which automatically injects highly-secure, distroless sidecars into your Pods without requiring any code or Dockerfile changes.
Docker
Get the latest lightweight image from Docker Hub or GHCR:
docker pull thelisdeep/pii-shield:2.1.0
# OR from GitHub Container Registry (Enterprise):
docker pull ghcr.io/pii-shield/pii-shield:2.1.0
Build from Source
You can build the binary directly from the source code:
go build -o pii-shield ./cmd/cleaner/main.go
Configuration
See CONFIGURATION.md for a full list of environment variables, including:
PII_SALT: Custom HMAC salt (Required for production).PII_ADAPTIVE_THRESHOLD: Enable dynamic entropy baselines.PII_DISABLE_BIGRAM_CHECK: Optimize for non-English logs.PII_CUSTOM_REGEX_LIST: Custom regex rules for deterministic redaction.PII_SAFE_REGEX_LIST: Whitelist regex rules to ignore (matches are returned as-is).
Entropy Sensitivity Table (Default Threshold: 3.6)
| Entropy | Data Type | Example |
|---|---|---|
| 0.0 - 3.0 | Common words, repeats | password, admin, 111111 |
| 3.0 - 3.6 | CamelCase, partial hashes | ProgramCampaignInstanceJob, 8f3a11b2c |
| 3.6 - 4.5 | Paths, UUIDs, Weak Passwords | /opt/application/runtime, P@ssw0rd2026! |
| 4.5 - 5.0 | Medium Tokens | E8s9d_2kL1 |
| 5.0+ | High Entropy Keys | (SHA-256, API Keys) |
Quick Start
- Test Locally (CLI) You can pipe any log output through PII-Shield to see it in action immediately:
# Emulate a log with a sensitive password
echo "Error: User password=MySecretPass123! failed login" | docker run -i --rm ghcr.io/pii-shield/pii-shield:2.1.0
# Output: Error: User password=[HIDDEN:8f3a11] failed login
- Kubernetes (Automated Sidecar Injection)
With the PII-Shield Operator installed, protecting an application is as simple as creating a
PiiPolicyand labeling your Pods.
Create a Policy:
apiVersion: core.pii-shield.io/v1alpha1
kind: PiiPolicy
metadata:
name: strict-policy
namespace: default
spec:
injectionMode: "file"
Label your Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-app
spec:
template:
metadata:
labels:
pii-shield.io/inject: "true"
annotations:
pii-shield.io/policy: "strict-policy"
# ...
The Operator will automatically inject the pii-shield-agent using the Native Sidecar pattern (K8s 1.28+) and securely mask all logs!
Verification
This project is verified with a growing testing suite intended to raise confidence before production hardening:
- Unit Tests: Cover edge cases, multilingual support, and JSON integrity with >85% coverage.
- Fuzzing: Native Go fuzzing ensures crash safety against invalid and random binary inputs.
- Smoke Testing:
./scripts/test-smoke.shexercises mixed workloads and reports detection accuracy. - End-to-End (E2E) Testing: The
operator/tests/run_e2e.shsuite performs full-stack validation using Minikube and Helm. It builds local images, provisions the Operator without cert-manager, deploys target Jobs, and verifies actual log redaction by intercepting sidecar outputs.
Performance Benchmarks
To compare end-to-end CLI throughput between the current branch and a baseline ref:
./benchmark/run_benchmarks.sh
By default, the benchmark compares HEAD against origin/main, refreshes origin/main, generates a mixed log corpus, alternates old/new run order, and reports median, p95, min/max, and MiB/s:
BASE_REF=origin/main RUNS=9 LINES=500000 ./benchmark/run_benchmarks.sh
This measures the full stdin-to-stdout CLI path. For scanner-only microbenchmarks, run:
go test -bench=. -benchmem ./pkg/scanner
Operator Integration Tests
The operator keeps fast unit tests separate from Kubernetes API integration tests. Regular operator tests do not start a local API server:
cd operator
go test ./...
To run the envtest-based controller integration suite:
./scripts/test-operator-integration.sh
These tests start a local Kubernetes API server and etcd through envtest, so they require permission to bind to 127.0.0.1. In restricted sandboxes, run them in a local shell, Docker environment, or CI runner that allows localhost bind.
Support
PII-Shield is open-source infrastructure for privacy-preserving logs. If this project is useful to you or your organization, you can support its development through GitHub Sponsors.
Release Verification
Release checksum and image-digest verification guidance is documented in docs/release-verification.md. Signature and provenance-backed releases are tracked as part of the supply-chain hardening roadmap.
License
Distributed under the Apache 2.0 License. See LICENSE for more information.