It is 2:47 AM. Your SIEM fires a privilege escalation alert on a production Linux host. You pull the process tree and find the origin: a Docker container launched six months ago with --privileged mode, by a developer who needed elevated access for a deployment test that never got cleaned up. The threat actor who found your exposed Docker API three days ago waited for exactly this. The container escape maps to MITRE ATT&CK T1611 — Escape to Host — and by the time you correlate the event chain, they have had 72 hours of lateral movement inside your environment. Docker container security was not a gap anyone decided to accept. It was one nobody noticed until it was too late.
This checklist audit exists to prevent that scenario. I use a version of this framework on every container environment we assess at SSE, from mid-market e-commerce platforms to healthcare clients running containerized workloads on Windows Server 2022. Fifteen checkpoints. Four categories. Clear pass/fail criteria for each. Work through it systematically and you will know exactly where your exposure sits before the 2:47 AM call comes.
What Is Actually at Risk in a Container Breach
The architectural promise of containers — lightweight, fast, isolated — creates a false sense of security for teams that have not thought through the threat model. Standard Windows Server containers share the kernel with the container host. Every container on that host has a direct line to the same kernel attack surface. If malware running inside a container can exploit a kernel vulnerability, it does not stay in the container.
Hyper-V containers operate differently. Each container runs inside its own virtual machine, with no shared kernel between the container and the host, or between containers on the same host. If a threat actor compromises a Hyper-V container, the blast radius stays inside that VM boundary. This is not a performance-free choice — you trade some density and startup speed for genuine hardware-level isolation. For workloads that process untrusted input, handle regulated data, or execute code from third parties, that trade is worth making.
The attack surface extends beyond container escape. Misconfigured registries expose credentials via T1552 — Unsecured Credentials. Unrestricted network access between containers enables east-west movement under T1021. Disabled logging eliminates your forensic trail. Each checkpoint in this audit targets a specific tactic in that chain.
How to Run This Audit
Each checkpoint contains three elements: what to check, the command or configuration to verify it, and a pass/fail criterion. Mark each as PASS, FAIL, or RISK-ACCEPTED. Risk-accepted means the gap is documented and a compensating control exists — it is not a synonym for ignored. Every item in that column needs a ticket, an owner, and a review date. At the end of the audit, any category with more than two FAIL marks is a critical gap. A single FAIL on Checkpoints 5, 10, 12, or 14 is a critical finding regardless of everything else.
Category 1: Image and Registry Hygiene (Checkpoints 1–4)
Image provenance is the start of your supply chain. A compromised base image or an unauthenticated registry is a pre-deployment infection vector. MITRE ATT&CK T1195 — Supply Chain Compromise — starts here, not at runtime.
Checkpoint 1: Image Provenance and Signing
What to check: Are all base images pulled from verified sources? Is Docker Content Trust enforced?
# Verify Docker Content Trust is enforced system-wide
echo $DOCKER_CONTENT_TRUST
# Should return: 1
# List images with digest verification
docker images --digests
# Inspect image history for unexpected layers
docker history --no-trunc image_name:tag
Pass: DOCKER_CONTENT_TRUST=1 set system-wide. All base images come from verified publishers with signed manifests. No anonymous or unverified pulls in your pipeline.
Fail: Content Trust disabled or not configured. Base images pulled from Docker Hub without digest pinning. This surfaces frequently during post-incident reviews — the attacker did not compromise the application, they compromised an upstream image tag three pulls ago. If you are running automated builds through a pipeline, the GitHub CI/CD automation guide covers supply chain integrity principles that apply directly to container image pipelines.
Checkpoint 2: Registry Authentication and TLS
What to check: Does your daemon.json allow insecure registries? Are private registries enforcing TLS?
# Review daemon.json for insecure registry entries
cat /etc/docker/daemon.json
# Example of a FAILING configuration:
{
"insecure-registries": ["192.168.1.50:5000"],
"registry-mirrors": []
}
Pass: The insecure-registries array is empty or absent in production. All private registry communication occurs over TLS with a valid certificate chain.
Fail: Any entry in insecure-registries in a production environment. Registries communicating over plain HTTP expose you to credential interception during image pulls — T1552 via network interception.
Checkpoint 3: Image Vulnerability Scanning
What to check: Are images scanned for known CVEs before deployment? Is scanning integrated into your pipeline or a manual afterthought?
# Scan with Trivy v0.50+ before pushing to registry
trivy image --exit-code 1 --severity HIGH,CRITICAL your-image:tag
# For Windows-based containers
trivy image --os-family windows your-windows-image:tag
Pass: All images pass a Trivy or equivalent scan with no HIGH or CRITICAL findings before deployment. Scanning runs automatically in CI, not as a manual step. Failed scans block the pipeline — not notify, block.
Fail: No automated scanning. Scanning exists but does not block deployments on critical findings.
Checkpoint 4: Secrets in Image Layers
What to check: Are credentials, API keys, or certificates embedded in image layers? Docker history makes these permanently visible to anyone with image access — they do not expire when you rotate the credential.
# Check image history for sensitive ENV instructions
docker history --no-trunc image_name:tag | grep -iE "password|secret|key|token|credential"
# Scan for embedded secrets with Trufflehog v3
trufflehog docker --image image_name:tag
Pass: No credentials in image layers. Secrets injected at runtime via Docker secrets, environment variables from a vault integration, or mounted volumes from a secrets manager.
Fail: Any credential found in image history or Trufflehog output. This is an automatic critical finding — image layers are permanent records.
Category 2: Runtime Configuration (Checkpoints 5–9)
Runtime misconfigurations drive the majority of container escape incidents I see during IR engagements. Most of them started as intentional decisions made under time pressure that never got reviewed.
Checkpoint 5: Privileged Mode and Capability Escalation
What to check: Are any containers running with --privileged? Are unnecessary Linux capabilities granted?
# List all running containers with privileged flag
docker inspect $(docker ps -q) --format '{{.Name}}: Privileged={{.HostConfig.Privileged}}'
# Inspect capability grants
docker inspect container_name --format '{{.HostConfig.CapAdd}}'
Pass: No production containers running --privileged. Capability sets are explicitly defined and minimal. Use --cap-drop ALL with selective --cap-add for only what the application requires.
Fail: Any container with Privileged: true in production. This is the single most common precondition for a T1611 container escape. There are no compensating controls that fully mitigate the risk of this flag in a production environment.
We had a client in the logistics sector where a developer added --privileged to a containerized ETL job to resolve a file permission error during an urgent go-live window. Nobody reviewed the compose file before the deployment reached production. The flag sat there for four months before we found it during a scheduled security assessment. That client had direct database access from the container host — a successful escape would have been a full data breach. The remediation took 45 minutes. The risk exposure had been running for four months.
Checkpoint 6: Read-Only Root Filesystem
What to check: Are containers running with a read-only root filesystem? A writable root filesystem gives an attacker a persistence foothold inside the container.
# Launch container with read-only root filesystem
docker run --read-only --tmpfs /tmp your-image:tag
# Verify on a running container
docker inspect container_name --format '{{.HostConfig.ReadonlyRootfs}}'
Pass: All containers that do not require persistent writes use --read-only. Temporary write paths use --tmpfs with explicit size limits. Web servers and API containers almost universally qualify for this setting.
Fail: Writable root filesystem on containers that have no documented need for persistent writes.
Checkpoint 7: Resource Limits and File Descriptor Caps
What to check: Are CPU, memory, and file descriptor limits configured? An unconstrained container is a denial-of-service vector from within your own environment — T1499 — Endpoint Denial of Service.
# Check resource limits on running container
docker inspect container_name --format '{{.HostConfig.Memory}} {{.HostConfig.NanoCpus}}'
# Review file descriptor limits (Docker default is 1024 open files)
docker inspect container_name --format '{{.HostConfig.Ulimits}}'
# Set file descriptor limit explicitly at launch
docker run --ulimit nofile=1024:1024 your-image:tag
Pass: Memory and CPU limits set on all containers. File descriptor limits explicitly configured — the default 1024 open files is appropriate for most workloads but verify against application requirements. Resource limits documented and reviewed quarterly.
Fail: Containers running without memory or CPU constraints. No ulimit configuration. This is not a theoretical risk — a single runaway container can exhaust host resources and cascade across every co-resident workload.
Checkpoint 8: Network Isolation and Subnet Control
What to check: Are containers on isolated networks? Is the default bridge network being used for production workloads with no subnet controls?
# List all Docker networks
docker network ls
# Inspect bridge network configuration
docker network inspect bridge
# Harden daemon.json with explicit CIDR ranges
{
"bip": "172.20.0.1/16",
"fixed-cidr": "172.20.5.0/24"
}
Pass: Containers that do not require inter-container communication are on separate networks. The default bridge network is not used for production workloads. Fixed CIDR ranges are configured in daemon.json to prevent IP space conflicts with your corporate network. For orchestrated environments, the Kubernetes network policy documentation covers policy-based east-west traffic control at scale.
Fail: All containers on the default bridge network with unrestricted inter-container communication. No CIDR configuration. Containers with network access to host management interfaces.
Checkpoint 9: Logging Configuration and Audit Trail
What to check: Is Docker configured to produce full-timestamp logs? Are container events forwarded to a SIEM or centralized log platform?
# Review logging configuration in daemon.json
cat /etc/docker/daemon.json
# Passing configuration with raw-logs and syslog forwarding:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"raw-logs": true
}
# Forward container events to SIEM via syslog
{
"log-driver": "syslog",
"log-opts": {
"syslog-address": "tcp://siem-collector:514"
}
}
Pass: raw-logs: true configured to ensure full timestamps without ANSI color stripping. Container events forwarded to your SIEM or log aggregation platform with a sub-5-minute ingestion window.
Fail: Default logging with no SIEM forwarding. raw-logs not configured. Container events invisible to your SOC. This is the detection gap that turns a containable incident into a prolonged breach — you cannot measure mean time to detect if you have no telemetry.
Category 3: Daemon and Host Hardening (Checkpoints 10–13)
The Docker daemon is the most privileged component in your container stack. These checkpoints align directly to the CIS Docker Benchmark — the reference standard I use on every container assessment at SSE. Deviation from the benchmark requires documented justification, not assumption.
Checkpoint 10: daemon.json Security Configuration
What to check: Is your daemon.json hardened against known insecure defaults? Are data storage paths explicitly controlled?
# Full hardened daemon.json review
cat /etc/docker/daemon.json
# Key fields and their security function:
{
"icc": false, // Disable inter-container communication by default
"no-new-privileges": true, // Block privilege escalation via setuid binaries
"userland-proxy": false, // Reduce attack surface on port mapping
"data-root": "/secure/docker", // Explicit storage path, not default C:\ProgramData\docker
"raw-logs": true, // Full timestamps for forensic accuracy
"live-restore": true // Maintain containers during daemon restart
}
Pass: icc: false disabling inter-container communication by default. no-new-privileges: true blocking setuid escalation. data-root explicitly set to a monitored, access-controlled path. All fields reviewed against the CIS Docker Benchmark.
Fail: Default daemon.json with no hardening entries. Inter-container communication unrestricted. Data root at the default system path without access controls.
Checkpoint 11: Kernel Isolation — Choosing the Right Container Type
What to check: For workloads requiring higher security assurance, are Hyper-V containers used instead of standard process-isolated containers?
Here is my position, stated plainly: any container processing external input, running third-party code, or handling regulated data should use Hyper-V isolation, not process isolation. The performance cost is real and measurable. The security benefit is the difference between a contained compromise and a host-level breach. Standard Windows Server containers share the kernel with the host — a kernel exploit in any container threatens all containers and the host simultaneously. Hyper-V containers eliminate that shared attack surface entirely.
# Launch a container with Hyper-V isolation on Windows Server
docker run --isolation=hyperv -d your-image:tag
# Verify isolation mode on a running container
docker inspect container_name --format '{{.HostConfig.Isolation}}'
# Hyper-V: returns "hyperv"
# Process isolation (default): returns "process"
Pass: Workloads classified as high-risk run with --isolation=hyperv. Isolation classification is documented per workload. For clients evaluating broader infrastructure security posture, our cloud migration and IT consulting assessments include container architecture review as a standard deliverable.
Fail: All containers using default process isolation with no documented risk classification. No rationale recorded for isolation choice on high-risk workloads.
Checkpoint 12: TLS on the Docker Daemon Socket
What to check: Is the Docker daemon socket exposed over TCP? If so, is mutual TLS enforced?
# Check if Docker daemon is listening on unencrypted TCP
ss -tlnp | grep dockerd
# FAIL if you see: 0.0.0.0:2375
# Acceptable: 127.0.0.1:2376 with TLS configured
# Passing TLS configuration in daemon.json:
{
"tls": true,
"tlscacert": "/etc/docker/certs/ca.pem",
"tlscert": "/etc/docker/certs/server-cert.pem",
"tlskey": "/etc/docker/certs/server-key.pem",
"tlsverify": true,
"cluster-store": "consul://consul-host:8500"
}
Pass: Docker daemon not listening on TCP 2375. If TCP access is required, port 2376 with full mutual TLS and tlsverify: true enforcing client certificate validation. Unix socket access restricted by filesystem permissions to the docker group only.
Fail: Docker daemon exposed on TCP 2375 without TLS. This is an unauthenticated remote code execution vector — any actor with network access to that port has full control of your container environment. This is a T1190 finding with critical severity, no exceptions.
Checkpoint 13: EDR and Antivirus Integration on Container Hosts
What to check: Is endpoint detection covering container host activity? Is antivirus configured to scan container workloads without disrupting runtime operations?
Antivirus and container runtimes have a complicated relationship. Poorly configured AV scanning Docker’s image layer directory causes measurable performance degradation and can interfere with layer operations entirely. The answer is not to exclude the entire Docker data path from scanning — the answer is scoped exclusions for layer storage combined with full coverage for process execution, network events, and container host activity. Skipping EDR coverage entirely on container hosts is not a security posture, it is a blind spot.
Pass: EDR agent deployed on the container host. Explicit AV exclusions for Docker’s data-root directory to prevent layer-file scanning interference, with full coverage retained for process execution and network telemetry. Container host is a monitored asset in your detection platform.
Fail: No EDR on container hosts. Blanket AV exclusion of all Docker-related paths with no compensating process or network monitoring.
Category 4: Detection and Visibility (Checkpoints 14–15)
The most hardened container environment still needs detection coverage. These two checkpoints determine whether your SOC can see what is happening and respond within a window that matters. Mean time to detect on container escapes is consistently higher than on traditional host events because most SIEM configurations were built for endpoint and network telemetry, not container event shapes.
Checkpoint 14: SIEM Correlation Rules for Container Events
What to check: Do you have active detection rules for container escape indicators, privileged container creation, and Docker API anomalies?
// KQL — Detect privileged container creation
// Maps to MITRE ATT&CK T1610 Deploy Container, T1611 Escape to Host
// Microsoft Sentinel / Microsoft Defender for Cloud
ContainerLog
| where LogEntry contains "--privileged" or LogEntry contains "Privileged: true"
| where ContainerName !in (known_privileged_allowlist)
| project TimeGenerated, ContainerName, Computer, LogEntry
| where TimeGenerated > ago(1h)
// SPL — Detect unauthenticated Docker daemon access
// Maps to MITRE ATT&CK T1190 Exploit Public-Facing Application
// Splunk
index=network dest_port=2375 NOT src_ip IN (authorized_mgmt_hosts)
| stats count by src_ip, dest_ip, _time
| where count > 1
| eval alert="DOCKER_DAEMON_UNAUTHENTICATED_ACCESS"
Pass: Active detection rules for privileged container creation, containers accessing host filesystem paths (/etc, /proc, /sys), new containers created outside business hours, and Docker daemon connection attempts on port 2375. Rules are tuned to your environment. Alert fatigue is a real operational problem — an untriaged alert is functionally worse than no alert, because it trains your analysts to ignore the queue.
Fail: No container-specific SIEM rules. Container events not ingested into your detection platform. No defined mean time to detect target for container escape events.
Checkpoint 15: Container Escape Incident Response Playbook
What to check: Does your IR team have a documented playbook for container escape events? Can they execute it at 2:47 AM without improvising from first principles?
The playbook does not need to be long. It needs to be correct, current, and rehearsed. At minimum, it covers: isolate the affected container and host from the network immediately, preserve forensic state before terminating the container, identify the escape vector through process and network logs, and assess lateral movement originating from the container host. Isolation is step one — not investigation, not notification, not root cause analysis. Isolate first. The same discipline covered in the Windows Server hardening guide for traditional host response applies directly to the container host layer.
Pass: Written playbook exists, reviewed within the last 12 months, and accessible without network connectivity to your documentation platform. Playbook tested in a tabletop exercise. Escalation matrix contains named individuals with backup contacts, not just role labels.
Fail: No documented playbook. Playbook exists but has not been reviewed since initial creation. Escalation contacts are role-based with no named individuals — roles do not answer phones at 2:47 AM.
Scoring Your Audit: Critical, High, and Medium Findings
After completing all 15 checkpoints, categorize findings by remediation priority. The goal is not a perfect score on the first pass. The goal is no critical findings and a measurable decline in high findings across successive quarterly audits.
| Priority | Checkpoints | Remediation Window |
|---|---|---|
| Critical | 5, 10, 12, 14 | 72 hours |
| High | 1, 2, 4, 11 | 2 weeks |
| Medium | 3, 6, 7, 8, 9, 13, 15 | Next sprint |
| Low / Informational | Documentation and process gaps | Quarterly review |
One caveat worth stating plainly: this audit addresses configuration and detection coverage. It does not address recovery. An organization running business-critical workloads on container hosts without a tested recovery process carries an exposure this checklist alone cannot address. Verify that your container hosts are included in your backup scope — bare metal restore capabilities for container hosts are a parallel workstream, not a competing priority. The NIST Cybersecurity Framework maps this clearly: Identify → Protect → Detect → Respond → Recover. All five functions matter.
What the Data Tells Us
Across container assessments at SSE, the most common finding is not exotic. It is Checkpoint 5 — privileged containers — combined with Checkpoint 9 — no SIEM ingestion of container events. Those two failures together create a scenario where an escape can occur and your detection capability is zero. You are operating blind at the worst possible moment.
The second most common finding is Checkpoint 2 — insecure registries in daemon.json. This surfaces most often in environments that started as proof-of-concept deployments, moved to production without a formal security review, and inherited the original permissive configuration. The fix — removing the insecure-registries entry and enforcing TLS — takes under an hour. The risk it removes is a direct credential interception path on every subsequent image pull.
Container security is not a one-time activity. Baselines drift. New images get pulled. Developers under deadline pressure add flags that nobody reviews. The same cadence that applies to traditional server hardening — baseline, drift detection, scheduled review — applies here with equal urgency. The ransomware post-mortem we published last year demonstrates what happens when security posture is treated as a deployment-time decision rather than an ongoing operational discipline. Container environments are not exempt from that lesson.
Your Next 48 Hours
Do not file this. Run Checkpoint 5 right now — the privileged mode audit takes under five minutes and the command is in this article. If any container returns Privileged: true and you cannot immediately cite a documented business requirement and an approved exception, that is your first work order. Follow it with Checkpoint 9 to verify your logging pipeline reaches your SIEM. Two checkpoints, under 20 minutes total, and you will have a concrete picture of your two highest-risk gaps.
If you want a full assessment with remediation guidance, detection engineering support, or help building a container security program from the ground up, the SSE team runs these engagements regularly across industries. Reach out through our client portal and we will schedule a scoping call.


